Session 8: Policy & Interpretability

New member initiation

We initiated our newer members - Welsh and Kappie - on the basics of AI and AI alignment.

We spent the first half of the meeting (2pm - 3:45pm) researching and taking notes on topics that we found interesting. Generally, the group was split into two streams, as per our established activity pipeline: Stream 1 and Stream 2.
Here’s a list of what each of us did research on:
- Welsh: AI watermarking
- Mogu: Mechanistic interpretability (overarching high level methodology)
- Gump: Mechanistic interpretability (features, circuits, universality)
- Kappie: Data input control
- Qronox: Developmental interpretability
- Skrubz: Economic potential of AI

In the latter half of the meeting, each of us went up and presented what we learned from the research session.
During or after each of these presentations, we would ask and answer questions for clarification and criticism. We learnt a lot from these presentations!

devinterp_tree overview tree of devinterp by qronox