mogu_apartment

TensorTrust Brainstorming

  • We played TensorTrust again, joined the Discord server for the game, and tried out basic to less basic strategies in an attempt to understand how a prompt injection might work (1 hour)
  • We created a docs of strategies that ended up being successful, some more obscure than others.

Potential of AI & Introduction to AI Alignment

  • We went through week 1 + 2 in BlueDot Impact’s AI Safety Fundamentals Course (30 mins)
  • Potential of AI:
    • We deconstruct some arguments for why AGI might not be possible by making these ground assumptions:
      • AGI is something the world is trying to build
      • AGI doesn’t need to be sentient, just sufficiently intelligent
      • Intelligence is probably not a transcendental metaphysical property only humans have
      • Humans are probably not the most intelligent thing something in the universe can be
    • We talked about how AI could (and in many ways already can) have massive sway over our workspace, politics, and public opinion
  • Intro to Alignment:
    • We talked about outer and inner misalignment is
    • We also discussed some misconceptions that may make it difficult to talk about misalignment:
      • Anthropomorphisation: AI is not necessarily sentient, and talking about it with respect to human feelings is counterproductive.
      • The Terminator Effect: A rogue AI does not need a physical “body” to cause harm to our world, physical harm included.
      • Appeal to fiction: Probably the hardest part of beginner alignment discussion. The only thing we can base AGI expectations on is sci-fi, so people tend to have some unrealistic expectations.

Resources for learning/funding/career development

  • We shared some resources and opportunities for further career development.
    • Open Philanthropy funding for career development
    • Global Challenges Project X-risk workshops
    • AI safety training / research programmes like ARENA, SPAR, MATS
    • Rationality camps