PUMA Vault

Etiqueta: inner-alignment

1 artículo con esta etiqueta.

  • 13 abr 2026

    Risks from Learned Optimization in Advanced Machine Learning Systems

    • literature
    • ai-safety
    • inner-alignment
    • deceptive-alignment
    • mesa-optimization
    • learned-optimization
    • puma-core
    • agents
    • architecture
    • ethics
    • keshav
    • literature-note
    • llm
    • moc
    • red-teaming
    • research
    • safety

Creado con Quartz v4.5.2 © 2026

  • GitHub
  • Discord Community