PUMA Vault

Etiqueta: ai-safety

5 artículos con esta etiqueta.

  • 01 may 2026

    RLHF and Constitutional AI — LLM Alignment Training Paradigms

    • permanent
    • rlhf
    • constitutional-ai
    • rlaif
    • ppo
    • reward-model
    • alignment
    • ai-safety
    • fine-tuning
    • sft
    • anthropic
    • openai
    • puma-core
    • research
    • training
    • ethics
    • llm
    • models
    • hitl
  • 14 abr 2026

    The Coming Wave: Technology, Power, and the Twenty-first Century's Greatest Dilemma

    • literature
    • ai-policy
    • ai-safety
    • power-concentration
    • containment
    • ethics
    • societal-impact
    • synthetic-biology
    • proliferation
    • governance
    • puma-core
    • book
    • research
    • literature-note
    • keshav
    • moc
    • social-impact
  • 13 abr 2026

    Risks from Learned Optimization in Advanced Machine Learning Systems

    • literature
    • ai-safety
    • inner-alignment
    • deceptive-alignment
    • mesa-optimization
    • learned-optimization
    • puma-core
    • agents
    • architecture
    • ethics
    • keshav
    • literature-note
    • llm
    • moc
    • red-teaming
    • research
    • safety
  • 13 abr 2026

    Algorithmic Bias in AI-Assisted Project Management

    • permanent
    • algorithmic-bias
    • fairness
    • ethics
    • discrimination
    • project-management
    • ai-safety
    • puma-core
    • research
    • hitl
    • issue-triage
    • effort-estimation
    • accountability
    • social-impact
    • diversity
  • 13 abr 2026

    Human-in-the-Loop (HITL) and Bounded Autonomy for AI Agents

    • permanent
    • hitl
    • human-in-the-loop
    • bounded-autonomy
    • ai-safety
    • ethics
    • oversight
    • control
    • human-ai-collaboration
    • puma-core
    • research
    • agents
    • architecture
    • project-management
    • accountability
    • alignment

Creado con Quartz v4.5.2 © 2026

  • GitHub
  • Discord Community