PUMA Vault
Search
Buscar
Modo oscuro
Modo claro
Explorador
Etiqueta: ai-safety
5 artículos con esta etiqueta.
01 may 2026
RLHF and Constitutional AI — LLM Alignment Training Paradigms
permanent
rlhf
constitutional-ai
rlaif
ppo
reward-model
alignment
ai-safety
fine-tuning
sft
anthropic
openai
puma-core
research
training
ethics
llm
models
hitl
14 abr 2026
The Coming Wave: Technology, Power, and the Twenty-first Century's Greatest Dilemma
literature
ai-policy
ai-safety
power-concentration
containment
ethics
societal-impact
synthetic-biology
proliferation
governance
puma-core
book
research
literature-note
keshav
moc
social-impact
13 abr 2026
Risks from Learned Optimization in Advanced Machine Learning Systems
literature
ai-safety
inner-alignment
deceptive-alignment
mesa-optimization
learned-optimization
puma-core
agents
architecture
ethics
keshav
literature-note
llm
moc
red-teaming
research
safety
13 abr 2026
Algorithmic Bias in AI-Assisted Project Management
permanent
algorithmic-bias
fairness
ethics
discrimination
project-management
ai-safety
puma-core
research
hitl
issue-triage
effort-estimation
accountability
social-impact
diversity
13 abr 2026
Human-in-the-Loop (HITL) and Bounded Autonomy for AI Agents
permanent
hitl
human-in-the-loop
bounded-autonomy
ai-safety
ethics
oversight
control
human-ai-collaboration
puma-core
research
agents
architecture
project-management
accountability
alignment