PUMA Vault
Search
Buscar
Modo oscuro
Modo claro
Explorador
Etiqueta: ppo
1 artículo con esta etiqueta.
01 may 2026
RLHF and Constitutional AI — LLM Alignment Training Paradigms
permanent
rlhf
constitutional-ai
rlaif
ppo
reward-model
alignment
ai-safety
fine-tuning
sft
anthropic
openai
puma-core
research
training
ethics
llm
models
hitl