PUMA Vault

Etiqueta: sft

1 artículo con esta etiqueta.

  • 01 may 2026

    RLHF and Constitutional AI — LLM Alignment Training Paradigms

    • permanent
    • rlhf
    • constitutional-ai
    • rlaif
    • ppo
    • reward-model
    • alignment
    • ai-safety
    • fine-tuning
    • sft
    • anthropic
    • openai
    • puma-core
    • research
    • training
    • ethics
    • llm
    • models
    • hitl

Creado con Quartz v4.5.2 © 2026

  • GitHub
  • Discord Community