PUMA Vault

Etiqueta: agentbench

1 artículo con esta etiqueta.

  • 13 abr 2026

    AgentBench: Evaluating LLMs as Agents

    • literature
    • llm-agents
    • benchmark
    • agentbench
    • evaluation
    • puma-core
    • agents
    • architecture
    • baseline
    • coding
    • critical-thinking
    • effort-estimation
    • gpt
    • keshav
    • literature-note
    • llm
    • local-llm
    • moc
    • multi-agent
    • ollama
    • open-source
    • project-management
    • react
    • red-teaming
    • research
    • story-points
    • triage
    • web

Creado con Quartz v4.5.2 © 2026

  • GitHub
  • Discord Community