PUMA Vault
Search
Buscar
Modo oscuro
Modo claro
Explorador
Etiqueta: agentbench
1 artículo con esta etiqueta.
13 abr 2026
AgentBench: Evaluating LLMs as Agents
literature
llm-agents
benchmark
agentbench
evaluation
puma-core
agents
architecture
baseline
coding
critical-thinking
effort-estimation
gpt
keshav
literature-note
llm
local-llm
moc
multi-agent
ollama
open-source
project-management
react
red-teaming
research
story-points
triage
web