PUMA Vault
Search
Buscar
Modo oscuro
Modo claro
Explorador
Etiqueta: swe-bench
18 artículos con esta etiqueta.
16 abr 2026
📚 Bibliography: Spec-Driven Development (SDD) and Agentic Software Engineering
bibliography
apa7
references
supplement
verified
academic-writing
agentscope
agile
aiops
aiopslabs
architecture
autogen
automation
benchmark
chatdev
citation
devops
gaia
github
langgraph
llm
mas
masai
mcp
memgpt
metagpt
multi-agent
openhands
orchestration
planning
project-management
protocol
react
reasoning
reasoning-action
reinforcement-learning
research
root-cause-analysis
scheduling
security
swe-bench
tool-use
tree-of-thoughts
workflow
sdd
spec-driven-development
07 abr 2026
Research Rabbit — Seed Expansion for PUMA Corpus
prompt
research-rabbit
citation-network
snowballing
puma
agile
automation
benchmark
bibliography
carbon-footprint
citation
clustering
effort-estimation
human-in-the-loop
hypothesis
issue-triage
literature-review
masai
moc
multi-agent
orchestration
pipeline
prisma
project-management
prompt-template
react
reasoning-action
research-methodology
research-tools
slr
story-points
sustainability
swe-bench
triage
workflow
zotero
07 abr 2026
Connected Papers — Citation Map for PUMA
prompt
connected-papers
citation-map
scientific-mapping
puma
academic-writing
autogen
automation
benchmark
bibliography
citation
effort-estimation
human-in-the-loop
issue-triage
literature-review
llm
masai
metagpt
moc
multi-agent
observability
orchestration
pipeline
prisma
project-management
prompt-template
react
reasoning-action
research
research-tools
slr
story-points
swe-bench
thesis
tracing
triage
zotero
06 abr 2026
MASAI: Modular Architecture for Software-Engineering AI Agents
literature
llm-agents
masai
software-engineering
modular
microsoft
puma-core
academic-writing
agents
architecture
benchmark
bibliography
citation
critical-thinking
effort-estimation
github
issue-triage
keshav
literature-note
llm
moc
multi-agent
orchestration
project-management
react
reading-method
reasoning
reasoning-action
red-teaming
research
scheduling
smart-pmo
story-points
swe-bench
triage
06 abr 2026
SWE-bench: Can Language Models Resolve Real-World GitHub Issues?
literature
benchmark
swe-bench
github
software-engineering
puma-core
academic-writing
bibliography
citation
dataset
jira
keshav
literature-note
llm
masai
moc
project-management
reading-method
research
tawos
06 abr 2026
MAGIS: LLM-Based Multi-Agent Framework for GitHub Issue Resolution
literature
multi-agent
github-issues
software-engineering
issue-resolution
puma-core
academic-writing
agents
benchmark
bibliography
citation
github
issue-triage
keshav
literature-note
llm
llm-agents
moc
pipeline
project-management
reading-method
research
swe-bench
triage
06 abr 2026
OpenHands: An Open Platform for AI Software Developers as Generalist Agents
literature
open-source
platform
ai-agents
software-engineering
openhands
agents
architecture
benchmark
bibliography
citation
github
keshav
literature-note
llm
llm-agents
moc
multi-agent
project-management
reading-method
smart-pmo
swe-bench
tool-use
06 abr 2026
AI Agent Specialization. RAG vs Fine-tuning — T3chFest 2026
video
agents
rag
fine-tuning
specialisation
academic-writing
benchmark
embeddings
issue-triage
jira
llm
metrics
moc
precision-recall
project-management
research
retrieval
swe-bench
triage
vector-db
video-note
06 abr 2026
📦 Repository Notes — Analysed Reference Code & Architectural Patterns
tools
repositories
github
open-source
reference
code-patterns
puma
academic-writing
api
architecture
backlog
benchmark
crewai
dev-tools
devops
docker
effort-estimation
gpt
human-in-the-loop
ide
issue-triage
jira
langgraph
literature-note
llm
moc
multi-agent
nlp
openai
openhands
orchestration
planning
project-management
prompt-engineering
react
reasoning-action
research
rest-api
scrum
software-engineering
sprint
story-points
swe-bench
template
tool-use
triage
workflow
06 abr 2026
📦 Repository Gist — Analysed Reference Gist
tools
repositories
github
open-source
reference
code-patterns
puma
academic-writing
api
architecture
backlog
benchmark
crewai
dev-tools
devops
docker
effort-estimation
gpt
human-in-the-loop
ide
issue-triage
jira
langgraph
literature-note
llm
moc
multi-agent
nlp
openai
openhands
orchestration
planning
project-management
prompt-engineering
react
reasoning-action
research
rest-api
scrum
software-engineering
sprint
story-points
swe-bench
template
tool-use
triage
workflow
06 abr 2026
📊 Tools — Datasets, Benchmarks & Data Access
tools
datasets
benchmarks
jira-sr
tawos
swe-bench
puma
academic-writing
agile
baseline
benchmark
carbon-footprint
code-review
dataset
effort-estimation
evaluation
github
hypothesis
issue-triage
jira
literature-note
metrics
moc
nlp
non-parametric
observability
planning
precision-recall
python
research
research-methodology
smart-pmo
statistics
story-points
sustainability
tracing
triage
wilcoxon
workflow
06 abr 2026
Multi-agent systems outperform single agents on PM tasks when agent roles match task specialisation boundaries
permanent-note
multi-agent
architecture
specialisation
pm-agents
masai
metagpt
artefact
backlog
baseline
benchmark
bmad
critical-thinking
dsr
effort-estimation
evaluation
gpt
human-in-the-loop
hypothesis
issue-triage
llm
moc
openai
orchestration
planning
project-management
react
reasoning
reasoning-action
red-teaming
research-methodology
sdd
smart-pmo
spec-driven-development
story-points
swe-bench
triage
06 abr 2026
📖 Glossary Supplement v2 — Extended Technical Terms
glossary
reference
definitions
supplement
academic-writing
accuracy
agentscope
ai-tools
aiops
ami
anthropic
api
architecture
auc
autogen
backlog
baseline
benchmark
bias
chain-of-thought
claude
cot
crewai
data-formats
dataset
devops
drca
effect-size
effort-estimation
egi
embeddings
ethics
evaluation
few-shot
fine-tuning
github
gpt
human-in-the-loop
hypothesis
ict
iipr
issue-triage
jira
json
keshav
langchain
langgraph
literature-review
llama
llm
lm-studio
local-llm
mas
memory
meta
metrics
mistral
mit-ai-lab
multi-agent
nlp
non-parametric
ollama
one-shot
openai
orchestration
perplexity
pipeline
planning
precision-recall
project-management
prompting
python
rag
rcoif
react
reading-method
reasoning
reasoning-action
reinforcement-learning
research
research-methodology
rest-api
retrieval
security
slr
software-engineering
sprint
statistics
story-points
supervised-learning
swarm-intelligence
swe-bench
tawos
tool-use
transformer
tree-of-thoughts
triage
validity
vector-db
wilcoxon
wp316
zero-shot
06 abr 2026
📚 Bibliography Supplement v3 — Verified New References
bibliography
apa7
references
supplement
verified
academic-writing
agentscope
agile
aiops
aiopslabs
architecture
autogen
automation
benchmark
chatdev
citation
devops
gaia
github
langgraph
llm
mas
masai
mcp
memgpt
metagpt
multi-agent
openhands
orchestration
planning
project-management
protocol
react
reasoning
reasoning-action
reinforcement-learning
research
root-cause-analysis
scheduling
security
swe-bench
tool-use
tree-of-thoughts
workflow
06 abr 2026
📊 MOC — LLM Benchmarks, PM-AI Convergence & Agent Architectures (v2)
moc
llm-benchmarks
pm-ai
agents
architectures
academic-writing
agentscope
aiops
aiopslabs
architecture
autogen
baseline
benchmark
bibliography
chain-of-thought
chatdev
citation
cot
critical-thinking
devops
embeddings
evaluation
gaia
github
gpt
langgraph
llm
local-llm
mas
masai
mcp
memgpt
memory
metagpt
multi-agent
navigation
ollama
openai
openhands
orchestration
project-management
protocol
rag
react
reasoning
reasoning-action
red-teaming
research
retrieval
root-cause-analysis
security
smart-pmo
software-engineering
swarm-intelligence
swe-bench
tree-of-thoughts
vector-db
workflow
06 abr 2026
🔧 MOC — PUMA Full Technology Stack
moc
tools
stack
technology
puma
ollama
claude
perplexity
zotero
academic-writing
ai-tools
anthropic
anythingllm
api
architecture
automation
benchmark
bibliography
chain-of-thought
cicd
citation
cot
crewai
dashboard
dataset
dev-tools
devops
docker
effort-estimation
elicit
embeddings
gemini
github
google
gpt
human-in-the-loop
ide
jira
knowledge-management
langgraph
literature-review
llama
llm
lm-studio
local-llm
meta
metrics
mistral
multi-agent
nlp
notebooklm
obsidian
openai
opencode
openhands
openspec
orchestration
pipeline
precision-recall
project-management
prompting
pydantic
python
rag
rcoif
react
reasoning
reasoning-action
research
research-tools
rest-api
retrieval
scrum
sdd
semantic-scholar
slr
software-engineering
spec-driven-development
story-points
swe-bench
tawos
template
vault
vector-db
01 mar 2026
Chapter 2 — Literature Review (State of the Art)
project
chapter
literature-review
slr
state-of-art
academic-writing
agile
artefact
baseline
benchmark
carbon-footprint
chain-of-thought
code-review
codecarbon
cornell-notes
cot
dataset
devops
docker
dsr
effect-size
effort-estimation
evaluation
few-shot
finer
github
gpt
hypothesis
issue-triage
jira
llama
llm
local-llm
meta
metrics
mistral
moc
navigation
non-parametric
note-taking
ollama
openai
pec
pipeline
precision-recall
prisma
project-management
project-note
puma
python
reasoning
research
research-methodology
software-engineering
statistics
story-points
sustainability
swe-bench
tawos
triage
wilcoxon
workflow
zero-shot
01 mar 2026
📖 MOC — Literature Review (SLR)
moc
literature
slr
prisma
state-of-art
academic-writing
agile
aiops
aiopslabs
benchmark
carbon-footprint
chain-of-thought
code-review
codecarbon
cot
dataset
devops
dsr
effort-estimation
few-shot
gaia
github
gpt
issue-triage
jira
literature-review
llm
mas
masai
metagpt
metrics
mit-ai-lab
multi-agent
openai
openhands
pipeline
precision-recall
project-management
react
reasoning
reasoning-action
research
research-methodology
scheduling
software-engineering
story-points
sustainability
swe-bench
tawos
triage
workflow
wp316