πŸ“Š MOC β€” LLM Benchmarks, PM-AI Convergence & Agent Architectures

Overview

Navigation map for all literature on LLM agents, PM-AI convergence, benchmarks, and agent architectures. Updated with verified references from bibliography supplement v3.


🎯 PUMA Core Papers (highest relevance)

PaperarXivKey ContributionPUMA Stage
Cinkusz et al. 2025 (Cognitive Agents PM)2508.166785-task PM benchmark with LLMs; 3 limitations PUMA addresses1–3
Assalaarachchi et al. 2026 (Agentic SPM)2601.16392”Agentic PM” vision; SPM 3.0 framework5
Yao et al. 2022 (ReAct)2210.03629Base agent pattern: Thought-Action-Observation4–5
Arora et al. 2024 (MASAI)2406.11638Modular sub-agent architecture for SE4–5
Gao et al. 2024 (AgentScope)2402.14034Multi-agent platform; native Ollama support5
Dorri et al. 2025 (Orchestrating HA Teams)2510.02557Manager Agent framework; GPT-5 vs GPT-4.15

πŸ—οΈ Foundational LLM Architecture


πŸ€– Agent Architectures

Foundation Papers

Multi-Agent Frameworks

Software Engineering Agents

Architecture Surveys & Taxonomies

Self-Improvement & Reasoning

Memory & State

Collaboration & Coordination

Workflow & Orchestration

Security & Governance


πŸ“‹ PM-AI Convergence

Agentic PM Vision

AIOps & DevOps

PM-AI & Human Collaboration


πŸ“Š Benchmarks & Evaluation


πŸ” AI Code Quality & Code Review

  • LN-CodeRabbit-2025-AIvsHumanCode β€” ⭐ CodeRabbit (2025): AI vs Human PRs β€” 470 PRs, 1.7Γ— more issues, 2.74Γ— security CVEs, 3Γ— readability deficit, 7 mitigation strategies