LN — Theory Is All You Need (Felin & Holweg, 2024)
Full Reference: Felin, T., & Holweg, M. (2024). Theory is all you need: AI, human cognition, and causal reasoning. INFORMS Studies in the Service Economy, 4(1), 1–23. https://doi.org/10.1287/stsc.2024.0189
Pass 1 — Bird’s Eye
Type: Theoretical / Position paper Main Claim: Data-driven AI is fundamentally retrospective and cannot generate novel scientific theory. Human cognition is prospective and causal — it can formulate hypotheses that go beyond existing data. AI lacks this creative, theory-driven capability. Relevance to PUMA: ⭐⭐⭐ Medium — provides the key counterargument to Agentic Science claims; essential for PUMA Section 2 to present a balanced state of the art.
Pass 2 — Key Arguments
Core Thesis
- Human cognition = prospective, theory-first, causal: generates experiments designed to test specific causal mechanisms
- AI cognition = retrospective, data-first, correlational: finds patterns in existing data but cannot generate truly novel causal models
- “Theory is what allows scientists to know what data to collect and how to interpret it”
Evidence Used
- Historical science: major paradigm shifts (Copernicus, Darwin, Einstein) required theory-first thinking, not data-driven discovery
- Current AI: AlphaFold predicts structures but didn’t generate the theory of protein folding
- Large language models: generate plausible text from patterns but cannot distinguish causal from correlational relationships
Key Limitation of the Argument
- Written in 2024, before: single-minus gluon result (2026), First Proof project (2025-2026), Gemini IMO gold medal (2025)
- These subsequent results suggest some LLMs can go beyond interpolation in formal reasoning domains
Pass 3 — PUMA Re-implementation
PUMA’s position: Felin & Holweg are partially correct about AI limitations in open-ended scientific creativity. However, PUMA’s domain (PM task automation) does NOT require the kind of causal theory generation that Felin & Holweg argue AI cannot do.
PUMA’s claim is more modest:
- LLM agents can classify issue priority better than naive baselines using historical patterns (correlational is sufficient)
- LLM agents can estimate story points more accurately than team averages using similar historical issues
This is precisely the “retrospective pattern matching” that Felin & Holweg describe — and for PM automation, that is exactly what is needed.
MIT Critical Questions
- How can I use this? → Cite as the strongest counterargument in PUMA Section 2 (literature review); then explain why PUMA’s tasks don’t require the causal creativity Felin & Holweg require.
- Does it really support its claims? → The theory is sound but increasingly challenged by 2025-2026 empirical evidence.
- What if Felin & Holweg are right for PM? → Even if AI cannot generate PM theories, it can still improve classification/estimation accuracy — which is PUMA’s actual claim.
APA7 Citation
Felin, T., & Holweg, M. (2024). Theory is all you need: AI, human cognition, and causal reasoning. INFORMS Studies in the Service Economy, 4(1), 1–23. https://doi.org/10.1287/stsc.2024.0189
Related Notes
- PN-AI-Scientific-Knowledge-Generation — synthesised view including counter-arguments
- PN-PUMA-within-AgenticScience-Trajectory — why Level 2 suffices vs Felin & Holweg
- PN-Agentic-Science-Paradigm — contrasting view
- LN-Zhang-2025-AgenticScienceSurvey — contrasting survey
- PR-PUMA-Ch2-Ch3-Ch4-Ch5 — cited in §2 state of art
- PR-PUMA-Ch1-Introduction — §1.1 PUMA claim scoping