Chapter 1 — Introduction
Chapter Status
Overview
✅ PEC1 delivered 2026-03-08 Chapter structure: Follows PUMA Project requirements for final thesis
Section Checklist
- 1.1 Context and justification (needs and problem)
- 1.2 Research objectives (general + specific, with SMART indicators)
- 1.3 Sustainability, ethical-social and diversity impact
- 1.4 Approach and method
- 1.5 Work plan (phases, tasks, milestones, risk management)
- 1.6 Brief summary of products
- 1.7 Brief description of other chapters
- 1.8 Declaration on generative AI use
1.1 Context — Key Arguments
Three quantified dimensions of the problem:
- Manual triage — Jira SR (Ortu et al., 2015) demonstrates inconsistent priority assignment across projects for equivalent issues → cognitive bias, measurable cost
- Inconsistent estimation — TAWOS (Tawosi et al., 2022) shows systematic over-estimation under sprint pressure, under-estimation for new projects
- Uniqueness Trap (Flyvbjerg & Gardner, 2023) — managers treat each project as unique, preventing statistical learning from historical patterns
Research gap (three limitations in existing work):
- Reproducibility: only 5/18 papers with published artefacts are executable (Angermeir et al., 2025 — ICSE 2026)
- No systematic comparison of prompting strategies for PM tasks
- No carbon footprint measurement per experimental condition
Three enabling conditions (why now):
- Open-weight local models (Llama 3.2, Mistral 7B via Ollama)
- Public verified datasets (Jira SR, TAWOS with stable DOIs)
- Carbon tracking tools (CodeCarbon, MIT licence)
1.2 Research Question & Hypotheses
Main RQ: Do statistically significant differences exist in automatic issue triage quality and effort estimation when using different LLMs and prompting strategies, evaluated on real ICT project datasets?
H1 (Triage): At least one configuration achieves F1-macro > heuristic baseline (p < 0.05, effect r ≥ 0.1) → EX-Hypotheses-H1-H2
H2 (Estimation): At least one few-shot configuration achieves MAE < historical mean (p < 0.05, delta ≥ 0.5 SP)
1.4 Methodology Summary
Paradigm: Design Science Research (DSR) — Hevner (2004) + Peffers (2007) Literature: SLR following PRISMA 2020 + PRISMA-DFLLM extension (AI-assisted screening documented) Experiment: Controlled, with pre-registered conditions, seed=42, temperature=0 Stats: Wilcoxon signed-rank test, effect size r
Incremental strategy:
- Guaranteed MVP: Triage module (Stage 1) — Strategy C
- Target: Triage + Estimation (Stages 1+2) — Strategy D
- Optional: Backlog prioritisation (Stage 3)
1.5 Project Plan
| Phase | Period | Key Output |
|---|---|---|
| F0 Initiation | Feb 23 – Mar 8 | PEC1: Chapter 1 + environment verified |
| F1 Design | Mar 9–28 | Architecture + prompting strategies |
| F2 Prototype | Mar 29 – Apr 8 | PEC2: Working triage module |
| F3 Extension | Apr 9 – May 10 | PEC3: Estimation module |
| F3b Conditional | May 11–31 | Backlog module (if F3 on time) |
| F4 Analysis | May 1 – Jun 7 | PEC4: Full results + statistics |
| F5 Closure | Jun 8–23 | PEC5: Final submission + defence |
1.8 AI Use Declaration
Framework: Marco Veritas (Codina, 2024) Principle: No delegation of judgement — AI generates options, author decides Validation: Every AI-sourced reference verified in primary source Tools declared: Claude, Perplexity, DeepSeek, Gemini, NotebookLM, AnythingLLM, GitHub Copilot, Cursor, OpenHands, Warp AI Terminal, OpenCode, Browser OS
Full log: AI-Use-Log
Writing Notes
Use this section for drafting observations, things to revise, feedback from tutor
- Tutor feedback received on: {{date}}
- Main revision needed: {{notes}}
- Approved version submitted: {{date}}
🔗 Connected Notes
Navigation: MOC-PUMA-Master · PR-PUMA-Ch2-Ch3-Ch4-Ch5 · PR-PUMA-Ch3-Methods
Hypotheses & Experiments: EX-Hypotheses-H1-H2 · EX-Stages-Overview
Core Concepts (§1.1 problems): PN-IssueTriage-StoryPoints — Manual triage bias · PN-KeyConcepts-Agents-Reproducibility-RedTeam — Reproducibility gap + Uniqueness Trap
Datasets referenced: LN-Datasets-JiraSR-TAWOS — Jira SR + TAWOS · LN-Angermeir-2025-Reproducibility — Reproducibility crisis
Methodology (§1.4): PN-DSR-SLR-Methods — DSR + PRISMA · PN-Wilcoxon-FINER-Cornell-PRISMA — Statistical protocol
Project governance: SP-PUMA-Constitution — Non-negotiable principles · BMAD-Agent-Roster — Research team
Key thinkers: PER-Flyvbjerg-Bent — Uniqueness Trap · PER-Assalaarachchi-Nuwan — Agentic SPM