Chapter 1 — Introduction

Chapter Status

Overview

✅ PEC1 delivered 2026-03-08 Chapter structure: Follows PUMA Project requirements for final thesis


Section Checklist

  • 1.1 Context and justification (needs and problem)
  • 1.2 Research objectives (general + specific, with SMART indicators)
  • 1.3 Sustainability, ethical-social and diversity impact
  • 1.4 Approach and method
  • 1.5 Work plan (phases, tasks, milestones, risk management)
  • 1.6 Brief summary of products
  • 1.7 Brief description of other chapters
  • 1.8 Declaration on generative AI use

1.1 Context — Key Arguments

Three quantified dimensions of the problem:

  1. Manual triage — Jira SR (Ortu et al., 2015) demonstrates inconsistent priority assignment across projects for equivalent issues → cognitive bias, measurable cost
  2. Inconsistent estimation — TAWOS (Tawosi et al., 2022) shows systematic over-estimation under sprint pressure, under-estimation for new projects
  3. Uniqueness Trap (Flyvbjerg & Gardner, 2023) — managers treat each project as unique, preventing statistical learning from historical patterns

Research gap (three limitations in existing work):

  1. Reproducibility: only 5/18 papers with published artefacts are executable (Angermeir et al., 2025 — ICSE 2026)
  2. No systematic comparison of prompting strategies for PM tasks
  3. No carbon footprint measurement per experimental condition

Three enabling conditions (why now):

  1. Open-weight local models (Llama 3.2, Mistral 7B via Ollama)
  2. Public verified datasets (Jira SR, TAWOS with stable DOIs)
  3. Carbon tracking tools (CodeCarbon, MIT licence)

1.2 Research Question & Hypotheses

Main RQ: Do statistically significant differences exist in automatic issue triage quality and effort estimation when using different LLMs and prompting strategies, evaluated on real ICT project datasets?

H1 (Triage): At least one configuration achieves F1-macro > heuristic baseline (p < 0.05, effect r ≥ 0.1) → EX-Hypotheses-H1-H2

H2 (Estimation): At least one few-shot configuration achieves MAE < historical mean (p < 0.05, delta ≥ 0.5 SP)


1.4 Methodology Summary

Paradigm: Design Science Research (DSR) — Hevner (2004) + Peffers (2007) Literature: SLR following PRISMA 2020 + PRISMA-DFLLM extension (AI-assisted screening documented) Experiment: Controlled, with pre-registered conditions, seed=42, temperature=0 Stats: Wilcoxon signed-rank test, effect size r

Incremental strategy:

  • Guaranteed MVP: Triage module (Stage 1) — Strategy C
  • Target: Triage + Estimation (Stages 1+2) — Strategy D
  • Optional: Backlog prioritisation (Stage 3)

1.5 Project Plan

PhasePeriodKey Output
F0 InitiationFeb 23 – Mar 8PEC1: Chapter 1 + environment verified
F1 DesignMar 9–28Architecture + prompting strategies
F2 PrototypeMar 29 – Apr 8PEC2: Working triage module
F3 ExtensionApr 9 – May 10PEC3: Estimation module
F3b ConditionalMay 11–31Backlog module (if F3 on time)
F4 AnalysisMay 1 – Jun 7PEC4: Full results + statistics
F5 ClosureJun 8–23PEC5: Final submission + defence

1.8 AI Use Declaration

Framework: Marco Veritas (Codina, 2024) Principle: No delegation of judgement — AI generates options, author decides Validation: Every AI-sourced reference verified in primary source Tools declared: Claude, Perplexity, DeepSeek, Gemini, NotebookLM, AnythingLLM, GitHub Copilot, Cursor, OpenHands, Warp AI Terminal, OpenCode, Browser OS

Full log: AI-Use-Log


Writing Notes

Use this section for drafting observations, things to revise, feedback from tutor

  • Tutor feedback received on: {{date}}
  • Main revision needed: {{notes}}
  • Approved version submitted: {{date}}

🔗 Connected Notes

Navigation: MOC-PUMA-Master · PR-PUMA-Ch2-Ch3-Ch4-Ch5 · PR-PUMA-Ch3-Methods

Hypotheses & Experiments: EX-Hypotheses-H1-H2 · EX-Stages-Overview

Core Concepts (§1.1 problems): PN-IssueTriage-StoryPoints — Manual triage bias · PN-KeyConcepts-Agents-Reproducibility-RedTeam — Reproducibility gap + Uniqueness Trap

Datasets referenced: LN-Datasets-JiraSR-TAWOS — Jira SR + TAWOS · LN-Angermeir-2025-Reproducibility — Reproducibility crisis

Methodology (§1.4): PN-DSR-SLR-Methods — DSR + PRISMA · PN-Wilcoxon-FINER-Cornell-PRISMA — Statistical protocol

Project governance: SP-PUMA-Constitution — Non-negotiable principles · BMAD-Agent-Roster — Research team

Key thinkers: PER-Flyvbjerg-Bent — Uniqueness Trap · PER-Assalaarachchi-Nuwan — Agentic SPM