Design Science Research (DSR)

Atomic Claim

Overview

DSR is the appropriate research paradigm for PUMA because it explicitly produces and evaluates artefacts (the benchmark framework) alongside the knowledge contribution (empirical evidence about LLM capabilities), satisfying both engineering and academic validity requirements.

💡 The Paradigm

DSR (Hevner et al., 2004; Peffers et al., 2007) holds that IS research must:

  1. Produce a useful artefact — not just knowledge, but something that works
  2. Evaluate against defined criteria — utility, novelty, rigour
  3. Communicate to both research and practice audiences

The PEFFERS Process (applied to PUMA)

StepPUMA Implementation
1. Problem identificationManual triage cost, inconsistent estimation, reproducibility gap
2. ObjectivesPUMA benchmark: F1 ≥ 0.55, MAE ≤ 3.0 SP, 100% reproducible
3. Design & DevelopmentModular framework: Ollama + Jira SR + TAWOS + CodeCarbon
4. DemonstrationRunning experiments on real datasets
5. EvaluationStatistical analysis (Wilcoxon), comparison with baselines
6. CommunicationPUMA Project paper + GitHub repository (MIT licence)

Integration with SDD/BDD

In PUMA, DSR is extended by Spec-Driven Development: every artefact component is specified before implementation, using PN-SDD-Framework and BDD scenarios. This makes the “Design” step more rigorous and reproducible.

🔗 Connected Ideas

Produces: SP-Architecture | Evaluated by: PN-DSR-SLR-Methods (SLR section) | Extended by: PN-SDD-Framework Applied in: PR-PUMA-Ch1-Introduction (§1.4) | PR-PUMA-Ch3-Methods (§3.1)


id: PN-SLR-PRISMA title: “Systematic Literature Review + PRISMA 2020” type: permanent-note category: method tags: [permanent, method, slr, prisma, literature-review, ebse] aliases: [“SLR”, “PRISMA”, “Systematic Review”] created: 2026-03-01 maturity: evergreen

Systematic Literature Review + PRISMA 2020

Atomic Claim

An SLR following PRISMA 2020 protocol provides transparent, reproducible, and bias-controlled evidence synthesis — essential for establishing the research gap that motivates PUMA.

The PRISMA Flow for PUMA

IDENTIFICATION
  Search strings in: arXiv, IEEE Xplore, ACM DL, Semantic Scholar, Google Scholar
  Date range: 2022–2026
  Initial records: ~[N]
         ↓
SCREENING (Title + Abstract)
  Exclusion criteria:
  - No empirical evaluation
  - No LLM component
  - No PM or SE task
  - Not reproducible (no code/data)
  Remaining: ~[N]
         ↓
ELIGIBILITY (Full text)
  Inclusion criteria:
  - Reproducible artefact published
  - Uses public dataset with labels
  - Reports quantitative metrics
  - Local or open-access models (preferred)
  Remaining: ≥40
         ↓
INCLUDED
  Final corpus: [N] papers

PRISMA-DFLLM Extension

For AI-assisted screening, document for each AI tool used:

  • Which screening task was automated (title/abstract/full-text)
  • Which model was used (Claude, Perplexity, Elicit)
  • What human validation was applied
  • Inclusion/exclusion decisions that were AI-suggested vs. human-confirmed

FINER Criteria (Research Question Validation)

CriterionPUMA Compliance
FeasibleLocal compute + public datasets + 6-month timeline ✅
InterestingNovel benchmark, fills reproducibility gap ✅
NovelNo existing open-source PM+LLM+CodeCarbon benchmark ✅
EthicalPublic data, open models, human-in-loop ✅
RelevantDirectly applicable to ICT PM practitioners ✅

🔗 Connected Ideas

Guides: PR-PUMA-Ch2-Ch3-Ch4-Ch5 | Uses: PN-Wilcoxon-FINER-Cornell-PRISMA (FINER + PRISMA) | Workflow: WF-SLR-Pipeline Active log: PRISMA-Log | MOC: MOC-Methods-Frameworks DSR source: LN-Hevner-2004-DSR — Hevner et al. (2004): 7 guidelines, 3 research cycles SLR source: LN-Kitchenham-2007-SLR — Kitchenham (2007): 3-phase SLR protocol, PICO PRISMA source: LN-Page-2021-PRISMA2020 — Page et al. (2021): PRISMA 2020, 27-item checklist