🔧 Frameworks — Research & Engineering Frameworks

Overview

Permanent notes on all methodological frameworks. Sub-folders: Prompting/ · Research/ · Engineering/

Current Framework Notes

NoteFrameworkMaturity
PN-RCOIF-FrameworkRCOIF structured prompting🌳 Evergreen
PN-MIT-Student-MethodMIT Student Method🌳 Evergreen
PN-EGI-FrameworkEGI guided exploration🌳 Evergreen
PN-AMI-DRCA-IIPR-FrameworksAMI · DRCA · IIPR🌳 Evergreen
PN-SDD-FrameworkSDD · BDD · BMAD🌳 Evergreen
PN-Wilcoxon-FINER-Cornell-PRISMAPRISMA-trAIce · Cornell · FINER🌳 Evergreen

Frameworks to Add

  • PN-Grounded-Theory.md — For qualitative AI use analysis
  • PN-CDD-Framework.md — Context-Driven Development detail
  • PN-Agent-OS.md — Agent OS orchestration patterns


id: RES-Triage-Stage1-Summary title: ”📊 Results Summary — Stage 1 Triage” type: permanent-note category: result tags: [permanent, results, triage, stage1, f1-macro, h1] created: 2026-03-01 maturity: seedling status: pending

Results Summary — Stage 1: Issue Triage

Chapter Status

⏳ Pending experiment execution (Phase F2) This note will be populated after running all 8 triage conditions.


Summary Table (to fill after F2)

ModelStrategyF1-macrop-valuer≥0.55?
HeuristicTBD
TF-IDF+SVMTBD
llama3.2:8bzero-shotTBDTBDTBD
llama3.2:8bfew-shot-3TBD
llama3.2:8bfew-shot-6TBD
llama3.2:8bcotTBD
mistral:7bzero-shotTBD
mistral:7bfew-shot-3TBD
mistral:7bfew-shot-6TBD
mistral:7bcotTBD

H1 Decision: Pending

Source experiments: EX-Stages-Overview



id: RES-Estimation-Stage2-Summary title: ”📊 Results Summary — Stage 2 Estimation” type: permanent-note category: result tags: [permanent, results, estimation, stage2, mae, h2] created: 2026-04-10 maturity: seedling status: pending

Results Summary — Stage 2: Effort Estimation

Chapter Status

⏳ Pending experiment execution (Phase F3)

Summary Table

ModelStrategyMAEp-valuer≤3.0 SP?
Historical meanTBD
Deep-SE~3.2lit.
CoGEE/GPT-4~1.9lit.
llama3.2:8bzero-shotTBD
llama3.2:8bfew-shot-3TBD
llama3.2:8bcotTBD
mistral:7bzero-shotTBD
mistral:7bfew-shot-3TBD
mistral:7bcotTBD

H2 Decision: Pending

Source experiments: EX-Stages-Overview



id: PN-Falsifiability-Popper title: “Falsifiability — Popper’s Criterion of Demarcation” type: permanent-note category: concept tags: [permanent, concept, falsifiability, popper, scientific-method, hypothesis] aliases: [“Falsifiability”, “Popper’s criterion”, “Demarcation criterion”] created: 2026-03-01 maturity: evergreen sources: [“LN-Books-KeyReferences”]

Falsifiability — Popper’s Criterion of Demarcation

Atomic Claim

A hypothesis has scientific value only if it can be falsified by empirical evidence — meaning it must be possible to conceive an observation that would prove it wrong. This is the principle guiding how PUMA formulates H1 and H2.

The Principle

Karl Popper (1934/1959) argued that the boundary between science and non-science is falsifiability: a statement is scientific if and only if it is possible to imagine evidence that would prove it false.

Applied to PUMA:

  • H₀₁ is falsifiable: if any LLM condition achieves F1 > baseline with p < 0.05, H₀₁ is rejected
  • H₀₁ is also non-refutable in a valid way: if no condition beats baseline, the experiment concludes local LLMs add no value in these conditions — a meaningful negative result

The Null Hypothesis Structure

H₀ (null, refutable):  No effect exists
H₁ (alternative):       Effect exists with minimum magnitude X
Falsification condition: If data is consistent with H₀ under the protocol, 
                         H₀ is not rejected — the experiment is informative either way

Why This Matters for PUMA

A common mistake is formulating hypotheses that cannot be falsified:

  • ❌ “LLMs will be useful for triage” — too vague, no measurement
  • ✅ “At least one configuration achieves F1-macro > 0.55 vs heuristic baseline (p < 0.05)” — precise, falsifiable

🔗 EX-Hypotheses-H1-H2 · PN-DSR-SLR-Methods