PUMA Vault

Home

❯

60 Resources

❯

63 Checklists

❯

Checklist: SLR Quality Assessment Criteria

01 mar 2026Se lee en 4 min

checklist
slr
quality
prisma
ebse
academic-writing
ai-ethics
anthropic
api
baseline
bibliography
carbon-footprint
citation
codecarbon
data-formats
dataset
dev-tools
ethics
evaluation
ide
json
literature-review
local-llm
non-parametric
obsidian
ollama
prompting
python
rcoif
research
research-methodology
research-tools
statistics
sustainability
validity
wilcoxon
zotero

Checklist: SLR Quality Assessment Criteria

Overview

Apply to every paper included in the PUMA SLR. Score each criterion 1 (poor) – 3 (good).

Quality Assessment Form

Paper: {{Title — Authors (Year)}} Zotero key: {{key}} Assessor: PUMA researcher Date assessed: {{date}}

Q1 — Research Question Clarity

1 — RQ is absent or very vague
2 — RQ is stated but imprecise
3 — RQ is clear, specific, and answerable

Q2 — Study Design Appropriateness

1 — Design is misaligned with RQ
2 — Design is adequate but has significant gaps
3 — Design is appropriate and justified

Q3 — Dataset Quality & Transparency

1 — Dataset not described or not public
2 — Dataset described but not easily replicable
3 — Public dataset with DOI/URL, size and composition described

Q4 — Metric Appropriateness

1 — Metric choice not justified or inappropriate
2 — Standard metrics used without discussion of alternatives
3 — Metric choice justified with discussion of alternatives and limitations

Q5 — Baseline Comparison

1 — No baseline comparison
2 — Baseline used but not well-documented
3 — Clear, reproducible baseline with documented implementation

Q6 — Statistical Rigour

1 — No statistical testing
2 — Some statistical testing but incomplete (e.g., missing effect sizes)
3 — Full statistical analysis with test choice justified, effect sizes, confidence intervals

Q7 — Reproducibility

1 — Code/data not available
2 — Code/data partially available or incomplete
3 — Fully reproducible: code, data, and instructions publicly available

Q8 — Threats to Validity

1 — Threats not discussed
2 — Some threats mentioned
3 — Comprehensive validity analysis (internal, external, construct)

Total score: {{sum}}/24 Quality category:

18–24: High quality
12–17: Moderate quality
<12: Low quality

PUMA inclusion decision:

Include as primary evidence
Include as secondary evidence (acknowledge limitations)
Exclude (score < 12 AND critical flaw in Q3 or Q7)

id: CL-Reproducibility-Experiment title: “Checklist: PUMA Experiment Reproducibility” type: checklist tags: [checklist, reproducibility, experiment, puma] created: 2026-03-01

Checklist: PUMA Experiment Reproducibility

Complete before each experiment run. All items must be checked.

Pre-Run Checklist

Environment

Python version logged: python --version → 3.11.x
All packages installed from pinned requirements.txt
Ollama version logged: ollama --version
Model versions pulled and verified: ollama list
seed=42 set in all random operations
temperature=0 confirmed in Ollama API calls

Dataset

Dataset loaded from original source file (not modified copy)
Stratified subset verified: 50 samples per class × 4 classes = 200 total
Subset saved as data/processed/jira_sr_subset_seed42.csv
SHA256 hash of subset file recorded in experiment log

Measurement

CodeCarbon tracker initialised with project name matching experiment ID
Carbon output directory set to results/carbon/
Latency timer using time.time() before/after Ollama call

During-Run Checklist

Progress logged to structured JSON (not just console)
Any errors caught and logged (not silently swallowed)
Raw model responses saved (not just parsed labels)

Post-Run Checklist

Results saved with timestamp: results/triage_{model}_{strategy}_{timestamp}.json
Carbon report saved: results/carbon/
Git commit made with tag: run-{experiment_id}-{date}
Metrics calculated and recorded in experiment note
Wilcoxon test run and p-value recorded
Experiment note in Obsidian updated

id: CL-AI-Validation title: “Checklist: AI Output Validation (Marco Veritas)” type: checklist tags: [checklist, ai-use, validation, marco-veritas, ethics] created: 2026-03-01

Checklist: AI Output Validation (Marco Veritas)

Apply whenever AI output is being considered for incorporation into the project or research decisions.

Before Using AI Output

I have a clear objective for this AI interaction
I have provided sufficient context (RCOIF structured prompt)
I understand what the AI cannot know or verify

Evaluating the Output

For factual claims (numbers, citations, statistics):

Every specific claim traced to a primary source
Every paper citation verified in Google Scholar / arXiv / IEEE
Every DOI verified as resolving to the correct paper
Any claim that cannot be verified → discarded or flagged

For analytical outputs (synthesis, gap analysis, argumentation):

I have read the primary sources myself, not relying solely on AI summary
AI analysis compared against my own prior assessment
Discrepancies investigated before accepting AI version
AI output rewritten entirely in my own words before incorporation

For code outputs (OpenCode, Copilot, Cursor):

I understand every line of generated code
I can explain what the code does and why
Tests written for generated code
Edge cases considered and handled

Logging

Interaction logged in AI-Use-Log
Tool, purpose, and validation method recorded
Any discarded output documented with reason

Vista Gráfica

Checklist: SLR Quality Assessment Criteria
Quality Assessment Form
Q1 — Research Question Clarity
Q2 — Study Design Appropriateness
Q3 — Dataset Quality & Transparency
Q4 — Metric Appropriateness
Q5 — Baseline Comparison
Q6 — Statistical Rigour
Q7 — Reproducibility
Q8 — Threats to Validity
id: CL-Reproducibility-Experiment title: “Checklist: PUMA Experiment Reproducibility” type: checklist tags: [checklist, reproducibility, experiment, puma] created: 2026-03-01
Checklist: PUMA Experiment Reproducibility
Pre-Run Checklist
Environment
Dataset
Measurement
During-Run Checklist
Post-Run Checklist
id: CL-AI-Validation title: “Checklist: AI Output Validation (Marco Veritas)” type: checklist tags: [checklist, ai-use, validation, marco-veritas, ethics] created: 2026-03-01
Checklist: AI Output Validation (Marco Veritas)
Before Using AI Output
Evaluating the Output
For factual claims (numbers, citations, statistics):
For analytical outputs (synthesis, gap analysis, argumentation):
For code outputs (OpenCode, Copilot, Cursor):
Logging

Retroenlaces

PUMA Estimation Agent — OpenSpec v1.0
🔢 Johnny Decimal Master Index — PUMA Vault
MOC Resources
📇 Master Index — Johnny Decimal
PUMA Vault — Complete User Guide

GitHub
Discord Community

PUMA Vault

Explorador

Checklist: SLR Quality Assessment Criteria

Checklist: SLR Quality Assessment Criteria

Quality Assessment Form

Q1 — Research Question Clarity

Q2 — Study Design Appropriateness

Q3 — Dataset Quality & Transparency

Q4 — Metric Appropriateness

Q5 — Baseline Comparison

Q6 — Statistical Rigour

Q7 — Reproducibility

Q8 — Threats to Validity

id: CL-Reproducibility-Experiment title: “Checklist: PUMA Experiment Reproducibility” type: checklist tags: [checklist, reproducibility, experiment, puma] created: 2026-03-01

Checklist: PUMA Experiment Reproducibility

Pre-Run Checklist

Environment

Dataset

Measurement

During-Run Checklist

Post-Run Checklist

id: CL-AI-Validation title: “Checklist: AI Output Validation (Marco Veritas)” type: checklist tags: [checklist, ai-use, validation, marco-veritas, ethics] created: 2026-03-01

Checklist: AI Output Validation (Marco Veritas)

Before Using AI Output

Evaluating the Output

For factual claims (numbers, citations, statistics):

For analytical outputs (synthesis, gap analysis, argumentation):

For code outputs (OpenCode, Copilot, Cursor):

Logging

Vista Gráfica

Tabla de Contenidos

Retroenlaces