AI Use Log — PUMA Project

Purpose: Complete log of all significant AI tool usage in the PUMA project, per Marco Veritas principles and PRISMA-trAIce requirements. Principle: No delegation of judgement. AI generates options; author decides and validates all outputs.


Logging Protocol

For every significant AI interaction (not minor autocomplete), record:

FieldDescription
DateYYYY-MM-DD
PhaseF0–F5
ToolTool name and version/model
PurposeWhat was the AI asked to do
Input givenBrief description of what was provided
Output receivedBrief description of what was generated
ValidationHow output was verified
Action takenWhat you did with the output
Discarded?Any output discarded and why

Log Entries

Template Entry (copy to add new entries)

---
Date: YYYY-MM-DD
Phase: F0
Tool: Claude (claude.ai, claude-opus-4.6)
Purpose: [What task]
Input: [Brief description]
Output: [Brief description]
Validation: [How verified]
Action: [What done with it]
Discarded: [N/A or reason]
---

Phase F0 — Initiation

Date: 2026-02-24
Phase: F0
Tool: Claude (claude.ai)
Purpose: Initial state-of-the-art panoramic mapping using EGI prompting
Input: RCOIF prompt describing PUMA project + EGI Panoramic Mapping variant
Output: Map of 6 sub-areas in PM+LLM research with key papers per area
Validation: Verified 12/15 papers in Google Scholar; 3 could not be confirmed (flagged)
Action: Used as starting point for Zotero library; discarded 3 unverifiable papers
Discarded: 3 hallucinated paper details (correct titles, wrong DOIs)
---

Date: 2026-02-26
Phase: F0
Tool: Perplexity AI (Sonar model)
Purpose: Dataset discovery for issue triage
Input: Query: "Public datasets for Jira issue priority classification machine learning"
Output: Mentioned JIRA Social Repository, Eclipse Bug Dataset, Mozilla Bug Dataset
Validation: Verified all 3 exist in primary sources; Jira SR chosen for PUMA
Action: Added Jira SR and TAWOS to Zotero with #puma-include tag
Discarded: None — all outputs verified
---

Date: 2026-03-01
Phase: F0
Tool: NotebookLM (Google)
Purpose: Cross-paper synthesis of reproducibility limitations
Input: Uploaded 8 papers on LLM benchmarks in SE
Output: Summary identifying reproducibility as the most cited limitation
Validation: Manually checked each paper's limitations section to confirm
Action: Used to strengthen Section 1.1 gap argument; rewritten in own words
Discarded: Some specific percentages NotebookLM cited — not traceable to specific pages
---

Phase F1 — Design

Date: {{DATE}}
Phase: F1
Tool: Claude Code (claude-sonnet-4.6)
Purpose: Generate SDD spec template for TriageAgent
Input: Described TriageAgent requirements in natural language
Output: Draft OpenSpec YAML for TriageAgent
Validation: Reviewed against BDD requirements and PUMA specs; significant revision made
Action: Used as starting point for SP-Triage-Agent-v1.md; 60% rewritten
Discarded: Initial output had unrealistic latency requirements (<5s); revised to <60s
---

Phase F2 — Prototype

(Entries to be added during implementation)


Summary Statistics

TABLE count(rows) as "Interactions", tool, phase
FROM "50 - Areas/51 Research"
WHERE type = "ai-use-log"
GROUP BY tool, phase

Tool Registry (All tools used in PUMA)

ToolCategoryPhasePRISMA-trAIce role
Claude (claude.ai)LLM conversationalF0–F5Research synthesis, writing review
ChatGPT (OpenAI)LLM conversationalF0–F2Cross-validation, alternative perspectives
DeepSeek-R1LLM reasoningF0–F1Statistical and technical questions
Google GeminiLLM multimodalF0–F1Figure analysis, long documents
Perplexity AIWeb-grounded LLMF0–F1State-of-art discovery
ConsensusEvidence AIF0Literature evidence validation
ElicitSLR assistantF0Abstract screening
NotebookLMPaper corpusF0–F1Multi-paper synthesis
AnythingLLMLocal RAGF1–F4Local document Q&A
Llama 3.2 8B (Ollama)Experiment modelF2–F4Research subject, not research tool
Mistral 7B (Ollama)Experiment modelF2–F4Research subject, not research tool
Claude CodeCoding agentF2–F4Code generation (human-reviewed)
GitHub CopilotCode completionF2–F3Inline code suggestions
Cursor AICode assistantF2–F3Code refactoring
OpenCodeCode agentF2–F3Open-source alternative
OpenHandsAutonomous agentF2Scaffolding generation
Warp AI TerminalCLI assistantF2–F4Terminal command generation
Browser OSWeb agentF0–F1Research automation
Microsoft CopilotOffice assistantF0–F5Minor document tasks

PR-PUMA-Ch1-Introduction (§1.8 AI Use Declaration) · Ethics-Review-Log

PRISMA-Log · LN-Tools-AI-Assistants-LLMs

MOC-Tools-Stack · MOC-Research-Pipeline