🎬 Karpathy’s Autoresearch: We Achieved Near-Human Scores in 2 Hours!

Video Details

Channel: Onchain AI Garage URL: https://www.youtube.com/watch?v=9jxrmk_Xses Relevance: ⭐⭐⭐⭐


Summary

Practical implementation of Karpathy’s autoresearch methodology, achieving near-human benchmark scores on a research task in 2 hours using Claude Code with Obsidian notes as context. Documents the exact workflow: structure notes → Claude reads → Claude proposes experiments → human validates → iterate.


PUMA Relevance

Provides a concrete proof-of-concept for the autoresearch approach used in PUMA’s research phase. The near-human benchmark score demonstrates that structured Obsidian notes can substitute for RAG in research contexts. Relevant for justifying PUMA’s hybrid Obsidian+Qdrant design.


MOCs