🎬 Videos — Observability, LLM Evaluation, Testing & DevOps
Overview
Videos covering LLM evaluation frameworks, observability, and testing. PUMA uses: CodeCarbon, Arize Phoenix, Wilcoxon tests, Promptfoo. See also: Carbon-Tracking-Log · LN-Tools-Dev-Environment
LLM Evaluation & Observability
| # | Title | Channel | URL | PUMA Relevance |
|---|---|---|---|---|
| 1 | Construyendo IA Fiable: Evals, Trazabilidad y Observabilidad | LambdaCast 34 | LambdaLoopers | https://www.youtube.com/watch?v=qZ2Eu3kqA_g | ⭐⭐⭐⭐⭐ PUMA’s evaluation design; evals + traceability |
| 2 | Curso evaluacion LLM con Promptfoo - episodio 1 | La Hora Maker | https://www.youtube.com/watch?v=nGaHoH9HHu0 | ⭐⭐⭐⭐ Promptfoo for LLM evaluation |
| 3 | AI Testing Series Day 1 || Test AI 10× Faster with promptfoo! | AB Automation Hub | https://www.youtube.com/watch?v=vfHu2-YLBWE | Promptfoo basics |
| 4 | AI Testing Series Day 2 || Variable Injection & Assertions in promptfoo | AB Automation Hub | https://www.youtube.com/watch?v=9S9UbvxO60c | Promptfoo advanced |
| 5 | Introduction to Observability and Prometheus Tutorial | NullSafe Architect | https://www.youtube.com/watch?v=sNk9NkgTOLs | Observability fundamentals |
AI Testing (Playwright & TestSprite)
| # | Title | Channel | URL | PUMA Relevance |
|---|---|---|---|---|
| 6 | Este IA hace el testing por ti (TestSprite) | Fazt Code | https://www.youtube.com/watch?v=-BKm_wUg9P8 | AI-automated testing |
| 7 | TestSprite MCP + GitHub Copilot CLI = Agentic AI Agent Test | Execute Automation | https://www.youtube.com/watch?v=iSZMfK6SqRI | MCP-based testing |
| 8 | Claude Code + Playwright Claude Code + Playwright = INSANE Browser Automations | Chase AI | https://www.youtube.com/watch?v=I9kO6-yPkfM | Playwright automation |
| 9 | Claude Code + Playwright CLI: Automate QA with Less Tokens | Eric Tech | https://www.youtube.com/watch?v=nN5R9DFYsXY | QA automation |
| 10 | Multi-Agent Code Review - AI Tools Compare Verilog Analysis | Craig Hollabaugh | https://www.youtube.com/watch?v=YdS45rcqHl0 | Multi-agent code review |
Carbon & Sustainability
| # | Title | Channel | URL | PUMA Relevance |
|---|---|---|---|---|
| 11 | (CodeCarbon official docs) | — | https://codecarbon.io | ⭐⭐⭐⭐⭐ Primary CodeCarbon resource |
| 12 | How AI is Automating AI Research: The Agentic Loop Explained! | AINexLayer | https://www.youtube.com/watch?v=KjbaFUjPkpM | Research automation measurement |
DevOps & CI/CD
| # | Title | Channel | URL | PUMA Relevance |
|---|---|---|---|---|
| 13 | Railway CLI + IA: Despliega TODO sin tocar nada 🤯 | Fazt Code | https://www.youtube.com/watch?v=kzeNxAdpV6g | Railway deployment |
| 14 | De saturar 4GB a Infraestructura Mínima: Refactorización de un MVP en Go | DevExpert | https://www.youtube.com/watch?v=zcID70f04g4 | MVP infrastructure |
| 15 | Google Cloud Tech: Monitoring configuration and automating detection & remediation | Google Cloud | https://www.youtube.com/watch?v=uaa6VNxcn2s | Cloud monitoring automation |
PydanticAI (Output Validation)
| # | Title | Channel | URL | PUMA Relevance |
|---|---|---|---|---|
| 16 | E124 - Creando agentes con PydanticAI | en_coders | https://www.youtube.com/watch?v=txRPLlkK4KE | ⭐⭐⭐⭐ PydanticAI agent creation |
| 17 | Pydantic AI + DeepSeek V3 - The BEST AI Agent Combo | Cole Medin | https://www.youtube.com/watch?v=zf_D2Eafvk0 | PydanticAI + DeepSeek |
| 18 | LLM Tutorial REVOLUTIONIZED with PydanticAI’s AI-Powered Tech Support | Atef Ataya | https://www.youtube.com/watch?v=hDoN9AetTms | PydanticAI for structured outputs |