LN: Zelikman et al. (2024) — Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking

Bibliographic Reference

Citation: Zelikman, E., Harik, G., Shao, Y., Jayasiri, V., Haber, N., & Goodman, N. D. (2024). Quiet-STaR: Language models can teach themselves to think before speaking. arXiv:2403.09629. COLM 2024. https://arxiv.org/abs/2403.09629

Pass 1 — Bird’s Eye View (5 Cs)

C	Assessment
Category	Training methodology proposal
Context	Extends STaR (Zelikman et al., 2022); unsupervised chain-of-thought training
Correctness	Evaluated on CommonsenseQA, GSM8K. Results are incremental improvements.
Contributions	(1) LLMs generate internal rationales for every token during training; (2) Rationales that improve predictions are reinforced; (3) Emergent reasoning without supervised CoT examples
Clarity	Complex implementation but well-explained theory.

Relevance: ⭐⭐⭐

Relevant as background on why CoT works. Not directly applicable to PUMA MVP (no training, only prompting). Useful for future work (fine-tuning section).

PUMA Connection

Quiet-STaR explains mechanistically why adding “Think step by step” to prompts (Zero-Shot CoT) improves classification quality. This supports the theoretical justification for PUMA’s Strategy 4 (CoT prompting). Reference for Ch.2 (prompting strategies background).

PUMA Vault

Explorador

Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking

LN: Zelikman et al. (2024) — Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking

Pass 1 — Bird’s Eye View (5 Cs)

PUMA Connection

MOCs

Vista Gráfica

Tabla de Contenidos

Retroenlaces