LN: Strubell, Ganesh & McCallum (2019) — Energy and Policy Considerations for Deep Learning in NLP

Bibliographic Reference

Citation: Strubell, E., Ganesh, A., & McCallum, A. (2019). Energy and policy considerations for deep learning in NLP. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (pp. 3645–3650). https://doi.org/10.18653/v1/P19-1355

Pass 1 — Bird’s Eye View (5 Cs)

C	Assessment
Category	Empirical study + policy argument
Context	University of Massachusetts Amherst, ACL 2019. First systematic measurement of the environmental cost of NLP model training
Correctness	Empirically measured; used AWS instances with known power draw. Results corroborated by subsequent independent studies
Contributions	(1) Quantified CO₂ cost of training large NLP models (BERT, Transformer-NAS: up to 626,155 lbs CO₂); (2) Comparison with equivalent car/flight emissions; (3) Policy recommendations for NLP research community; (4) Methodology for measuring ML energy consumption
Clarity	Excellent — concrete numbers, clear methodology, provocative framing

Relevance: ⭐⭐⭐⭐⭐

Strubell et al. provides the academic justification for PUMA’s carbon footprint measurement (CodeCarbon integration). PUMA’s sustainability reporting methodology is directly traceable to this paper.

Pass 2 — Key Concepts

The Carbon Cost of NLP Training

Key findings (2019 figures):

Model	CO₂ eq (lbs)	Equivalent to
Transformer (base)	26	~1 flight NY–SF
GPT-2	~300	~30 flights NY–SF
Transformer-NAS (neural arch search)	626,155	~5× lifetime car emissions
BERT training	~1,400	~125 flights NY–SF

The CO₂ Measurement Methodology

Strubell et al.’s approach (basis for CodeCarbon):

$CO_{2}^{e q} = E \times CI$

Where:

$E$ = energy consumed (kWh) = Power draw × Duration
$CI$ = carbon intensity of electricity grid (kg CO₂/kWh)

This two-factor model is implemented in CodeCarbon with the extension:

$CO_{2}^{e q} = E \times CI \times PUE$

Where PUE (Power Usage Effectiveness) accounts for data centre overhead.

Policy Recommendations

Strubell et al. make three policy recommendations:

Reporting standards: NLP papers should report training cost alongside performance metrics
Equitable access: High compute cost creates barriers for researchers without industry resources
Efficiency incentives: Research community should prioritise efficient models, not just maximally accurate ones

These recommendations directly motivated the ML sustainability movement (Green AI, SustaiNLP workshops).

Inference vs. Training Cost

A critical distinction the paper emphasises:

Training is the dominant environmental cost (600k lbs CO₂ for NAS)
Inference is orders of magnitude cheaper (PUMA uses pre-trained models — inference only)

PUMA’s carbon footprint comes entirely from inference — running already-trained models on 200–1000 issues. This is at the milligram CO₂ scale, not tonne scale. However, measuring it demonstrates scientific rigour and establishes baselines for production SmartPMO deployment.

PUMA Integration

Ch.3 Methods / Sustainability subsection: Strubell et al. as the methodological basis for CodeCarbon integration
CO₂eq formula: Directly from this paper (extended with PUE in PN-ComputationalSustainability)
Framing: PUMA measures inference cost, not training cost — proportionately tiny, but establishes methodology for production deployment

PN-ComputationalSustainability — full CodeCarbon integration, CO₂eq formula, hardware baselines
Ethics-Review-Log — sustainability as an ethics consideration in PUMA
PN-LLM-Models-PUMA — model parameter counts that determine inference cost

PUMA Vault

Explorador

Energy and Policy Considerations for Deep Learning in NLP

LN: Strubell, Ganesh & McCallum (2019) — Energy and Policy Considerations for Deep Learning in NLP

Pass 1 — Bird’s Eye View (5 Cs)

Pass 2 — Key Concepts

The Carbon Cost of NLP Training

The CO₂ Measurement Methodology

Policy Recommendations

Inference vs. Training Cost

PUMA Integration

MOCs

Vista Gráfica

Tabla de Contenidos

Retroenlaces

PUMA Vault

Explorador

Energy and Policy Considerations for Deep Learning in NLP

LN: Strubell, Ganesh & McCallum (2019) — Energy and Policy Considerations for Deep Learning in NLP

Pass 1 — Bird’s Eye View (5 Cs)

Pass 2 — Key Concepts

The Carbon Cost of NLP Training

The CO₂ Measurement Methodology

Policy Recommendations

Inference vs. Training Cost

PUMA Integration

Related Notes

MOCs

Vista Gráfica

Tabla de Contenidos

Retroenlaces