LN — Plasma Control with Deep RL (Degrave et al., 2022)

Full Reference: Degrave, J., Felici, F., Buchli, J., et al. (2022). Magnetic control of tokamak plasmas through deep reinforcement learning. Nature, 602, 414–419. https://doi.org/10.1038/s41586-021-04301-9

Pass 1 — Bird’s Eye

Main Claim

A deep RL agent learns to control plasma configurations in the TCV tokamak, including configurations not previously achievable with classical linear controllers.

Property	Detail
Type	Research paper — Physics / Reinforcement Learning
Relevance to PUMA	⭐⭐ Medium — demonstrates AI discovering novel operational strategies in a domain with complex physics; analogous to PUMA discovering optimal triage strategies

Pass 2 — Key Content

System

Deep RL agent trained in simulation (MHD physics model), deployed on real tokamak
Learns control policies: coil current adjustments to maintain plasma shape
Operates at 10kHz control frequency

Novel Contributions

First demonstration of RL control for tokamak plasma (milestone)
Agent produced novel plasma configurations not previously achievable with classical controllers
Multi-objective control (various plasma shapes) via single policy

Knowledge Generated

The novel configurations represent genuinely new operational knowledge for fusion physics
Plasma shapes discovered by RL were not anticipated by domain experts — constitutes new knowledge

PUMA Relevance

PUMA Analogy

The pattern is analogous to PUMA:

RL agent discovers novel plasma configurations not anticipated by human experts

PUMA’s CoT agent might discover novel triage reasoning chains not used by human PMs

The key difference: PUMA’s “discovery” is evaluated via F1-macro rather than physical plasma stability — but the knowledge generation dynamic is similar.

APA7 Citation

Degrave, J., Felici, F., Buchli, J., et al. (2022). Magnetic control of tokamak plasmas through deep reinforcement learning. Nature, 602, 414–419. https://doi.org/10.1038/s41586-021-04301-9

MOCs

MOC-AI-Knowledge-Generation

PUMA Vault

Explorador

Literature Note — Magnetic control of tokamak plasmas through deep reinforcement learning

LN — Plasma Control with Deep RL (Degrave et al., 2022)

Pass 1 — Bird’s Eye

Pass 2 — Key Content

System

Novel Contributions

Knowledge Generated

PUMA Relevance

APA7 Citation

MOCs

Vista Gráfica

Tabla de Contenidos

Retroenlaces

PUMA Vault

Explorador

Literature Note — Magnetic control of tokamak plasmas through deep reinforcement learning

LN — Plasma Control with Deep RL (Degrave et al., 2022)

Pass 1 — Bird’s Eye

Pass 2 — Key Content

System

Novel Contributions

Knowledge Generated

PUMA Relevance

APA7 Citation

Related Notes

MOCs

Vista Gráfica

Tabla de Contenidos

Retroenlaces