PUMA Vault

Etiqueta: switch-transformer

1 artículo con esta etiqueta.

  • 14 abr 2026

    Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity

    • literature
    • mixture-of-experts
    • moe
    • sparse-model
    • switch-transformer
    • scaling
    • deepseek
    • mixtral
    • transformer
    • architecture
    • efficiency
    • puma-core
    • literature-note
    • moc

Creado con Quartz v4.5.2 © 2026

  • GitHub
  • Discord Community