Machine Learning for Astroparticle Physics:
A Crash-course in SBI

Lecture 1a - Machine Learning in Astroparticle Physics

Christoph Weniger — University of Amsterdam (GRAPPA)

Doing physics with data, and where ML now enters

Theory
GR, ΛCDM, ν oscillation Hamiltonian, dark-matter halo models, EFT of LSS

→

Model / Simulator
LALSimulation, CORSIKA, N-body, hydro, ν propagation

Instrument
LIGO/Virgo/KAGRA, IceCube, Fermi-LAT, Euclid, LSST, CTA, SKA, CMB-S4

→

Data
strain h(t), DOM hits, γ-ray photon lists, survey images, CMB maps, visibilities

Data + Model
forward model meets observation

→

Inference
posteriors on θ: BH masses/spins, mixing angles, Ω_m, σ₈

→

Insight
H₀, NS equation of state, dark-matter mass, ν hierarchy

Machine learning and artificial intelligence play a role in defining new theories, setting up and accelerating models and simulations, instrumental design, data acquisition and processing sistical analysis and inference, as well as interpretation of results. It has permeated the entire scientific workflow.

A short history of statistical / computational astronomy

1809 — Gauss, Theoria Motus: least squares rediscovers Ceres. ML-before-ML.
1911–1913 — Hertzsprung & Russell: statistical population study of stellar luminosities & colours.
1968 — Schmidt, V/V_max: testing population uniformity in flux limited surveys, selection effects built in.
1996 — Bertin & Arnouts, SExtractor: its CLASS_STAR star/galaxy classifier is a small feed-forward NN, trained on simulations. Shipped in every modern survey pipeline.
2010 — Ball & Brunner review: Sloan-era ML (RF, SVM, photo-z, classification).
2015 — Dieleman, Willett & Dambre: rotation-invariant CNN wins the Galaxy Zoo Kaggle challenge. The deep-learning era begins for astro.

Statistical inference and pattern recognition has been doing work for two centuries. The principles and questions remain the same, what has massively expanded is what is possible in practice.

Bertin & Arnouts 1996 (A&AS 117, 393); Ball & Brunner 2010 (arXiv:0906.2173); Dieleman et al. 2015 (arXiv:1503.07077).

ML for theory and the simulator

Learning the forward model: symbolic regression, emulators, surrogates.

ML for the simulator — gravitational-wave surrogates

DANSur generation time vs batch size — constant ∼0.18 ms/waveform over 5 orders of magnitude in input size.

NRSur7dq4 (Varma et al. 2019): 1528 NR sims, q ≤ 4, χ ≤ 0.8, generic spins. ≥1 order of magnitude more accurate than alternatives within training range. Used in GW190521.
mlgw (Schmidt 2020) and mlgw-bns (Tissino 2022): PCA+regression on EOB waveforms; 10–50× speedup. BNS validated on GW170817.
DANSur (Freitas et al. 2024): dual-stage neural network (NN), pretrained on approximants then fine-tuned with NR (numerical relativity sims). Millions of waveforms in <20 ms on GPU; mean NR mismatch ∼10⁻⁴.

Varma et al. 2019 (arXiv:1905.09300); Freitas et al. 2024 (arXiv:2412.06946).

ML for the simulator — cosmology emulators

CosmoPower (Spurio Mancini, Piras et al. 2021): NN emulators for cosmological power spectra; the de-facto inside modern MCMC pipelines.
CAMELS (Villaescusa-Navarro et al. 2021): thousands of N-body+hydro sims as training set for emulators and SBI.

CosmoPower (blue) reproduces a full CLASS-based posterior (red) on KiDS-450 + GAMA, in 3 min vs 2.5 h on 16 cores.

CosmoPower (arXiv:2106.03846); CAMELS (arXiv:2010.00619).

ML for theory — can a network rediscover the equation?

Cranmer, Sanchez-Gonzalez et al. 2020. Train a graph net on N-body sims, distil its edge messages with symbolic regression. Recovers Newton; discovers a new analytic formula for dark-matter overdensity δ (MAE 0.088).
Lemos et al. 2022. Same trick on 30 years of real solar-system data: rediscovers Newton's law plus all planetary masses, unsupervised.
Tools: PySR (Cranmer 2023), AI Feynman (Udrescu & Tegmark 2020).

Cranmer et al. 2020 (arXiv:2006.11287); Lemos et al. 2022 (arXiv:2202.02306).

ML for the instrument and the data

From controlling a real detector to denoising, classifying, and reconstrution.

ML for the data — glitch classification (Gravity Spy)

New glitch classes ("Paired Doves", "Helix") discovered by Gravity Spy volunteers during O1 beta testing.

Advanced LIGO data contains instrumental/environmental transients that mask or mimic astrophysical signals.
Zevin et al. 2017: CNN + Zooniverse citizen-science labels — the canonical glitch classifier.
Wu et al. 2024 (O4 update): multi-time-window fusion with attention. Deployed in O4.

Zevin et al. 2017 (arXiv:1611.04596); Wu et al. 2024 (arXiv:2401.12913).

ML for the data — neutrino reconstruction at IceCube

True vs reconstructed cascade energy: standard likelihood (left) and CNN (right). Comparable accuracy (CNN more stable at high energy), ∼100× faster.

Cascade-CNN with hexagonal kernels (Abbasi et al. 2021, JINST 16). Tested on experimental data. 2–3 orders of magnitude faster than likelihood reconstruction, robust to simulation systematics.
2023 DeepCore CNN: 2D architecture exploiting time and depth translational symmetry, for flavour identification (track-like ν_μ vs. cascades) and inelasticity reconstruction at GeV scale. Outperforms conventional likelihood reconstruction.

Abbasi et al. 2021 (arXiv:2101.11589); IceCube DeepCore CNN 2023 (arXiv:2307.16373).

ML for the instrument — Deep Loop Shaping at LIGO Livingston

Top: cosmological reach gain. Bottom: strain noise — RL controller (blue) vs operational linear controller (red), Dec 2024 LIGO Livingston.

DeepMind + LIGO Instrument Team, Science 389, 6764 (2025). Reinforcement learning with a frequency-domain reward (=reduce noise power) controls mirror suspensions in real time.
>30× control-noise reduction in the 10–30 Hz band, up to 100× in sub-bands. Surpasses the quantum-limit-motivated design goal.

Buchli, Tracey et al. 2025 (arXiv:2509.14016; DOI 10.1126/science.adw1291).

ML for the inference

Posteriors on physical parameters

ML for the inference — GW detection, then & now

2018 (simulated noise): CNN matches matched-filter ROC.

2024 (real O3 noise): AresGW p_astro vs FAR — new candidates inside GWTC/OGC/IAS spread.

Gabbard et al. 2018 (PRL): CNN matches matched-filter ROC on simulated data.
AresGW (Nousi et al. 2022, Koloniari et al. 2024): 54-layer ResNet + Deep Adaptive Input Normalization + dynamic augmentation + curriculum learning. Detects precessing (non-aligned-spin) BBH in real LIGO O3a noise; reports eight new candidate events (p_astro > 0.5) consistent with the FAR–p_astro spread of catalogued GWTC/OGC/IAS events, in the 7–50 M_⊙ training range.

Gabbard et al. 2018 (arXiv:1712.06041); AresGW (2211.01520 & 2407.07820).

ML for the inference — DINGO for GW parameter estimation

Eight GWTC-1 events. Coloured contours: DINGO (neural posterior estimation). Grey: LALInference (gold-standard stochastic sampler).

DINGO = neural posterior estimation on GW strain.

Simulator: waveform model + LIGO/Virgo noise PSDs.
Embedding: whitened strain → 128-d summary (SVD-seeded ResNet).
Head: 15-D conditional normalising flow over masses, spins, distance, sky, inclination.

Headline: O(day) → 20 s/event. Mean JSD vs LALInference 0.0009 nat (sampler floor ∼0.002).

DINGO-IS (2022) adds importance sampling: unbiased posteriors + failure-case diagnostic. 42 BBH events, median ε ∼10%.

NRE alternative (Delaunoy et al. 2020): learns the likelihood-to-evidence ratio rather than the posterior; same ∼1000× speedup, and the ratio is the importance weight DINGO-IS reweights with.

Dax et al. 2021 (PRL 127, 241103); Dax et al. 2022 (arXiv:2210.05686); Delaunoy et al. 2020 (arXiv:2010.12931).

ML for the inference — beyond gravitational waves

Neural SBI is run and stress-tested with full instrumental forward models across astroparticle physics and cosmology. Three more examples:

Strong lensing → dark-matter substructure. Wagner-Carena et al. 2023: NPE on populations of HST-quality simulated lenses (full pipeline systematics); subhalo mass function recovered from 1000 lenses.

arXiv:2203.00690

21cm → reionization astrophysics. Saxena et al. 2023: marginal NRE on mock SKA 21cm P(k), constraining X-ray heating and EoR parameters.

arXiv:2303.07339

Galaxy clustering → Ω_m, σ₈. SimBIG (Hahn et al. 2023): normalising-flow SBI on 109,636 real BOSS CMASS galaxies down to non-linear scales; σ₈ 27% tighter than PT-likelihood baselines.

arXiv:2211.00723

Two ways to do Bayesian inference with a simulator

\[ p(\theta\mid d) \;=\; \frac{p(d\mid\theta)\,p(\theta)}{p(d)} \]

Likelihood-based — "MCMC on steroids"

Write down \(p(d\mid\theta)\) explicitly.
Replace the slow forward model with a neural emulator (CosmoPower).
Run MCMC / HMC over the emulator. The likelihood is still there.

Simulation-based — "ABC on steroids"

No explicit likelihood. Only \((\theta_i, x_i) \sim p(\theta)\,p(x\mid\theta)\) pairs.
Learn the posterior (NPE), the likelihood (NLE), or the ratio (NRE) directly.
DINGO is NPE.

Cranmer, Brehmer & Louppe 2020, "The frontier of simulation-based inference," PNAS 117, 30055 (arXiv:1911.01429).

Foundation models for astro

Foundation models touching astro

AstroCLIP (Parker et al. 2024): cross-modal image-spectrum embedding; transfers to photo-z, morphology, redshift.
AstroLLaMA (Nguyen et al. 2023): LLaMA-2 domain-adapted on astro-ph abstracts.

AstroCLIP (arXiv:2310.03024); AstroLLaMA (arXiv:2309.06126).

Plan for this course

In this course we will introduce the foundational concepts of simulation-based inference. We will introduce key concepts step-by-step, with multiple examples to develop intuition. The hands-on exercises, as well as lecture notes, support this.

Monday

Lecture 1a: Motivation (this lecture)
Lecture 1b: Approximate Bayesian Computation
Lecture 2a: Linear regression
Lecture 2b: Deep learning
Hands-on exercises

Wednesday

Colloquium: Ongoing work
Lecture 3a: Density estimation
Lecture 3b: Diffusion models

Thursday

Lecture 4a: Summary networks CNN
Lecture 4b: Summary networks GNNs

Friday

Lecture 5a: Sequential SBI
Lecture 5b: Diagnostics
Hands-on exercises

Machine Learning for Astroparticle Physics: A Crash-course in SBI