Research

Machine learning accelerated Bayesian joint analysis of data from the Universe's first Gyr

Cosmic history

Timeline showing a Big Bang model for the origin and evolution of the Universe.

Image credit: NASA/WMAP Science Team

To better understand how the first galaxies formed and reionized the intergalactic medium (IGM), we develop machine learning–accelerated Bayesian frameworks for the joint analysis of high-redshift cosmological data.

In Sims et al. 2025, we develop Bayesian joint analysis and information-theoretic frameworks used to derive astrophysical inferences from across the electromagnetic spectrum — including Cosmic Microwave Background measurements (Planck, SPT), 21-cm power spectrum upper limits (HERA, LOFAR, MWA), and Lyman-line-based estimates of the IGM neutral fraction — to place improved bounds on the timing and drivers of cosmic reionization.

Our analysis shows that reionization was driven primarily by galaxies forming in dark matter halos with masses above ∼10⁹ M☉, and that the transition from a neutral to ionized IGM occurred rapidly, completing by redshift ~6. Using neural density estimation methods, we sample directly from the CMB optical depth predicted by each model rather than assuming a fixed prescription for the ionizing photon escape fraction — a key astrophysical uncertainty.

This approach yields tighter, model-aware constraints on the timing, duration, and source populations of reionization. Future work will incorporate improved models of star formation and ionizing photon escape, calibrated by new data from JWST and the Lyman-alpha forest, to refine our understanding of galaxy formation in the first billion years.