California Harvard Astrostatistics Collaboration:
In recent years, technological advances have dramatically increased the quality and quantity of data available to astronomers. Newly launched or soon-to-be launched space-based telescopes are tailored to data-collection challenges associated with specific scientific goals. These instruments provide massive new surveys resulting in new catalogs containing terabytes of data, high resolution spectrography and imaging across the electromagnetic spectrum, and incredibly detailed movies of dynamic and explosive processes in the solar atmosphere. The spectrum of new instruments is helping scientists make impressive strides in our understanding of the physical universe, but at the same time generating massive data-analytic and data-mining challenges for scientists who study the resulting data.
The complexity of the instruments, the complexity of the astronomical sources, and the complexity of the scientific questions leads to a subtle inference problem that requires sophisticated statistical tools. For example, data are typically subject to non-uniform stochastic censoring, heteroscedastic errors in measurement, and background contamination. Astronomical sources exhibit complex and irregular spatial structure. Scientists wish to draw conclusions as to the physical environment and structure of the source, the processes and laws which govern the birth and death of planets, stars, and galaxies, and ultimately the structure and evolution of the universe. Nonetheless, insufficient effort has been made to bring the strength of modern statistical methods to bear on these problems. The California-Harvard Astrostatistics Collaboration proposes to develop statistical methods, computational techniques, and freely available software to address outstanding inferential problems in high-energy astrophysics and in solar physics.
