Data science helps cross-check space discoveries 'across time and telescopes'

a large galaxy shown shining pink, with two nearby galaxies to the upper left. dots of far distant galaxies are littered about

James Webb imaged the Cartwheel galaxy and its closeby companions. (Image credit: NASA, ESA, CSA, STScI, Webb ERO Production Team)

We are living in the age of information, and this sentiment also pertains to astronomy. Quite a few telescopes are trained to scan large sections of our sky, cataloging and imaging millions, even billions, of objects. Having this much information can do wonders for science, but it can also make things extremely difficult.

With so much data, it is often hard to match objects to one another across surveys. That’s why one group at Johns Hopkins University turned to data science to develop a new method of making such matches.

Matching astronomical objects is critical for space scientists because different surveys supply different information, whether that be wavelength data, exposure times, or even the date the survey was done. Surveys such as the Sloan Digital Sky Survey, the Hubble Source Catalog, the Fermi Gamma-ray Space Telescope and the Evolutionary Map of the Universe each detect between thousands and billions of objects at a wide range of wavelengths and under pretty different conditions.

As you might imagine, problems often arise when researchers try to study an object that may be present in more than one of these surveys. For example, imagine observing a distant galaxy only to find that another foreground galaxy appears very close to your target. When looking at two different surveys, especially at multiple wavelengths, it may be difficult to determine which galaxy is which. To get the science right, the objects need to be matched correctly.

That’s not always easy.

It is here where Jacob Feitelberg, Amitabh Basu and Tamás Budavári from Johns Hopkins University hope to step in. Using techniques often seen in data science, they managed to pair objects from multiple surveys in order to obtain the likelihood that some recorded objects are indeed the same object. "For every observation from survey 1 and survey 2, we give this pair a 'score,' which measures the likelihood that these observations were of the same celestial object," Basu said in a statement. Scoring like this allows pairing to extend through enormous amounts of data and quickly. The team's method proved so effective, in fact, that the researchers could even match objects between 100 different catalogs.