Data science helps cross-check space discoveries 'across time and telescopes'

a large galaxy shown shining pink, with two nearby galaxies to the upper left. dots of far distant galaxies are littered about
James Webb imaged the Cartwheel galaxy and its closeby companions. (Image credit: NASA, ESA, CSA, STScI, Webb ERO Production Team)

We are living in the age of information, and this sentiment also pertains to astronomy. Quite a few telescopes are trained to scan large sections of our sky, cataloging and imaging millions, even billions, of objects. Having this much information can do wonders for science, but it can also make things extremely difficult. 

With so much data, it is often hard to match objects to one another across surveys. That’s why one group at Johns Hopkins University turned to data science to develop a new method of making such matches. 

Matching astronomical objects is critical for space scientists because different surveys supply different information,  whether that be wavelength data, exposure times, or even the date the survey was done. Surveys such as the Sloan Digital Sky Survey, the Hubble Source Catalog, the Fermi Gamma-ray Space Telescope and the Evolutionary Map of the Universe each detect between thousands and billions of objects at a wide range of wavelengths and under pretty different conditions.

Related: Stunning James Webb Space Telescope image shows stars forming in strange wheel-shaped galaxy

As you might imagine, problems often arise when researchers try to study an object that may be present in more than one of these surveys. For example, imagine observing a distant galaxy only to find that another foreground galaxy appears very close to your target. When looking at two different surveys, especially at multiple wavelengths, it may be difficult to determine which galaxy is which. To get the science right, the objects need to be matched correctly. 

That’s not always easy.  

It is here where Jacob Feitelberg, Amitabh Basu and Tamás Budavári from Johns Hopkins University hope to step in. Using techniques often seen in data science, they managed to pair objects from multiple surveys in order to  obtain the likelihood that some recorded objects are indeed  the same object. "For every observation from survey 1 and survey 2, we give this pair a 'score,' which measures the likelihood that these observations were of the same celestial object," Basu said in a statement. Scoring like this allows pairing to extend through enormous amounts of data and quickly. The team's method proved so effective, in fact, that the researchers could even match objects between 100 different catalogs. 

"These observations are fundamental to building theories about the universe, from the smallest particles to the vast cosmos. By matching observations across time and telescopes, researchers can extract more knowledge from the same data, contributing to a deeper understanding of the cosmos,"  Budavári said. 

The team's code is publicly available, too.

A paper about this research was published in September in The Astronomical Journal. 

Join our Space Forums to keep talking space on the latest missions, night sky and more! And if you have a news tip, correction or comment, let us know at:

Elizabeth Fernandez

Elizabeth is a freelance science writer. She has a Ph.D. in astrophysics from the University of Texas at Austin and has worked with telescopes all around the world and in space. Now she writes on astronomy, physics, geology, mathematics, and science and technology in society.