How We Found the Most Distant Quasar (Yet) Known

False-color image of the field around the quasar ULAS J1120+0641 (the faint red source indicated by the cross hairs). Only its color distinguishes the quasar from the other sources, mostly ordinary stars in Earth's Milky Way galaxy. (Image credit: The United Kingdom Infrared Telescope)

Daniel Mortlock worked on the Planck satellite project at the University of Cambridge before becoming a lecturer in astrostatistics at Imperial College London in 2005. Mortlock contributed this article to Space.com's Expert Voices: Op-Ed & Insights

Just before midnight on Sept. 3, 2010, an astronomical database went live on the Web. The Eighth Data Release of the — take a breath now — United Kingdom Infrared Telescope (UKIRT) Infrared Deep Sky Survey (UKIDSS) wasn't particularly noteworthy in computing terms, but it was of considerable scientific significance: It contained new data on hundreds of millions of astronomical objects, many of them never previously seen.

The vast majority of these objects were ordinary sunlike stars in Earth's own Milky Way galaxy, but there was about a 10 percent chance that hidden somewhere in the terabytes of data was a single object more distant than any known. My job was to find it.

Catching a quasar

I was in an international team led by my Imperial College colleague Steve Warren, and the particular type of object we were looking for was a quasar. This is the glowing accretion disk of gas that can form around a supermassive black hole at the center of an otherwise ordinary galaxy. The material being pulled into the black hole gets compressed and heated to the point that it easily outshines all the stars in the host galaxy. In many cases, that host galaxy is so faint it is not detected, leaving only the quasar visible.

The main reason for putting so much effort into finding distant quasars , in particular, is that they are by far the brightest, and hence most revealing, astronomical objects in the early universe. Back in 2010, the most distant quasar known appeared to astronomers as it was when the universe was 900 million years old, just 7 percent of its current age of 13.9 billion years. (The finite speed of light means that larger physical distances translate to greater distances in time, or look-back times.)

It is remarkable that a disk of glowing gas about the size of our solar system can be seen billions of light years away, but the comparatively small size of quasars also means they appear star-like when viewed from Earth, just unresolved points of light in the night sky. This is one reason that quasars can be so hard to find: In any astronomical image taken through a single-wavelength filter, they are indistinguishable from ordinary stars, which massively outnumber them.

The secret to finding quasars is looking for their distinctive colors . The most distant quasars are very red in color, being almost invisible at optical wavelengths while appearing bright in the near-infrared. (This is due to a combination of the cosmological expansion — which Doppler-shifts all light to longer wavelengths — and absorption by neutral — i.e., un-ionized — hydrogen atoms present in the early universe.) In contrast, stars like the sun mainly emit optical light, although cooler brown dwarfs (essentially "failed" stars in which hydrogen fusion never got going) are almost as red as the target quasars. So, quasar searches are typically done by comparing images of the same part of the sky taken with different wavelength filters.

If the UKIDSS data had been perfect, it might have been possible to identify any record-breaking quasars immediately. But all real astronomical data is noisy: The measured colors of the sources in the UKIDSS catalogue (and all other data sets) don't quite match their true values.

As a result, in a plot of measured brightness ratios from different filters, stars and brown dwarfs overlap with distant quasars . The traditional approach of identifying all objects with colors like the target objects, which had worked in previous searches at lower distances, would have been hopelessly inefficient with UKIDSS. [Brightest Quasar is Also Most Distant ]

That could easily have been a potentially fatal problem for the project, as there were far too many objects to study more closely through re-observation. What was needed was some way to prioritize the best candidates only on the basis of the data at hand.

This sort of problem — how best to make use of limited astronomical data — is the subject of the emerging field of astrostatistics (which, the complaints of Microsoft Word 2011 notwithstanding, is spelled without a hyphen).

Astrostatistics sort the Big Data

The solution we came up with was to use the statistical technique of Bayesian model comparison to assess each candidate, in turn, by considering which of two hypotheses was more consistent with the data: that a given object is a (cool) star or that the object is a (distant) quasar.

An additional vital ingredient in the method is Bayes' theorem, a fundamental mathematical result published posthumously by the Presbyterian minister Thomas Bayes (1701-1761). The theorem demands the inclusion of prior information, rather than just the data at hand. This is often cited as a reason not to use Bayesian methods, because it can often seem that there is no other, prior useful information available. But in our case we actively needed to use the (prior) fact that stars outnumber quasars by many thousands to one. The odds of any object chosen randomly from the UKIDSS database being a distant quasar were correspondingly low, and so most apparently promising candidates would correctly be discarded.

Measured colors (essentially the ratio of how bright objects appear in different wavelength filters) for objects detected in the United Kingdom Infrared Telescope Infrared Deep Sky Survey that passed researchers' initial selection criteria (shown by the dashed lines). Even though the sources are broadly consistent with being distant quasars, the vast majority are actually either stars or brown dwarfs in the Milky Way galaxy (the predicted properties of which are shown as the blue curve). The five distant quasars (ULAS J1120+0641 and ULAS J1148+0702, along with the three already known) are indicated in red, with error bars to illustrate the limited precision of the measurements. The predicted quasar properties are shown as the blue curve, with labels showing how these colors change with look-back time. (Image credit: Daniel Mortlock)

Another appealing aspect of the Bayesian approach is that it automatically encodes many of the criteria that we had been applying intuitively (and qualitatively) when we had first started the search. Fainter objects had been rejected because the color estimates were less precise; now they were objectively ranked in descending order by the fact that a star, when that faint, could end up having the measured colors of a quasar. We had regarded ambiguous objects with measured colors halfway between the two populations with limited enthusiasm; now they were rejected for being so much more likely to have been "scattered" from the dominant stellar population.

The result of applying the Bayesian ranking scheme to the UKIDSS data was that an input list of tens of thousands of apparently good candidates was reduced to fewer than 50 objects. Three of those already had been identified as very distant (but not quite record-breaking) quasars by the earlier Sloan Digital Sky Survey (SDSS), an important validation of our approach. Quick follow-up observations to confirm the UKIDSS measurements of the remainder allowed us to discard all but two of the other candidates; we sent the coordinates of the two survivors to the Gemini North Telescope for more precise spectroscopic measurements (in which the light is separated into different wavelengths).

Ancient quasar revealed

The first of the two objects, with the perhaps uninspiring name of ULAS J1120+0641, was observed on the night of Nov. 27, 2010, and it was immediately revealed it to be easily the most distant quasar known, bettering the previous record holder by a full hundred million years.

We had found what we were looking for — and the short time between the initial data release and the confirmation was important, as there were other research groups with access to the same data attempting the same search. (The second object, ULAS J1148+0702, was also confirmed as a quasar, but was in the same distance range as the slightly closer quasars found earlier by SDSS.) In the time since its discovery, the quasar ULAS J1120+0641 has been observed using telescopes all around the planet, and the Hubble Space Telescope in orbit.

Scientists are still unraveling this quasar's secrets to this day. Aside from revealing what conditions were like 800 million years after the Big Bang, ULAS J1120+0641 is also the home of the earliest supermassive black hole found to date, a monster with two billion times the mass of the sun that had, in contradiction with most standard theories of black hole formation, somehow coalesced in the cosmologically short time available. And none of this would have been possible without a piece of mathematics done by an 18th century Presbyterian priest.

Follow all of the Expert Voices issues and debates — and become part of the discussion — on Facebook, Twitter and Google+. The views expressed are those of the author and do not necessarily reflect the views of the publisher. This version of the article was originally published on Space.com.