The AGN Data Challenge was created to motivate planning for AGN science with the Rubin Observatory Legacy Survey of Space and Time (LSST), focusing on parameterization of AGN light curves, AGN selection, and AGN photo-z. Below are the award recipients.
Selection of AGNs through variability and correlation
Awardees: Vincenzo Petrecca and Maurizio Paolillo
Award: $500
Variability has proven to be a hallmark of nuclear activity in galaxies at all frequencies, proving to be an efficient AGN selection tool. This notebook explores the parameter space of variable sources showing that even the simplest statistics obtained from light curves perform very well in the selection of QSOs. We prove that lightcurve variance and correlation among bands alone allow us to produce samples of QSOs (and AGN) with a completeness of ~ 90%, albeit with low purity (~50%). By adding the extendedness and color, it is possible to remove most of the contaminants, reaching a purity of ~ 90 % and decreasing the completeness by less than 10%. Correlation analysis among different bands thus enables a very fast and cheap first-order selection of candidate QSOs (and possibly less luminous AGN); we propose to include the correlation indexes among the LSST data products, as they could be relevant features for more complex selection approaches based on traditional and Machine Learning methods.
ULISSE: a deep learning tool for the efficient initial exploration of large surveys
Awardees: Torbaniuk Doorenbos, Paolillo Cavuoti, Brescia Longo, Pablo Márquez-Neila, Raphael Sznitman
Award: $1000 x 2
Starting from a single prototype object, ULISSE is capable of identifying objects sharing the same morphological and photometric properties and hence of creating a list of candidate sosia. We applied ULISSE to the LSSTC AGN Challenge data for the detection of candidate Active Galactic Nuclei (AGNs). ULISSE employs transfer learning, directly using features extracted from the ImageNet dataset. The power of this method is that it is capable of rapidly identifying a list of candidates, starting from only a single image of a given prototype, without the need for any time-consuming neural network training. We showcased great capability for identifying objects with similar properties and, in the specific application considered, good efficiency at detecting candidate AGNs.
Deep neural network architecture
Awardee: Weixiang Yu
Award: $750
Rubin LSST is expected to find ~100 million AGN over the course of its 10-year survey. Confidently and efficiently selecting those AGN from a total of ~40 billion LSST sources without spectroscopy is an extremely challenging task. To tackle this challenge, my submission strives to make the most use of the rich LSST dataset in a scalable manner. I constructed a deep neural network architecture that classifies sources into Quasar/AGN, Galaxy, and Star using properties that will be included in the future data release catalog (e.g., color, proper motion, etc.) and a 2D representation of their light curves in all photometric bands. My proposed algorithm works equally well for both densely sampled and sparsely sampled sources. High completeness and purity (>96%) were achieved for all three classes on the data challenge dataset.
Machine learning methods
Awardees: Djordje Savic, Isidora Jankov
Award: $2,500
The amount of data the LSST will bring requires us to apply machine learning methods for data-driven science. Quasars are essentially galaxies in the phase of strong nuclear activity driven by a supermassive black hole. Separating quasars from other objects in the sky, namely stars and regular galaxies, is one of the fundamental tasks in cosmology and galaxy evolution. We extensively investigated how various machine learning methods: random forest, support vector machine, xgboost, and artificial neural networks, perform on a variety of data that includes: tabular, time series, and imaging datasets. Time series have proven to provide a huge increase in the classification performance. We decided to participate as a way of progressive education starting almost from scratch. The whole concept of the data challenge was highly motivating, and we were looking forward to exchanging ideas with our international colleagues who are working in the same field. The combined results of each participating team are highly complementary and beautifully demonstrate how the same problem could be approached by changing the perspective.