Data Science Fellowship Program


The LSST Discovery Alliance (LSST-DA) Data Science Fellowship Program (DSFP) is an innovative training program launched in 2016.

Our vision is to position graduate students to be future leaders who can meet the scientific challenges posed by large astronomy datasets. Specifically, datasets of the velocity and volume of those anticipated from Rubin Observatory’s Legacy Survey of Space and Time (LSST). Meet our past and present Data Science Fellowship cohorts.

The DSFP is free for students, thanks to the generosity of the National Science Foundation, the Brinson Foundation, the Gordon and Betty Moore Foundation, and LSST-DA. Selected fellows receive a travel stipend and lodging during the duration of the session and a per diem to offset the cost of meals and incidental expenses. 

We are not accepting applications at this time, but anticipate the next call for applications will be in the fall of 2024. Look for announcements for future cohorts here and through Rubin Observatory community networks. We strive to create an inclusive program and particularly encourage applications from students from traditionally underrepresented groups in astronomy.

Fellowship Success

By all measures, the DSFP has been successful in its goals:

  • Of the 90 students who completed DSFP as of September 2023, 99% reported that DSFP contributed to their PhD and/or securing their current position.
  • 56 have received a PhD, 30 are still working toward a PhD, 35 have become postdocs (14 won prize fellowships), and 4 have become tenure-track faculty.
  • The vast majority of past students who received PhDs and did not continue in academia are in data science industry jobs.
  • Applications for each DSFP student cohort are now >20x oversubscribed.

Program Overview

The DSFP is a two-year training program designed to teach skills required for Vera C. Rubin Observatory Legacy Survey of Space and Time (LSST) science that are not easily addressed by current astrophysics programs. Fellows learn a wide range of essential skills, including the basics of managing and building code, statistics, machine learning, scalable programming, data management, image processing, visualization, and communication. Our program is a supplement to graduate education, intended to teach students in astronomy-related fields (e.g., astrophysics, cosmology, planetary science, etc.) essential skills for dealing with big data. 

The DSFP consists of six one-week schools over a two-year period (three per year), each at a different host institution. This gives students ample time to attain skill mastery. As this program is intended as a true supplement to graduate education, fellows make a two-year commitment. They have the opportunity to not only study these topics in far greater depth than traditional schools but also foster new collaborations and professional networks. On top of teaching our students the skills they need for modern survey astronomy, we also aim to create a collaborative, supportive learning environment and work to empower our students to teach the skills they learn to others. 

The Curriculum

Below is a partial list of topics the DSFP covers. Our curriculum is developed and distributed openly. To facilitate the exploration of our materials, we have linked example lessons for the topics below (full material is available via our GitHub repository). We also film every lecture and post the lectures to our YouTube channel.

Software Engineering

  • Building code repositories
  • Object-oriented programming
  • Version control/GitHub
  • Issue tracking
  • Unit tests
  • Continuous integration

Statistics

  • Regression
  • Frequentist vs. Bayesian methods
  • Gaussian processes
  • Generative models
  • Hierarchical models
  • Missing information and selection effects

Machine Learning

  • Unsupervised methods, including density estimation, anomaly detection, feature extraction, and clustering techniques
  • Supervised methods
  • End-to-end automated classification models
  • Deep neural networks

Scalable Programming and Data Management

  • Parallel programming
  • Databases
  • Software profiling
  • Cloud computing

Time Series Analysis 

  • Understanding variable sources with incomplete and noisy sampling
  • Measures of periodicity
  • Gaussian processes
  • The LSST alert stream

Image Processing

  • Noisy astronomical detectors
  • Processing pipelines
  • Position, flux, and shape measurements
  • Hands-on experience with the LSST image-processing software stack

Visualization  

  • Visualization of large dimensional data sets
  • Interactive visualization for exploration
  • Visual hierarchies
  • The effective use of space, color, contrast, and textures

Science Communication 

  • Understanding your audience
  • Effective body language for communication
  • Presentation design principles
  • Using data to tell a story

Program Leadership

The Data Science Fellowship Program is led by:

  • Adam Miller, CIERA/Northwestern (Director)
  • Lucianne Walkowicz, JustSpace Alliance (Founder, Deputy Director)
  • Bryan Scott, CIERA (DSFP Postdoctoral Fellow)
  • Vicky Kalogera, CIERA/Northwestern (Founding Member, Advising Director)

The program also has an advisory board:

  • Andrew Connolly, University of Washington
  • Chris Lintott, University of Oxford
  • Zeljko Ivezic, University of Washington
  • Phil Marshall, SLAC/Stanford University
  • Mario Jurić, University of Washington
  • Robert Lupton, Princeton University

For questions, use our contact form. Please select Data Science Fellowship Program from the dropdown.


LSST Discovery Alliance gratefully acknowledges support for the Data Science Fellowship Program from: