Characterizing Variable Stars with APOGEE + TESS#


Learning Goals#

By the end of this tutorial, you will:

  • Understand how to search the MAST Archive and download SDSS APOGEE data using astroquery.mast

  • Plot a stellar spectrum and an HR diagram and using APOGEE data

  • Understand how to search for TESS light curves complementing the APOGEE observations

  • Explore different types of variable stars using their APOGEE parameters and TESS light curves

Table of Contents#

  • Introduction

  • Imports

  • Accessing APOGEE data at MAST

    • Querying all APOGEE data

    • Searching for a specific star

    • Downloading APOGEE data products

    • Plotting an APOGEE spectrum

    • Downloading the APOGEE allStar catalog

    • Plotting an HR diagram from APOGEE

  • Searching for TESS data of this star

    • Coordinate search using astroquery.mast

    • Plotting a light curve from TESS

  • Combining APOGEE and TESS data

    • Plotting APOGEE and TESS data together

    • Exploring Different types of Variable Stars

      • RR-Lyrae Variable: 2M11361176+8117369

      • Rotationally Variable - 2M19181706+5141323

      • Eclipsing Binary - 2M19203184+3830492

  • Additional Resources

    • How to Cite

    • About This Notebook

Introduction#

The Apache Point Observatory Galactic Evolution Experiment (APOGEE) survey provides infrared-wavelength high-resolution spectroscopy for over 650,000 unique stars from the Milky Way and nearby dwarf galaxies. APOGEE collected data between 2011 - 2020 as part of the Sloan Digital Sky Survey (SDSS-IV) project. APOGEE data is now available at the Mikulski Archive for Space Telescopes (MAST) through the SDSS Legacy Archive at MAST.

In this notebook tutorial, we will demonstrate how to access APOGEE data at MAST using Python. One APOGEE star, a Cepheid Variable named V1154-Cyg will be used to demonstrate the basics of how to download and plot APOGEE data. We will then combine this APOGEE spectrum with light curves from the TESS mission, also accessible from MAST, to study this variable star.

Imports#

The main packages we’re using for this notebook and their use-cases are:

  • astroquery.mast Observations for searching the MAST archive

  • astropy.io fits for accessing FITS files

  • astropy.table Table for accessing FITS tables

  • astropy.units for working with astronomical units

  • astropy.coordinates for handling astronomical coordinates

  • matplotlib.pyplot for plotting data

  • numpy to handle array functions

%matplotlib inline

from astroquery.mast import Observations
import astropy.io.fits as fits
from astropy.table import Table
from astropy.coordinates import SkyCoord

import astropy.units as u
import matplotlib.pyplot as plt
import numpy as np

This cell updates some of the settings in matplotlib to use larger font sizes in the figures:

#Update Plotting Parameters
params = {'axes.labelsize': 12, 'xtick.labelsize': 12, 'ytick.labelsize': 12, 
          'text.usetex': False, 'lines.linewidth': 1,
          'axes.titlesize': 18, 'font.family': 'serif', 'font.size': 12}
plt.rcParams.update(params)

Accessing APOGEE data at MAST#

The SDSS Legacy Archive at MAST hosts all of the science-ready data products from the SDSS-IV APOGEE Survey, which includes spectra for over 650,000 unique stars in the Milky Way. APOGEE acquired observations in both the northern and southern hemispheres, targeting the disk, bulge, bar, and halo components of the Milky Way, as well as several hundred stars in nearby satellite galaxies (including the LMC and SMC). This notebook will demonstrate how to search and download APOGEE data using MAST!

Querying all APOGEE data#

Searching for APOGEE data is straightforward with astroquery.mast. In this example, we use Observations.query_criteria and search for provenance_name = 'APOGEE'. This will return a table describing all of the APOGEE data hosted by the MAST archive.

Other useful search parameters for APOGEE data might include:

  • obs_collection = 'SDSS': searches for all SDSS data

  • To specify telescope, use the instrument_name field. For example, instrument_name = 'apogee-lco25m' limits the search to the stars that were observed with the Las Campanas Observatory 2.5-m telescope in the southern hemisphere.

  • Use target_name to search for stars using their APOGEE ID (usually the 2MASS ID - for example '2M05320041-0017041')

  • obs_id can help search for specific targets or fields. Note that wild cards (*) are allowed in the search fields, for example:

    • obs_id='*n7789*' for anything in the “N7789” field

    • obs_id='*2M05320041-0017041*' for a specific star name (2MASS ID)

# Search for APOGEE data
# Using the pagesize and page parameters to only return first 10 results
apogee_obs_list = Observations.query_criteria(provenance_name='APOGEE', pagesize=10, page=1)

# Display First Ten Entries in Table
apogee_obs_list

Searching for a specific star#

Let’s narrow down the search to one particular star: cepheid variable “V1154-Cyg”. Cepheid Variables are pulsating stars, which physically grow larger and smaller, and as a result brighter and dimmer, on a regular rhythm of few days. Cepheid variables in particular are useful to study in astronomy because their pulsation period correlates with their intrinsic brightness, making it easy to determine how far away the star is! Later in this notebook, we will look at some other kinds of variable stars.

We can search for this star in particular using the target_name keyword:

# Search for APOGEE star V1154_Cyg
apogee_obs_list = Observations.query_criteria(provenance_name='APOGEE',
                                              target_name='V1154_Cyg')

# Display results
apogee_obs_list

From this results table, we can see some basic metadata related to this observation:

  • This star was observed with the APO 1-M telescope (instrument_name)

  • It’s coordinates are in the s_ra and s_dec columns

  • From the t_min column, we can see that this star was first observed on the date of MJD 58002 (Correpsonding to 2017-09-06) and last observed on MJD 58069 (2017-11-12 )

  • APOGEE provides infrared-wavelength (wavelength_region) spectra (dataproduct_type) with wavelength range of 1514 - 1696 nanometers (em_min, em_max)

Downloading APOGEE data products#

List all of the data products available for this star using Observations.get_product_list().

There are 8 total files available for this star, which include the individual spectra from each visit (APVISIT; apVisit-dr17-*-V1154_Cyg.fits), the combined spectrum (APSTAR; apStar-dr17-V1154_Cyg.fits), the processed and anaylzed spectrum (ASPCAPSTAR; aspcapStar-dr17-V1154_Cyg.fits) which contains the best-fit model and stellar parameters from the APOGEE Stellar Parameters and Chemical Abundance Pipeline (ASPCAP). Only the apStar and aspcapStar files are tagged as “Minimum Recommended Products’

More information on the APOGEE data products available at MAST can be found on the APOGEE Data Products in the Archive Manual, and more information on all of these products can be seen in the search results table:

# List all products available for this observation
products = Observations.get_product_list(apogee_obs_list)

# Show table
products

Now we will download the aspcapStar spectrum for this star using Observations.download_products(). The download will print a status message when completed.

manifest = Observations.download_products(products, productSubGroupDescription='ASPCAPSTAR')

Plotting an APOGEE Spectrum#

Now let’s take a look at the file and plot the spectrum.

Based on the descriptions in the aspcapStar Data Model, this file has four extensions:

  • HDU0: The Primary Header and file metadata

  • HDU1: The array containing the observed combined spectrum

  • HDU2: The error array for the spectrum

  • HDU3: The best fit model spectrum from ASPCAP

  • HDU4: The ASPCAP Data Table, containing the best fit stellar parameter values and other information for this star

# Open file
aspcap_star = fits.open(manifest['Local Path'][0])

# Display file information
aspcap_star.info()

We can plot the spectrum using this information! The wavelength information can be found in the file hearer (HUD0), and observed flux is contained in the first extension (HDU1).

plt.figure(figsize=(15, 5))

# Get wavelength information from the header
# CRVAL1 is the minimum wavlength, CDELT1 is the step size (in log space) and NAXIS1 is the number of pixels
apogee_wls = np.logspace(aspcap_star[1].header['CRVAL1'],
                         aspcap_star[1].header['CRVAL1']+aspcap_star[1].header['CDELT1']*aspcap_star[3].header['NAXIS1'],
                         aspcap_star[1].header['NAXIS1'])


# Get the observed and model flux from extensions 1 and 3
observed_flux = aspcap_star[1].data
model_flux = aspcap_star[3].data

# Mask the data using the error array in extenion 2
pixel_mask = (aspcap_star[2].data < 0.1) # use only pixels with small errors (10% or less)

# Plot the observed spectrum
plt.plot(apogee_wls[pixel_mask], observed_flux[pixel_mask], c='k', label='Observed Spectrum')

# Plot the model spectrum
plt.plot(apogee_wls[pixel_mask], model_flux[pixel_mask], c='darkred', label='Best Fit Model Spectrum')

# Set axes limits
plt.ylim(0.6, 1.1)
plt.xlim(np.min(apogee_wls), np.max(apogee_wls))

# Set labels
apogee_id = aspcap_star[4].data['APOGEE_ID'][0] # String containing star's name
plt.title(f"APOGEE Spectrum - {apogee_id}")
plt.xlabel(r'Wavelength [$\AA$]')
plt.ylabel('Flux')
plt.grid()
plt.legend()

plt.show()

This is the APOGEE spectrum for this star! It looks great. We can tell it must be a hot star because of the large Hydrogen Absorption lines (for example around 15346, 15443, 15560 and 16113 angstroms - this is the Hydrogen Brackett Series!) The gaps in the spectrum around 15800 and 16400 Angstroms are due to the APOGEE instrumentation: APOGEE observes across 3 ccds, which have a small gap in wavelength coverage between them.

Downloading the APOGEE allStar Catalog#

We can also download the APOGEE allStar catalog, which contains a lot of useful information for the full APOGEE sample, including stellar parameters, chemical abundances, positions, and observation data for all of the stars in APOGEE.

We can do this with the download_file function in astroquery, knowing the MAST data URL for the allStar catalog is mast:SDSS/apogee/allStar-dr17-synspec_rev1.fits.

This is a fairly large file (3.9 GB), so this download may take a few minutes!

# Download allStar file - this may take a few minutes!
Observations.download_file('mast:SDSS/apogee/allStar-dr17-synspec_rev1.fits')
# Open file 
allstar = Table.read('allStar-dr17-synspec_rev1.fits', hdu=1)

# Display table
allstar[:10]

Plotting an HR Diagram (or Kiel Diagram) with APOGEE#

When characterizing stars, a useful plot is the HR diagram, which helps classify stars by their different types (for example, a star can be a Sun-like main sequence star or a massive Red Giant star). Let’s plot an HR Diagram using APOGEE data.

Traditional HR diagrams plot a star’s color on the x-axis and its brightness (or magnitude) on the y-axis. For this excerise, we’ll be using the star’s temperature (\(T_{eff}\)) as a substitute for color and the surface gravity (\(\log g\)) as a proxy for brightness. This variation on the HR diagram is technically called a Kiel Diagram, but they are very similar in appearance and interpretation.

# Plot HR Diagram form APOGEE catalog
plt.figure(figsize=(10, 10))

# Plot Temperature (TEFF) against surface gravity (LOGG)
# And color points by their metallicity (FE_H)
im = plt.scatter(allstar['TEFF'], allstar['LOGG'], c=allstar['FE_H'],
                 marker='.', s=1, zorder=1,
                 cmap='jet', vmin=-1, vmax=0.5)
        
# Add colorbar legend to plot
plt.colorbar(im, location='bottom', label='Metallicity [Fe/H]')

        
# Labels and titles
plt.xlim(7000, 3000)
plt.ylim(5.5, -0.5)
plt.xlabel(r'Effective Temperature [$T_{\rm{eff}}$, K]')
plt.ylabel(r'Surface Gravity [$\log(g)$]')
plt.grid()
plt.title('APOGEE Stellar Parameters - allStar')

plt.show()

This is an HR Diagram made with APOGEE data! The Main Sequence stars are along the bottom, and the Red Giant Branch is in the upper-right portion. The Horizontal Branch stars are the upper-left group of dark blue (metal poor) points.

Searching for TESS Data of this Star#

Now let’s search MAST for a light curve of our Cepheid Variable star.

Coordinate search using astroquery.mast#

# Retrieve RA and DEC of APOGEE star
ra = apogee_obs_list['s_ra'][0]
dec = apogee_obs_list['s_dec'][0]
# make a SkyCoord object from these coordinates
coord = SkyCoord(ra=ra*u.deg, dec=dec*u.deg)
print(coord)

# Search for TESS observations based on coordinates
tess_obs = Observations.query_criteria(coordinates=coord, # Search by coordinates
                                       radius=(1/60/60)*u.deg, # search within 1 arcsecond
                                       # Search for TESS observations
                                       obs_collection='TESS',
                                       # Select only Science observations (not calibration files)
                                       intentType='science',
                                       # Search for time series data (light curves)
                                       dataproduct_type='timeseries',
                                       dataURL='*_lc.fits'
                                       )

# Display Results
tess_obs

There are a handful of results, but we can find the closest match to this star using the “distance” column which returns the distance (in arcsecs) of each match to our cone search. There are several results with distance of 0, (this star was observed in multiple TESS sectors), so we will just use the first result.

tess_obs[np.argmin(tess_obs['distance'])]

And download this file similar to what we did with APOGEE before:

# Get product list
tess_products = Observations.get_product_list(tess_obs[np.argmin(tess_obs['distance'])])

# Show products
tess_products
# Download Products
manifest = Observations.download_products(tess_products, description='Light curves')

Plotting a light curve from TESS#

The TESS light curve file has three extensions:

  • The primary header (HDU0)

  • The light curve data (HDU1)

  • aperture Information (HDU2)

# Open file
tess_lc = fits.open(manifest['Local Path'][0])

# Print file information
tess_lc.info()

We can plot the light curve by using the TIME and SAP_FLUX data from the first extension:

plt.figure(figsize=(15, 5))

# Plot the observed spectrum
plt.plot(tess_lc['LIGHTCURVE'].data['TIME'], tess_lc['LIGHTCURVE'].data['SAP_FLUX'], c='k', label='TESS Light Curve')

# Plot a vertical line for every day
for i in range(int(np.nanmin(tess_lc['LIGHTCURVE'].data['TIME'])), int(np.nanmax(tess_lc['LIGHTCURVE'].data['TIME']))):
    plt.axvline(i, c='lightgrey')
    
# Set labels
plt.title(f"TESS-SPOC Light Curve - {tess_lc[0].header['OBJECT']}")
plt.xlabel('Time [BTJD]')
plt.ylabel('Flux')
plt.grid()

plt.show()

This is a very typical light curve for a Cepheid Variable star, and TESS observed the variations in flux really well! This star cycles over a period of about 6 days.

Combining APOGEE and TESS data#

Now we are ready to combine everything: the HR Diagram, the spectrum, and the TESS light curve into a single plot!

Plotting APOGEE and TESS data together#

In the below cell, we define a helper function to plot all of the data together for our Cepheid Variable star:

def plot_spectrum_and_lc(apogee_spec, tess_lightcurve):
    """Helper function for plotting the HR Diagram, APOGEE Spectrum, and TESS Light Curve for a star"""
    # Set up axes for subplots
    plt.figure(figsize=(20, 10))
    ax1 = plt.subplot2grid((3, 5), (0, 0), colspan=2, rowspan=3)
    ax2 = plt.subplot2grid((3, 5), (0, 2), colspan=3, rowspan=1)
    ax3 = plt.subplot2grid((3, 5), (1, 2), colspan=3, rowspan=1)
    
    # ===================
    # AX1 - HR Diagram
    # ===================
    # Plot HR Diagram form APOGEE catalog
    # Only plot 1 out of every 10 points with [::10] to reduce plotting time
    im = ax1.scatter(allstar['TEFF'][::10], allstar['LOGG'][::10], c=allstar['FE_H'][::10],
                     marker='.', s=1, cmap='jet', vmin=-1, vmax=0.5, zorder=1)
            
    plt.colorbar(im, ax=ax1, location='bottom', label='Metallicity [Fe/H]')
            
    # Plot the specific star
    indx = np.where(allstar['APOGEE_ID'] == apogee_spec[4].data['APOGEE_ID'][0])[0][0]
    if allstar['TEFF'][indx]: # if the star has valid apogee parameters
        ax1.scatter(allstar['TEFF'][indx], allstar['LOGG'][indx], c=allstar['FE_H'][indx],
                    marker='*', s=1000, cmap='jet', edgecolor='k', lw=2,
                    vmin=-1, vmax=0.5, zorder=5, label='APOGEE Parameters')
    else: # params from aspcapstar (uncalibrated)
        print('WARNING: Quality Warning Flags on ASPCAP star parameters. ASPCAP parameters may not be reliable for this star.')
        ax1.scatter(apogee_spec[4].data['FPARAM'][0][0], apogee_spec[4].data['FPARAM'][0][1], c=apogee_spec[4].data['FPARAM'][0][3],
                    marker='*', s=1000, cmap='jet', edgecolor='k', lw=2, vmin=-1, vmax=0.5, zorder=5, label='APOGEE Parameters')
                
    # Labels and titles
    ax1.set_xlim(7000, 3000)
    ax1.set_ylim(5.5, -0.5)
    ax1.set_xlabel(r'Effective Temperature [$T_{\rm{eff}}$, K]')
    ax1.set_ylabel(r'Surface Gravity [$\log(g)$]')
    ax1.grid()
    ax1.set_title('APOGEE Stellar Parameters - allStar')
            
    # ===================
    # AX2 - APOGEE Spectrum
    # ===================        
    # Get the observed and model flux from extensions 1 and 3
    observed_flux = apogee_spec[1].data
    model_flux = apogee_spec[3].data
    
    # Mask the data using the error array in extenion 2
    pixel_mask = (apogee_spec[2].data < 0.1) # use only pixels with small errors (10% or less)
    
    # Plot the observed spectrum
    ax2.plot(apogee_wls[pixel_mask], observed_flux[pixel_mask], c='k', label='Observed Spectrum')
    
    # Plot the model spectrum
    ax2.plot(apogee_wls[pixel_mask], model_flux[pixel_mask], c='darkred', label='Best Fit Model Spectrum')
    
    # Set axes limits
    ax2.set_ylim(0.6, 1.1)
    ax2.set_xlim(np.min(apogee_wls), np.max(apogee_wls))
    
    apogee_id = apogee_spec[4].data['APOGEE_ID'][0] # String containing star's name
    ax2.set_title(f"APOGEE Spectrum - {apogee_id}")
    ax2.set_xlabel(r'Wavelength [$\AA$]')
    ax2.set_ylabel('Flux')
    ax2.grid()
    ax2.legend()
    
    # ===================
    # AX3 - TESS Light Curve
    # ===================
    # Plot Light Curve
    ax3.plot(tess_lightcurve['LIGHTCURVE'].data['TIME'], tess_lightcurve['LIGHTCURVE'].data['SAP_FLUX'], c='k', label='TESS Light Curve')
    
    # Plot a vertical line for every day
    for i in range(int(np.nanmin(tess_lightcurve['LIGHTCURVE'].data['TIME'])), int(np.nanmax(tess_lightcurve['LIGHTCURVE'].data['TIME']))):
        ax3.axvline(i, c='lightgrey')
        
    # Set labels
    ax3.set_title(f"TESS-SPOC Light Curve - {tess_lightcurve[0].header['OBJECT']}")
    ax3.set_xlabel('Time [BTJD]')
    ax3.set_ylabel('Flux')
    ax3.grid()
            
    plt.tight_layout()
    save_name = f"{apogee_id}_APOGEE_spec_TESS_lightcurve.png"
    plt.savefig(save_name, bbox_inches='tight')
    print(f"Saved to {save_name}")
    plt.show()
aspcap_star = fits.open('mastDownload/SDSS/sdss_apogee_apo1m_cepheid_v1154_cyg/aspcapStar-dr17-V1154_Cyg.fits')
tess_lc = fits.open(manifest['Local Path'][0])

plot_spectrum_and_lc(aspcap_star, tess_lc)

Like we saw before, this star is a metal-poor horiztonal branch star, and has a period of about 6 days. A very cool example of a Cepheid Variable!

Exploring different types of Variable Stars#

So far in this notebook, we’ve plotted the stellar parameters, spectrum, and light curve of a Cepheid variable star. But what about other types of variables? Here are a few more stars in APOGEE that we will explore and plot in this section!

  • 2M11361176+8117369: a RR Lyrae Variable Star, a type of quickly-pulsating old, low-mass star

  • 2M19181706+5141323: a “Rotationally Variable” Star, a main sequence dwarf star with large starspots blocking out some of the flux on one side of the star as it rotates

  • 2M19223768+5044541: an Eclipsing Binary Star, a type of close binary system with two stars rotating around each other

Since we’re going to be going through the same process for each star (search for APOGEE data -> search for TESS data -> plot results), Let’s write a helper function:

def plot_variable_star(apogee_id: str) -> None:
    """
    This function takes a star name from APOGEE and downloads the APOGEE and TESS
    data for this star. It then plots the APOGEE spectrum and TESS Lightcurve together.

    Parameters:
    ============
    apogee_id: str
        name of star we want to plot
    """
    print("="*50)
    print(f"Plotting Spectrum and Light Curve for {apogee_id}")
    print("="*50)

    # Search for APOGEE Data
    print("Searching For APOGEE Data...")
    apogee_obs_list = Observations.query_criteria(provenance_name='APOGEE',
                                                  target_name=apogee_id)
    print("Downloading APOGEE Data...")
    # Download APOGEE Spectrum
    product_list = Observations.get_product_list(apogee_obs_list)
    manifest = Observations.download_products(product_list, productSubGroupDescription='ASPCAPSTAR')
    # Open APOGEE Spectrum
    aspcapstar = fits.open(manifest['Local Path'][0])

    # Coordinates of Star
    coord = SkyCoord(ra=apogee_obs_list['s_ra'][0]*u.deg, dec=apogee_obs_list['s_dec'][0]*u.deg)

    # Search for TESS observations based on coordinates
    print("Searching For TESS Data...")
    tess_obs = Observations.query_criteria(coordinates=coord, # Search by coordinates
                                           radius=(1/60/60)*u.deg, # search within 1 arcsecond
                                           obs_collection='TESS', # Search for TESS observations
                                           intentType='science', # Select only Science observations (not calibration files)
                                           dataproduct_type='timeseries', dataURL="*lc.fits") # Search for time series data (light curves)

    # Pick closest match by distance
    tess_obs = tess_obs[np.argmin(tess_obs['distance'])]

    # Download TESS Light Curve
    print("Downloading TESS Data...")
    tess_products = Observations.get_product_list(tess_obs)
    manifest = Observations.download_products(tess_products, description='Light curves')

    # Open light curve
    tess_lc = fits.open(manifest['Local Path'][0])

    # Plot star
    print("Making Plot...")
    plot_spectrum_and_lc(aspcapstar, tess_lc)  

RR Lyrae Variable - 2M11361176+8117369#

# Define the star name we want to search for
star_name = '2M11361176+8117369'
# Call our helper function to download data and plot the results
plot_variable_star(star_name)

This star is also a horizontal branch star, but it is hotter and more metal-poor than the Cepheid Variable was. It pulsates a lot more quickly too - from the light curve, the pulsation period is about 12 hours!

Rotationally Variable - 2M19181706+5141323#

star_name = '2M19181706+5141323'
plot_variable_star(star_name)

This star is a main-sequence dwarf star, a lot like the Sun but a little bit cooler in temperature. Its light curve is not as periodic as the other two, because the variation in flux is caused by the stars rotation, not pulsation! Large star spots likely block out the flux on part of this star, causing it to dip in brightness roughly every 14 days.

Eclipsing Binary Star - 2M19203184+3830492#

star_name = '2M19203184+3830492'
plot_variable_star(star_name)

Every few days, this star’s flux dips dramatically in brightness by over 25%! This system is an eclipsing binary pair - a pair of stars orbiting around each other, and eclipsing one another along our line-of-sight, causing the fluctuations in brightness seen in the light curve.

Because this is an eclipsing binary pair, the ASPCAP parameters have a quality warning and may not be as reliable as the parameter estimates for other stars (which explains why the star is slightly off of the main sequence in the HR Diagram!). The single visit spectra (apVisit files) are probably more useful for most science cases involving binary stars than the combined spectrum (aspcapStar file) we plotted here. The apVisit files can be dowloaded using the productSubGroupDescription='APVISIT' keyword when downloading the products.


Congratulations! You have reached the end of this tutorial notebook. You have learned how to access and download APOGEE data from MAST, and combine it with TESS light curves to characterize different types of variable stars.

Additional Resources#

Additional resources are linked below:

Citations#

If you use MaNGA data for published research, see the following links for information on which citations to include in your paper:

About this Notebook#

Author(s): Julie Imig (jimig@stsci.edu)
Keyword(s): Tutorial, SDSS, APOGEE, TESS, stars
First published: January 2025
Last updated: January 2025


Top of Page Space Telescope Logo