Using Kepler Data to Plot a Light Curve#

This notebook tutorial demonstrates the process of loading and extracting information from Kepler light curve FITS files to plot a light curve and display the photometric aperture. light_curve_tres2

Table of Contents#

Getting the Data
Reading FITS Extensions
Plotting a Light Curve
The Aperture Extension
Additional Resources
About this Notebook


Light curve background: A light curve is a plot of flux versus time that shows the variability of light output from an object. This is one way to find planets periodically transitting a star. The light curves made here will plot the corrected and uncorrected fluxes from Kepler data of object KIC 11446443 (TRES-2).

Some notes about the file: kplr_011446443-2009131110544_slc.fits
The filename contains phrases for identification, where

  • kplr = Kepler

  • 011446443 = Kepler ID number

  • 2009131110544 = year 2009, day 131, time 11:05:44

  • slc = short cadence

Defining some terms:

  • Cadence: the frequency with which summed data are read out. Files are either short cadence (a 1 minute sum) or long cadence (a 30 minute sum).

  • SAP Flux: Simple Aperture Photometry flux; flux after summing the calibrated pixels within the optimal aperture

  • PDCSAP Flux: Pre-search Data Conditioned Simple Aperture Photometry; these are the flux values nominally corrected for instrumental variations.

  • BJD: Barycentric Julian Day; this is the Julian Date that has been corrected for differences in the Earth’s position with respect to the Solar System Barycentre (center of mass of the Solar System).

  • HDU: Header Data Unit; a FITS file is made up of Header or Data units that contain information, data, and metadata relating to the file. The first HDU is called the primary, and anything that follows is considered an extension.

For more information about the Kepler mission and collected data, visit the Kepler archive page. To read more details about light curves and relevant data terms, look in the Kepler archive manual.

Top of Page


Let’s start by importing some libraries to the environment:

  • matplotlib notebook for creating interactive plots

  • fits for accessing FITS files

  • astropy.table Table for creating tidy tables of the data

  • matplotlib for plotting data

%matplotlib notebook
from import fits
from astropy.table import Table 
import matplotlib.pyplot as plt

Top of Page

Getting the Data#

Start by importing libraries from Astroquery. For a longer, more detailed description using of Astroquery, please visit this tutorial or read the Astroquery documentation.

from astroquery.mast import Mast
from astroquery.mast import Observations

Next, we need to find the data file. This is similar to searching for the data using the MAST Portal in that we will be using certain keywords to find the file. The target name of the object we are looking for is kplr011446443, collected by the Kepler spacecraft.

keplerObs = Observations.query_criteria(target_name='kplr011446443', obs_collection='Kepler')
keplerProds = Observations.get_product_list(keplerObs[1])
yourProd = Observations.filter_products(keplerProds, extension='kplr011446443-2009131110544_slc.fits', 
Table masked=True length=1
601031Keplertimeserieskplr011446443_sc_Q113313330333033302Lightcurve Short Cadence (CSC) - Q0Cmast:KEPLER/url/missions/kepler/lightcurves/0114/011446443/kplr011446443-2009131110544_slc.fitsSCIENCEMinimum Recommended ProductsSLC--Kepler--EX_STKSkplr011446443-2009131110544_slc.fits1457280601031PUBLIC2KEPLER

Now that we’ve found the data file, we can download it using the reults shown in the table above:

Observations.download_products(yourProd, mrp_only = False, cache = False) 
Downloading URL to ./mastDownload/Kepler/kplr011446443_sc_Q113313330333033302/kplr011446443-2009131110544_slc.fits ...
Table length=1
Local PathStatusMessageURL

Click on the blue URL above to download the file. You are now ready to complete the rest of the notebook.

Top of Page

Reading FITS Extensions#

Now that we have the file, we can start working with the data. We will begin by assigning a shorter name to the file to make it easier to use. Then, using the info function from, we can see some information about the FITS Header Data Units:

filename = "./mastDownload/Kepler/kplr011446443_sc_Q113313330333033302/kplr011446443-2009131110544_slc.fits"
Filename: ./mastDownload/Kepler/kplr011446443_sc_Q113313330333033302/kplr011446443-2009131110544_slc.fits
No.    Name      Ver    Type      Cards   Dimensions   Format
  0  PRIMARY       1 PrimaryHDU      58   ()      
  1  LIGHTCURVE    1 BinTableHDU    155   14280R x 20C   [D, E, J, E, E, E, E, E, E, J, D, E, D, E, D, E, D, E, E, E]   
  2  APERTURE      1 ImageHDU        48   (8, 9)   int32   
  • **No. 0 (Primary): **
    This HDU contains meta-data related to the entire file.

  • **No. 1 (Light curve): **
    This HDU contains a binary table that holds data like flux measurements and times. We will extract information from here when we define the parameters for the light curve plot.

  • **No. 2 (Aperture): **
    This HDU contains the image extension with data collected from the aperture. We will also use this to display a bitmask plot that visually represents the optimal aperture used to create the SAP_FLUX column in HDU1.

For more detailed information about header extensions, look here.

Let’s say we wanted to see more information about the extensions than what the command gave us. For example, we can access information stored in the header of the Binary Table extension (No. 1, LIGHTCURVE). The following line opens the FITS file, writes the first HDU extension into header1, and then closes the file. Only 24 columns are displayed here but you can view them all by adjusting the range:

with as hdulist: 
    header1 = hdulist[1].header
print(repr(header1[0:24])) #repr() prints the info into neat columns
XTENSION= 'BINTABLE'           / marks the beginning of a new HDU               
BITPIX  =                    8 / array data type                                
NAXIS   =                    2 / number of array dimensions                     
NAXIS1  =                  100 / length of first array dimension                
NAXIS2  =                14280 / length of second array dimension               
PCOUNT  =                    0 / group parameter count (not used)               
GCOUNT  =                    1 / group count (not used)                         
TFIELDS =                   20 / number of table fields                         
TTYPE1  = 'TIME    '           / column title: data time stamps                 
TFORM1  = 'D       '           / column format: 64-bit floating point           
TUNIT1  = 'BJD - 2454833'      / column units: barycenter corrected JD          
TDISP1  = 'D14.7   '           / column display format                          
TTYPE2  = 'TIMECORR'           / column title: barycenter - timeslice correction
TFORM2  = 'E       '           / column format: 32-bit floating point           
TUNIT2  = 'd       '           / column units: day                              
TDISP2  = 'E13.6   '           / column display format                          
TTYPE3  = 'CADENCENO'          / column title: unique cadence number            
TFORM3  = 'J       '           / column format: signed 32-bit integer           
TDISP3  = 'I10     '           / column display format                          
TTYPE4  = 'SAP_FLUX'           / column title: aperture photometry flux         
TFORM4  = 'E       '           / column format: 32-bit floating point           
TUNIT4  = 'e-/s    '           / column units: electrons per second             
TDISP4  = 'E14.7   '           / column display format                          
TTYPE5  = 'SAP_FLUX_ERR'       / column title: aperture phot. flux error        

We can also view a table of the data from the Binary Table extension. This is where we can find the flux and time columns to be plotted later. Here only the first four rows of the table are displayed:

with as hdulist:
    binaryext = hdulist[1].data

binarytable = Table(binaryext)
Table length=4

Top of Page

Plotting a Light Curve#

Now that we have seen and accessed the data, we can begin to plot a light curve:

  1. Open the file using command This will allow the program to read and store the data we will manipulate to be plotted. Here we’ve also renamed the file with a phrase that is easier to handle (see line 1).

  2. Start by calibrating the time. Because the Kepler data is in BKJD (Kepler Barycentric Julian Day) we need to convert it to time in Julian Days (BJD) if we want to be able to compare it to other outside data. For a more detailed explanation about time conversions, visit the page 13 or page 17 of the Kepler Archive Manual.

    • Read in the BJDREF times, both the integer (BJDREFI) and the floating point (BJDREFF). These are found as columns of data in the binary extension of the header.

  3. Read in the columns of times and fluxes (both uncorrected and corrected) from the data.

with, mode="readonly") as hdulist:
    # Read in the "BJDREF" which is the time offset of the time array.
    bjdrefi = hdulist[1].header['BJDREFI'] 
    bjdreff = hdulist[1].header['BJDREFF']

    # Read in the columns of data.
    times = hdulist[1].data['time'] 
    sap_fluxes = hdulist[1].data['SAP_FLUX']
    pdcsap_fluxes = hdulist[1].data['PDCSAP_FLUX']
  1. Now that the appropriate data has been read and stored, convert the times to BJDS by adding the BJDREF times to the data of times.

  2. Finally, we can plot the fluxes against time. We can also set a title and add a legend to the plot. We can label our fluxes accordingly and assign them colors and styles (“-k” for a black line, “-b” for a blue line).

# Convert the time array to full BJD by adding the offset back in.
bjds = times + bjdrefi + bjdreff 


# Plot the time, uncorrected and corrected fluxes.
plt.plot(bjds, sap_fluxes, '-k', label='SAP Flux') 
plt.plot(bjds, pdcsap_fluxes, '-b', label='PDCSAP Flux') 

plt.title('Kepler Light Curve')
plt.xlabel('Time (days)')
plt.ylabel('Flux (electrons/second)')

Top of Page

The Aperture Extension#

We can also make a plot of the third HDU; the image extension (No. 2, APERTURE). This data is stored as an array of integers that encodes which pixels were collected from the spacecraft and which were used in the optimal aperture (look here for more information on the aperture extension).

First, we need to re-open the FITS file and access the header. Next, we read in the image extension and print it as an array:

with as hdulist: 
    imgdata = hdulist[2].data
[[1 1 1 1 1 1 1 0]
 [1 1 1 5 5 5 5 1]
 [1 1 5 5 7 7 5 5]
 [1 1 5 7 7 7 7 5]
 [1 1 5 7 7 7 7 5]
 [1 1 5 7 7 7 7 5]
 [1 1 5 7 7 7 5 5]
 [1 1 5 5 7 5 5 1]
 [1 1 1 5 5 5 1 0]]

We can also show the data in a plot:

plt.title('Kepler Aperture')
<matplotlib.colorbar.Colorbar at 0x7f359c1bf190>

Top of Page

Additional Resources#

For more information about the MAST archive and details about mission data:

Kepler Archive Page (MAST)
Kepler Archive Manual
Exo.MAST website

About this Notebook#

Author: Josie Bunnell, STScI SASP Intern
Updated On: 08/10/2018

Top of Page STScI logo