Using Kepler Data to Plot a Light Curve


This notebook tutorial demonstrates the process of loading and extracting information from Kepler light curve FITS files to plot a light curve and display the photometric aperture. light_curve_tres2

Table of Contents


[Introduction](#intro_ID)
[Imports](#imports_ID)
[Getting the Data](#data_ID)
[Reading FITS Extensions](#header_ID)
[Plotting a Light Curve](#lightcurve_ID)
[The Aperture Extension](#aperture_ID)
[Additional Resources](#resources_ID)
[About this Notebook](#about_ID)

Introduction

Light curve background: A light curve is a plot of flux versus time that shows the variability of light output from an object. This is one way to find planets periodically transitting a star. The light curves made here will plot the corrected and uncorrected fluxes from Kepler data of object KIC 11446443 (TRES-2).

Some notes about the file: kplr_011446443-2009131110544_slc.fits
The filename contains phrases for identification, where

  • kplr = Kepler
  • 011446443 = Kepler ID number
  • 2009131110544 = year 2009, day 131, time 11:05:44
  • slc = short cadence

Defining some terms:

  • Cadence: the frequency with which summed data are read out. Files are either short cadence (a 1 minute sum) or long cadence (a 30 minute sum).
  • SAP Flux: Simple Aperture Photometry flux; flux after summing the calibrated pixels within the optimal aperture
  • PDCSAP Flux: Pre-search Data Conditioned Simple Aperture Photometry; these are the flux values nominally corrected for instrumental variations.
  • BJD: Barycentric Julian Day; this is the Julian Date that has been corrected for differences in the Earth's position with respect to the Solar System Barycentre (center of mass of the Solar System).
  • HDU: Header Data Unit; a FITS file is made up of Header or Data units that contain information, data, and metadata relating to the file. The first HDU is called the primary, and anything that follows is considered an extension.

For more information about the Kepler mission and collected data, visit the Kepler archive page. To read more details about light curves and relevant data terms, look in the Kepler archive manual.


Imports

Let's start by importing some libraries to the environment:

  • matplotlib notebook for creating interactive plots
  • astropy.io fits for accessing FITS files
  • astropy.table Table for creating tidy tables of the data
  • matplotlib for plotting data
In [1]:
%matplotlib notebook
from astropy.io import fits
from astropy.table import Table 
import matplotlib.pyplot as plt

Getting the Data

Start by importing libraries from Astroquery. For a longer, more detailed description using of Astroquery, please visit this tutorial or read the Astroquery documentation.

In [2]:
from astroquery.mast import Mast
from astroquery.mast import Observations


Next, we need to find the data file. This is similar to searching for the data using the MAST Portal in that we will be using certain keywords to find the file. The target name of the object we are looking for is kplr011446443, collected by the Kepler spacecraft.

In [3]:
keplerObs = Observations.query_criteria(target_name='kplr011446443', obs_collection='Kepler')
keplerProds = Observations.get_product_list(keplerObs[1])
yourProd = Observations.filter_products(keplerProds, extension='kplr011446443-2009131110544_slc.fits', 
                                        mrp_only=False)
yourProd
Out[3]:
Table masked=True length=1
obsIDobs_collectiondataproduct_typeobs_iddescriptiontypedataURIproductTypeproductGroupDescriptionproductSubGroupDescriptionproductDocumentationURLprojectprvversionproposal_idproductFilenamesizeparent_obsiddataRights
str10str6str10str36str60str1str110str7str28str1str1str6str1str7str44int64str10str6
9000210988Keplertimeserieskplr011446443_sc_Q113313330333033302Lightcurve Short Cadence (CSC) - Q0Cmast:Kepler/url/missions/kepler/lightcurves/0114/011446443/kplr011446443-2009131110544_slc.fitsSCIENCE------Kepler--EX_STKSkplr011446443-2009131110544_slc.fits14572809000210988PUBLIC


Now that we've found the data file, we can download it using the reults shown in the table above:

In [4]:
Observations.download_products(yourProd, mrp_only = False, cache = False) 
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:Kepler/url/missions/kepler/lightcurves/0114/011446443/kplr011446443-2009131110544_slc.fits to ./mastDownload/Kepler/kplr011446443_sc_Q113313330333033302/kplr011446443-2009131110544_slc.fits ... [Done]
Out[4]:
Table length=1
Local PathStatusMessageURL
str95str8objectobject
./mastDownload/Kepler/kplr011446443_sc_Q113313330333033302/kplr011446443-2009131110544_slc.fitsCOMPLETENoneNone


Click on the blue URL above to download the file. You are now ready to complete the rest of the notebook.


Reading FITS Extensions


Now that we have the file, we can start working with the data. We will begin by assigning a shorter name to the file to make it easier to use. Then, using the info function from astropy.io.fits, we can see some information about the FITS Header Data Units:

In [5]:
filename = "./mastDownload/Kepler/kplr011446443_sc_Q113313330333033302/kplr011446443-2009131110544_slc.fits"
fits.info(filename)
Filename: ./mastDownload/Kepler/kplr011446443_sc_Q113313330333033302/kplr011446443-2009131110544_slc.fits
No.    Name      Ver    Type      Cards   Dimensions   Format
  0  PRIMARY       1 PrimaryHDU      58   ()      
  1  LIGHTCURVE    1 BinTableHDU    155   14280R x 20C   [D, E, J, E, E, E, E, E, E, J, D, E, D, E, D, E, D, E, E, E]   
  2  APERTURE      1 ImageHDU        48   (8, 9)   int32   
  • No. 0 (Primary):
    This HDU contains meta-data related to the entire file.
  • No. 1 (Light curve):
    This HDU contains a binary table that holds data like flux measurements and times. We will extract information from here when we define the parameters for the light curve plot.
  • No. 2 (Aperture):
    This HDU contains the image extension with data collected from the aperture. We will also use this to display a bitmask plot that visually represents the optimal aperture used to create the SAP_FLUX column in HDU1.

For more detailed information about header extensions, look here.


Let's say we wanted to see more information about the extensions than what the fits.info command gave us. For example, we can access information stored in the header of the Binary Table extension (No. 1, LIGHTCURVE). The following line opens the FITS file, writes the first HDU extension into header1, and then closes the file. Only 24 columns are displayed here but you can view them all by adjusting the range:

In [6]:
with fits.open(filename) as hdulist: 
    header1 = hdulist[1].header
  
print(repr(header1[0:24])) #repr() prints the info into neat columns
XTENSION= 'BINTABLE'           / marks the beginning of a new HDU               
BITPIX  =                    8 / array data type                                
NAXIS   =                    2 / number of array dimensions                     
NAXIS1  =                  100 / length of first array dimension                
NAXIS2  =                14280 / length of second array dimension               
PCOUNT  =                    0 / group parameter count (not used)               
GCOUNT  =                    1 / group count (not used)                         
TFIELDS =                   20 / number of table fields                         
TTYPE1  = 'TIME    '           / column title: data time stamps                 
TFORM1  = 'D       '           / column format: 64-bit floating point           
TUNIT1  = 'BJD - 2454833'      / column units: barycenter corrected JD          
TDISP1  = 'D14.7   '           / column display format                          
TTYPE2  = 'TIMECORR'           / column title: barycenter - timeslice correction
TFORM2  = 'E       '           / column format: 32-bit floating point           
TUNIT2  = 'd       '           / column units: day                              
TDISP2  = 'E13.6   '           / column display format                          
TTYPE3  = 'CADENCENO'          / column title: unique cadence number            
TFORM3  = 'J       '           / column format: signed 32-bit integer           
TDISP3  = 'I10     '           / column display format                          
TTYPE4  = 'SAP_FLUX'           / column title: aperture photometry flux         
TFORM4  = 'E       '           / column format: 32-bit floating point           
TUNIT4  = 'e-/s    '           / column units: electrons per second             
TDISP4  = 'E14.7   '           / column display format                          
TTYPE5  = 'SAP_FLUX_ERR'       / column title: aperture phot. flux error        


We can also view a table of the data from the Binary Table extension. This is where we can find the flux and time columns to be plotted later. Here only the first four rows of the table are displayed:

In [7]:
with fits.open(filename) as hdulist:
    binaryext = hdulist[1].data

binarytable = Table(binaryext)
binarytable[1:5]
Out[7]:
Table length=4
TIMETIMECORRCADENCENOSAP_FLUXSAP_FLUX_ERRSAP_BKGSAP_BKG_ERRPDCSAP_FLUXPDCSAP_FLUX_ERRSAP_QUALITYPSF_CENTR1PSF_CENTR1_ERRPSF_CENTR2PSF_CENTR2_ERRMOM_CENTR1MOM_CENTR1_ERRMOM_CENTR2MOM_CENTR2_ERRPOS_CORR1POS_CORR2
float64float32int32float32float32float32float32float32float32int32float64float32float64float32float64float32float64float32float32float32
120.529923867848990.000966727855501401288.1691.511872598.10860.5752603406100.9127.5288240nannannannan621.21811875420910.0002101067848.89728816557140.000291254140.0002031729-0.0025746305
120.530605089734310.000966749745502401425.5391.534482598.02610.57525027406242.22125.2121050nannannannan621.21902071684740.00021018942848.89357906508550.00029128940.00021144762-0.0025837936
120.531286211677070.00096677175503401172.091.5172652597.94380.5752402405984.03123.313390nannannannan621.21792281620920.00021032244848.8948522601740.00029142220.00021972114-0.0025929555
120.531967433569660.000966793575504401473.6291.530642597.86130.5752302406293.0121.849870nannannannan621.21982631017140.00021020704848.89617903320790.000291155270.00022799587-0.0026021185

Plotting a Light Curve


Now that we have seen and accessed the data, we can begin to plot a light curve:

  1. Open the file using command fits.open. This will allow the program to read and store the data we will manipulate to be plotted. Here we've also renamed the file with a phrase that is easier to handle (see line 1).

  2. Start by calibrating the time. Because the Kepler data is in BKJD (Kepler Barycentric Julian Day) we need to convert it to time in Julian Days (BJD) if we want to be able to compare it to other outside data. For a more detailed explanation about time conversions, visit the page 13 or page 17 of the Kepler Archive Manual.
    • Read in the BJDREF times, both the integer (BJDREFI) and the floating point (BJDREFF). These are found as columns of data in the binary extension of the header.

  3. Read in the columns of times and fluxes (both uncorrected and corrected) from the data.
In [8]:
with fits.open(filename, mode="readonly") as hdulist:
    # Read in the "BJDREF" which is the time offset of the time array.
    bjdrefi = hdulist[1].header['BJDREFI'] 
    bjdreff = hdulist[1].header['BJDREFF']

    # Read in the columns of data.
    times = hdulist[1].data['time'] 
    sap_fluxes = hdulist[1].data['SAP_FLUX']
    pdcsap_fluxes = hdulist[1].data['PDCSAP_FLUX']
  1. Now that the appropriate data has been read and stored, convert the times to BJDS by adding the BJDREF times to the data of times.

  2. Finally, we can plot the fluxes against time. We can also set a title and add a legend to the plot. We can label our fluxes accordingly and assign them colors and styles ("-k" for a black line, "-b" for a blue line).
In [9]:
# Convert the time array to full BJD by adding the offset back in.
bjds = times + bjdrefi + bjdreff 

plt.figure(figsize=(9,4))

# Plot the time, uncorrected and corrected fluxes.
plt.plot(bjds, sap_fluxes, '-k', label='SAP Flux') 
plt.plot(bjds, pdcsap_fluxes, '-b', label='PDCSAP Flux') 

plt.title('Kepler Light Curve')
plt.legend()
plt.xlabel('Time (days)')
plt.ylabel('Flux (electrons/second)')
plt.show()

The Aperture Extension


We can also make a plot of the third HDU; the image extension (No. 2, APERTURE). This data is stored as an array of integers that encodes which pixels were collected from the spacecraft and which were used in the optimal aperture (look here for more information on the aperture extension).

First, we need to re-open the FITS file and access the header. Next, we read in the image extension and print it as an array:

In [10]:
with fits.open(filename) as hdulist: 
    imgdata = hdulist[2].data
    
print(imgdata)
[[1 1 1 1 1 1 1 0]
 [1 1 1 5 5 5 5 1]
 [1 1 5 5 7 7 5 5]
 [1 1 5 7 7 7 7 5]
 [1 1 5 7 7 7 7 5]
 [1 1 5 7 7 7 7 5]
 [1 1 5 7 7 7 5 5]
 [1 1 5 5 7 5 5 1]
 [1 1 1 5 5 5 1 0]]

We can also show the data in a plot:

In [11]:
plt.figure(2)
plt.title('Kepler Aperture')
plt.imshow(imgdata, cmap=plt.cm.YlGnBu_r)
plt.xlabel('Column')
plt.ylabel('Row')
plt.colorbar()
Out[11]:
<matplotlib.colorbar.Colorbar at 0x7fe16994ea20>

Additional Resources

For more information about the MAST archive and details about mission data:

MAST API
Kepler Archive Page (MAST)
Kepler Archive Manual
Exo.MAST website


About this Notebook

Author: Josie Bunnell, STScI SASP Intern
Updated On: 08/10/2018