Using Kepler Light Curve Products with Lightkurve#

Learning Goals#

By the end of this tutorial, you will:

  • Understand how NASA’s Kepler Mission collected and released light curve data products.

  • Be able to download and plot light curve files from the data archive using Lightkurve.

  • Be able to access light curve metadata.

  • Understand the time and brightness units.


The Kepler, K2, and TESS telescopes observe stars for long periods of time, from just under a month to four years. By doing so they observe how the brightness of stars change over time. A series of these brightness observations is referred to as a light curve of a star.

Light curves of stars observed by the Kepler, K2, or TESS missions are created from the raw images collected by these telescopes using software built for this purpose by the mission teams. In this tutorial, we will learn how to use the Lightkurve package to download these preprocessed light curves from Kepler’s data archive, plot them, and understand their properties and units.

Much of the explanation below is inspired by Kinemuchi et al. (2012), an excellent paper introducing and explaining the terminology surrounding the Kepler mission and its data. You can find detailed information on the mission and its data products in the official Kepler Instrument Handbook and the Kepler Data Processing Handbook.

We will use the Kepler mission as the main example, but these tools are extensible to TESS and K2 as well. For example, while in this tutorial we will learn to work with Lightkurve’s KeplerLightCurve objects, there are also TessLightCurve objects that work in the same way.


This tutorial only requires the Lightkurve package, which in turn uses matplotlib for plotting.

import lightkurve as lk
%matplotlib inline

1. About NASA’s Photometric Space Telescopes#

In order to understand the data produced by NASA’s Kepler, K2, and TESS missions, it is useful to understand a little about how these data were obtained.

1.1. Kepler#

During its nominal mission, the Kepler telescope made observations using 21 pairs of rectangular charge-coupled device (CCD) camera chips (also called modules), each consisting of four 1100 x 2048 pixel channels. Each observed star fell on one of these 84 CCD channels. Recording the channel numbers for each star was important, because the Kepler spacecraft rotated by 90 degrees roughly four times a year. These rotations divide what are referred to as observing quarters. While the same star may be observed in multiple quarters, it may fall on a different CCD channel each time.

Kepler observed a single field in the sky, although not all stars in this field were recorded as light curves. Instead, pixels were selected around a predetermined list of target stars, which were then downloaded. These downloaded measurements are stored in target pixel files (TPFs). By adding up the flux (a measurement of an object’s brightness per unit time) measured by the pixels in which a target star appears, the total brightness of a star can be measured. If you make this measurement at different times, you obtain a light curve.

Kepler recorded the brightness measurements at two different cadences: a Short Cadence (SC, 58.85 seconds) and a Long Cadence (LC, 29.4 minutes). For more details, read: Kepler Instrument Handbook, Section 2.1. Mission Overview and 2.6. Pixels of Interest, and the Kepler Archive Manual Chapter 2: Kepler Data Products.

Figure: The field of view of the Kepler mission. The rectangles represent the CCD modules described above.

1.2. K2#

The Kepler mission ended in 2013 following the loss of two reaction wheels, leaving the spacecraft unable to stay fixed on one portion of the sky. Instead, it changed its focus to the ecliptic plane, and performed 80-day observing campaigns of 19 separate fields. K2 data are very similar to Kepler data, but are subject to higher levels of instrument noise due to the increased instability of the spacecraft. For more details read the K2 Handbook, specifically Section 2: What’s New in K2.

1.3. TESS#

The Transiting Exoplanet Survey Satellite (TESS) succeeded Kepler in 2018. The data it collects are very similar to those from Kepler and K2, but TESS covers a much larger area of the sky at a lower resolution. TESS observes large sectors of the sky for 27 days at a time. The overlap of these sectors means that stars near the ecliptic poles will receive a year of uninterrupted data, while those near the ecliptic receive only ~27 days. Compared to Kepler, TESS observes in several different cadence modes, including 20 seconds, 120 seconds, 10 minutes, and 30 minutes. For more details, see the Mission Overview and the TESS Instrument Handbook, specifically Section 2: Introduction to TESS.

Some stars that have been observed by TESS will also have been observed by Kepler, and in some rare cases K2.

2. Downloading a Light Curve File#

The light curves of stars created by the Kepler mission are stored at the Mikulksi Archive for Space Telescopes (MAST) archive, along with metadata about the observations, such as which CCD channel was used at each time.

Lightkurve’s built-in tools allow us to search for light curve files in the archive, and download them and their metadata. In this example, we will start by downloading one quarter of Kepler data for a star named Kepler-8, a star somewhat larger than the Sun, and the host of a hot Jupiter planet.

Using Lightkurve’s search_lightcurve function, we can find an itemized list of different light curve file products available for Kepler-8:

search_result = lk.search_lightcurve("Kepler-8", mission="Kepler")
SearchResult containing 50 data products.
0Kepler Quarter 032009Kepler60kplr0069222440.0
1Kepler Quarter 022009Kepler60kplr0069222440.0
2Kepler Quarter 022009Kepler60kplr0069222440.0
3Kepler Quarter 022009Kepler60kplr0069222440.0
4Kepler Quarter 032009Kepler60kplr0069222440.0
5Kepler Quarter 032009Kepler60kplr0069222440.0
6Kepler Quarter 002009Kepler1800kplr0069222440.0
7Kepler Quarter 012009Kepler1800kplr0069222440.0
8Kepler Quarter 022009Kepler1800kplr0069222440.0
40Kepler Quarter 132012Kepler60kplr0069222440.0
41Kepler Quarter 112012Kepler60kplr0069222440.0
42Kepler Quarter 122012Kepler1800kplr0069222440.0
43Kepler Quarter 112012Kepler1800kplr0069222440.0
44Kepler Quarter 132012Kepler1800kplr0069222440.0
45Kepler Quarter 142012Kepler1800kplr0069222440.0
46Kepler Quarter 152013Kepler1800kplr0069222440.0
47Kepler Quarter 172013Kepler1800kplr0069222440.0
48Kepler Quarter 162013Kepler1800kplr0069222440.0
49Kepler Quarter2009KBONUS-BKG1765Gaia DR3 21167309949659052800.0
Length = 50 rows

In this list, each row represents a different observing period. We find that Kepler recorded the maxmimum of 18 quarters of data for this target across four years. The observation column lists the Kepler Quarter. The target_name represents the Kepler Input Catalogue (KIC) ID of the target, and the productFilename column is the name of the FITS files downloaded from MAST. The distance column shows the separation on the sky between the searched coordinates and the downloaded objects — this is only relevant when you pass a radius argument to the search_lightcurve function to search for targets within a given search radius around a set of coordinates.

The search_lightcurve function takes several additional arguments, such as the quarter number or the mission name. You can find examples of its use in the online documentation for this function.

The search function returns a SearchResult object which has several convenient operations. For example, we can select the fourth data product in the list as follows:

SearchResult containing 1 data products.
0Kepler Quarter 032009Kepler60kplr0069222440.0

We can download this data product using the download() method:

klc = search_result[4].download()

This instruction is identical to the following line:

klc = lk.search_lightcurve("Kepler-8", mission="Kepler", quarter=4).download()
/opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/lightkurve/ LightkurveWarning: Warning: 5 files available to download. Only the first file has been downloaded. Please use `download_all()` or specify additional criteria (e.g. quarter, campaign, or sector) to limit your search.

The klc variable we have obtained in this way is a KeplerLightCurve object. This object contains time, flux, and flux error information, as well as a whole lot of data about spacecraft systematics. We can view all of them by calling the object by itself:

KeplerLightCurve length=45453 LABEL="KIC 6922244" QUARTER=4 AUTHOR=Kepler FLUX_ORIGIN=pdcsap_flux
electron / selectron / sdpixpixelectron / selectron / selectron / selectron / selectron / selectron / spixpixpixpixpixpixpixpixpixpix

This object provides a convenient way to interact with the data file that has been returned by the archive, which contains both the light curve data and metadata about the observations.

Before diving into the properties of the light curve file, we can plot the data, also using Lightkurve.

%matplotlib inline

On this plot, the y-axis is flux in electrons per second. This unit may appear counterintutive, as flux is a measure of brightness. The CCD cameras measure an electrical charge, and so light is recorded as electrons, not photons as you might expect. On the x-axis we have time in Barycentric Kepler Julian Date (BKJD). In short, the x-axis values are days since the start of the Kepler mission. The repeating dips in brightness are transits, the effect of a planet orbiting Kepler-8 and passing between us and the star.


You can also download light curve FITS files from the archive by hand, store them on your local disk, and open them using the<filename>) function. This function will return a KeplerLightCurve object just as in the above example. You can find out where Lightkurve stored a light curve file using the filename attribute:


3. The SAP and PDCSAP Light Curves#

As you can see in the Table above, there are two different types of flux stored in the KeplerLightCurve object. These correspond to different levels of data treatment performed for this star by NASA’s Kepler Data Processing Pipeline: the simple aperture photometry (SAP) flux, and the presearch data conditioning SAP (PDCSAP) flux.

By default, a KeplerLightCurve will set the PDCSAP flux to its .flux property.

To compare the PDCSAP and the SAP flux, we can use the column keyword while plotting.

Note: alternatively, you can replace the flux column with the sap_flux column by using klc.flux = klc['sap_flux'].

ax = klc.plot(column='pdcsap_flux', label='PDCSAP Flux', normalize=True)
klc.plot(column='sap_flux', label='SAP Flux', normalize=True, ax=ax);

In brief:

  • The SAP light curve is calculated by summing together the brightness of pixels that fall within an aperture set by the Kepler mission. This is often referred to as the optimal aperture, but in spite of its name can sometimes be improved upon! Because the SAP light curve is a sum of the brightness in chosen pixels, it is still subject to systematic artifacts of the mission.

  • The PDCSAP light curve is subject to more treatment than the SAP light curve, and is specifically intended for detecting planets. The PDCSAP pipeline attempts to remove systematic artifacts while keeping planetary transits intact.

Looking at the figure we made above, you can see that the SAP light curve has a long-term change in brightness that has been removed in the PDCSAP light curve, while keeping the transits at the same depth. For most inspections, a PDCSAP light curve is what you want to use, but when looking at astronomical phenomena that aren’t planets (for example, long-term variability), the SAP flux may be preferred.

For now, let’s continue to use the PDCSAP flux only. Because this is the default .flux property of our light curve object, we don’t need to change anything.


The plot() methods in Lightkurve always return a Matplotlib object. This is useful because it lets us manipulate the plot using standard Matplotlib functions. For example, we can set the title as follows:

ax = klc.plot() 
ax.set_title("PDCSAP light curve of Kepler-8");

And the figure can be saved as follows:


4. Accessing the Metadata#

When downloading data from MAST, that data usually comes in the format of a FITS file. These FITS files carry a wealth of metadata about the observation. When these are loaded in to Lightkurve to create a KeplerLightCurve, all of the metadata are stored in the .meta property of the object.

We can view these metadata by calling this property, as follows:

{'INHERIT': True,
 'EXTVER': 1,
 'TELESCOP': 'Kepler',
 'INSTRUME': 'Kepler Photometer',
 'OBJECT': 'KIC 6922244',
 'KEPLERID': 6922244,
 'RA_OBJ': 281.28812,
 'DEC_OBJ': 42.45108,
 'EQUINOX': 2000.0,
 'EXPOSURE': 28.59382564,
 'BJDREFI': 2454833,
 'BJDREFF': 0.0,
 'TIMEUNIT': 'd',
 'TELAPSE': 31.05881061,
 'LIVETIME': 28.59382564,
 'TSTART': 352.36610831,
 'TSTOP': 383.42491892,
 'LC_START': 55184.86785934,
 'LC_END': 55215.92624784,
 'DEADC': 0.92063492,
 'TIMEPIXR': 0.5,
 'TIERRELA': 5.78e-07,
 'TIERABSO': < at 0x7efd809d5d10>,
 'INT_TIME': 6.01980290327,
 'READTIME': 0.518948526144,
 'FRAMETIM': 6.538751429414,
 'NUM_FRM': 9,
 'TIMEDEL': 0.000681119940564,
 'DATE-OBS': '2009-12-19T20:49:13.622Z',
 'DATE-END': '2010-01-19T22:14:17.237Z',
 'BACKAPP': True,
 'DEADAPP': True,
 'VIGNAPP': True,
 'GAIN': 115.49,
 'READNOIS': 83.014212,
 'MEANBLCK': 715,
 'LCFXDOFF': 419400,
 'SCFXDOFF': 219400,
 'CDPP3_0': < at 0x7efd809d5d10>,
 'CDPP6_0': < at 0x7efd809d5d10>,
 'CDPP12_0': < at 0x7efd809d5d10>,
 'CROWDSAP': 1.0,
 'FLFRCSAP': 0.8286,
 'PDCVAR': 1.0116586685180664,
 'PDCMETHD': 'quickMap',
 'NUMBAND': 1,
 'FITTYPE1': 'prior',
 'PR_GOOD1': 0.9999775886535645,
 'PR_WGHT1': 0.022915253415703773,
 'PDC_TOT': 0.9629505276679993,
 'PDC_TOTP': 28.58700942993164,
 'PDC_COR': 0.9999974370002747,
 'PDC_CORP': 50.0,
 'PDC_VAR': 0.9998998641967773,
 'PDC_VARP': 71.4129867553711,
 'PDC_NOI': 0.8930104970932007,
 'PDC_NOIP': 28.58700942993164,
 'PDC_EPT': < at 0x7efd809d5d10>,
 'PDC_EPTP': < at 0x7efd809d5d10>,
 'SIMPLE': True,
 'BITPIX': 8,
 'NAXIS': 0,
 'EXTEND': True,
 'NEXTEND': 2,
 'ORIGIN': 'NASA/Ames',
 'DATE': '2016-05-18',
 'CREATOR': '1193565 FluxExporter2PipelineModule',
 'PROCVER': 'svn+ssh://murzim/repo/soc/tags/release/9.3.46 r61343',
 'FILEVER': '6.1',
 'TIMVERSN': 'OGIP/93-003',
 'CHANNEL': 31,
 'SKYGROUP': 31,
 'MODULE': 10,
 'OUTPUT': 3,
 'QUARTER': 4,
 'SEASON': 2,
 'DATA_REL': 25,
 'OBSMODE': 'short cadence',
 'MISSION': 'Kepler',
 'TTABLEID': 29,
 'PMRA': 0.0,
 'PMDEC': 0.0,
 'PMTOTAL': 0.0,
 'PARALLAX': None,
 'GLON': 71.6589,
 'GLAT': 19.012749,
 'GMAG': 13.886,
 'RMAG': 13.511,
 'IMAG': 13.424,
 'ZMAG': 13.413,
 'D51MAG': 13.7,
 'JMAG': 12.576,
 'HMAG': 12.323,
 'KMAG': 12.292,
 'KEPMAG': 13.563,
 'GRCOLOR': 0.375,
 'JKCOLOR': 0.284,
 'GKCOLOR': 1.594,
 'TEFF': 6225,
 'LOGG': 4.169,
 'FEH': -0.04,
 'EBMINUSV': 0.096,
 'AV': 0.297,
 'RADIUS': 1.451,
 'TMINDEX': 262064792,
 'SCPID': 262064792,
 'LABEL': 'KIC 6922244',
 'RA': 281.28812,
 'DEC': 42.45108,
 'FILENAME': '/home/runner/.lightkurve/cache/mastDownload/Kepler/kplr006922244_sc_Q003333310333330000/kplr006922244-2010019161129_slc.fits',
 'FLUX_ORIGIN': 'pdcsap_flux',
 'AUTHOR': 'Kepler',
 'TARGETID': 6922244,
 'QUALITY_BITMASK': 'default',
 'QUALITY_MASK': array([ True,  True,  True, ...,  True,  True,  True])}

As you can see, there is a lot here if you don’t know what you are looking for! These metadata don’t just include information about the observations, but also data from the Kepler Input Catalogue (KIC) used to select observing targets, such as their magnitudes and temperature.

The .meta property is a Python dictionary, which has some convenient features. For example, we can retrieve the value of an individual keyword as follows (keep in mind that dictionary keywords are case sensitive):


Alternatively, we can use the .get() method, which accounts for queries that aren’t in the dictionary.


A feature of the KeplerLightCurve object is that the metadata can also be accessed via user-friendly object properties for convenience. For example, the Kepler Quarter number is directly accessible via the quarter property:


5. Understanding the Data Arrays and Units#

As we saw above, the KeplerLightCurve object is a table that contains many arrays other than the PDCSAP and SAP fluxes. Detailed information on each of these can be found in the Kepler Archive Manual, Section 2.3.1. Light Curve Files.

The first six columns appear in all KeplerLightCurve objects, and contain the most commonly used information. These are:

  • time: the time measurements at each cadence.

  • flux: the flux of the target star at each time measurement. This is populated with PDCSAP flux by default.

  • flux_err: the statistical uncertainty on each flux data point.

  • quality: information on the data quality at each time measurement.

  • centroid_col & centroid_row: the position of the target star on the CCD at each observation. This changes over time due to, for example, small jitters of the spacecraft.

The remaining columns are more detailed information on the observation. Some of these are duplicated in the first five columns described above:

  • timecorr: correction values that allow users to revert back to non-barycentric timestamps.

  • cadenceno: these are mission-specific identifiers of each exposure.

  • sap_flux & sap_flux_err: the SAP flux and associated error.

  • sap_bkg & sap_bkg_err: the calculated background (and associated error) inside the aperture used to calculate the SAP flux.

  • pdcsap_flux & pdcsap_flux_err: the PDCSAP flux and associated error. Duplicated by default in flux and flux_err.

  • sap_quality: information on the data quality at each time measurement. Duplicated in quality.

  • psf_centr1 & psf_centr2 (and errors): the column and row centroid positions of a PSF model fit to the target star.

  • mom_centr1 & mom_centr2 (and errors): the column and row centroid positions of the target star, weighted by flux. Duplicated in centroid_col and centroid_row respectively.

  • pos_corr1 & pos_corr2: the column and row components of the calculated image motion.

These columns can be accessed as properties of the KeplerLightCurve , for example, as follows:

\[[1673.4445,~1673.3635,~1673.2825,~\dots,~1828.2699,~1828.2723,~1828.2748] \; \mathrm{\frac{e^{-}}{s}}\]

The unit information of the arrays are stored using Astropy’s astropy.units module, which means that they are an Astropy Quantity object. We can view the units as follows:

print(f'Centroid column unit: {klc.centroid_col.unit}')
print(f'Flux unit: {klc.flux.unit}')
Centroid column unit: pix
Flux unit: electron / s

You can access the data in the form of a standard NumPy array using the value attribute:

MaskedNDArray([682.69288948, 682.69073348, 682.6908351 , ...,
               682.61725999, 682.62236336, 682.61672471], dtype='>f8')

We can also plot the data using the KeplerLightCurve’s plot() method by passing a column keyword argument:

ax = klc.plot(column='mom_centr1', label='Flux-weighted column position')
klc.plot(ax=ax, column='psf_centr1', label='PSF centroid column position');

Finally, the .time property is a little different. Instead of an Astropy Quantity object, it is an Astropy Time object, and has some additional time scale and format information.

<Time object: scale='tdb' format='bkjd' value=[352.36644886 352.36712985 352.36781104 ... 383.42321606 383.42389716
print(f'Time scale: {klc.time.scale}')
print(f'Time format: {klc.time.format}')
Time scale: tdb
Time format: bkjd

Here, the time format is the unit of time, in this case Barycentric Kepler Julian Date (BKJD). The time scale indicates how the time is measured, in this case by taking the Barycentric Dynamical Time (TDB). This detailed information may be important when comparing observations of a periodic event (such as a planet transit) with observations made with other telescopes on Earth.


Some stars, such as Kepler-10, have been observed both with Kepler and TESS. In this exercise, download and plot the TESS PDCSAP flux only. You can do this by either selecting it from the SearchResult returned by search_lightcurve() or by using the mission keyword argument when searching.

#search_result = lk.search_lightcurvefile(...)
# Solution:
search_result = lk.search_lightcurve('Kepler-10', mission='TESS')
SearchResult containing 43 data products.
0TESS Sector 142019SPOC1203777807900.0
1TESS Sector 402021SPOC203777807900.0
2TESS Sector 412021SPOC203777807900.0
3TESS Sector 412021SPOC1203777807900.0
4TESS Sector 402021SPOC1203777807900.0
5TESS Sector 552022SPOC203777807900.0
6TESS Sector 542022SPOC203777807900.0
7TESS Sector 532022SPOC203777807900.0
8TESS Sector 552022SPOC1203777807900.0
33TESS Sector 142019TASOC18003777807900.0
34TESS Sector 152019CDIPS18003777807900.0
35TESS Sector 262020TASOC18003777807900.0
36TESS Sector 262020CDIPS18003777807900.0
37TESS Sector 262020TASOC18003777807900.0
38TESS Sector 412021CDIPS18003777807900.0
39TESS Sector 402021CDIPS18003777807900.0
40TESS Sector 552022CDIPS18003777807900.0
41TESS Sector 542022CDIPS18003777807900.0
42TESS Sector 532022CDIPS18003777807900.0
Length = 43 rows;
/opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/lightkurve/ LightkurveWarning: Warning: 43 files available to download. Only the first file has been downloaded. Please use `download_all()` or specify additional criteria (e.g. quarter, campaign, or sector) to limit your search.

About this Notebook#

Authors: Oliver Hall (, Geert Barentsen

Updated On: 2020-08-31

Citing Lightkurve and Astropy#

If you use lightkurve or astropy for published research, please cite the authors. Click the buttons below to copy BibTeX entries to your clipboard.


When using Lightkurve, we kindly request that you cite the following packages:

  • lightkurve
  • astropy
  • astroquery — if you are using search_lightcurve() or search_targetpixelfile().
  • tesscut — if you are using search_tesscut().

Space Telescope Logo