Download MAST TESS Light Curves Within an FFI Footprint Using TAP#

This notebook is a demo for accessing Transiting Exoplanet Survey Satellite (TESS) data in the Common Archive Observation Model (CAOM) catalog at MAST, using a Virtual Observatory standard Table Access Protocol (TAP) service.

Table of Contents#

  1. TAP Service Introduction

  2. Imports

  3. Service Specific Configuration

  4. Connecting to the TAP Service

  5. Use Case: Getting light curves from a sector, camera, and chip

    • Step 1: Getting the footprint

    • Step 2: Getting an inventory of TESS lightcurves within the footprint

  6. Additional Resources

  7. About This Notebook

TAP Service Introduction#

Table Access Protocol (TAP) services allow more direct and flexible access to astronomical data than the simpler types of IVOA standard data services. Queries are built with the SQL-like Astronomical Data Query Language (ADQL), and can include geographic / spatial queries as well as filtering on other characteristics of the data. This also allows the user fine-grained control over the returned columns, unlike the fixed set of coumns retunred from cone, image, and spectral services.

For this example, we’ll be using the astropy affiliated PyVO client, which is interoperable with other valid TAP services, including those at MAST. PyVO documentation is available at ReadTheDocs:

We’ll be using PyVO to call the CAOM Catalog TAP service at MAST and filter the results for TESS-related information. The schema for this catalog is an IVOA standard, and is also described within the service itself.


# Use the pyvo library as our client to the data service.
import pyvo as vo

# For handling ordinary astropy Tables in responses
from astropy.table import Table

# For displaying and manipulating some types of results
import requests
import astropy
import time

# suppress unimportant unit warnings from many TAP services
import warnings
warnings.filterwarnings("ignore", module="*")
warnings.filterwarnings("ignore", module="pyvo.utils.xml.elements")
warnings.filterwarnings("ignore", module="")

Service Specific Configuration#

Every TAP service has a “Base URL” plus associated endpoints for synchronous and asynchronous queries, as well as status and capability information, and sometimes service-provided sample queries. The endpoints are predefined in the TAP standard, so clients can infer them using the base. We therefore only have to provide astroquery that base.

TAP_URL = ""

Connecting to the TAP Service#

The PyVO library is able to connect to any TAP service, given the “base” URL as noted in metadata registry resources describing the service. The CAOM TAP service at MAST has access to TESS FFI and time series, including file URLs for download.

TAP_service = vo.dal.TAPService(TAP_URL)
Capability ivo://

Interface vs:ParamHTTP

Language ADQL
Output format application/x-votable+xml
    Also available as votable

Output format text/csv;header=present
    Also available as csv

Maximum size of resultsets
    Default 100000 row
    Maximum 100000 row

Capability ivo://

Interface vr:WebBrowser

List available tables#

TAP_tables = TAP_service.tables
for tablename in TAP_tables.keys():
    if not "tap_schema" in tablename:  
        # These tables have no descriptions, but we can easily print the table name
        print("Columns={}".format(sorted([ for k in TAP_tables[tablename].columns ])))
No description

Columns=['calib_level', 'dataRights', 'dataURL', 'dataproduct_type', 'em_max', 'em_min', 'filters', 'instrument_name', 'intentType', 'jpegURL', 'mtFlag', 'objID', 'obs_collection', 'obs_id', 'obs_title', 'obsid', 'project', 'proposal_id', 'proposal_pi', 'proposal_type', 'provenance_name', 's_dec', 's_ra', 's_region', 'sequence_number', 'srcDen', 't_exptime', 't_max', 't_min', 't_obs_release', 'target_classification', 'target_name', 'wavelength_region']
No description

Columns=['calib_level', 'dataRights', 'dataURL', 'dataproduct_type', 'em_max', 'em_min', 'filters', 'instrument_name', 'intentType', 'jpegURL', 'mtFlag', 'objID', 'obs_collection', 'obs_id', 'obs_title', 'obsid', 'project', 'proposal_id', 'proposal_pi', 'proposal_type', 'provenance_name', 's_dec', 's_ra', 's_region', 'sequence_number', 'srcDen', 't_exptime', 't_max', 't_min', 't_obs_release', 'target_classification', 'target_name', 'wavelength_region']
No description

Columns=['calib_level', 'dataRights', 'dataURL', 'dataproduct_type', 'em_max', 'em_min', 'filters', 'instrument_name', 'intentType', 'jpegURL', 'mtFlag', 'objID', 'obs_collection', 'obs_id', 'obs_title', 'obsid', 'project', 'proposal_id', 'proposal_pi', 'proposal_type', 'provenance_name', 's_dec', 's_ra', 's_region', 'sequence_number', 'srcDen', 't_exptime', 't_max', 't_min', 't_obs_release', 'target_classification', 'target_name', 'wavelength_region']
No description

Columns=['accMetaChecksum', 'algName', 'collection', 'envAmbientTemp', 'envElevation', 'envHumidity', 'envPhotometric', 'envSeeing', 'envTau', 'envWavelengthTau', 'id', 'insKeywords', 'insName', 'intent', 'lastModified', 'linked', 'maxLastModified', 'maxLevel', 'metaChecksum', 'metaDataRights', 'metaProducer', 'metaReadGroups', 'metaRelease', 'nObs', 'obsType', 'observationID', 'observationTID', 'prpID', 'prpKeywords', 'prpPI', 'prpProject', 'prpTitle', 'recordCreated', 'recordModified', 'reqFlag', 'sequenceNumber', 'statusCode', 'tlsGeoLocationX', 'tlsGeoLocationY', 'tlsGeoLocationZ', 'tlsKeywords', 'tlsName', 'trgClassification', 'trgID', 'trgKeywords', 'trgMoving', 'trgName', 'trgPosDec', 'trgRedshift', 'trgStandard', 'trgType', 'trgposCoordSys', 'trgposEquinox', 'trgposRA', 'type']
No description

Columns=['collection', 'derivedTID', 'recordCreated', 'recordModified', 'simpleTID', 'statusCode']
No description

Columns=['accMetaChecksum', 'calibrationLevel', 'collection', 'creatorID', 'cstCtype', 'cstDimension', 'cstLower', 'cstUpper', 'dataProductType', 'dataReadGroups', 'dataRelease', 'dataRights', 'dqFlag', 'enrBandpassName', 'enrBoundsSTCS', 'enrDimension', 'enrEMBand', 'enrMax', 'enrMin', 'enrResolvingPower', 'enrResolvingPowerLower', 'enrResolvingPowerUpper', 'enrRestWavelength', 'enrSampleSize', 'enrTransition', 'id', 'lastModified', 'maxLastModified', 'metaChecksum', 'metaDataRights', 'metaProducer', 'metaReadGroups', 'metaRelease', 'midExpDate', 'mtrBackground', 'mtrBackgroundStdDev', 'mtrFluxDensityLimit', 'mtrMagLimit', 'mtrSampleSNR', 'mtrSourceNumberDensity', 'obsUCD', 'observationTID', 'observationUUID', 'planeTID', 'plrDimension', 'plrState', 'posBoundsSTCS', 'posDimension1', 'posDimension2', 'posResolution', 'posResolutionLower', 'posResolutionUpper', 'posSampleSize', 'posTimeDependant', 'previewURI', 'productID', 'productURI', 'prvInputs', 'prvKeywords', 'prvLastExecuted', 'prvName', 'prvProducer', 'prvProject', 'prvReference', 'prvRunID', 'prvVersion', 'recordCreated', 'recordModified', 'releaseDate', 'statusCode', 'timBoundsSTCS', 'timDimension', 'timExposure', 'timMax', 'timMin', 'timResolution', 'timResolutionLower', 'timResolutionUpper', 'timSampleSize']
No description

Columns=['accMetaChecksum', 'artifactTID', 'collection', 'contentChecksum', 'contentLength', 'contentReadGroups', 'contentRelease', 'contentRights', 'contentType', 'creationDate', 'dataURI', 'id', 'lastModified', 'maxLastModified', 'metaChecksum', 'metaProducer', 'planeTID', 'planeUUID', 'productFileName', 'productType', 'productTypeID', 'recordCreated', 'recordModified', 'releaseType', 'statusCode']
No description

Columns=['accMetaChecksum', 'artifactTID', 'artifactUUID', 'collection', 'id', 'lastModified', 'maxLastModified', 'metaChecksum', 'metaProducer', 'name', 'partTID', 'productType', 'recordCreated', 'recordModified', 'statusCode']
No description

Columns=['accMetaChecksum', 'chunkTID', 'collection', 'cstwcsCRPIX', 'cstwcsCRVAL', 'cstwcsCTYPE', 'cstwcsCUNIT', 'cstwcsDELTA', 'cstwcsNAXIS', 'cstwcsRNDER', 'cstwcsRangeEndPix', 'cstwcsRangeEndVal', 'cstwcsRangeStartPix', 'cstwcsRangeStartVal', 'cstwcsSYSER', 'customAxis', 'energyAxis', 'enrwcsBandpassName', 'enrwcsCRPIX', 'enrwcsCRVAL', 'enrwcsCTYPE', 'enrwcsCUNIT', 'enrwcsDELTA', 'enrwcsNAXIS', 'enrwcsRNDER', 'enrwcsRangeEndPix', 'enrwcsRangeEndVal', 'enrwcsRangeStartPix', 'enrwcsRangeStartVal', 'enrwcsResolvingPower', 'enrwcsRestfrq', 'enrwcsRestwav', 'enrwcsSYSER', 'enrwcsSpecsys', 'enrwcsSsysobs', 'enrwcsSsyssrc', 'enrwcsTransition', 'enrwcsVelang', 'enrwcsVelosys', 'enrwcsZsource', 'id', 'lastModified', 'maxLastModified', 'metaChecksum', 'metaProducer', 'naxis', 'observableAxis', 'obxDepBin', 'obxDepCTYPE', 'obxDepCUNIT', 'obxIndBin', 'obxIndCTYPE', 'obxIndCUNIT', 'partTID', 'partUUID', 'plrwcsCRPIX', 'plrwcsCRVAL', 'plrwcsCTYPE', 'plrwcsCUNIT', 'plrwcsDELTA', 'plrwcsNAXIS', 'plrwcsRNDER', 'plrwcsRangeEndPix', 'plrwcsRangeEndVal', 'plrwcsRangeStartPix', 'plrwcsRangeStartVal', 'plrwcsSYSER', 'polarizationAxis', 'positionAxis1', 'positionAxis2', 'poswcsCD11', 'poswcsCD12', 'poswcsCD21', 'poswcsCD22', 'poswcsCRPIX1', 'poswcsCRPIX2', 'poswcsCRVAL1', 'poswcsCRVAL2', 'poswcsCTYPE1', 'poswcsCTYPE2', 'poswcsCUNIT1', 'poswcsCUNIT2', 'poswcsCoordSys', 'poswcsEquinox', 'poswcsNAXIS1', 'poswcsNAXIS2', 'poswcsRNDER1', 'poswcsRNDER2', 'poswcsRangeEndPix1', 'poswcsRangeEndPix2', 'poswcsRangeEndVal1', 'poswcsRangeEndVal2', 'poswcsRangeStartPix1', 'poswcsRangeStartPix2', 'poswcsRangeStartVal1', 'poswcsRangeStartVal2', 'poswcsResolution', 'poswcsSYSER1', 'poswcsSYSER2', 'productType', 'recordCreated', 'recordModified', 'statusCode', 'timeAxis', 'timwcsCRPIX', 'timwcsCRVAL', 'timwcsCTYPE', 'timwcsCUNIT', 'timwcsDELTA', 'timwcsExposure', 'timwcsMJDref', 'timwcsNAXIS', 'timwcsRNDER', 'timwcsRangeEndPix', 'timwcsRangeEndVal', 'timwcsRangeStartPix', 'timwcsRangeStartVal', 'timwcsResolution', 'timwcsSYSER', 'timwcsTimeSys', 'timwcsTrefpos']
No description

Columns=['collection', 'linkCollection', 'linkTID', 'linkType', 'obsTID', 'recordCreated']
No description

Columns=['contentType', 'description', 'documentationURL', 'groupDescription', 'mission', 'productType', 'project', 'subGroupDescription', 'typeID']
No description

Columns=['access_estsize', 'access_format', 'access_url', 'calib_level', 'dataproduct_type', 'em_max', 'em_min', 'em_res_power', 'em_xel', 'facility_name', 'instrument_name', 'o_ucd', 'objID', 'obs_collection', 'obs_id', 'obs_publisher_did', 'pol_states', 'pol_xel', 's_dec', 's_fov', 's_ra', 's_region', 's_resolution', 's_xel1', 's_xel2', 't_exptime', 't_max', 't_min', 't_resolution', 't_xel', 'target_name']
No description

Columns=['access_estsize', 'access_format', 'access_url', 'calib_level', 'dataproduct_type', 'em_max', 'em_min', 'em_res_power', 'em_xel', 'facility_name', 'instrument_name', 'o_ucd', 'objID', 'obs_collection', 'obs_id', 'obs_publisher_did', 'obs_release_date', 'pol_states', 'pol_xel', 's_dec', 's_fov', 's_ra', 's_region', 's_resolution', 's_xel1', 's_xel2', 't_exptime', 't_max', 't_min', 't_resolution', 't_xel', 'target_name']

Use Case: Getting images from a sector, camera, and chip#

Step 1: Getting the footprint#

For our purposes, any one footprint related to a sector, camera, and chip combination is good enough. We are not currently accounting for small movements of the spacecraft to form a composite footprint. Observation IDs for this mission are constructed based on sector, camera, and chip combination, and we can use this to launch our footprint search:

sector = '1'
camera = '1'
chip = '2'
observationIDwildcard = 'tess%-s000{}-{}-{}-%-s'.format(sector, camera, chip)

Here we query for a single ID and footprint to start with. For filters, we use the TESS mission, target types associated with light curves instead of full frame images, and the sector number. Note that the sector ID (sequence number) is in a different table in the CAOM database than most of the metadata we want, so we have to join these tables based on the shared observation ID.

This query uses an asynchronous job with longer timeouts in case of connection issues.

job = TAP_service.run_async(f"""
            SELECT top 1 obs_id, s_region  
            FROM dbo.caomobservation JOIN ivoa.obscore on dbo.caomobservation.observationID = ivoa.obscore.obs_id 
            WHERE collection = 'TESS' and trgType = 'FIELD' and 
            sequenceNumber = {sector} and
            observationID like '{observationIDwildcard}'

footprint_results = job.to_table()
Table length=1
tess2018207052942-s0001-1-2-0120-sPOLYGON 330.343765 -44.336318 324.663797 -33.277719 337.720578 -28.384567 344.128621 -38.561643 330.343765 -44.336318

Step 2: Getting an inventory of TESS lightcurves within the footprint#

Here we take the footprint from the above query and find all lightcurves intersecting with this footprint, in ALL sectors. Depending on where this is in the sky, there could be responses only in the original sector, or there could be overlaps with other sectors. There would be more sector overlap near the poles, for instance. By filtering on obs_collection = TESS, we filter based on the TESS mission and exclude High Level Science Products (including those based on TESS).

The format must be reformatted for our next query. We separate the shape from the list of vertices, which themselves must be comma-separated.

footprint = footprint_results['s_region'][0]
footprintShape = footprint[0:footprint.find(' ')]
footprintVertices = footprint[footprint.find(' '):].strip().replace(' ', ', ')

330.343765, -44.336318, 324.663797, -33.277719, 337.720578, -28.384567, 344.128621, -38.561643, 330.343765, -44.336318

Once the footprint has been isolated and reformatted, we perform another query listing all lightcurves (minus the data validation timeseries, whose files end in _dvt) by their target name, sector, RA and Dec, as well as returning the access url for each FITS file and its estimated file size. We are doing this as an asynchronous query, which handles longer response times, just in case.

job = TAP_service.run_async("""
            SELECT target_name, sequenceNumber as sector, s_ra, s_dec, access_url, access_estsize, obs_id 
            FROM dbo.caomobservation JOIN ivoa.obscore on dbo.caomobservation.observationID = ivoa.obscore.obs_id 
            obs_collection = 'TESS' and dataproduct_type = 'timeseries' 
            and access_url like '%lc.fits' and 
            CONTAINS(POINT('ICRS',s_ra,s_dec),{}('ICRS', {}))=1
            ORDER BY obs_id
            """.format(footprintShape, footprintVertices))

TAP_results = job.to_table()
Table length=2575

The returned data is in an astropy table; you can manipulate it to do more ordering or filtering. To download individual files or the whole set, you can use the access_url column, as below.

Python’s requests library lets you download files from a URL. The downloads will appear in the directory where your notebook is running.

# Example: first result row:
single_url = TAP_results['access_url'][0]
filename = TAP_results['obs_id'][0] + "_lc.fits"
r = requests.get(single_url, allow_redirects=True)
open(filename, 'wb').write(r.content)
print('File downloaded: {} bytes'.format(r.headers['Content-length']))

# Uncomment the code below to download every file in a loop. 
# Warning: this can take some time as there are ~900 2megabyte files listed from the notebook's original sample query 
# (see "True length" of masked table above).

#for rows in TAP_results:
#    single_url = rows['access_url']
#    filename = rows['obs_id'] + "_lc.fits"
#    r = requests.get(single_url, allow_redirects=True)
#    open(filename, 'wb').write(r.content)
#print('All files downloaded.')
File downloaded: 2039040 bytes

If you have problems running requests or would rather save individual files through your browser, you can simply print clickable links, instead, or wrap them in curl or wget calls, which may be different based on your operating system.

# Example: first result row:
single_url = TAP_results['access_url'][0]

# Uncomment the code below to display clickable links for every file in a loop.
#for rows in TAP_results:
#    single_url = rows['access_url'].decode('UTF-8')
#    print(single_url)

Additional Resources#

Table Access Protocol#

  • IVOA standard for RESTful web service access to tabular data


Astronomical Query Data Language (2.0)#

  • IVOA standard for querying astronomical data in tabular format, with geometric search support


Common Archive Observation Model (2)#

  • IVOA standard data model whose relational representation this catalog follows



  • an affiliated package for astropy

  • find and retrieve astronomical data available from archives that support standard IVOA virtual observatory service protocols.


About this Notebook#

Authors: Scott Fleming & Theresa Dower, STScI Archive Scientists & Software Engineer

Last Updated: Dec 2022

STScI logo