Beginner: Searching MAST using astroquery.mast#

Introduction and Goals:#

This is a beginner tutorial on accessing the MAST Archive using the Astroquery API. We’ll cover the major search features you might find useful when querying for observations. By the end of this tutorial, you will:

  • Understand how to search for observations hosted on the MAST Archive

  • Download data products corresponding to your observations of interest

  • Create a visual display of the downloaded data

Table of Contents#

  • Imports

  • Three ways to search for MAST observations

    • By Region

    • By Object Name

    • By Criteria

  • Getting Associated Data Products

    • Performing a Product Query

    • Filtering Data Products

  • Downloading Products

  • Displaying Data

  • Further Reading

Imports#

Let’s start by importing the packages we need for this notebook.

  • astroquery.mast to access the MAST API

  • astropy to create coordinate objects and read fits files

  • matplotlib to plot the data

from astropy.coordinates import SkyCoord
from astropy.io import fits
from astroquery.mast import Observations
from matplotlib.colors import SymLogNorm

import matplotlib.pyplot as plt

%matplotlib inline

Three Methods to Search for MAST Observations#

All three searches outlined below use astroquery.mast, an astronomer-friendly wrapper for our pure Python API. Specifically, these searches use the Observations class, which queries our entire multi-mission collection.

1. Query by Region#

To search by coordinates (and a radius), you can use Observations.query_region. The coordinates can be given as a string or astropy.coordinates object; the radius, as a string or float. If no radius is specified, the default is 0.2 degrees.

Let’s try an example search with the coordinates [290.213503, -11.800746] (arbitrarily chosen) and no radius.

# This will give a warning that the coordinates are being interpreted as an ICRS coordinate provided in degrees
obsByRegion = Observations.query_region("290.213503 -11.800746")
len(obsByRegion)
WARNING: InputWarning: Coordinate string is being interpreted as an ICRS coordinate provided in degrees. [astroquery.utils.commons]
358

Excellent! At the time of writing, this search returns 358 results. As the MAST archive grows, this number increases; you might see a larger value. Let’s take a look a subset of the results.

# Preview the first 3 results
obsByRegion[:3]
Table masked=True length=3
intentTypeobs_collectionprovenance_nameinstrument_nameprojectfilterswavelength_regiontarget_nametarget_classificationobs_ids_ras_decdataproduct_typeproposal_picalib_levelt_mint_maxt_exptimeem_minem_maxobs_titlet_obs_releaseproposal_idproposal_typesequence_numbers_regionjpegURLdataURLdataRightsmtFlagsrcDenobsiddistance
str7str5str9str13str4str4str16str18str4str78float64float64str10str19int64float64float64float64float64float64str63float64str6str3int64str115str120str116str6boolfloat64str9float64
scienceTESSSPOCPhotometerTESSTESSOpticalTESS FFI--tess-s0054-1-3293.1472936457296-9.71249801098487imageRicker, George359769.4006710879659795.62954313657475.199781600.01000.0--59828.0N/A--54POLYGON 288.297067 -16.349772 285.946199 -5.101134 297.968471 -2.757443 300.386369 -14.421799 288.297067 -16.349772----PUBLICFalsenan926168500.0
scienceTESSSPOCPhotometerTESSTESSOpticalTESS FFI--tess-s0080-1-4285.6629279299061-9.094913113849628imageRicker, George360479.38711812560505.84107659722158.399925600.01000.0--60545.0N/A--80POLYGON 280.09193 -15.630142 279.064881 -3.763278 291.262336 -2.663712 292.216134 -14.120328 280.09193 -15.630142----PUBLICFalsenan2297775160.0
scienceTESSSPOCPhotometerTESSTESSOptical98516825--tess2022190063128-s0054-0000000098516825-0227-s290.30057005371-11.6867245976452timeseriesRicker, George359769.4035014814859795.633632488425120.0600.01000.0--59828.0G04103--54CIRCLE 290.30057005 -11.6867246 0.00138889--mast:TESS/product/tess2022190063128-s0054-0000000098516825-0227-s_lc.fitsPUBLICFalsenan91544945507.5456440624232

The columns of the above table (there are many!) correspond to searchable fields in the MAST database. You can find an example of performing a search by criteria down below.

To avoid that pesky warning from the search above, we can create a coordinates object and pass it to our query. Let’s try that now, and in addition, add a radius of 1 arcsecond to this search.

# Set up our coordinates object
coord = SkyCoord(290.213503, -11.800746, unit='deg')

# Same search as above, now with a radius of 1 arcsecond
obsByRegion2 = Observations.query_region(coord, radius='1s')
len(obsByRegion2)
11

Consistency check: we expect that as the radius gets smaller, we will get fewer results. In this case, only 11 results are returned compared to 358 above.

Let’s take a look at the results again, this time limiting the displayed columns.

# Let's limit the number of columns we see at once
columns = ['obs_collection', 'intentType', 'instrument_name', 
           'target_name', 't_exptime', 'filters', 'dataproduct_type']

# Show the results with the above columns only
obsByRegion2[columns]
Table length=11
obs_collectionintentTypeinstrument_nametarget_namet_exptimefiltersdataproduct_type
str5str7str10str12float64str4str5
TESSsciencePhotometerTESS FFI475.199781TESSimage
TESSsciencePhotometerTESS FFI158.399925TESSimage
PS1scienceGPC11126.007731.0gimage
PS1scienceGPC11126.007900.0iimage
PS1scienceGPC11126.007876.0rimage
PS1scienceGPC11126.007580.0yimage
PS1scienceGPC11126.007480.0zimage
HLSPsciencePhotometerTICA FFI475.2TESSimage
HLSPsciencePhotometerTICA FFI158.4TESSimage
GALEXscienceGALEXAIS_246_1_35176.0NUVimage
GALEXscienceGALEXAIS_246_1_35176.0FUVimage

This “streamlined” view is helpful: it avoids visual clutter and helps you to focus on the fields that are most relevant to your search.

2. Query by Object Name#

To search for an object by name, you can use the query.object() function. As before, you may optionally specify a radius.

The object name is first resolved to coordinates by a call to the Simbad and NED archives. Then, the search proceeeds based on these coordinates.

obsByName = Observations.query_object("M51", radius=".005 deg")

# print total number of results; then show only the first five
print("Number of results:", len(obsByName))
obsByName[columns][:5]
Number of results: 7503
Table length=5
obs_collectionintentTypeinstrument_nametarget_namet_exptimefiltersdataproduct_type
str11str11str12str64float64str26str10
TESSsciencePhotometerTESS FFI1425.599358TESSimage
TESSsciencePhotometerTESS FFI1425.599393TESSimage
TESSsciencePhotometerTESS FFI1425.599383TESSimage
TESSsciencePhotometerTESS FFI475.199787TESSimage
TESSsciencePhotometerTESS FFI158.399923TESSimage

For some special catalogs, like those used with Kepler, K2, and TESS, astroquery performs a direct lookup using the MAST catalog. It is important to include both the catalog identifier and number when searching one of these datasets; for example, to query the TESS Input catalog you would use "TIC 261136679", rather than a plain "261136679".

obsByTessName = Observations.query_object("TIC 261136679", radius="1s")

# we'll use some new columns to de-clutter the table
columns = ['obs_collection', 'wavelength_region', 'provenance_name', 't_min', 't_max', 'obsid']

# print number of results and display the first five
print("Number of results:", len(obsByTessName))
obsByTessName[columns][:5]
Number of results: 430
Table length=5
obs_collectionwavelength_regionprovenance_namet_mint_maxobsid
str11str10str17float64float64str9
TESSOpticalSPOC58324.8130439004658352.66690385416660865631
TESSOpticalSPOC58410.4159353009258436.3323207291761132377
TESSOpticalSPOC58516.8531245833458541.4992148726962468248
TESSOpticalSPOC58596.2709659606558623.3754181481565153352
TESSOpticalSPOC58624.4595443402858652.3765100115765180725

It’s worth noting that even though we queried using the TESS Input Catalog, not all of the results are from the TESS mission. In fact, we can see exactly which missions are returned using Python’s built-in set function:

# return the unique set of missions found in this query
set(obsByTessName['obs_collection'])
{'GALEX', 'HLSP', 'SPITZER_SHA', 'SWIFT', 'TESS'}

The TIC is a catalog like any other, and the MAST Archive resolves the input name to a location on the sky. Other missions that have observed the same region will also return results.

To search for results that are specific to TESS, or your mission of choice, see the ‘by other criteria’ section below.

3. Query by Other Criteria (with or without name/region) #

To search for observations based on additonal parameters, you can use query_criteria(). In a sense, this is a more powerful version of the tools above, as you can still search by coordinates and objectname; however, you can inculde additional desired criteria. You may also run the search without specifying a name or coordinates.

To perform the search, give your critera as keyword arguments. There are many valid criteria for MAST searches . (Note that these are columns of the results tables we saw above!) Some relevant examples are: “filters”, “t_exptime” (exposure time), instrument_name, “provenance_name”, and “sequence_number”.

Let’s query for TESS Sector 9 data with an exposure time between 1400 and 1500s.

obsByCriteria = Observations.query_criteria(obs_collection=["TESS"], sequence_number=9,
                                            t_exptime=[1400, 1500])

columns = ['target_name', 's_ra', 's_dec', 't_exptime', 'obsid']
obsByCriteria[columns][:5]
Table length=5
target_names_ras_dect_exptimeobsid
str8float64float64float64str8
TESS FFI133.67905448025425-46.182396030083241425.59941462896230
TESS FFI143.74057778883315-36.890513963133131425.59941462893522
TESS FFI119.47374170204309-54.1110656462167941425.59941462896721
TESS FFI148.8381670809981-53.6401965875519051425.59941462895676
TESS FFI162.4696250638618-5.4256527079199271425.59941462892977

There’s no limit on the number of filters you can apply in a search. It may be an interesting exercise for the reader to go through the example below and figure out what exactly we’re searching for.

(Hint: check the field descriptions)

# Make sure to run this cell, as the data is used in the sections that follow

exByCriteria = Observations.query_criteria(obs_collection=["HLA"], s_dec=[50, 60], 
                                           calib_level=[3], proposal_pi="Mould*", 
                                           dataproduct_type="IMAGE", t_max=[49800, 49820])

columns = ['obs_collection', 'obs_id', 'target_name', 'filters', 'instrument_name', 'proposal_id']
exByCriteria[columns]
Table length=4
obs_collectionobs_idtarget_namefiltersinstrument_nameproposal_id
str3str27str12str9str9str4
HLAhst_05766_04_wfpc2_f555w_pcNGC5457-FLD2F555WWFPC2/PC5766
HLAhst_05766_04_wfpc2_f555w_wfNGC5457-FLD2F555WWFPC2/WFC5766
HLAhst_05766_04_wfpc2_total_pcNGC5457-FLD2DETECTIONWFPC2/PC5766
HLAhst_05766_04_wfpc2_total_wfNGC5457-FLD2DETECTIONWFPC2/WFC5766

Getting Associated Data Products#

Each observation returned from a MAST query can have one or more associated data products. For example, a JWST observation might return an uncalibrated file, a guide-star file, and the science data you’re searching for.

Performing a Product Query#

Using an astropy table of observations, or a list of observation ids (“obs_ids”), get_product_list() will return a table containing the associated data products.

Since we already have a list of observations, we can use that as the starting point for our query. To keep it simple, let’s look at only the last observation from our search above.

# Let's select a small subset from our critera search above
newObsList = exByCriteria[-1:]

# Now we get the list of products associated with that observation
dataProducts = Observations.get_product_list(newObsList)

# preview the first five
dataProducts[:5]
Table masked=True length=5
obsIDobs_collectiondataproduct_typeobs_iddescriptiontypedataURIproductTypeproductGroupDescriptionproductSubGroupDescriptionproductDocumentationURLprojectprvversionproposal_idproductFilenamesizeparent_obsiddataRightscalib_levelfilters
str8str3str5str30str67str1str96str9str28str8str1str8str19str4str43int64str8str6int64str9
25153181HLAimagehst_05766_04_wfpc2_f555w_wf_02Preview-FullSmast:HLA/url/cgi-bin/preview.cgi?dataset=hst_05766_04_wfpc2_f555w_wf_02PREVIEW------HLA--5766hst_05766_04_wfpc2_f555w_wf_02_drz.jpg--25579950PUBLIC2F555W
25153181HLAimagehst_05766_04_wfpc2_f555w_wf_02HLA simple fits science imageSmast:HLA/url/cgi-bin/getdata.cgi?dataset=hst_05766_04_wfpc2_f555w_wf_02_drz.fitsSCIENCE--DRZ--HLA--5766hst_05766_04_wfpc2_f555w_wf_02_drz.fits1028160025579950PUBLIC2F555W
25579950HLAimagehst_05766_04_wfpc2_total_wfPreview-FullCmast:HLA/url/cgi-bin/preview.cgi?dataset=hst_05766_04_wfpc2_total_wfPREVIEW------HLA--5766hst_05766_04_wfpc2_total_wf_drz.jpg--25579950PUBLIC3DETECTION
25579950HLAimagehst_05766_04_wfpc2_total_wfHLA DAOPHOT CatalogCmast:HLA/url/cgi-bin/getdata.cgi?download=1&filename=hst_05766_04_wfpc2_total_wf_daophot_trm.catSCIENCEMinimum Recommended ProductsDAOPHOT--HLA--5766hst_05766_04_wfpc2_total_wf_daophot_trm.cat--25579950PUBLIC3DETECTION
25579950HLAimagehst_05766_04_wfpc2_total_wfHLA simple fits science imageCmast:HLA/url/cgi-bin/getdata.cgi?dataset=hst_05766_04_wfpc2_total_wf_drz.fitsSCIENCEMinimum Recommended ProductsDRZ--HLA--5766hst_05766_04_wfpc2_total_wf_drz.fits3079872025579950PUBLIC3DETECTION

Filtering the Data Products#

After the data products have been retrieved, you can use filter_products to download only data products that meet your given criteria. There are many filters for MAST products. Some examples are: mrp_only (Minimum Recommended Products) and extension (file extension).

A note on filtering: each listed filter is joined with an AND, but each option within that filter is joined with an OR. For example, the search below will return any products that are ‘science’ type AND have a calibration level of 2 OR 3.

scienceProducts = Observations.filter_products(dataProducts, productType=["SCIENCE"],
                                               calib_level=[2, 3], mrp_only=False)
scienceProducts[:5]
Table masked=True length=5
obsIDobs_collectiondataproduct_typeobs_iddescriptiontypedataURIproductTypeproductGroupDescriptionproductSubGroupDescriptionproductDocumentationURLprojectprvversionproposal_idproductFilenamesizeparent_obsiddataRightscalib_levelfilters
str8str3str5str30str67str1str96str9str28str8str1str8str19str4str43int64str8str6int64str9
25153181HLAimagehst_05766_04_wfpc2_f555w_wf_02HLA simple fits science imageSmast:HLA/url/cgi-bin/getdata.cgi?dataset=hst_05766_04_wfpc2_f555w_wf_02_drz.fitsSCIENCE--DRZ--HLA--5766hst_05766_04_wfpc2_f555w_wf_02_drz.fits1028160025579950PUBLIC2F555W
25579950HLAimagehst_05766_04_wfpc2_total_wfHLA DAOPHOT CatalogCmast:HLA/url/cgi-bin/getdata.cgi?download=1&filename=hst_05766_04_wfpc2_total_wf_daophot_trm.catSCIENCEMinimum Recommended ProductsDAOPHOT--HLA--5766hst_05766_04_wfpc2_total_wf_daophot_trm.cat--25579950PUBLIC3DETECTION
25579950HLAimagehst_05766_04_wfpc2_total_wfHLA simple fits science imageCmast:HLA/url/cgi-bin/getdata.cgi?dataset=hst_05766_04_wfpc2_total_wf_drz.fitsSCIENCEMinimum Recommended ProductsDRZ--HLA--5766hst_05766_04_wfpc2_total_wf_drz.fits3079872025579950PUBLIC3DETECTION
25579950HLAimagehst_05766_04_wfpc2_total_wfHLA SExtractor CatalogCmast:HLA/url/cgi-bin/getdata.cgi?download=1&filename=hst_05766_04_wfpc2_total_wf_sexphot_trm.catSCIENCEMinimum Recommended ProductsSEXPHOT--HLA--5766hst_05766_04_wfpc2_total_wf_sexphot_trm.cat--25579950PUBLIC3DETECTION
24556184HSTimageu2ms0402tDADS C0F file - Calibrated exposure WFPC/WFPC2/FOC/FOS/GHRS/HSPSmast:HST/product/u2ms0402t_c0f.fitsSCIENCE--C0F--CALWFPC22.5.3 (Sep 4, 2008)5766u2ms0402t_c0f.fits1030752025579950PUBLIC2F555W

Downloading Products#

Passing a table of products (like the one above) to download_products() will download the entire table. You can also pass a list of Observation IDs (obs_id) if you know them. download_products() also allows you to filter data as you request the download. In the example below, we will only download the drizzled files (drz.fits).

Products will, by default, be downloaded into the current working directory, in a subdirectory called mastDownload.The full local file paths will have the form mastDownload/Mission/Observation ID/file.

# This is the filtered download of the scienceProducts table
manifest = Observations.download_products(scienceProducts, extension=("drz.fits"))

# Uncomment below for "plain" download of the scienceProducts table
# manifest = Observations.download_products(scienceProducts)
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HLA/url/cgi-bin/getdata.cgi?dataset=hst_05766_04_wfpc2_f555w_wf_02_drz.fits to ./mastDownload/HLA/hst_05766_04_wfpc2_f555w_wf_02/hst_05766_04_wfpc2_f555w_wf_02_drz.fits ...
 [Done]
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HLA/url/cgi-bin/getdata.cgi?dataset=hst_05766_04_wfpc2_total_wf_drz.fits to ./mastDownload/HLA/hst_05766_04_wfpc2_total_wf/hst_05766_04_wfpc2_total_wf_drz.fits ...
 [Done]

Note: download_products includes caching by default. If you have downloaded the files before, they will not be downloaded again unless caching is turned off. This may cause issues if the data is updated and the filename remains the same!

manifest
Table length=2
Local PathStatusMessageURL
str89str8objectobject
./mastDownload/HLA/hst_05766_04_wfpc2_f555w_wf_02/hst_05766_04_wfpc2_f555w_wf_02_drz.fitsCOMPLETENoneNone
./mastDownload/HLA/hst_05766_04_wfpc2_total_wf/hst_05766_04_wfpc2_total_wf_drz.fitsCOMPLETENoneNone

The manifest returns useful information about the status of the files. You can find the local path, along with a status. This will either be COMPLETE, SKIPPED, or ERROR. If the status is ERROR, there will be additional information in the ‘Message’ column. The URL field includes a link to directly download the data.

Displaying Data#

At this point the data is ready for analysis, and we are done querying the MAST Archive.

Below we take a look at the data files using astropy.fits and matplotlib.

# Get the filenames from the manifest
filename0 = manifest['Local Path'][0]
filename1 = manifest['Local Path'][1]

# Open the files using astropy.fits
file1 = fits.open(filename0)
file2 = fits.open(filename1)

With the files opened, we can create a plot:

# initialize the plot
f, (ax1, ax2) = plt.subplots(1, 2)

# set height and width
f.set_figheight(5)
f.set_figwidth(12)

# plot the data
ax1.imshow(file1[0].data, cmap="inferno", norm=SymLogNorm(linthresh=0.03, vmin=0, vmax=1.5))
ax2.imshow(file2['SCI'].data, cmap="inferno", norm=SymLogNorm(linthresh=0.03, vmin=-0.01, vmax=1.5))
<matplotlib.image.AxesImage at 0x7fa204be1690>
../../../_images/efc1a22a2a620d0aeaab86c67a97692be19a5ca0ebd01f52f94450cc440ad665.png

We’ve plotted the same region of the sky, with subtle differences caused by the differing filters.

Further Reading#

astoquery.mast readthedocs

About this Notebook#

For additonal questions, comments, or feedback, please email archive@stsci.edu.

Authors: Thomas Dutkiewicz, Scott Fleming
Keywords: MAST, astroquery
Latest update Mar 2025

Top of Page Space Telescope Logo