Beginner: Searching MAST using astroquery.mast#

Introduction and Goals:#

This is a beginner tutorial on accessing the MAST Archive using the Astroquery API. We’ll cover the major search features you might find useful when querying for observations. By the end of this tutorial, you will:

  • Understand how to search for observations hosted on the MAST Archive

  • Download data products corresponding to your observations of interest

  • Create a visual display of the downloaded data

Table of Contents#

  • Imports

  • Three ways to search for MAST observations

    • By Region

    • By Object Name

    • By Criteria

  • Getting Associated Data Products

    • Performing a Product Query

    • Filtering Data Products

  • Downloading Products

  • Displaying Data

  • Further Reading

Imports#

Let’s start by importing the packages we need for this notebook.

  • astroquery.mast to access the MAST API

  • astropy to create coordinate objects and access the data

  • matplotlib to plot the data

from astropy.coordinates import SkyCoord
from astropy.io import fits
from astroquery.mast import Observations
from matplotlib.colors import SymLogNorm

import matplotlib.pyplot as plt

%matplotlib inline

Three Ways to Search for MAST Observations#

1. By Region#

To search by coordinates (and a radius), you can use query_region.

The coordinates can be given as a string or astropy.coordinates object, and the radius as a string or float object. If no radius is specified, the default is 0.2 degrees.

Let’s try an example search with the coordinates (322.49324, 12.16683) and no radius.

# This will give a warning that the coordinates are being interpreted as an ICRS coordinate provided in degrees
obsByRegion = Observations.query_region("322.49324 12.16683")
len(obsByRegion)
WARNING: InputWarning: Coordinate string is being interpreted as an ICRS coordinate provided in degrees. [astroquery.utils.commons]
12978

Excellent! At the time of writing, this search returns 2240 results. As the MAST archive grows, this number increases; you might see a larger value. Let’s take a look a subset of the results.

# Preview the first 3 results
obsByRegion[:3]
Table masked=True length=3
intentTypeobs_collectionprovenance_nameinstrument_nameprojectfilterswavelength_regiontarget_nametarget_classificationobs_ids_ras_decdataproduct_typeproposal_picalib_levelt_mint_maxt_exptimeem_minem_maxobs_titlet_obs_releaseproposal_idproposal_typesequence_numbers_regionjpegURLdataURLdataRightsmtFlagsrcDenobsiddistance
str11str11str31str13str12str27str16str33str56str92float64float64str10str25int64float64float64float64float64float64str106float64str20str12int64str17043str185str186str6boolfloat64str9float64
scienceTESSSPOCPhotometerTESSTESSOpticalTESS FFI--tess-s0055-1-1325.935119355519911.77496602082765imageRicker, George359796.601343796359823.76825269676475.199786600.01000.0--59841.0N/A--55POLYGON 329.837843 19.006512 333.753586 8.161861 322.106193 4.225655 318.039501 15.437198 329.837843 19.006512----PUBLICFalsenan951333210.0
scienceTESSSPOCPhotometerTESSTESSOpticalTESS FFI--tess-s0082-1-2317.1715434739556410.091041157822387imageRicker, George360532.89155686342660558.749996122686158.399927600.01000.0--60613.0N/A--82POLYGON 320.550581 17.936678 325.068597 6.875186 313.680238 2.326972 309.407072 13.006235 320.550581 17.936678----PUBLICFalsenan2328813500.0
scienceTESSSPOCPhotometerTESSTESSOptical96703881--tess2022217014003-s0055-0000000096703881-0242-s322.49317112.166862timeseriesRicker, George359796.6041688888959823.76815293982120.0600.01000.0--59841.0G04046--55CIRCLE 322.493171 12.166862 0.00138889--mast:TESS/product/tess2022217014003-s0055-0000000096703881-0242-s_lc.fitsPUBLICFalsenan937705000.0

The columns of the above table (there are many!) correspond to searchable fields in the MAST database. You can find the full list of criteria here and an example of performing a search by criteria down below.

If we want to avoid that pesky warning from the search above we can create a coordinates object and pass it to our search. Let’s try that now, and in addition, let’s add a radius of 1 arcsecond to this search.

# Set up our coordinates object
coord = SkyCoord(290.213503, -11.800746, unit='deg')

# Same search as above, now with a radius of 1 arcsecond
obsByRegion2 = Observations.query_region(coord, radius='1s')
len(obsByRegion2)
11

Sanity check: we expect that as the radius gets smaller, we get fewer results. In this case, only 15 results are returned.

Let’s take a look at the results again, but let’s limit the columns that get displayed.

# Let's limit the number of columns we see at once
columns = ['obs_collection', 'intentType', 'instrument_name', 
           'target_name', 't_exptime', 'filters', 'dataproduct_type']

# Show the results with the above columns only
obsByRegion2[columns]
Table length=11
obs_collectionintentTypeinstrument_nametarget_namet_exptimefiltersdataproduct_type
str5str7str10str12float64str4str5
TESSsciencePhotometerTESS FFI475.199781TESSimage
TESSsciencePhotometerTESS FFI158.399925TESSimage
PS1scienceGPC11126.007731.0gimage
PS1scienceGPC11126.007900.0iimage
PS1scienceGPC11126.007876.0rimage
PS1scienceGPC11126.007580.0yimage
PS1scienceGPC11126.007480.0zimage
HLSPsciencePhotometerTICA FFI475.2TESSimage
HLSPsciencePhotometerTICA FFI158.4TESSimage
GALEXscienceGALEXAIS_246_1_35176.0NUVimage
GALEXscienceGALEXAIS_246_1_35176.0FUVimage

This “streamlined” view is helpful: it avoids visual clutter and helps you to focus on the fields that are most relevant to your search.

2. By Object Name#

To search for an object by name, you can use the query.object function. This function also optionally allows you to specify a radius.

The object name is first resolved to coordinates by a call to the Simbad and NED archives. Then, the search proceeeds based on these coordinates.

obsByName = Observations.query_object("M51", radius=".005 deg")
print("Number of results:", len(obsByName))

obsByName[:5]
Number of results: 7503
Table masked=True length=5
intentTypeobs_collectionprovenance_nameinstrument_nameprojectfilterswavelength_regiontarget_nametarget_classificationobs_ids_ras_decdataproduct_typeproposal_picalib_levelt_mint_maxt_exptimeem_minem_maxobs_titlet_obs_releaseproposal_idproposal_typesequence_numbers_regionjpegURLdataURLdataRightsmtFlagsrcDenobsiddistance
str11str11str31str12str13str26str19str64str49str53float64float64str10str23int64float64float64float64float64float64str107float64str16str13int64str17792str175str176str6boolfloat64str9float64
scienceTESSSPOCPhotometerTESSTESSOpticalTESS FFI--tess-s0016-4-3200.606694032273150.01251405441704imageRicker, George358738.1419021258762.808850511425.599358600.01000.0--58782.3333334N/A--16POLYGON 188.96005900 47.31461100 195.80207500 58.03466100 213.76319300 51.45788000 203.63432700 41.78187500 188.96005900 47.31461100----PUBLICFalsenan275455660.0
scienceTESSSPOCPhotometerTESSTESSOpticalTESS FFI--tess-s0022-2-1198.2482618439644645.21473093959813imageRicker, George358898.8155937858925.982579921425.599393600.01000.0--58945.0N/A--22POLYGON 209.81203800 45.13366400 199.37165100 36.75577700 186.47721200 44.07435900 197.22646800 53.68842400 209.81203800 45.13366400----PUBLICFalsenan273471700.0
scienceTESSSPOCPhotometerTESSTESSOpticalTESS FFI--tess-s0023-2-2206.1136436179843441.534991636349474imageRicker, George358927.6075687658954.378137311425.599383600.01000.0--58973.0N/A--23POLYGON 217.19928400 43.86714800 209.71103500 33.48941000 195.67778400 38.51512800 201.79979000 49.12765700 217.19928400 43.86714800----PUBLICFalsenan273869920.0
scienceTESSSPOCPhotometerTESSTESSOpticalTESS FFI--tess-s0049-2-1204.0691381729259342.4477011962157imageRicker, George359636.97501385416459663.81560847222475.199787600.01000.0--59695.0N/A--49POLYGON 215.129288 42.6299 205.417683 34.012968 192.868429 41.149492 202.762088 50.900239 215.129288 42.6299----PUBLICFalsenan831633780.0
scienceTESSSPOCPhotometerTESSTESSOpticalTESS FFI--tess-s0076-1-1209.0712628976723746.689635318292865imageRicker, George360367.1941488310260394.280102627316158.399923600.01000.0--60438.0N/A--76POLYGON 220.895636 46.249072 209.783858 38.201258 196.899205 45.853449 208.607539 55.186936 220.895636 46.249072----PUBLICFalsenan2146506090.0

For some special catalogs, like those used with Kepler, K2, and TESS, astroquery performs a direct lookup using the MAST catalog. It is important to include both the catalog identifier and number when searching one of these datasets; for example, to query the TESS Input catalog you would use “TIC 261136679”, rather than a plain “261136679”.

obsByTessName = Observations.query_object("TIC 261136679", radius="1s")
# As before, we use columns to limit the display
columns = ['obs_collection', 'wavelength_region', 'provenance_name', 't_min', 't_max', 'obsid']
obsByTessName[columns][:10]
Table length=10
obs_collectionwavelength_regionprovenance_namet_mint_maxobsid
str11str10str17float64float64str9
TESSOpticalSPOC58324.8130439004658352.66690385416660865631
TESSOpticalSPOC58410.4159353009258436.3323207291761132377
TESSOpticalSPOC58516.8531245833458541.4992148726962468248
TESSOpticalSPOC58596.2709659606558623.3754181481565153352
TESSOpticalSPOC58624.4595443402858652.3765100115765180725
TESSOpticalSPOC58653.4181698726858681.8553605555665462606
TESSOpticalSPOC59035.779107759060.1399472327797249
TESSOpticalSPOC59061.3474442459086.5971616927844486
TESSOpticalSPOC59144.012856159169.4431255328131771
TESSOpticalSPOC59228.248649259253.5614096328364165

It’s worth noting that even though we queried using the TESS Input Catalog, not all of the results are from the TESS mission. The TIC is a catalog like any other, and the MAST Archive resolves the input name to a location on the sky. Other missions that have observed the same region will also return results.

To search for results that are specific to TESS, or your mission of choice, see the ‘by other criteria’ section below.

3. By Other Criteria (with or without name/region) #

To search for observations based on additonal parameters, you can use query_criteria. In a sense, this is a more powerful version of the tools above, as you can still search by coordinates and objectname; however, you can inculde additional desired criteria. You may also run the search without specifying a name or coordinates.

To perform the search, give your critera as keyword arguments. Valid criteria and their descriptions are listed here. (Note that these are columns of the results tables we saw above!) Some relevant examples are: “filters”, “t_exptime” (exposure time), instrument_name, “provenance_name”, and “sequence_number”.

Let’s for TESS Sector 9 data with an exposure time between 1400 and 1500s.

obsByCriteria = Observations.query_criteria(obs_collection=["TESS"], sequence_number=9,
                                            t_exptime=[1400, 1500])
columns = ['target_name', 's_ra', 's_dec', 't_exptime', 'obsid']
obsByCriteria[columns][:5]
Table length=5
target_names_ras_dect_exptimeobsid
str8float64float64float64str8
TESS FFI148.8381670809981-53.6401965875519051425.59941462895676
TESS FFI162.4696250638618-5.4256527079199271425.59941462892977
TESS FFI151.37494470193366-26.745954695297461425.59941462895165
TESS FFI173.83652057983417-10.277520139369341425.59941462892431
TESS FFI169.0707180032819-21.3872269798947061425.59941462891912

There’s no limit on the number of filters you can apply in a search. It may be an interesting exercise for the reader to go through the example below and figure out what exactly we’re searching for.

(Hint: check the field descriptions)

# Make sure to run this cell, as the data is used in the sections that follow

exByCriteria = Observations.query_criteria(obs_collection=["HLA"], s_dec=[50, 60], 
                                           calib_level=[3], proposal_pi="Mould*", 
                                           dataproduct_type="IMAGE", t_max=[49800, 49820])
columns = ['obs_collection', 'obs_id', 'target_name', 'filters', 'instrument_name', 'proposal_id']
exByCriteria[columns]
Table length=4
obs_collectionobs_idtarget_namefiltersinstrument_nameproposal_id
str3str27str12str9str9str4
HLAhst_05766_04_wfpc2_f555w_pcNGC5457-FLD2F555WWFPC2/PC5766
HLAhst_05766_04_wfpc2_f555w_wfNGC5457-FLD2F555WWFPC2/WFC5766
HLAhst_05766_04_wfpc2_total_pcNGC5457-FLD2DETECTIONWFPC2/PC5766
HLAhst_05766_04_wfpc2_total_wfNGC5457-FLD2DETECTIONWFPC2/WFC5766

Getting Associated Data Products#

Performing a Product Query#

Each observation returned from a MAST query can have one or more associated data products. For example, a JWST observation might return an uncalibrated file, a guide-star file, and the science data you’re searching for.

You can input a table of observations or list of observation ids (“obs_id”) and get_product_list will return a table containing the associated data products.

Since we already have a list of observations, we can use that as the starting point for our query. To keep it simple, let’s look at only the last observation from our search above.

# Let's select a small subset from our critera search above
newObsList = exByCriteria[-1:]

# Now we get the list of products associated with that observation
dataProducts = Observations.get_product_list(newObsList)
dataProducts[:5]
Table masked=True length=5
obsIDobs_collectiondataproduct_typeobs_iddescriptiontypedataURIproductTypeproductGroupDescriptionproductSubGroupDescriptionproductDocumentationURLprojectprvversionproposal_idproductFilenamesizeparent_obsiddataRightscalib_levelfilters
str8str3str5str30str67str1str96str9str28str8str1str8str19str4str43int64str8str6int64str9
25153181HLAimagehst_05766_04_wfpc2_f555w_wf_02Preview-FullSmast:HLA/url/cgi-bin/preview.cgi?dataset=hst_05766_04_wfpc2_f555w_wf_02PREVIEW------HLA--5766hst_05766_04_wfpc2_f555w_wf_02_drz.jpg--25579950PUBLIC2F555W
25153181HLAimagehst_05766_04_wfpc2_f555w_wf_02HLA simple fits science imageSmast:HLA/url/cgi-bin/getdata.cgi?dataset=hst_05766_04_wfpc2_f555w_wf_02_drz.fitsSCIENCE--DRZ--HLA--5766hst_05766_04_wfpc2_f555w_wf_02_drz.fits1028160025579950PUBLIC2F555W
25579950HLAimagehst_05766_04_wfpc2_total_wfPreview-FullCmast:HLA/url/cgi-bin/preview.cgi?dataset=hst_05766_04_wfpc2_total_wfPREVIEW------HLA--5766hst_05766_04_wfpc2_total_wf_drz.jpg--25579950PUBLIC3DETECTION
25579950HLAimagehst_05766_04_wfpc2_total_wfHLA DAOPHOT CatalogCmast:HLA/url/cgi-bin/getdata.cgi?download=1&filename=hst_05766_04_wfpc2_total_wf_daophot_trm.catSCIENCEMinimum Recommended ProductsDAOPHOT--HLA--5766hst_05766_04_wfpc2_total_wf_daophot_trm.cat--25579950PUBLIC3DETECTION
25579950HLAimagehst_05766_04_wfpc2_total_wfHLA simple fits science imageCmast:HLA/url/cgi-bin/getdata.cgi?dataset=hst_05766_04_wfpc2_total_wf_drz.fitsSCIENCEMinimum Recommended ProductsDRZ--HLA--5766hst_05766_04_wfpc2_total_wf_drz.fits3079872025579950PUBLIC3DETECTION

Filtering the Data Products#

After the data products have been retrieved, you can use filter_products to download only data products that meet your given criteria. Available filters are listed here. Some examples are: “mrp_only” (Minimum Recommended Products) and “extension” (file extension).

A note on filtering: each listed filter is joined with an AND, but each option within that filter is joined with an OR. For example, the search below will return any products that are ‘science’ type and have a calibration level of 2 or 3.

scienceProducts = Observations.filter_products(dataProducts, productType=["SCIENCE"],
                                               calib_level=[2, 3], mrp_only=False)
scienceProducts[:5]
Table masked=True length=5
obsIDobs_collectiondataproduct_typeobs_iddescriptiontypedataURIproductTypeproductGroupDescriptionproductSubGroupDescriptionproductDocumentationURLprojectprvversionproposal_idproductFilenamesizeparent_obsiddataRightscalib_levelfilters
str8str3str5str30str67str1str96str9str28str8str1str8str19str4str43int64str8str6int64str9
25153181HLAimagehst_05766_04_wfpc2_f555w_wf_02HLA simple fits science imageSmast:HLA/url/cgi-bin/getdata.cgi?dataset=hst_05766_04_wfpc2_f555w_wf_02_drz.fitsSCIENCE--DRZ--HLA--5766hst_05766_04_wfpc2_f555w_wf_02_drz.fits1028160025579950PUBLIC2F555W
25579950HLAimagehst_05766_04_wfpc2_total_wfHLA DAOPHOT CatalogCmast:HLA/url/cgi-bin/getdata.cgi?download=1&filename=hst_05766_04_wfpc2_total_wf_daophot_trm.catSCIENCEMinimum Recommended ProductsDAOPHOT--HLA--5766hst_05766_04_wfpc2_total_wf_daophot_trm.cat--25579950PUBLIC3DETECTION
25579950HLAimagehst_05766_04_wfpc2_total_wfHLA simple fits science imageCmast:HLA/url/cgi-bin/getdata.cgi?dataset=hst_05766_04_wfpc2_total_wf_drz.fitsSCIENCEMinimum Recommended ProductsDRZ--HLA--5766hst_05766_04_wfpc2_total_wf_drz.fits3079872025579950PUBLIC3DETECTION
25579950HLAimagehst_05766_04_wfpc2_total_wfHLA SExtractor CatalogCmast:HLA/url/cgi-bin/getdata.cgi?download=1&filename=hst_05766_04_wfpc2_total_wf_sexphot_trm.catSCIENCEMinimum Recommended ProductsSEXPHOT--HLA--5766hst_05766_04_wfpc2_total_wf_sexphot_trm.cat--25579950PUBLIC3DETECTION
24556184HSTimageu2ms0402tDADS C0F file - Calibrated exposure WFPC/WFPC2/FOC/FOS/GHRS/HSPSmast:HST/product/u2ms0402t_c0f.fitsSCIENCE--C0F--CALWFPC22.5.3 (Sep 4, 2008)5766u2ms0402t_c0f.fits1030752025579950PUBLIC2F555W

Downloading Products#

Passing a table of products (like the one above) to download_products will download the entire table. You can also pass a list of Observation IDs (obs_id) if you know them.

download_products also allows you to filter data as you request the download. In the example below, we will only download the drizzled files (drz.fits).

Products will by default be downloaded into the current working directory, in a subdirectory called “mastDownload.”
The full local file paths will have the form “mastDownload/Mission/Observation ID/file.”

# This is the filtered download of the scienceProducts table
manifest = Observations.download_products(scienceProducts, extension=("drz.fits"))

# Uncomment below for "plain" download of the scienceProducts table
# manifest = Observations.download_products(scienceProducts)
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HLA/url/cgi-bin/getdata.cgi?dataset=hst_05766_04_wfpc2_f555w_wf_02_drz.fits to ./mastDownload/HLA/hst_05766_04_wfpc2_f555w_wf_02/hst_05766_04_wfpc2_f555w_wf_02_drz.fits ...
 [Done]
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HLA/url/cgi-bin/getdata.cgi?dataset=hst_05766_04_wfpc2_total_wf_drz.fits to ./mastDownload/HLA/hst_05766_04_wfpc2_total_wf/hst_05766_04_wfpc2_total_wf_drz.fits ...
 [Done]

Note: download_products includes caching by default. If you have downloaded the files before, they will not be downloaded again unless caching is turned off. This may cause issues if the data is updated and the filename remains the same!

manifest
Table length=2
Local PathStatusMessageURL
str89str8objectobject
./mastDownload/HLA/hst_05766_04_wfpc2_f555w_wf_02/hst_05766_04_wfpc2_f555w_wf_02_drz.fitsCOMPLETENoneNone
./mastDownload/HLA/hst_05766_04_wfpc2_total_wf/hst_05766_04_wfpc2_total_wf_drz.fitsCOMPLETENoneNone

The manifest returns useful information about the status of the files. You can find the local path, along with a status. This will either be COMPLETE, SKIPPED, or ERROR. If the status is ERROR, there will be additional information in the ‘Message’ column. The URL field includes a link to directly download the data.

Displaying Data#

At this point the data is ready for analysis, and we are done querying the MAST Archive.

Below we take a look at the data files using astropy.fits and matplotlib.

# Get the filenames from the manifest
filename0 = manifest['Local Path'][0]
filename1 = manifest['Local Path'][1]

# Open the files using astropy.fits
file1 = fits.open(filename0)
file2 = fits.open(filename1)
f, (ax1, ax2) = plt.subplots(1, 2)
f.set_figheight(5)
f.set_figwidth(12)
ax1.imshow(file1[0].data, cmap="inferno", norm=SymLogNorm(linthresh=0.03, vmin=0, vmax=1.5))
ax2.imshow(file2['SCI'].data, cmap="inferno", norm=SymLogNorm(linthresh=0.03, vmin=-0.01, vmax=1.5))
<matplotlib.image.AxesImage at 0x7f0f28508850>
../../../_images/efc1a22a2a620d0aeaab86c67a97692be19a5ca0ebd01f52f94450cc440ad665.png

We see the same region of the sky, with subtle differences caused by the differing filters.

Further Reading#

Full documentation on astroquery.mast can be found here.

About this Notebook#

For additonal questions, comments, or feedback, please email archive@stsci.edu.

Authors: Thomas Dutkiewicz, Scott Fleming
Keywords: MAST, astroquery
Latest update Oct 2022
Next Review: Apr 2023

Top of Page Space Telescope Logo