Accessing JWST Proposal Data in astroquery.mast#
Learning Goals#
This tutorial is aimed at researchers of any level looking for specific observations from a particular program ID. It will cover the basics of authentication, data search, and data downloads.
By the end of this tutorial, you will:
Know to how to login/logout to access data in astroquery.
Be able to search for data based on proposal ID.
Download filtered data products from the MAST Archive.
Table of Contents#
The workflow for this notebook consists of:
Logging in/out
Searching for Data by ID
Data Products
Filtering and Downloading Data
Filtering
Downloading Directly
Downloading via Curl Script
Additional Resources
Imports#
We need to import Observations
from astroquery.mast to access the MAST Archive:
from astroquery.mast import Observations
Logging in and out#
Most data in the MAST Archive is public, and can be accessed without logging in. However, some data is restricted during its ‘exclusive access period’ (EAP), during which time it is only available to the PI team. During this period, your team will need to sign in to access the data.
To begin, you should make sure that you have an authorized MyST Account.
In order to access data programatically we will also need to obtain an API token. To create and view tokens associated with your account, visit https://auth.mast.stsci.edu/tokens.
There are several ways to enter your token, including:
Manual response to prompt from
Observations.login()
(must be done every time)Python keyring; either through the
keyring
library orObservations.login
Storing it in the bash environment variable
$MAST_API_TOKEN
This flexiblility can overwhelming at first; let’s take a look at some examples of these methods below.
# Option 1: Respond to prompt. Uncomment the line below
#Observations.login()
This works well for infrequent API users, but storing the token is far more convenient for repeated logins. You can store the token using keyring or use the built-in store_token
flag:
# Option 2: Store Token. Uncomment the line below
#Observations.login(store_token=True)
Using ‘store_token’ will allow us to automatically log in, without needing to re-enter the token, for as long as the token remains valid. Note that tokens expire after 10 days of inactivity, or 60 days after creation, whichever comes first. Once it expires, you should use reenter_token=True
to overwrite the old token with the new one.
The third option is to store the token as the bash environment variable $MAST_API_TOKEN
. This method varies from system to system; for more details, you can check out this guide (links to a non-STScI site).
Let’s take a minute to verify that our login was successful:
session_info = Observations.session_info()
eppn:
ezid: anonymous
anon: True
scopes: []
session: None
token: None
You should see all of your information above. If not, verify that your token and MyST account are active.
And of course, if the need arises, we can logout:
Observations.logout()
session_info = Observations.session_info()
eppn:
ezid: anonymous
anon: True
scopes: []
session: None
token: None
Searching for Data by ID#
We can use a program ID to query the MAST Archive for data. In the example below, we’ll use 2733 as the ID. This is the program that produced the stunning images of the Southern Ring Nebula!
# Let's get a list of all observations associated with this proposal
obs_list = Observations.query_criteria(proposal_id=2733)
# We can chooose the columns we want to display in our table
disp_col = ['dataproduct_type','calib_level','obs_id',
'target_name','filters','proposal_pi', 'obs_collection']
obs_list[disp_col].show_in_notebook()
WARNING: AstropyDeprecationWarning: show_in_notebook() is deprecated as of 6.1 and to create
interactive tables it is recommended to use dedicated tools like:
- https://github.com/bloomberg/ipydatagrid
- https://docs.bokeh.org/en/latest/docs/user_guide/interaction/widgets.html#datatable
- https://dash.plotly.com/datatable [warnings]
idx | dataproduct_type | calib_level | obs_id | target_name | filters | proposal_pi | obs_collection |
---|---|---|---|---|---|---|---|
0 | image | 3 | STSCI_PR_2022-033 | NGC 3132 | Southern Ring Nebula | Eight-Burst Nebula | -- | -- | OPO |
1 | image | 3 | STSCI_PR_2022-059 | Southern Ring Nebula | NGC 3132 | -- | -- | OPO |
2 | image | 3 | jw02733-o001_t001_nircam_clear-f187n | NGC-3132 | F187N | Pontoppidan, Klaus M. | JWST |
3 | image | 3 | jw02733-o001_t001_nircam_clear-f090w | NGC-3132 | F090W | Pontoppidan, Klaus M. | JWST |
4 | image | 3 | jw02733-o001_t001_nircam_clear-f356w | NGC-3132 | F356W | Pontoppidan, Klaus M. | JWST |
5 | image | 3 | jw02733-o001_t001_nircam_clear-f212n | NGC-3132 | F212N | Pontoppidan, Klaus M. | JWST |
6 | image | 3 | jw02733-o001_t001_nircam_f405n-f444w | NGC-3132 | F444W;F405N | Pontoppidan, Klaus M. | JWST |
7 | image | 3 | jw02733-o001_t001_nircam_f444w-f470n | NGC-3132 | F444W;F470N | Pontoppidan, Klaus M. | JWST |
8 | image | 3 | jw02733-o002_t001_miri_f1130w | NGC-3132 | F1130W | Pontoppidan, Klaus M. | JWST |
9 | image | 3 | jw02733-o002_t001_miri_f770w | NGC-3132 | F770W | Pontoppidan, Klaus M. | JWST |
10 | image | 3 | jw02733-o002_t001_miri_f1800w | NGC-3132 | F1800W | Pontoppidan, Klaus M. | JWST |
11 | image | 3 | jw02733-o002_t001_miri_f1280w | NGC-3132 | F1280W | Pontoppidan, Klaus M. | JWST |
We have limited the display columns in the above table for conciseness. For a complete list of observation fields (the columns in the above table) and their descriptions, read here.
We can verify that we have the right observation by looking at the 'proposal_pi'
column above. The first observation is a press release image from the Webb Science Launch; this is why it is marked as part of the “Office of Public Outreach” (OPO) collection.
Data Products#
Level 3 products are the result of combining and processing multiple lower level products. These two categories are distinct; level 3 products are target-based (sometimes called source-based), while levels 2 and 1 are directly associated with an exposure. A great starting point to understand JWST files and the processing pipeline is available on the Jdox website.
For level 3 observations, it’s likely that there are many associated (levels 2 and 1) data products. Let’s take a look at how many products are associated with the second observation from our search above.
# We explicity get the 2nd observation by name in this cell.
mask = (obs_list['obs_id'] == 'jw02733-o002_t001_miri_f1130w')
data_products = Observations.get_product_list(obs_list[mask])
print(len(data_products))
3470
This produces over 3000 data products associated with this observation! This is not uncommon for a JWST level-3 observation. In the next section, we’ll take a look at how we can filter down the number of results before we download them.
Filtering and Downloading Data#
Filtering#
You can apply filter keyword arguments to download only data products that meet your given criteria. Available filters are “mrp_only” (minimum recommended products), “extension” (file extension), calib_level (calibration level), and all products fields listed here.
In this example, let’s try filtering for only the level 2, calibrated exposures. It is important that we also filter by “SCIENCE” type products; otherwise, our results will include guide star acquisition images.
filtered_prod = Observations.filter_products(data_products, calib_level=[2], productType="SCIENCE")
# Again, we choose columns of interest for convenience
disp_col = ['obsID','dataproduct_type','productFilename','size','calib_level']
filtered_prod[disp_col].show_in_notebook(display_length=10)
WARNING: AstropyDeprecationWarning: show_in_notebook() is deprecated as of 6.1 and to create
interactive tables it is recommended to use dedicated tools like:
- https://github.com/bloomberg/ipydatagrid
- https://docs.bokeh.org/en/latest/docs/user_guide/interaction/widgets.html#datatable
- https://dash.plotly.com/datatable [warnings]
idx | obsID | dataproduct_type | productFilename | size | calib_level |
---|---|---|---|---|---|
0 | 87599751 | image | jw02733002002_02103_00004_mirimage_o002_crf.fits | 29689920 | 2 |
1 | 87599751 | image | jw02733002002_02103_00004_mirimage_cal.fits | 29689920 | 2 |
2 | 87599751 | image | jw02733002002_02103_00004_mirimage_i2d.fits | 29445120 | 2 |
3 | 87599751 | image | jw02733002002_02103_00004_mirimage_rate.fits | 21191040 | 2 |
4 | 87599751 | image | jw02733002002_02103_00004_mirimage_rateints.fits | 42336000 | 2 |
5 | 87599752 | image | jw02733002002_02103_00005_mirimage_o002_crf.fits | 29689920 | 2 |
6 | 87599752 | image | jw02733002002_02103_00005_mirimage_cal.fits | 29689920 | 2 |
7 | 87599752 | image | jw02733002002_02103_00005_mirimage_i2d.fits | 29445120 | 2 |
8 | 87599752 | image | jw02733002002_02103_00005_mirimage_rate.fits | 21191040 | 2 |
9 | 87599752 | image | jw02733002002_02103_00005_mirimage_rateints.fits | 42336000 | 2 |
10 | 87599767 | image | jw02733002001_02103_00004_mirimage_o002_crf.fits | 29689920 | 2 |
11 | 87599767 | image | jw02733002001_02103_00004_mirimage_cal.fits | 29689920 | 2 |
12 | 87599767 | image | jw02733002001_02103_00004_mirimage_i2d.fits | 29445120 | 2 |
13 | 87599767 | image | jw02733002001_02103_00004_mirimage_rate.fits | 21191040 | 2 |
14 | 87599767 | image | jw02733002001_02103_00004_mirimage_rateints.fits | 42336000 | 2 |
15 | 87599771 | image | jw02733002001_02103_00001_mirimage_o002_crf.fits | 29689920 | 2 |
16 | 87599771 | image | jw02733002001_02103_00001_mirimage_cal.fits | 29689920 | 2 |
17 | 87599771 | image | jw02733002001_02103_00001_mirimage_i2d.fits | 29445120 | 2 |
18 | 87599771 | image | jw02733002001_02103_00001_mirimage_rate.fits | 21191040 | 2 |
19 | 87599771 | image | jw02733002001_02103_00001_mirimage_rateints.fits | 42336000 | 2 |
20 | 87600168 | image | jw02733002001_02103_00005_mirimage_o002_crf.fits | 29689920 | 2 |
21 | 87600168 | image | jw02733002001_02103_00005_mirimage_cal.fits | 29689920 | 2 |
22 | 87600168 | image | jw02733002001_02103_00005_mirimage_i2d.fits | 29445120 | 2 |
23 | 87600168 | image | jw02733002001_02103_00005_mirimage_rate.fits | 21191040 | 2 |
24 | 87600168 | image | jw02733002001_02103_00005_mirimage_rateints.fits | 42336000 | 2 |
25 | 87600176 | image | jw02733002002_02103_00002_mirimage_o002_crf.fits | 29689920 | 2 |
26 | 87600176 | image | jw02733002002_02103_00002_mirimage_cal.fits | 29689920 | 2 |
27 | 87600176 | image | jw02733002002_02103_00002_mirimage_i2d.fits | 29445120 | 2 |
28 | 87600176 | image | jw02733002002_02103_00002_mirimage_rate.fits | 21191040 | 2 |
29 | 87600176 | image | jw02733002002_02103_00002_mirimage_rateints.fits | 42336000 | 2 |
30 | 87600439 | image | jw02733002002_02103_00007_mirimage_o002_crf.fits | 29689920 | 2 |
31 | 87600439 | image | jw02733002002_02103_00007_mirimage_cal.fits | 29689920 | 2 |
32 | 87600439 | image | jw02733002002_02103_00007_mirimage_i2d.fits | 29445120 | 2 |
33 | 87600439 | image | jw02733002002_02103_00007_mirimage_rate.fits | 21191040 | 2 |
34 | 87600439 | image | jw02733002002_02103_00007_mirimage_rateints.fits | 42336000 | 2 |
35 | 87600443 | image | jw02733002002_02103_00006_mirimage_o002_crf.fits | 29689920 | 2 |
36 | 87600443 | image | jw02733002002_02103_00006_mirimage_cal.fits | 29689920 | 2 |
37 | 87600443 | image | jw02733002002_02103_00006_mirimage_i2d.fits | 29445120 | 2 |
38 | 87600443 | image | jw02733002002_02103_00006_mirimage_rate.fits | 21191040 | 2 |
39 | 87600443 | image | jw02733002002_02103_00006_mirimage_rateints.fits | 42336000 | 2 |
40 | 87600445 | image | jw02733002002_02103_00003_mirimage_o002_crf.fits | 29689920 | 2 |
41 | 87600445 | image | jw02733002002_02103_00003_mirimage_cal.fits | 29689920 | 2 |
42 | 87600445 | image | jw02733002002_02103_00003_mirimage_i2d.fits | 29445120 | 2 |
43 | 87600445 | image | jw02733002002_02103_00003_mirimage_rate.fits | 21191040 | 2 |
44 | 87600445 | image | jw02733002002_02103_00003_mirimage_rateints.fits | 42336000 | 2 |
45 | 87602147 | image | jw02733002001_02103_00006_mirimage_o002_crf.fits | 29689920 | 2 |
46 | 87602147 | image | jw02733002001_02103_00006_mirimage_cal.fits | 29689920 | 2 |
47 | 87602147 | image | jw02733002001_02103_00006_mirimage_i2d.fits | 29445120 | 2 |
48 | 87602147 | image | jw02733002001_02103_00006_mirimage_rate.fits | 21191040 | 2 |
49 | 87602147 | image | jw02733002001_02103_00006_mirimage_rateints.fits | 42336000 | 2 |
50 | 87602171 | image | jw02733002001_02103_00007_mirimage_o002_crf.fits | 29689920 | 2 |
51 | 87602171 | image | jw02733002001_02103_00007_mirimage_cal.fits | 29689920 | 2 |
52 | 87602171 | image | jw02733002001_02103_00007_mirimage_i2d.fits | 29445120 | 2 |
53 | 87602171 | image | jw02733002001_02103_00007_mirimage_rate.fits | 21191040 | 2 |
54 | 87602171 | image | jw02733002001_02103_00007_mirimage_rateints.fits | 42336000 | 2 |
55 | 87602190 | image | jw02733002001_02103_00008_mirimage_o002_crf.fits | 29689920 | 2 |
56 | 87602190 | image | jw02733002001_02103_00008_mirimage_cal.fits | 29689920 | 2 |
57 | 87602190 | image | jw02733002001_02103_00008_mirimage_i2d.fits | 29445120 | 2 |
58 | 87602190 | image | jw02733002001_02103_00008_mirimage_rate.fits | 21191040 | 2 |
59 | 87602190 | image | jw02733002001_02103_00008_mirimage_rateints.fits | 42336000 | 2 |
60 | 87602196 | image | jw02733002002_02103_00001_mirimage_o002_crf.fits | 29689920 | 2 |
61 | 87602196 | image | jw02733002002_02103_00001_mirimage_cal.fits | 29689920 | 2 |
62 | 87602196 | image | jw02733002002_02103_00001_mirimage_i2d.fits | 29445120 | 2 |
63 | 87602196 | image | jw02733002002_02103_00001_mirimage_rate.fits | 21191040 | 2 |
64 | 87602196 | image | jw02733002002_02103_00001_mirimage_rateints.fits | 42336000 | 2 |
65 | 87602200 | image | jw02733002001_02103_00003_mirimage_o002_crf.fits | 29689920 | 2 |
66 | 87602200 | image | jw02733002001_02103_00003_mirimage_cal.fits | 29689920 | 2 |
67 | 87602200 | image | jw02733002001_02103_00003_mirimage_i2d.fits | 29445120 | 2 |
68 | 87602200 | image | jw02733002001_02103_00003_mirimage_rate.fits | 21191040 | 2 |
69 | 87602200 | image | jw02733002001_02103_00003_mirimage_rateints.fits | 42336000 | 2 |
70 | 87602206 | image | jw02733002001_02103_00002_mirimage_o002_crf.fits | 29689920 | 2 |
71 | 87602206 | image | jw02733002001_02103_00002_mirimage_cal.fits | 29689920 | 2 |
72 | 87602206 | image | jw02733002001_02103_00002_mirimage_i2d.fits | 29445120 | 2 |
73 | 87602206 | image | jw02733002001_02103_00002_mirimage_rate.fits | 21191040 | 2 |
74 | 87602206 | image | jw02733002001_02103_00002_mirimage_rateints.fits | 42336000 | 2 |
75 | 87602208 | image | jw02733002002_02103_00008_mirimage_o002_crf.fits | 29689920 | 2 |
76 | 87602208 | image | jw02733002002_02103_00008_mirimage_cal.fits | 29689920 | 2 |
77 | 87602208 | image | jw02733002002_02103_00008_mirimage_i2d.fits | 29445120 | 2 |
78 | 87602208 | image | jw02733002002_02103_00008_mirimage_rate.fits | 21191040 | 2 |
79 | 87602208 | image | jw02733002002_02103_00008_mirimage_rateints.fits | 42336000 | 2 |
Well, that was effective! We now have 80 files, instead of over 3000.
As a final check before we proceed to the download, let’s find the total file size of our results:
total = sum(filtered_prod['size'])
print('{:.2f} GB'.format(total/10**9))
2.44 GB
For downloads larger than a GB, it is highly recommended that you follow the steps in Downloading via Curl Script rather than attempting to download the data directly.
Downloading Data Directly#
We’ll use the filtered product list to select our downloads. This method will immediately send a request to the MAST Archive, and download the data to this notebook’s folder.
Note: By default, this will only download the first five files. This reduces download time for the purposes of the tutorial while still demonstrating a successful download.
# Don't forget to login, if accessing non-public data! You can un-comment the line below:
# Observations.login()
# You can download all of the products by removing the '[:5]' from the line below:
manifest = Observations.download_products(filtered_prod[:5])
print(manifest['Status'])
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:JWST/product/jw02733002002_02103_00004_mirimage_o002_crf.fits to ./mastDownload/JWST/jw02733002002_02103_00004_mirimage/jw02733002002_02103_00004_mirimage_o002_crf.fits ...
[Done]
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:JWST/product/jw02733002002_02103_00004_mirimage_cal.fits to ./mastDownload/JWST/jw02733002002_02103_00004_mirimage/jw02733002002_02103_00004_mirimage_cal.fits ...
[Done]
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:JWST/product/jw02733002002_02103_00004_mirimage_i2d.fits to ./mastDownload/JWST/jw02733002002_02103_00004_mirimage/jw02733002002_02103_00004_mirimage_i2d.fits ...
[Done]
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:JWST/product/jw02733002002_02103_00004_mirimage_rate.fits to ./mastDownload/JWST/jw02733002002_02103_00004_mirimage/jw02733002002_02103_00004_mirimage_rate.fits ...
[Done]
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:JWST/product/jw02733002002_02103_00004_mirimage_rateints.fits to ./mastDownload/JWST/jw02733002002_02103_00004_mirimage/jw02733002002_02103_00004_mirimage_rateints.fits ...
[Done]
Status
--------
COMPLETE
COMPLETE
COMPLETE
COMPLETE
COMPLETE
Downloading via Curl Script#
Rather than downloading the files directly, we can instead download a curl script. You can run the script at any time to download your data.
This method supports larger data volumes (and downloads more quickly!) than a traditional portal download.
manifest = Observations.download_products(filtered_prod, curl_flag=True)
Downloading URL https://mast.stsci.edu/api/v0.1/Download/bundle.sh to ./mastDownload_20241021185433.sh ...
[Done]
You can run the script in your terminal by navigating to the desired download location and typing bash [filename].sh
. For Windows users, this will require cygwin or other programs that support bash scripts. You may be prompted for your API token.
Additional Resources#
Within the current directory, there is a companion script that unifies all of the code from this notebook. It runs in the terminal with two arguments: the program ID, and whether you should download a curl script.
For example, you might run python3 companion_script.py 2733 True
to download the above data via a curl script.
For additional details about astroquery.mast, see the readthedocs page.
About this Notebook#
For additonal questions, comments, or feedback, please email archive@stsci.edu
.
Authors: Thomas Dutkiewicz, Susan Mullally
Keywords: JWST, MAST, authentication
Last Updated: Jul 2022
Next Review: Jan 2023
Citations#
If you use astroquery
for published research, please cite the
authors.