Astroquery: Exploring Metadata from the James Webb Space Telescope#
Learning Goals#
By the end of this tutorial, you will:
Understand how to use the
astroquery.mast
module to access metadata from the James Webb Space Telescope (JWST).Run metadata queries based on coordinates, an object name, or non-positional criteria.
Use optional search parameters to further refine query results.
Automated testing has found an error in this Notebook. The authors have been notified and are working on the issue; in the meantime, please use this as a reference only.
Table of Contents#
Introduction
Querying MAST for JWST Metadata
Setup
Optional Search Parameters
Query by Object Name
Query by Region
Query by Criteria
Additional Resources
Exercise Solutions
Introduction#
Welcome! This tutorial focuses on using the astroquery.mast
module to search for metadata from the James Webb Space Telescope (JWST). Launched in December of 2021, JWST is an advanced space observatory designed for observations in the infrared light spectrum.
The Mikulski Archive for Space Telescopes (MAST) hosts publicly accessible data products from space telescopes like JWST. astroquery.mast
provides access to a broad set of JWST metadata, including header keywords, proposal information, and observational parameters. The available metadata can also be found using the MAST JWST Search interface.
Please note that astroquery.mast.MastMissions
and the MAST JWST Search API do not yet support data product downloads.
Imports#
This notebook uses the following packages:
astroquery.mast to query the MAST Archive
astropy.coordinates to assign coordinates of interest
from astroquery.mast import MastMissions
from astropy.coordinates import SkyCoord
Querying MAST for JWST Metadata#
Setup#
In order to make queries on JWST metadata, we will have to perform some setup. First, we will instantiate an object of the MastMissions
class and assign its mission
to be 'jwst'
. Its service
is set to the default of 'search'
.
# Create MastMissions object and assign mission to 'jwst'
missions = MastMissions(mission='jwst')
print(f'Mission: {missions.mission}')
print(f'Service: {missions.service}')
When writing queries, keyword arguments can be used to specify output characteristics (see the following section) and filter on values like instrument, exposure type, and proposal ID. The available column names for a mission are returned by the get_column_list
function. Below, we will print out the name, data type, and description for the first 10 columns in JWST metadata.
# Get available columns for JWST mission
columns = missions.get_column_list()
columns[:10]
Optional Search Parameters#
Before we dive in to the actual queries, it’s important to know how we can refine our results with optional keyword arguments. The following parameters are available:
radius
: For positional searches only. Only return results within a certain distance from an object or set of coordinates. Default is 3 arcminutes.limit
: The maximum number of results to return. Default is 5000.offset
: Skip the first n results. Useful for paging through results.select_cols
: A list of columns to be returned in the response.
As we walk through different types of queries, we will see these parameters in action!
Query by Object Name#
We’ve reached our first query! We can use object names to perform metadata queries using the query_object
function.
To start, let’s query for the Messier 1 object, a supernova remnant in the Taurus constellation. You may know it better as the Crab Nebula!
# Query for Messier 1 ('M1')
results = missions.query_object('M1')
# Display the first 5 results
print(f'Total number of results: {len(results)}')
results[:5]
There were 250 total results, meaning that 250 JWST datasets were targeting the Crab Nebula. Now, let’s try refining our search a bit more.
Each dataset is associated with a celestial coordinate, given by
targ_ra
(right ascension) andtarg_dec
(declination). By default, the query returns all datasets that fall within 3 arcminutes from the object’s coordinates. Let’s set theradius
parameter to be 1 arcminute instead.Say that we’re not interested in the first 4 results. We can assign
offset
to skip a certain number of rows.By default, a subset of recommended columns are returned for each query. However, we can specify exactly which columns to return using the
select_cols
keyword argument. TheArchiveFileID
column is included automatically.
# Refined query for Messier 1 ('M1')
results = missions.query_object('M1',
radius=1, # Search within a 1 arcminute radius
offset=4, # Skip the first 4 results
select_cols=['fileSetName', 'targprop', 'date_obs']) # Select certain columns
# Display the first 5 results
print(f'Total number of results: {len(results)}')
results[:5]
Exercise 1#
Now it’s your turn! Try querying for the Whirlpool Galaxy object. Search within a radius of 1 arcminute, skip the first 300 results, and select the fileSetName
and opticalElements
columns.
# # Query for Whirlpool Galaxy
# results = missions.query_object(...) # Write your query!
# # Display the first 5 results
# print(f'Total number of results: {len(results)}')
# results[:5]
Query by Region#
The missions
object also allows us to query by a region in the sky. By passing in a set of coordinates to the query_region
function, we can return datasets that fall within a certain radius
value of that point. This type of search is also known as a cone search.
# Create coordinate object
coords = SkyCoord(210.80227, 54.34895, unit=('deg'))
# Query for results within 10 arcminutes of coords
results = missions.query_region(coords, radius=10)
# Display results
print(f'Total number of results: {len(results)}')
results[:5]
395 JWST datasets fall within our cone search. In other words, their target coordinates are within 10 arcminutes of the coordinates that we defined.
Exercise 2:#
JWST has observed the star Vega, which has a right ascension of 279.23473 degrees and a declination of 38.78369 degrees. Use the query_region
function to search for datasets within 15 arcminutes of Vega. Select the fileSetName
, targprop
, targ_ra
, and targ_dec
columns.
# # Vega coordinates
# vega = SkyCoord(_, _, unit=('deg')) # Fill in with Vega's coordinates
# # Query for datasets around Vega
# results = missions.query_region(...) # Write your query!
# # Display the first 5 results
# print(f'Total number of results: {len(results)}')
# results[:5]
Query by Criteria#
In some cases, we may want to run queries with non-positional parameters. To accomplish this, we use the query_criteria
function.
For any of our query functions, we can filter our results by the value of columns in the dataset.
Let’s say that we only want observations from JWST’s Near Infrared Camera (NIRCam) instrument, and that we only want datasets connected to program number 1189.
# Query with column criteria
results = missions.query_criteria(instrume='NIRCAM', # From Near Infrared Camera
program=1189,
select_cols=['fileSetName', 'instrume', 'exp_type', 'program', 'pi_name'])
# Display the first 5 results
print(f'Total number of results: {len(results)}')
results[:5]
To exclude and filter out a certain value from the results, we can prepend the value with !
.
Let’s run the same query as above, but this time, we will filter out datasets coming from the NIRCam instrument.
# Filtered query, excluding NIRCam datasets
results = missions.query_criteria(program=1189,
instrume='!NIRCAM', # Exclude datasets from the NIRCam instrument
select_cols=['fileSetName', 'instrume', 'exp_type', 'program', 'pi_name'])
# Display the first 5 results
print(f'Total number of results: {len(results)}')
results[:5]
We can also use wildcards for more advanced filtering. Let’s use the same query from above, but we will add an exposure type filter for fixed slits (FS) spectroscopy.
# Filtered query with wildcard
results = missions.query_criteria(program=1189,
instrume='!NIRCAM', # Exclude datasets from the NIRCam instrument
exp_type='*FIXEDSLIT*', # Any exposure type that contains 'FIXEDSLIT'
select_cols=['fileSetName', 'instrume', 'exp_type', 'program', 'pi_name'])
# Display the first 10 results
print(f'Total number of results: {len(results)}')
results[:10]
To filter by multiple values for a single column, we use a string of the values delimited by commas.
To illustrate this, we will use a slightly different query. We query for datasets that have a fixed slits spectroscopy exposure type and targets with moving coordinates (targtype='MOVING'
). We will add another filter to match three different last names for principal investigators (PIs).
# Filtered query with multiple values
results = missions.query_criteria(exp_type='*FIXEDSLIT*', # Any exposure type that contains 'FIXEDSLIT'
targtype='MOVING', # Only return moving targets
pi_name='Stansberry, Parker, Lunine', # Last name of PI can be any of these 3 values
select_cols=['fileSetName', 'targtype', 'instrume', 'exp_type', 'program', 'pi_name'])
# Display the first 10 results
print(f'Total number of results: {len(results)}')
results[:10]
For columns with numeric or date values, we can filter using comparison values:
<
: Return values less than or before the given number/date>
: Return values greater than or after the given number/date<=
: Return values less than or equal to the given number/date>=
: Return values greater than or equal to the given number/date
As an example, let’s write a query to return all datasets with an observation date before February 1, 2022.
# Query using comparison operator
results = missions.query_criteria(date_obs='<2022-02-01', # Must be observed before February 1, 2022
select_cols=['fileSetName', 'program', 'date_obs'])
# Display results
print(f'Total number of results: {len(results)}')
results
For numeric or date data types, we can also filter with ranges. This requires the following syntax: '#..#'
.
Let’s write a query that uses range syntax to return datasets that belong to a program number between 1150 and 1155. We will also select for exposure durations that are greater than or equal to 100 seconds.
# Query using range operator
results = missions.query_criteria(program='1150..1155', # Program number between 1150 and 1155
duration='>100', # Exposure duration is greater than or equal to 100 seconds
select_cols=['fileSetName', 'program', 'duration'])
# Display results
print(f'Total number of results: {len(results)}')
results
Exercise 3#
It’s time to apply all that you’ve learned! Write a non-positional query based on the following:
Fixed targets (HINT:
targtype='FIXED'
)Instument is Mid-Infrared Instrument (MIRI) or Fine Guidance Sensor (FGS)
Proposal type should NOT include General Observers (
GO
)Exposure type includes the string
'IMAGE'
Right ascension is between 70 and 75 degrees
Program number is less than 1200.
Skip the first 5 entries.
Select the following columns:
fileSetName
,targtype
,instrume
,proposal_type
,exp_type
,targ_ra
,program
# # A non-positional query with column criteria
# results = missions.query_criteria(...) # Write your query here!
# # Display results
# print(f'Total number of results: {len(results)}')
# results
Additional Resources#
Exercise Solutions#
Exercise 1#
# Query for Whirlpool Galaxy
results = missions.query_object('Whirlpool',
radius=1, # Search radius of 1 arcminute
offset=300, # Skip the first 300 rows
select_cols=['fileSetName', 'opticalElements'])
# Display the first 5 results
print(f'Total number of results: {len(results)}')
results[:5]
Exercise 2#
# Vega coordinates
vega = SkyCoord(279.23473, 38.78369, unit=('deg'))
# Query for datasets around Vega
results = missions.query_region(vega,
radius=15, # Search radius of 15 arcminutes
select_cols=['fileSetName', 'targprop', 'targ_ra', 'targ_dec'])
# Display the first 5 results
print(f'Total number of results: {len(results)}')
results[:5]
Exercise 3#
# A non-positional query with column criteria
results = missions.query_criteria(targtype='FIXED', # Fixed target
instrume='MIRI, FGS', # Select MIRI and FGS observations
proposal_type='!GO', # Not from a general observer proposal
exp_type='*IMAGE*', # Contains the string "IMAGE"
targ_ra='70..75', # Between 70 and 75
program='<1200', # Less than 1200
offset=5, # Skip the first 5 results
select_cols=['fileSetName', 'targtype', 'instrume', 'proposal_type',
'exp_type', 'targ_ra', 'program'])
# Display results
print(f'Total number of results: {len(results)}')
results
Citations#
If you use astroquery
for published research, please cite the
authors. Follow these links for more information about citing astroquery
:
About this Notebook#
Author(s): Sam Bianco
Keyword(s): Tutorial, JWST, Astroquery, MastMissions
First published: June 2024
Last updated: June 2024