Choosing how to access the data

This Notebook explains three methods of accessing COS data hosted by the STScI Mikulski Archive for Space Telescopes (MAST). You may read through all three, or you may wish to focus on a particular method which best suits your needs. Please use the table below to determine which section on which to focus.

The Classic HST Search (Web Interface) The MAST Portal (Web Interface) The Astroquery (Python Interface)
- User-friendly point-and-click searching - Very user-friendly point-and-click searching - Requires a bit of Python experience
- Advanced mission-specific search parameters, including: central wavelength, detector, etc. - Lacks some mission-specific search parameters - Allows for programmatic searching and downloads
- Can be difficult to download the data if not on the STScI network - Easy to download selected data - Best for large datasets
Use this method if... ...You're unfamiliar with Python and need to search for data by cenwave ...You're exploring the data and you don't need to search by cenwave ...You know Python and have an idea of what data you're looking for, or you have a lot of data
Described in... Section 1.1 Section 1.3 Section 2.1

Note that these are only recommendations, and you may prefer another option. For most purposes, the writer of this tutorial recommends the Astroquery Python interface, unless you are not at all comfortable using python or doing purely exploratory work.

0. Introduction

The Cosmic Origins Spectrograph (COS) is an ultraviolet spectrograph on-board the Hubble Space Telescope(HST) with capabilities in the near ultraviolet (NUV) and far ultraviolet (FUV).

This tutorial aims to prepare you to access the existing COS data of your choice by walking you through downloading a processed spectrum, as well as various calibration files obtained with COS.

We will define a few directories in which to place our data.

And to create new directories, we'll import pathlib.Path:

In [1]:
#Import for: working with system paths
from pathlib import Path

# This will be an important directory for the Notebook, where we save data
data_dir = Path('./data/')
data_dir.mkdir(exist_ok=True)

1. Downloading the data through the browser interface

One can search for COS data from both a browser-based Graphical User Interface (gui) and a Python interface. This Section (1) will examine two web interfaces. Section 2 will explain the Python interface.

Note, there are other, more specialized ways to query the mast API not discussed in this Notebook. An in-depth MAST API tutorial can be found here.

A browser gui for searching specifically through HST archival data can be found here. We will be discussing this HST search below.

A newer and more general MAST gui, which also allows access to data from other telescopes such as TESS, but does not offer all HST-specific search parameters, is also available. We will discuss this interface in Section 1.3.

The search page of the HST interface is laid out as in fig. 1.1:

Fig 1.1

where here we have indicated we would like to find all archival science data from the COS far-ultraviolet (FUV) configuration, taken with any grating while looking at Quasi-Stellar Objects (QSO) within a 3 arcminute radius of (1hr:37':40", +33d 09m 32s). The output columns we have selected to see are visible in the bottom left of Fig 1.1.

Note that if you have a list of coordinates, Observation ID(s), etc. for a series of targets you can click on the "File Upload Form" and attach your list of OBSIDs or identifying features. Then specify which type of data your list contains using the "File Contents" drop-down menu.

Figure 1.2 shows the results of our search shown in Fig 1.1.

Fig 1.2

We now choose our dataset. We rather arbitrarily select LCXV13050 because of its long exposure time, taken under an observing program described as:

"Project AMIGA: Mapping the Circumgalactic Medium of Andromeda"

This is a Quasar known as 3C48, one of the first quasars discovered.

Clicking on the dataset, we are taken to a page displaying a preview spectrum (Fig 1.3).

Fig 1.3

We now return to the search page and enter in LCXV13050 under "Dataset" with no other parameters set. Clicking "search", now we see a single-rowed table with just our dataset, and the option to download datasets. We mark the row we wish to download and click "Submit marked data for retrieval from STDADS". See Fig 1.4.

Fig 1.4

Now we see a page like in Fig 1.5, where we can either sign in with STScI credentials, or simply provide our email to proceed without credentials. Make sure to select "Deliver the data to the Archive staging area". Click "Send Retrieval Request to STDADS" and you will recieve an email with instructions on downloading with ftp. You will need to do this step from the command line.

Fig 1.5

In the case of this request, the command to retrieve the data depends on how you are connected to the internet.

If you are connected to the STScI network, either in-person or via VPN , you should use the wget command as in the example below:

wget -r --ftp-user=[anonymous] --ask-password [ftps://archive.stsci.edu/stage/anonymous/anonymous42822] --directory-prefix=[data_dir]

where the password was the email address used, and data_dir is the directory defined at the beginning of this Notebook. Now all the data is in a subdirectory "./archive.stsci.edu/stage/anonymous/anonymous42822/"

If you are not on the STScI network, downloading from the staged data is slightly more complicated and the best way to download staged data may vary. Below are several options for downloading this data. If you only need to download a few files, the Web Browser approach requires very little setup. If you are unfamiliar with ftp protocols and need to download many files, you may try using an ftp client, such as Cyberduck. However, if you are still having difficulty downloading the staged data you may simply opt to use the Astroquery Python method described in Section 2.

Cyberduck ftp client Mac/UNIX terminal or Windows CLI Web Browser (system-independent but must download files one-at-a-time) Direct Download from the Newer MAST Portal (simple but can be slow for large downloads)
Download the Cyberduck software (free with suggested donation) install the gnu ftp tool onto MacOS with brew (Windows users can skip this step) Open a browser Navigate to the MAST portal - described in detail in Section 1.3
Click "Open Connection" ftp archive.stsci.edu in the address bar, type: search for your datasets, either by narrowing down with the filters described above, or much more quickly:
Set Server = archive.stsci.edu, select "Anonymous login" enter username (anonymous) and password (email address) ftp://your_email_address@archive.stsci.edu/stage/< your_directory_name > search for the Observation IDs you previously found querying the "MAST Observations by Observation ID" at the top left
Click "Connect" cd /stage/<username>/<usernamennnnn> i.e. ftp://anonymous@archive.stsci.edu:/stage/anonymous/<anonymousnnnnn> select the products you would like to download and click the button at the top right "Add data products to Downloads Basket"
Navigate to stage/anonymous/<anonymousnnnnn> prompt allow to open folder and sign in with anonymous and email address click "My Download Basket" (upper left)
Select the files you wish to download binary select the data and data filters you want to download
Right-click and select "Download" or "Download To" mget * click "Download selected items" (upper right); Note that the "Batch Retrieval" button will stage the data as the previous HST Search interface did.
quit select your preferred compression and click "Download"
make sure to allow the pop-up window, which will begin the download

Well Done making it this far!

Attempt the exercise below for some extra practice.

Exercise 1: Searching the archive for TRAPPIST-1 data

TRAPPIST-1 is a cool red dwarf with a multiple-exoplanet system.

  • Find its coordinates using the SIMBAD Basic Search.
  • Use those coordinates in the HST web search or the MAST portal to find all COS exposures of the system.
  • Limit the search terms to find the COS dataset taken in the COS far-UV configuration with the grating G130M.

What is the dataset ID, and how long was the exposure?

Place your answer in the cell below.

In [2]:
# Your answer here

Now let's try using the web interface's file upload form to search for a series of observations by their dataset IDs. We're going to look for three observations of the same object, the white dwarf WD1057+719, taken with three different COS gratings. Two are in the FUV and one in the NUV. The dataset IDs are

  • LDYR52010
  • LBNM01040
  • LBBD04040

So that we have an example list of datasets to input to the web search, we make a comma-separated-value txt file with these three obs_ids, and save it as obsId_list.txt.

In [3]:
obsIdList = ['LDYR52010','LBNM01040','LBBD04040'] # The three observation IDs we want to gather
obsIdList_length = len(obsIdList) 

with open('./obsId_list.txt', 'w') as f: # Open up this new file in "write" mode
    for i, item in enumerate(obsIdList): # We want a newline after each obs_id except the last one
        if i < obsIdList_length - 1:
            f.writelines(item + "," + '\n')
        if i == obsIdList_length - 1: # Make sure we don't end the file with a blank line (below)
            f.writelines(item)

Then we link to this file under the Local File Name browse menu on the file upload form. We must set the File Contents term to Data ID, as that is the identifier we have provided in our file, and we change the delimiter to a comma. Because we are searching by Dataset ID, we don't need to specify any additional parameters to narrow down the data.

Fig 1.6

We now can access all the datasets, as shown in Fig. 1.7:

Fig 1.7

Now, to download all of the relavent files, we can check the mark box for all of them, and again hit "Submit marked data for retrieval from STDADS". This time, we want to retrieve all the calibration files associated with each dataset, so we check the following boxes:

  • Uncalibrated
  • Calibrated
  • Used Reference Files

(See Fig. 1.8)

Fig 1.8

The procedure from here is the same described above in Section 1.1. Now, when we download the staged data, we obtain multiple subdirectories with each dataset separated.

1.3. The MAST Portal

STScI hosts another web-based gui for accessing data, the MAST Portal. This is a newer interface which hosts data from across many missions and allows the user to visualize the target in survey images, take quick looks at spectra or lightcurves, and manage multiple search tabs at once. Additionally, it handles downloads in a slightly more beginner-friendly manner than the current implementation of the Classic HST Search. This guide will only cover the basics of accessing COS data through the MAST Portal; you can find more in-depth documentation in the form of helpful video guides on the MAST YouTube Channel.

Let's find the same data we found in Section 1.1, on the QSO 3C48:

Navigate to the MAST Portal at https://mast.stsci.edu/portal/Mashup/Clients/Mast/Portal.html, and you will be greeted by a screen where the top looks like Fig. 1.9.

Fig 1.9

Click on "Advanced Search" (boxed in red in Fig. 1.9). This will open up a new search tab, as shown in Fig. 1.10:

Fig 1.10

Fig 1.10 (above) shows the default search fields which appear. Depending on what you are looking for, these may or may not be the most helpful search fields. By unchecking some of the fields which we are not interested in searching by right now (boxed in green), and then entering the parameter values by which to narrow the search into each parameter's box, we generate Fig. 1.11 (below). One of the six fields (Mission) by which we are narrowing is boxed in a dashed blue line. The list of applied filters is boxed in red. A dashed pink box at the top left indicates that 2 records were found matching all of these parameters. To its left is an orange box around the "Search" button to press to bring up the list of results

Here we are searching by:

Search Parameter Value
Mission HST
Instrument COS/FUV
Filters G160M
Target Name 3C48
Observation ID LCXV* (the star is a "wild card" value, so the search will find any file whose obs_id begins with LCXV)
Product Type spectrum

Fig 1.11

Click the "Search" button (boxed in orange), and you will be brought to a page resembling Fig. 1.12.

Fig 1.12

Above, in Fig 1.12:

  • The yellow box to the right shows the AstroView panel, where you can interactively explore the area around your target:
    • click and drag to pan around
    • scroll to zoom in/out
  • The dashed-blue box highlights additional filters you can use to narrow your search results.
  • The red box highlights a button you can click with some spectral datasets to pull up an interactive spectrum.
  • The green box highlights the "Mark" checkboxes for each dataset.
  • The black circle highlights the single dataset download button:
    • If you only need to download one or two datasets, you may simply click this button for each dataset
    • Clicking the single dataset download button will attempt to open a "pop-up" window, which you must allow in order to download the file. Some browsers will require you to manually allow pop-ups.

To download multiple datasets: The MAST portal acts a bit like an online shopping website, where you add your data products to the checkout cart/basket, then open up your cart to checkout and download the files.

Using the checkboxes, mark all the datasets you wish to download (in this case, we'll download both LCXV13040 and LCXV13050). Then, click the "Add data products to Download Basket" button (circled in a dashed-purple line), which will take you to a "Download Basket" screen resembling Fig 1.13:

Fig 1.13

Each dataset contains many files, most of which are calibration files or intermediate processing files. You may or may not want some of these intermediate files in addition to the final product file. In the leftmost "Filters" section of the Download Basket page, you can narrow which files will be downloaded (boxed in red). By default, only the minimum recommended products (mrp) will be selected. In the case of most COS data, this will be the final spectrum x1dsum file and association asn file for each dataset. The mrp files for the first dataset (LCXV13040) are highlighted in yellow. These two mrp filetypes are fine for our purposes here; however if you want to download files associated with specific exposures, or any calibration files or intermediate files, you can select those you wish to download with the checkboxes in the file tree system (boxed in dashed-green).

For this tutorial, we simply select "Minimum Recommended Products" at the top left. With this box checked, all of the folders representing individual exposures are no longer visible. Check the box labelled "HST" to select all files included by the filters, and click the "Download Selected Items" button at the top right (dashed-black circle). This will bring up a small window asking you what format to download your files as. For datasets smaller than several Gigabytes, the Zip format will do fine. Click Download, and a pop-up window will try to open to download the files. If no download begins, make sure to enable this particular pop-up, or allow pop-ups on the MAST page.

Your files should now be downloaded as a compressed Zip folder. If you need help uncompressing the Zipped files, check out these links for: Windows and Mac. There are numerous ways to do this on Linux, however we have not vetted them.

2. The Python Package astroquery.mast

Another way to search for and download archived datasets is from within Python using the module astroquery.mast. We will import one of this module's key submodules: Observations.

Please note that the canonical source of information on this package is the astroquery docs - please look there for the most up-to-date instructions.

We will import the following packages:

  • astroquery.mast's submodule Observations for finding and downloading data from the MAST archive
  • csv's submodule reader for reading in/out from a csv file of source names.
In [4]:
# Downloading data from archive
from astroquery.mast import Observations

# Reading in multiple source names from a csv file
from csv import reader

2.1. Searching for a single source with Astroquery

There are many options for searching the archive with astroquery, but we will begin with a very general search using the coordinates we found for WD1057+719 in the last section to find the dataset with the longest exposure time using the COS/FUV mode through the G160M filter. We could also search by object name to have it resolved to a set of coordinates, with the function Observations.query_object(objectname = '3C48').

  • Our coordinates were: (11:00:34.126 +71:38:02.80).
    • We can search these coordinates as sexagesimal coordinates, or convert them to decimal degrees.
In [5]:
query_1 = Observations.query_object("11:00:34.126 +71:38:02.80", radius="5 sec")

This command has generated a table of objects called "query_1". We can see what information we have on the objects in the table by printing its keys, and see how many objects are in the table with len(query_1).

In [6]:
print(f"We have table information on {len(query_1)} observations in the following categories/columns:\n")
q1_keys = (query_1.keys())
q1_keys
We have table information on 746 observations in the following categories/columns:

Out[6]:
['intentType',
 'obs_collection',
 'provenance_name',
 'instrument_name',
 'project',
 'filters',
 'wavelength_region',
 'target_name',
 'target_classification',
 'obs_id',
 's_ra',
 's_dec',
 'dataproduct_type',
 'proposal_pi',
 'calib_level',
 't_min',
 't_max',
 't_exptime',
 'em_min',
 'em_max',
 'obs_title',
 't_obs_release',
 'proposal_id',
 'proposal_type',
 'sequence_number',
 's_region',
 'jpegURL',
 'dataURL',
 'dataRights',
 'mtFlag',
 'srcDen',
 'obsid',
 'distance']

2.2. Narrowing Search with Observational Parameters

Now we narrow down a bit with some additional parameters and sort by exposure time. The parameter limits we add to the search are:

  • Only look for sources in the coordinate range between right ascension 165 to 166 degrees and declination +71 to +72 degrees
  • Only find observations in the UV
  • Only find observations taken with the COS instrument (either in its FUV or NUV configuration).
  • Only find spectrographic observations
  • Only find observations made using the COS grating "G160M"
In [7]:
query_2 = Observations.query_criteria(s_ra=[165., 166.], s_dec=[+71.,+72.],
                                        wavelength_region="UV", instrument_name=["COS/NUV","COS/FUV"], 
                                        dataproduct_type = "spectrum", filters = 'G160M')

# Next lines simplifies the columns of data we see to some useful data we will look at right now
limq2 = query_2['obsid','obs_id', 'target_name', 'dataproduct_type', 'instrument_name',
                'project', 'filters', 'wavelength_region', 't_exptime'] 
sort_order = query_2.argsort('t_exptime') # This is the index list in order of exposure time, increasing
print(limq2[sort_order])
chosenObs = limq2[sort_order][-1] # Grab the last value of the sorted list
print(f"\n\nThe longest COS/FUV exposure with the G160M filter is: \n\n{chosenObs}") 
 obsid     obs_id  target_name ... filters wavelength_region     t_exptime    
-------- --------- ----------- ... ------- ----------------- -----------------
26242800 ldyr52030  WD1057+719 ...   G160M                UV               0.0
24843807 lbn133010  WD1057+719 ...   G160M                UV               0.0
24139526 lbb916lbq  WD1057+719 ...   G160M                UV               1.0
24139542 lbb918scq  WD1057+719 ...   G160M                UV               1.0
24134176 la9r02dfq WD-1057+719 ...   G160M                UV               1.0
24139587 lbb9x3ckq  WD1057+719 ...   G160M                UV               1.0
24139534 lbb917k7q  WD1057+719 ...   G160M                UV               1.0
24140064 lbe702iqs  WD1057+719 ...   G160M                UV               1.0
24134175 la9r02deq WD-1057+719 ...   G160M                UV             108.0
24139525 lbb916laq  WD1057+719 ...   G160M                UV             108.0
     ...       ...         ... ...     ...               ...               ...
24843895 lbnm03030  WD1057+719 ...   G160M                UV          2800.832
24843884 lbnm01020 WD-1057+719 ...   G160M                UV          2850.208
24843894 lbnm03020  WD1057+719 ...   G160M                UV          2900.608
24843896 lbnm03040  WD1057+719 ...   G160M                UV           3000.64
24140062 lbe702ios  WD1057+719 ...   G160M                UV            3179.0
24843893 lbnm03010  WD1057+719 ...   G160M                UV          3200.768
24839998 la9r02010 WD-1057+719 ...   G160M                UV          3280.128
24843887 lbnm01050 WD-1057+719 ...   G160M                UV          3350.368
24843897 lbnm03050  WD1057+719 ...   G160M                UV            3700.8
24842491 lbek02010  WD1057+719 ...   G160M                UV            4252.0
24842492 lbek02020  WD1057+719 ...   G160M                UV 5401.599999999999
Length = 159 rows


The longest COS/FUV exposure with the G160M filter is: 

 obsid     obs_id  target_name dataproduct_type instrument_name project filters wavelength_region     t_exptime    
-------- --------- ----------- ---------------- --------------- ------- ------- ----------------- -----------------
24842492 lbek02020  WD1057+719         spectrum         COS/FUV     HST   G160M                UV 5401.599999999999

Caution!

Please note that these queries are Astropy tables and do not always respond as expected for other data structures like Pandas DataFrames. For instance, the first way of filtering a table shown below is correct, but the second will consistently produce the wrong result. You must search and filter these tables by masking them, as in the first example below.

In [8]:
# Searching a table generated with a query
## First, correct way using masking
mask = (query_1['obs_id'] == 'lbbd01020') # NOTE, obs_id must be lower-case
print("Correct way yields: \n" , query_1[mask]['obs_id'],"\n\n")

# Second INCORRECT way
print("Incorrect way yields: \n" , query_1['obs_id' == 'LBBD01020']['obs_id'], "\nwhich is NOT what we're looking for!")
Correct way yields: 
   obs_id 
---------
lbbd01020 


Incorrect way yields: 
 tess-s0014-4-3 
which is NOT what we're looking for!

2.3. Choosing and Downloading Data Products

Now we can choose and download our data products from the archive dataset.

We will first generate a list of data products in the dataset: product_list. This will generate a large list, but we will only show the first 10 values.

In [9]:
product_list = Observations.get_product_list(chosenObs)
product_list[:10] #Not the whole dataset, just first 10 lines/observations
Out[9]:
Table masked=True length=10
obsIDobs_collectiondataproduct_typeobs_iddescriptiontypedataURIproductTypeproductGroupDescriptionproductSubGroupDescriptionproductDocumentationURLprojectprvversionproposal_idproductFilenamesizeparent_obsiddataRightscalib_level
str8str3str8str9str62str1str44str9str28str12str1str6str6str5str27int64str8str6int64
24842492HSTspectrumlbek02020DADS JIF fileDmast:HST/product/lbek02020_jif.fitsAUXILIARY--JIF--CALCOS--12086lbek02020_jif.fits21600024842492PUBLIC1
24842492HSTspectrumlbek02020DADS JIT fileDmast:HST/product/lbek02020_jit.fitsAUXILIARY--JIT--CALCOS--12086lbek02020_jit.fits27936024842492PUBLIC1
24842492HSTspectrumlbek02020DADS TRL file - Processing logDmast:HST/product/lbek02020_trl.fitsAUXILIARY--TRL--CALCOS--12086lbek02020_trl.fits864024842492PUBLIC1
24842492HSTspectrumlbek02020DADS X1S file - Summed 1D spectrum COSDmast:HST/product/lbek02020_x1dsum1.fitsAUXILIARY--X1DSUM1--CALCOS3.3.1112086lbek02020_x1dsum1.fits181152024842492PUBLIC2
24842492HSTspectrumlbek02020DADS X2S file - Summed 1D spectrum COSDmast:HST/product/lbek02020_x1dsum2.fitsAUXILIARY--X1DSUM2--CALCOS3.3.1112086lbek02020_x1dsum2.fits181152024842492PUBLIC2
24842492HSTspectrumlbek02020DADS X3S file - Summed 1D spectrum COSDmast:HST/product/lbek02020_x1dsum3.fitsAUXILIARY--X1DSUM3--CALCOS3.3.1112086lbek02020_x1dsum3.fits181152024842492PUBLIC2
24842492HSTspectrumlbek02020DADS X4S file - Summed 1D spectrum COSDmast:HST/product/lbek02020_x1dsum4.fitsAUXILIARY--X1DSUM4--CALCOS3.3.1112086lbek02020_x1dsum4.fits181152024842492PUBLIC2
24842492HSTspectrumlbek02020DADS ASN file - Association ACS/WFC3/STISDmast:HST/product/lbek02020_asn.fitsAUXILIARYMinimum Recommended ProductsASN--CALCOS3.3.1112086lbek02020_asn.fits1152024842492PUBLIC3
24842492HSTspectrumlbek02020DADS LOG fileDmast:HST/product/lbek02020_log.txtINFO--LOG--CALCOS--12086lbek02020_log.txt173224842492PUBLIC1
24842492HSTspectrumlbek02020Preview-FullDmast:HST/product/lbek02020_x1dsum.pngPREVIEW------CALCOS3.3.1112086lbek02020_x1dsum.png11179124842492PUBLIC3

Now, we will download just the minimum recommended products (mrp) which are the fully calibrated spectrum (denoted by the suffix _x1d or here x1dsum) and the association file (denoted by the suffix _asn). We do this by setting the parameter mrp_only to True. The association file contains no data, but rather the metadata explaining which exposures produced the x1dsum dataset. The x1dsum file is the final product summed across all of the fixed pattern noise positionsGratingOffsetPositions(FP-POS)) (FP-POS). The x1d and x1dsum<n> files are intermediate spectra. Much more information can be found in the COS Instrument Handbook.

We would set mrp_only to False, if we wanted to download all the data from the observation, including:

  • support files such as the spacecraft's pointing data over time (jit files).
  • intermediate data products such as calibrated TIME-TAG data (corrtag or corrtag_a/corrtag_b files) and extracted 1-dimensional spectra averaged over exposures with a specific FP-POS value (x1dsum<n> files).

However, use caution with downloading all files, as in this case, setting mrp_only to False results in the transfer of 7 Gigabytes of data, which can take a long time to transfer and eat away at your computer's storage! In general, only download the files you need. On the other hand, often researchers will download only the raw data, so that they can process it for themselves. Since here we only need the final x1dsum and asn files, we only need to download 2 Megabytes.

In [10]:
downloads = Observations.download_products(product_list, download_dir=str(data_dir) , extension='fits', mrp_only=True, cache=False)
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/lbek02020_asn.fits to data/mastDownload/HST/lbek02020/lbek02020_asn.fits ... [Done]
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/lbek02020_x1dsum.fits to data/mastDownload/HST/lbek02020/lbek02020_x1dsum.fits ... [Done]

Exercise 2: Download the raw counts data on TRAPPIST-1

In the previous exercise, we found an observation COS took on TRAPPIST-1 system. In case you skipped Exercise 1, the observation's Dataset ID is LDLM40010.

Use Astroquery.mast to download the raw TIME-TAG data, rather than the x1d spectra files. See the COS Data Handbook Ch. 2 for details on TIME-TAG data files. Make sure to get the data from both segments of the FUV detector (i.e. both RAWTAG_A and RAWTAG_B files). If you do this correctly, there should be five data files for each detector segment.

Note that some of the obs_id may appear in the table as slightly different, i.e.: ldlm40alq and ldlm40axq, rather than ldlm40010. The main obs_id they fall under is still ldlm40010, and this will still work as a search term. They are linked together by the association file described here in section 2.3.

In [11]:
# Your answer here

2.4. Using astroquery to find data on a series of sources

In this case, we'll look for COS data around several bright globular clusters:

  • Omega Centauri
  • M5
  • M13
  • M15
  • M53

We will first write a comma-separated-value (csv) file objectname_list.csv listing these sources by their common name. This is a bit redundant here, as we will immediately read back in what we have written; however it is done here to deliberately teach both sides of the writing/reading process, and as many users will find themselves with a csv sourcelist they must search.

In [12]:
sourcelist = ['omega Centauri', 'M5', 'M13', 'M15', 'M53'] # The 5 sources we want to look for
sourcelist_length = len(sourcelist) # measures the length of the list for if statements below 

with open('./objectname_list.csv', 'w') as f: # Open this new file in "write" mode
    for i, item in enumerate(sourcelist): # We want a comma after each source name except the last one
        if i < sourcelist_length - 1:
            f.writelines(item + ",")
        if i == sourcelist_length - 1: # No comma after the last entry
            f.writelines(item)
In [13]:
with open('./objectname_list.csv', 'r', newline = '') as csvFile: # Open the file we just wrote in "read" mode
    objList = list(reader(csvFile, delimiter = ','))[0] # This is the exact same list as `sourcelist`!

print("The input csv file contained the following sources:\n", objList)

globular_cluster_queries = {} # Make a dictionary, where each source name (i.e. "M15") corresponds to a list of its observations with COS
for obj in objList: # each "obj" is a source name
    query_x = Observations.query_criteria(objectname = obj, radius = "5 min", instrument_name=['COS/FUV', 'COS/NUV']) # query the area in +/- 5 arcminutes
    globular_cluster_queries[obj] = (query_x) # add this entry to the dictionary
    
globular_cluster_queries # show the dictionary
The input csv file contained the following sources:
 ['omega Centauri', 'M5', 'M13', 'M15', 'M53']
Out[13]:
{'omega Centauri': <Table masked=True length=15>
 dataproduct_type calib_level obs_collection ...   objID1       distance     
       str8          int64         str3      ...    str9        float64      
 ---------------- ----------- -------------- ... --------- ------------------
            image           2            HST ... 108405360 293.85414993790886
         spectrum           1            HST ... 117331263  87.65983662909147
         spectrum           1            HST ... 117331267  87.65983662909147
            image           2            HST ... 117331272  94.47272986343471
         spectrum           1            HST ... 117331412  87.65983662909147
            image           2            HST ... 117331415  97.04717859213693
         spectrum           3            HST ... 117332211  97.04717859213693
         spectrum           3            HST ... 117332236  87.65983662909147
         spectrum           3            HST ... 117332441  87.65983662909147
         spectrum           3            HST ... 117332545  94.47272986343471
         spectrum           3            HST ... 117332570  97.04717859213693
         spectrum           3            HST ... 117332623  94.47272986343471
            image           2            HST ... 117340338 295.35269365022396
         spectrum           2            HST ... 117347625  97.04717859213693
            image           2            HST ... 117348858  294.8382904103309,
 'M5': <Table masked=True length=5>
 dataproduct_type calib_level obs_collection ...   objID1       distance    
       str8          int64         str3      ...    str9        float64     
 ---------------- ----------- -------------- ... --------- -----------------
         spectrum           1            HST ... 117320430 51.62395577766772
         spectrum           1            HST ... 117320565 51.62395577766772
         spectrum           3            HST ... 117320601 51.62395577766772
         spectrum           1            HST ... 117323571 51.62395577766772
         spectrum           3            HST ... 117325528 51.62395577766772,
 'M13': <Table masked=True length=5>
 dataproduct_type calib_level obs_collection ...   objID1       distance     
       str8          int64         str3      ...    str9        float64      
 ---------------- ----------- -------------- ... --------- ------------------
         spectrum           1            HST ... 108249358 130.64769900000118
         spectrum           1            HST ... 117319280 130.64769900000118
         spectrum           1            HST ... 117319283 130.64769900000118
         spectrum           3            HST ... 117320115 130.64769900053992
         spectrum           3            HST ... 117320427 130.64769900053992,
 'M15': <Table masked=True length=10>
 dataproduct_type calib_level obs_collection ...   objID1       distance     
       str8          int64         str3      ...    str9        float64      
 ---------------- ----------- -------------- ... --------- ------------------
         spectrum           1            HST ... 117319172  99.67537720410724
         spectrum           1            HST ... 117319271 27.096650306913773
         spectrum           1            HST ... 117319429 27.096650306913773
         spectrum           1            HST ... 117319437  99.67537720410724
         spectrum           1            HST ... 117319823  99.67537720410724
         spectrum           3            HST ... 117320519  27.09665030692082
         spectrum           3            HST ... 117320813  27.09665030692082
         spectrum           3            HST ... 117321106  99.67537720435718
         spectrum           3            HST ... 117340643  99.67537720435718
         spectrum           1            HST ... 117349963 27.096650306913773,
 'M53': <Table masked=True length=3>
 dataproduct_type calib_level obs_collection ...   objID1       distance    
       str8          int64         str3      ...    str9        float64     
 ---------------- ----------- -------------- ... --------- -----------------
         spectrum           3            HST ... 108382566 261.1902814694652
         spectrum           3            HST ... 108384563 261.1902814694652
            image           2            HST ... 115712512  263.628661035198}

Excellent! You've now done the hardest part - finding and downloading the right data. From here, it's generally straightforward to read in and plot the spectrum. We recommend you look into our tutorial on Viewing a COS Spectrum.

Congratulations! You finished this Notebook!

There are more COS data walkthrough Notebooks on different topics. You can find them here.


About this Notebook

Author: Nat Kerman nkerman@stsci.edu

Updated On: 2021-07-06

This tutorial was generated to be in compliance with the STScI style guides and would like to cite the Jupyter guide in particular.

Citations

If you use astropy, matplotlib, astroquery, or numpy for published research, please cite the authors. Follow these links for more information about citations:


Top of Page Space Telescope Logo







Exercise Solutions:

Note, that for many of these, there are multiple ways to get an answer.

We will import:

  • numpy to handle array functions
  • astropy.io fits for accessing FITS files
  • astropy.table Table for creating tidy tables of the data
In [14]:
# Manipulating arrays
import numpy as np
# Reading in data
from astropy.table import Table
In [15]:
## Ex. 1 soln:
dataset_id_ = 'LDLM40010'
exptime_ = 12403.904
print(f"The TRAPPIST-1 COS data is in dataset {dataset_id_}, taken with an exosure time of {exptime_}")
The TRAPPIST-1 COS data is in dataset LDLM40010, taken with an exosure time of 12403.904
In [16]:
## Ex. 2 soln:
query_3 = Observations.query_criteria(obs_id = 'LDLM40010',
                                        wavelength_region="UV", instrument_name="COS/FUV", filters = 'G130M')

product_list2 = Observations.get_product_list(query_3)
rawRowsA = np.where(product_list2['productSubGroupDescription'] == "RAWTAG_A")
rawRowsB = np.where(product_list2['productSubGroupDescription'] == "RAWTAG_B")
rawRows = np.append(rawRowsA,rawRowsB)
!mkdir ./data/Ex2/
downloads2 = Observations.download_products(product_list2[rawRows], download_dir=str(data_dir/'Ex2/') , extension='fits', mrp_only=False, cache=True)
downloads3 = Observations.download_products(product_list2, download_dir=str(data_dir/'Ex2/') , extension='fits', mrp_only=True, cache=True)

asn_data = Table.read('./data/Ex2/mastDownload/HST/ldlm40010/ldlm40010_asn.fits', hdu = 1)
print(asn_data)
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/ldlm40alq_rawtag_a.fits to data/Ex2/mastDownload/HST/ldlm40alq/ldlm40alq_rawtag_a.fits ... [Done]
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/ldlm40axq_rawtag_a.fits to data/Ex2/mastDownload/HST/ldlm40axq/ldlm40axq_rawtag_a.fits ... [Done]
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/ldlm40blq_rawtag_a.fits to data/Ex2/mastDownload/HST/ldlm40blq/ldlm40blq_rawtag_a.fits ... [Done]
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/ldlm40c2q_rawtag_a.fits to data/Ex2/mastDownload/HST/ldlm40c2q/ldlm40c2q_rawtag_a.fits ... [Done]
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/ldlm40d2q_rawtag_a.fits to data/Ex2/mastDownload/HST/ldlm40d2q/ldlm40d2q_rawtag_a.fits ... [Done]
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/ldlm40alq_rawtag_b.fits to data/Ex2/mastDownload/HST/ldlm40alq/ldlm40alq_rawtag_b.fits ... [Done]
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/ldlm40axq_rawtag_b.fits to data/Ex2/mastDownload/HST/ldlm40axq/ldlm40axq_rawtag_b.fits ... [Done]
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/ldlm40blq_rawtag_b.fits to data/Ex2/mastDownload/HST/ldlm40blq/ldlm40blq_rawtag_b.fits ... [Done]
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/ldlm40c2q_rawtag_b.fits to data/Ex2/mastDownload/HST/ldlm40c2q/ldlm40c2q_rawtag_b.fits ... [Done]
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/ldlm40d2q_rawtag_b.fits to data/Ex2/mastDownload/HST/ldlm40d2q/ldlm40d2q_rawtag_b.fits ... [Done]
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/ldlm40010_asn.fits to data/Ex2/mastDownload/HST/ldlm40010/ldlm40010_asn.fits ... [Done]
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/ldlm40010_x1dsum.fits to data/Ex2/mastDownload/HST/ldlm40010/ldlm40010_x1dsum.fits ... [Done]
   MEMNAME        MEMTYPE     MEMPRSNT
-------------- -------------- --------
     LDLM40ALQ         EXP-FP        1
     LDLM40AXQ         EXP-FP        1
     LDLM40BLQ         EXP-FP        1
     LDLM40C2Q         EXP-FP        1
     LDLM40D2Q         EXP-FP        1
     LDLM40010        PROD-FP        1