Editing the extraction boxes in a spectral extraction file (XTRACTAB)

Learning Goals

This Notebook is designed to walk the user (you) through: Altering the extraction box used to extract your spectrum from a COS `TIME-TAG` exposure file.

1. Investigating the exposure

- 1.1. Understanding the XTRACTAB and examining a 2D spectrum

- 1.2. Defining some useful functions

- 1.3. Examining the extraction boxes

2. Editing the extraction boxes

- 2.1. Defining an editing function

- 2.2. Making the edits

- 2.3. Confirming the changes

3. Running the CalCOS Pipeline with the new XTRACTAB

- 3.1. Editing the XTRACTAB header value

- 3.2. Running the pipeline

4. Example using FUV Data

0. Introduction

The Cosmic Origins Spectrograph (COS) is an ultraviolet spectrograph on-board the Hubble Space Telescope (HST) with capabilities in the near ultraviolet (NUV) and far ultraviolet (FUV).

This tutorial aims to prepare you to work with the COS data of your choice by walking you through altering the extraction box sizes in the XTRACTAB/_1dx file to make sure you are extracting the cleanest possible signal from your source and background. We will demonstrate this in both the NUV and FUV.

Note that some COS modes which use the FUV detector can be better extracted using the TWOZONE method, which is not directly discussed in this Notebook. All COS/NUV modes use the BOXCAR method discussed in this Notebook.

We will import the following packages:

  • numpy to handle arrays and functions
  • astropy.io fits and astropy.table Table for accessing FITS files
  • glob, os, and shutil for working with system files
  • astroquery.mast Observations for finding and downloading data from the MAST archive
  • matplotlib.pyplot for plotting
  • matplotlib.image for reading in images
  • calcos to run the CalCOS pipeline for COS data reduction
  • scipy.interpolate interp1d for interpolating datasets to the same sampling
  • pathlib Path for managing system paths

These Python packages are installed standard with the the STScI conda distribution. For more information, see our Notebook tutorial on setting up an environment.

We'll also filter out two unhelpful warnings about a deprecation and dividing by zero which do not affect our data processing.

In [1]:
# Import for: array manipulation
import numpy as np
# Import for: reading fits files
from astropy.table import Table
from astropy.io import fits

# Import for: downloading the data
from astroquery.mast import Observations

# Import for: plotting
from matplotlib import pyplot as plt

# Import for: showing images from within Python
from matplotlib import image as mpimg

# Import for: dealing with system files
import glob, os, shutil

# Import for: running the CalCOS pipeline
import calcos

# Import for: comparing the old and new CalCOS values
from scipy.interpolate import interp1d 

#Import for: working with system paths
from pathlib import Path

# We will also suppress a warning that won't affect our data processing:
np.seterr(divide = 'ignore') 
np.warnings.filterwarnings('ignore', category=np.VisibleDeprecationWarning)

We will also define a few directories we will need:

In [2]:
# These will be important directories for the Notebook
datadir = Path('./data/')
outputdir = Path('./output/')
plotsdir = Path('./output/plots/')

# Make the directories if they don't exist
datadir.mkdir(exist_ok=True), outputdir.mkdir(exist_ok=True), plotsdir.mkdir(exist_ok=True)
Out[2]:
(None, None, None)

And we will need to download the data we wish to work with:

We choose the exposures with the association obs_id: LE4B04010 and download all the _rawtag data. This dataset happens to be COS/NUV data taken with the G185M grating, observing the star: LS IV -13 30. For more information on downloading COS data, see our notebook tutorial on downloading COS data.

Note, we're working with the _rawtags because they are smaller files and quicker to download than the _corrtag files. However, this workflow translates very well to using _corrtag files, as you likely will want to do when working with your actual data. If you wish to use the default corrections converting from raw to corrected TIME-TAG data, you may instead download and work with CORRTAG files directly.

In [3]:
pl = Observations.get_product_list(Observations.query_criteria(obs_id='LE4B04010'))
masked_pl = pl[np.isin(pl['productSubGroupDescription'],['RAWTAG', 'ASN', 'X1DSUM'])] # You could put 'CORRTAG' here
# Now download:
Observations.download_products(masked_pl)
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/le4b04010_asn.fits to ./mastDownload/HST/le4b04010/le4b04010_asn.fits ... [Done]
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/le4b04010_x1dsum.fits to ./mastDownload/HST/le4b04010/le4b04010_x1dsum.fits ... [Done]
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/le4b04zfq_rawtag.fits to ./mastDownload/HST/le4b04zfq/le4b04zfq_rawtag.fits ... [Done]
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/le4b04zkq_rawtag.fits to ./mastDownload/HST/le4b04zkq/le4b04zkq_rawtag.fits ... [Done]
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/le4b04zoq_rawtag.fits to ./mastDownload/HST/le4b04zoq/le4b04zoq_rawtag.fits ... [Done]
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/le4b04zqq_rawtag.fits to ./mastDownload/HST/le4b04zqq/le4b04zqq_rawtag.fits ... [Done]
Out[3]:
Table length=6
Local PathStatusMessageURL
str50str8objectobject
./mastDownload/HST/le4b04010/le4b04010_asn.fitsCOMPLETENoneNone
./mastDownload/HST/le4b04010/le4b04010_x1dsum.fitsCOMPLETENoneNone
./mastDownload/HST/le4b04zfq/le4b04zfq_rawtag.fitsCOMPLETENoneNone
./mastDownload/HST/le4b04zkq/le4b04zkq_rawtag.fitsCOMPLETENoneNone
./mastDownload/HST/le4b04zoq/le4b04zoq_rawtag.fitsCOMPLETENoneNone
./mastDownload/HST/le4b04zqq/le4b04zqq_rawtag.fitsCOMPLETENoneNone

We gather a list of all the _rawtag files we have downloaded, as well as the _asnfile and _x1dsum file:

In [4]:
rawtags = glob.glob('./mastDownload/HST/**/*_rawtag.fits', 
                    recursive=True)
asnfile = glob.glob('./mastDownload/HST/**/*_asn.fits', 
                    recursive=True)[0]
old_x1dsum = glob.glob('./mastDownload/HST/**/*_x1dsum.fits', 
                    recursive=True)[0]

1. Investigating the exposure

1.1. Understanding the XTRACTAB and examining a 2D spectrum

The raw data from the COS instrument is a series of events, each corresponding to a photon interacting with the detector at a specific X, Y point, (and at a specific time if in TIME-TAG mode). We generally wish to translate this to a 1-dimensional spectrum (*Flux or Intensity on the y axis vs. Wavelength on the x axis*). To do this, we can first make a 2-dimensional spectrum, by plotting all the X,Y points of the spectrum onto a 2D image of the detector. The different spectra (i.e. of the NUV of FUV target, the wavelength calibration source) then appear of stripes of high count density on this image. We can then simply draw extraction boxes around these stripes, and integrate to collapse the data onto the wavelenth axis.

The XTRACTAB is a fits file which contains a series of parallelogram "boxes" to be used for different COS modes.

These are the boxes which we collapse to create a 1-dimensional spectrum. For each combination of COS lifetime position, grating, cenwave, etc., the extraction box is specified by giving the slope and y-intercept of a line, and the height of the parallelogram which should be centered on the line. Similar boxes are specified for background regions. For more information on the XTRACTAB, see the COS Data Handbook.

For many reasons, we may wish to use an extraction box different from the one specified by the default XTRACTAB, and instead set the boxes manually.

We need to see where the NUV stripes fall in order to determine where we should place the extraction boxes. First, let's plot this as a 2D image of the raw counts. To begin, we select and plot the raw counts data from the 0th rawtag file:

In [5]:
# Read the data from the first rawtag
rawtag = rawtags[0]
rtd=Table.read(rawtag,1)
###

plt.figure(figsize=(10,8))
# Plot the raw counts:
plt.scatter(rtd['RAWX'],rtd['RAWY'], s= 0.1, alpha=0.1, c='C0')

# Plot lines roughly centered on the 3 NUV stripes:
for i, (line,label) in enumerate(zip([187,285,422],['NUVA', 'NUVB', 'NUVC'])): 
    plt.axhline(line, color='myr'[i], linewidth=3, alpha=0.8, linestyle='dotted', label=label)

plt.xlim(0,1024)
plt.ylim(0,1024)

plt.xlabel('Dispersion axis Pixel', size=20)
plt.ylabel('Cross-dispersion axis Pixel', size=20)
plt.title("Fig 1.1\n2D spectrum of all raw (unfiltered) counts", size=25)
plt.legend(loc='upper right')
plt.tight_layout()
plt.show()