MIRI LRS Slit Spectroscopy: Spectral Extraction using the JWST Pipeline#
July 2023
Use case: Spectral extraction of slit spectra with the JWST calibration pipeline.
Data: Publicly available science data
Tools: jwst, matplotlib, astropy.
Cross-intrument: NIRSpec, MIRI.
Documentation: This notebook is part of a STScI’s larger post-pipeline Data Analysis Tools Ecosystem and can be downloaded directly from the JDAT Notebook Github directory.
Introduction: Spectral extraction in the JWST calibration pipeline#
The JWST calibration pipeline performs spectrac extraction for all spectroscopic data using basic default assumptions that are tuned to produce accurately calibrated spectra for the majority of science cases. This default method is a simple fixed-width boxcar extraction, where the spectrum is summed over a number of pixels along the cross-dispersion axis, over the valid wavelength range. An aperture correction is applied at each pixel along the spectrum to account for flux lost from the finite-width aperture.
The extract_1d
step uses the following inputs for its algorithm:
the spectral extraction reference file: this is a json-formatted file, available as a reference file from the JWST CRDS system
the bounding box: the
assign_wcs
step attaches a bounding box definition to the data, which defines the region over which a valid calibration is available. We will demonstrate below how to visualize this region.
However the extract_1d
step has the capability to perform more complex spectral extractions, requiring some manual editing of parameters and re-running of the pipeline step.
Aims#
This notebook will demonstrate how to re-run the spectral extraction step with different settings to illustrate the capabilities of the JWST calibration pipeline.
Assumptions#
We will demonstrate the spectral extraction methods on resampled, calibrated spectral images. The basic demo and two examples run on Level 3 data, in which the nod exposures have been combined into a single spectral image. Two examples will use the Level 2b data - one of the nodded exposures.
Test data#
The data used in this notebook is an observation of the Type Ia supernova SN2021aefx, observed by Jha et al in PID 2072 (Obs 1). These data were taken with zero exclusive access period, and published in Kwok et al 2023. You can retrieve the data from this Box folder, and we recommend you place the files in the data/
folder of this repository, or change the directory settings in the notebook prior to running.
You can of course use your own data instead of the demo data.
JWST pipeline version and CRDS context#
This notebook was written using the calibration pipeline version 1.10.2. We set the CRDS context explicitly to 1089 to match the current latest version in MAST. If you use different pipeline versions or CRDS context, please read the relevant release notes (here for pipeline, here for CRDS) for possibly relevant changes.
Contents#
The Level 3 data products
The spectral extraction reference file
Example 1: Changing the aperture width
Example 2: Changing the aperture location
Example 3: Extraction with background subtraction
Example 4: Tapered column extraction
Import Packages#
astropy.io
fits for accessing FITS filesos
for managing system pathsmatplotlib
for plotting dataurllib
for downloading datatarfile
for unpacking datanumpy
for basic array manipulationjwst
for running JWST pipeline and handling data productsjson
for working with json filescrds
for working with JWST reference files
# Set CRDS variables first
import os
os.environ['CRDS_CONTEXT'] = 'jwst_1089.pmap'
os.environ['CRDS_PATH'] = os.environ['HOME']+'/crds_cache'
os.environ['CRDS_SERVER_URL'] = 'https://jwst-crds.stsci.edu'
print(f'CRDS cache location: {os.environ["CRDS_PATH"]}')
%matplotlib inline
import urllib.request
import tarfile
import numpy as np
import matplotlib.pyplot as plt
import astropy.io.fits as fits
import astropy.units as u
from astropy.modeling import models, fitting
import jwst
from jwst import datamodels
from jwst.extract_1d import Extract1dStep
from matplotlib.patches import Rectangle
import json
import crds
print(f'Using JWST calibration pipeline version {jwst.__version__}')
data_tar_url = 'https://data.science.stsci.edu/redirect/JWST/jwst-data_analysis_tools/MIRI_LRS_notebook/data.tar.gz'
# Download and unpack data if needed
if not os.path.exists("data.tar.gz"):
print("Downloading Data")
urllib.request.urlretrieve(data_tar_url, 'data.tar.gz')
if not os.path.exists("data/"):
print("Unpacking Data")
with tarfile.open('./data.tar.gz', "r:gz") as tar:
tar.extractall(filter='data')
1. The Level 3 Data Products #
Let’s start by plotting the main default Level 3 output products:
the
s2d
file: this is the 2D image built from the co-added resampled individual nod exposures.the
x1d
file: this is the 1-D extracted spectrum, extracted from the Level 3s2d
file.
The s2d
image shows a bright central trace, flanked by two negative traces. These result from the combination of the nod exposures, each of which also contains a positive and negative trace due to being mutually subtracted for background subtraction.
We restrict the short-wavelength end of the x-axis to 5 micron, as our calibration is very poor below this wavelength. The Level 3 spectrum is extracted from the resampled, dither-combined, calibrated exposure.
l3_s2d_file = 'data/jw02072-o001_t010_miri_p750l_s2d_1089.fits'
l3_s2d = datamodels.open(l3_s2d_file)
fig, ax = plt.subplots(figsize=[2, 8])
im2d = ax.imshow(l3_s2d.data, origin='lower', aspect='auto', cmap='gist_gray')
ax.set_xlabel('column')
ax.set_ylabel('row')
ax.set_title('SN2021aefx - Level 3 resampled 2D spectral image')
fig.colorbar(im2d)
fig.show()
l3_file = 'data/jw02072-o001_t010_miri_p750l_x1d_1089.fits'
l3_spec = datamodels.open(l3_file)
fig2, ax2 = plt.subplots(figsize=[12, 4])
ax2.plot(l3_spec.spec[0].spec_table['WAVELENGTH'], l3_spec.spec[0].spec_table['FLUX'])
ax2.set_xlabel('wavelength (um)')
ax2.set_ylabel('flux (Jy)')
ax2.set_title('SN2021aefx - Level 3 spectrum in MAST (pmap 1089)')
ax2.set_xlim(5., 14.)
fig2.show()
The spectral extraction reference file #
The reference file that tells the extract_1d
algorithm what parameters to use is a text file using the json
format that is available in CRDS. The second reference file used in the extraction is the aperture correction; this corrects for the flux lost as a function of wavelength for the extraction aperture size used. You can use the datamodel attributes of the x1d
file to check which extraction reference file was called by the algorithm.
We show below how to examine the file programmatically to see what aperture was used to produce the default Level 3 spectrum shown above. Note: this json file can easily be opened and edited with a simple text editor.
Full documentation of the extract_1d
reference file is available here. We recommend you read this page and any links therein carefully to understand how the parameters in the file are applied to the data.
print(f'Spectral extraction reference file used: {l3_spec.meta.ref_file.extract1d.name}')
file_path = 'data/jw02072-o001_t010_miri_p750l_x1d_1089.fits'
with fits.open(file_path) as hdul:
header = hdul[0].header
json_ref_default = crds.getreferences(header)['extract1d']
with open(json_ref_default) as json_ref:
x1dref_default = json.load(json_ref)
print('Settings for SLIT data: {}'.format(x1dref_default['apertures'][0]))
print(' ')
print('Settings for SLITLESS data: {}'.format(x1dref_default['apertures'][1]))
Let’s look at what’s in this file.
id: identification label, in this case specifying the exposure type the parameters will be applied to.
region_type: optional, if included must be set to ‘target’
disp_axis: defines the direction of dispersion (1 for x-axis, 2 for y-axis). For MIRI LRS this should always be set to 2.
xstart (int): first pixel in the horizontal direction (x-axis; 0-indexed)
xstop (int): last pixel in the horizontal direction (x-axis; 0-indexed; limit is inclusive)
bkg_order:
use_source_posn (True/False): if True, this will use the target coordinates to locate the target in the field, and offset the extraction aperture to this location. We recommend this is set to False.
bkg_order: the polynomial order to be used for background fitting. if the accompanying parameter bkg_coeff is not provided, no background fitting will be performed. For MIRI LRS slit data, default background subtraction is achieved in the Spec2Pipeline, by mutually subtracting nod expsosures.
As for MIRI LRS the dispersion is in the vertical direction (i.e. disp_axis
= 2), the extraction aperture width is specified with the coordinates xstart
and xstop
. If no coordinates ystart
and ystop
are provided, the spectrum will be extracted over the full height of the s2d
cutout region. We can illustrate the default extraction parameters on the Level 3 s2d
file.
xstart = x1dref_default['apertures'][0]['xstart']
xstop = x1dref_default['apertures'][0]['xstop']
ap_height = np.shape(l3_s2d.data)[0]
ap_width = xstop - xstart + 1
x1d_rect = Rectangle(xy=(xstart, 0), width=ap_width, height=ap_height, angle=0., edgecolor='red',
facecolor='None', ls='-', lw=1.5)
fig, ax = plt.subplots(figsize=[2, 8])
im2d = ax.imshow(l3_s2d.data, origin='lower', aspect='auto', cmap='gist_gray')
ax.add_patch(x1d_rect)
ax.set_xlabel('column')
ax.set_ylabel('row')
ax.set_title('SN2021aefx - Level 3 resampled 2D spectral image')
fig.colorbar(im2d)
fig.show()
Example 1: Changing the extraction width #
In this example, we demonstrate how to change the extraction width from the default. Instead of 8 pixels, we’ll extract 12, keeping the aperture centred on the trace.
We will modify the values in the json files in python in this notebook, but the file can also simply be edited in a text editor.
xstart2 = xstart - 2
xstop2 = xstop + 2
print('New xstart, xstop values = {0},{1}'.format(xstart2, xstop2))
with open(json_ref_default) as json_ref:
x1dref_default = json.load(json_ref)
x1dref_ex1 = x1dref_default.copy()
x1dref_ex1['apertures'][0]['xstart'] = xstart2
x1dref_ex1['apertures'][0]['xstop'] = xstop2
with open('x1d_reffile_example1.json', 'w') as jsrefout:
json.dump(x1dref_ex1, jsrefout, indent=4)
ap_width2 = xstop2 - xstart2 + 1
x1d_rect1 = Rectangle(xy=(xstart, 0), width=ap_width, height=ap_height, angle=0., edgecolor='red',
facecolor='None', ls='-', lw=1, label='8-px aperture (default)')
x1d_rect2 = Rectangle(xy=(xstart2, 0), width=ap_width2, height=ap_height, angle=0., edgecolor='cyan',
facecolor='None', ls='-', lw=1, label='12-px aperture')
fig4, ax4 = plt.subplots(figsize=[2, 8])
im2d = ax4.imshow(l3_s2d.data, origin='lower', aspect='auto', cmap='gist_gray')
# ax4.add_collection(aps_collection)
ax4.add_patch(x1d_rect1)
ax4.add_patch(x1d_rect2)
ax4.set_xlabel('column')
ax4.set_ylabel('row')
ax4.set_title('Example 1: Default vs modified extraction aperture')
ax4.legend(loc=3)
fig.colorbar(im2d)
fig.show()
Next we run the spectral extraction step, using this modified reference file. Note: when a step is run individually the file name suffix is different from when we run the Spec3Pipeline in its entirety. The extracted spectrum will now have extract1dstep.fits
in the filename. The custom parameters we pass to the step call:
output_file
: we provide a custom output filename for this example (including an output filename renders thesave_results
parameter obsolete)override_extract1d
: here we provide the name of the custom reference file we created above
We will plot the output against the default extracted product. We expect the spectra to be almost identical; differences can be apparent at the longer wavelengths as our path loss correction is less well calibrated in this low SNR region.
sp3_ex1 = Extract1dStep.call(l3_s2d, output_dir='data/',
output_file='lrs_slit_extract_example1', override_extract1d='x1d_reffile_example1.json')
print(sp3_ex1)
fig5, ax5 = plt.subplots(figsize=[12, 4])
ax5.plot(l3_spec.spec[0].spec_table['WAVELENGTH'], l3_spec.spec[0].spec_table['FLUX'], label='8-px aperture')
ax5.plot(sp3_ex1.spec[0].spec_table['WAVELENGTH'], sp3_ex1.spec[0].spec_table['FLUX'], label='12-px aperture')
ax5.set_xlabel('wavelength (um)')
ax5.set_ylabel('flux (Jy)')
ax5.set_title('Example 1: Difference aperture sizes')
ax5.set_xlim(5., 14.)
ax5.legend()
fig5.show()
Example 2: Changing aperture location#
In this example we will demonstrate spectral extraction at a different location in the slit. A good use case for this is to extract a spectrum from one of the nodded exposures, prior to combination of the nods in the Spec3Pipeline. We will take the s2d
output from the Spec2Pipeline, and extract the spectrum. In the nod 1 exposure we see the spectrum peak is located in column 13 (0-indexed), and we extract a default 8-px fixed-width aperture.
l2_s2d_file = 'data/jw02072001001_06101_00001_mirimage_s2d.fits'
l2_s2d = datamodels.open(l2_s2d_file)
xstart3 = 9
xstop3 = 17
with open(json_ref_default) as json_ref:
x1dref_default = json.load(json_ref)
x1dref_ex2 = x1dref_default.copy()
x1dref_ex2['apertures'][0]['xstart'] = xstart3
x1dref_ex2['apertures'][0]['xstop'] = xstop3
with open('x1d_reffile_example2.json', 'w') as jsrefout:
json.dump(x1dref_ex2, jsrefout, indent=4)
ap_width3 = xstop3 - xstart3 + 1
x1d_rect3 = Rectangle(xy=(xstart3, 0), width=ap_width3, height=ap_height, angle=0., edgecolor='red',
facecolor='None', ls='-', lw=1, label='8-px aperture at nod 1 location')
fig6, ax6 = plt.subplots(figsize=[2, 8])
im2d = ax6.imshow(l2_s2d.data, origin='lower', aspect='auto', cmap='gist_gray')
ax6.add_patch(x1d_rect3)
ax6.set_xlabel('column')
ax6.set_ylabel('row')
ax6.set_title('Example 2: Different aperture location')
ax6.legend(loc=3)
fig6.colorbar(im2d)
fig6.show()
sp2_ex2 = Extract1dStep.call(l2_s2d_file, output_dir='data/', output_file='lrs_slit_extract_example2',
override_extract1d='x1d_reffile_example2.json')
Let’s again plot the output against the default extracted product. We expect this 1-nod spectrum to be noisier but not significantly different from the combined product. The spectrum may have more bad pixels that manifest as spikes or dips in the spectrum.
fig7, ax7 = plt.subplots(figsize=[12, 4])
ax7.plot(l3_spec.spec[0].spec_table['WAVELENGTH'], l3_spec.spec[0].spec_table['FLUX'], label='default location (nods combined)')
ax7.plot(sp2_ex2.spec[0].spec_table['WAVELENGTH'], sp2_ex2.spec[0].spec_table['FLUX'], label='nod 1 location (single nod)')
ax7.set_xlabel('wavelength (um)')
ax7.set_ylabel('flux (Jy)')
ax7.set_title('Example 2: Different aperture locations')
ax7.set_xlim(5., 14.)
ax7.legend()
fig7.show()
Example 3: Extraction with background subtraction#
For LRS slit observations, the default background subtraction strategy is performed in the background
step in the Spec2Pipeline; the 2 nodded exposures are mutually subtracted, resulting in each returning a 2D spectral image with a positive and a negative trace, and the background subtracted.
For non-standard cases or slitless LRS data it is however possible to subtract a background as part of the spectral extraction in extract_1d
. In the extract_1d
reference file we can pass specific parameters for the background:
bkg_coeff (list or list of floats): the regions to be used as background. This is the main parameter required for background subtraction
bkg_fit (string): the type or method of the background computation. (e.g. None, ‘poly’, ‘mean’ or ‘median’)
bkg_order (int): the order of polynomial to fit to background regions. if bkg_fit is not set to ‘poly’, this parameter will be ignored.
smoothing_length (odd int; optional): the width of the boxcar filter that will be used to smooth the background signal in the dispersion direction. This can provide a better quality in case of noisy data.
The ‘poly’ option for the bkg_fit
parameter will take the value of all pixels in the background region on a given row, and fit a polynomial of order bkg_order
to them. This option can be useful in cases where a gradient is present in the background.
The data we’re using here already has the background subtracted so we expect the impact of this to be minimal, but we provide a demonstration using the nod 1, level 2b spectral image. In this example we will calculate the background from 2 4-column windows, setting the bkg_fit
to ‘median’.
rows = [140, 200, 325]
fig8, ax8 = plt.subplots(figsize=[8, 4])
ncols = np.shape(l2_s2d.data)[1]
pltx = np.arange(ncols)
for rr in rows:
label = 'row {}'.format(rr)
ax8.plot(pltx, l2_s2d.data[rr, :], label=label)
ax8.axvline(x=1, ymin=0, ymax=1, ls='--', lw=1., color='coral', label='background regions')
ax8.axvline(x=5, ymin=0, ymax=1, ls='--', lw=1., color='coral')
ax8.axvline(x=39, ymin=0, ymax=1, ls='--', lw=1., color='coral')
ax8.axvline(x=43, ymin=0, ymax=1, ls='--', lw=1., color='coral')
ax8.legend()
fig8.show()
with open(json_ref_default) as json_ref:
x1dref_default = json.load(json_ref)
x1dref_ex3 = x1dref_default.copy()
x1dref_ex3['apertures'][0]['xstart'] = xstart3
x1dref_ex3['apertures'][0]['xstop'] = xstop3
x1dref_ex3['apertures'][0]['bkg_coeff'] = [[0.5], [4.5], [38.5], [43.5]]
x1dref_ex3['apertures'][0]['bkg_fit'] = 'median'
with open('x1d_reffile_example3.json', 'w') as jsrefout:
json.dump(x1dref_ex3, jsrefout, indent=4)
sp2_ex3 = Extract1dStep.call(l2_s2d_file, output_dir='data/', output_file='lrs_slit_extract_example3',
override_extract1d='x1d_reffile_example3.json')
When the extract_1d
step performs a background subtraction, the background spectrum is part of the output product, so you can check what was subtracted. In the plot below we can see that, as expected, the background for this particular exposure is near-zero (apart from the noisy long-wavelength end), as the subtraction was already performed.
fig9, ax9 = plt.subplots(nrows=2, ncols=1, figsize=[12, 4])
# ax9.plot(l3_spec.spec[0].spec_table['WAVELENGTH'], l3_spec.spec[0].spec_table['FLUX'], label='default location (nods combined)')
ax9[0].plot(sp2_ex2.spec[0].spec_table['WAVELENGTH'], sp2_ex2.spec[0].spec_table['FLUX'], label='nod 1 spectrum - no bkg sub')
ax9[0].plot(sp2_ex3.spec[0].spec_table['WAVELENGTH'], sp2_ex3.spec[0].spec_table['FLUX'], label='nod 1 spectrum - with bkg sub')
ax9[1].plot(sp2_ex3.spec[0].spec_table['WAVELENGTH'], sp2_ex3.spec[0].spec_table['BACKGROUND'], label='background')
ax9[1].set_xlabel('wavelength (um)')
ax9[0].set_ylabel('flux (Jy)')
ax9[0].set_title('Example 3: Extraction with background subtraction')
ax9[0].set_xlim(5., 14.)
ax9[1].set_xlim(5., 14.)
ax9[0].legend()
ax9[1].legend()
fig9.show()
Example 4: Tapered column extraction#
In this example we will use the JWST calibration pipeline to perform a spectral extraction in a tapered column aperture. The way to specify this in the extraction reference file is to use the src_soeff
parameter instead of the simpler xstart
, xstop
settings. The src_coeff
parameter can take polynomial coefficients rather than fixed pixel values. In this example, we will define a tapered column aperture corresponding to 3 * the FWHM of the spatial profile.
Polynomial definitions for the extraction aperture can be specified as a function of pixels or wavelength, which is defined in the independent_var
parameter.
We will use pre-measured FWHM values as a function of wavelength to fit a straight line to the FWHM(\(\lambda\)) profile, and set the extraction parameters according to this fit. The FWHM can of course also be measured directly from the data as well.
def calc_xap_fit():
# these are values measured from commissioning data. FWHM is in arcsec.
lam = [5.0, 7.5, 10.0, 12.0]
fwhm = [0.29, 0.3, 0.36, 0.42]
# convert from arcsec to pixel using MIRI pixel scaling of 0.11 arcsec/px
fwhm_px = fwhm / (0.11*u.arcsec/u.pixel)
# we want to extract 3 * fwhm, which means 1.5 * fwhm on either side of the trace
xap_pix = 1.5 * fwhm_px
# now we want to fit a line to these points
line_init = models.Linear1D()
fit = fitting.LinearLSQFitter()
fitted_line = fit(line_init, lam, xap_pix.value)
print(fitted_line)
fig, ax = plt.subplots(figsize=[8, 4])
xplt = np.linspace(4.0, 14., num=50)
ax.plot(lam, xap_pix.value, 'rx', label='1.5 * FWHM(px)')
ax.plot(xplt, fitted_line(xplt), 'b-', label='best-fit line')
ax.set_xlabel('wavelength')
ax.set_ylabel('px')
ax.legend()
return fitted_line
poly_pos = calc_xap_fit()
print(poly_pos.slope, poly_pos.intercept)
The above polynomial defines the relationship between wavelength and the number of pixels to extract. To ensure that the extractio location is centred on the location of the spectrum, we add to the intercept value the central location of the trace, which is at column 30.5.
In the next cell, we provide these parameters to the src_coeff
parameter in the extraction reference file. Note: The src_coeff
parameter takes precedence over the xstart
and xstop
parameters if all 3 are present; for clarity we remove the latter from our reference file.
trace_cen = 30.5
with open(json_ref_default) as json_ref:
x1dref_default = json.load(json_ref)
x1dref_ex4 = x1dref_default.copy()
x1dref_ex4['apertures'][0]['xstart'] = None
x1dref_ex4['apertures'][0]['xstop'] = None
x1dref_ex4['apertures'][0]['independent_var'] = 'wavelength'
x1dref_ex4['apertures'][0]['src_coeff'] = [[-1*poly_pos.intercept.value + trace_cen, -1*poly_pos.slope.value], [poly_pos.intercept.value + trace_cen, poly_pos.slope.value]]
with open('x1d_reffile_example4.json', 'w') as jsrefout:
json.dump(x1dref_ex4, jsrefout, indent=4)
sp3_ex4 = Extract1dStep.call(l3_s2d, output_dir='data/',
output_file='lrs_slit_extract_example4', override_extract1d='x1d_reffile_example4.json')
fig10, ax10 = plt.subplots(figsize=[12, 4])
ax10.plot(l3_spec.spec[0].spec_table['WAVELENGTH'], l3_spec.spec[0].spec_table['FLUX'], label='default fixed-width aperture')
ax10.plot(sp3_ex4.spec[0].spec_table['WAVELENGTH'], sp3_ex4.spec[0].spec_table['FLUX'], label='tapered column aperture')
ax10.set_xlabel('wavelength (um)')
ax10.set_ylabel('flux (Jy)')
ax10.set_title('Example 4: Tapered column vs. fixed-width extraction aperture')
ax10.set_xlim(5., 14.)
ax10.legend()
fig10.show()
The output spectrum also contains a reference to the number of pixels used for the extraction as a function of wavelength. Let’s visualize that too.
fig11, ax11 = plt.subplots(figsize=[12, 4])
ax11.plot(l3_spec.spec[0].spec_table['WAVELENGTH'], l3_spec.spec[0].spec_table['NPIXELS'], label='default fixed-width aperture')
ax11.plot(sp3_ex4.spec[0].spec_table['WAVELENGTH'], sp3_ex4.spec[0].spec_table['NPIXELS'], label='tapered column aperture')
ax11.set_xlabel('wavelength (um)')
ax11.set_ylabel('number of pixels')
ax11.set_title('Example 4: Number of pixels extracted')
ax11.set_xlim(5., 14.)
ax11.legend()
fig11.show()
Summary#
We hope this notebook was useful in helping you understand the capabilities of the JWST calibration for spectral extraction. The above examples are not an exhaustive list of all the possibilities: different methods of source and background extraction can be combined for more complex extraction operations.
If you have any questions, comments, or requests for further demos of these capabilities, please contact the JWST Helpdesk.