Masking Persistence in WFC3/IR Images#


Learning Goals#

This notebook shows how to use the Hubble Space Telescope WFC3/IR persistence model to flag pixels affected by persistence in the calibrated (FLT) science images. When the images are sufficiently dithered to step over the observed persistence artifacts, AstroDrizzle may be used to exclude those flagged pixels when combining the FLT frames.

By the end of this tutorial, you will:

  • Download images and persistence products from MAST.

  • Flag affected pixels in the data quality arrays of the FLT images.

  • Redrizzle the FLT images to produce a “clean” DRZ combined product.

Table of Contents#

Introduction
1. Imports
2. Data

  • 2.1 Download the WFC3/IR observations from MAST

  • 2.2 Download the persistence model products

3. Analysis

  • 3.1 Display the images

  • 3.2 Use the persistence model to add DQ flags

  • 3.3 Redrizzle the FLT data and apply the new DQ flags

  • 3.4 Compare the original and corrected DRZ files

  • 3.5 Compare the original and corrected WHT files

4. Conclusions
Additional Resources
About this Notebook
Citations

Introduction #

Image persistence in the IR array occurs whenever a pixel is exposed to light that exceeds more than about half of the full well of a pixel in the array. Persistence can occur within a single visit, as the different exposures in a visit are dithered. Persistence also occurs from observations in a previous visit of completely different fields.

Image persistence is seen in a small, but non-negligible fraction of WFC3/IR exposures. Its properties are discussed in in Section 5.7.9 of the WFC3 Instrument Handbook and in Chapter 8 of the WFC3 Data Handbook. Persistence is primarily a function of the degree to which a pixel is filled (in electrons) and the time since this occurred. Additional information is available from the WFC3 Persistence Webpage.

As described in Section 8.3 of the Data Handbook, there are two possible ways to mitigate persistence: 1.) exclude the affected pixels from the analysis or 2.) subtract the persistence model directly from the image.

This notebook illustrates the first method and shows how to use the model to flag affected pixels in the data quality (DQ) array of each FLT image. When the images are sufficiently dithered, affected regions of the detector may be replaced with ‘good’ pixels from other exposures in the visit when combining the exposures with AstroDrizzle. Note that this reduces the effective exposure time in those regions of the combined image.

In the second method, the persistence-corrected FLT frames, downloaded in Section 2.2 of this notebook, may be used directly for analysis. Alternatively, a scaled version of the persistence model may be subtracted from each FLT image until an adequate correction is achieved. In this case, flagging the affected pixels in the DQ arrays would not be required.

1. Imports #

This notebook assumes you have installed the required libraries as described here.

We import:

  • glob for finding lists of files

  • os for setting environment variables

  • shutil for managing directories

  • urllib for obtaining the Persistence products from MAST

  • matplotlib.pyplot for plotting data

  • astropy.io fits for accessing FITS files

  • astroquery for downloading data from MAST

  • ccdproc for querying keyword values in the FITS headers

  • drizzlepac astrodrizzle for combining images

import glob 
import os 
import shutil
import urllib

import matplotlib.pyplot as plt

from astropy.io import fits
from astroquery.mast import Observations
from ccdproc import ImageFileCollection

from drizzlepac import astrodrizzle

2. Data #

2.1 Download the WFC3/IR observations from MAST #

Here, we obtain WFC3/IR observations from the Grism Lens-Amplified Survey from Space (GLASS) program 13459, Visit 29 in the F140W filter. These exposures were impacted by persistence from grism G102 exposures obtained just prior to these.

The following commands query the Mikulski Archive for Space Telescopes (MAST) and then download the FLT and DRZ data products to the current directory.

data_list = Observations.query_criteria(obs_id="ica529*", filters="F140W")

Observations.download_products(
    data_list["obsid"],
    mrp_only=False,
    download_dir="./data",
    productSubGroupDescription=["FLT", "DRZ"],
)

science_files = glob.glob("data/mastDownload/HST/*/*fits")

for im in science_files:
    filename = os.path.basename(im)
    new_path = os.path.join(".", filename)
    os.rename(im, new_path)

data_directory = "./data"

try:
    if os.path.isdir(data_directory):
        shutil.rmtree(data_directory)
except Exception as e:
    print(f"An error occured while deleting the directory {data_directory}: {e}")

This Visit contains 4 consecutive dithered FLT exposures in the F140W filter, which are obtained in a single orbit. The following commands print the values of keywords describing those data, where the POSTARG* values represent the commanded x-axis and y-axis offsets in arcseconds.

image_collection = ImageFileCollection(
    "./",
    keywords=[
        "asn_id",
        "targname",
        "filter",
        "samp_seq",
        "nsamp",
        "exptime",
        "postarg1",
        "postarg2",
        "date-obs",
        "time-obs",
    ],
    glob_include="ica529*flt.fits",
    ext=0,
)

try:
    summary_table = image_collection.summary
    if summary_table:
        print(summary_table)
    else:
        print("No FITS files matched the pattern or no relevant data found.")
except Exception as e:
    print(f"An error occurred while creating the summary table: {e}")

2.2 Download the persistence model products #

To find the URL of the tar file containing the persistence fits files, visit the PERSIST Search Form and search for dataset=’ica529*’.

You may hover over the link in the Visit column to get the URL to the gzipped tar file for Visit 29. This URL is called in the Python code below. The persistence model data products we will use to create a mask are named ‘rootname_persist.fits’ and contain any contributions from either external or internal persistence.

External persistence is defined as residual signal that is generated by an earlier visit, and internal persistence as that generated within the same visit as the image in question. External persistence typically comes from a prior scheduled WFC3/IR program and is not within the control of the observer. Internal persistence can be mitigated by the observer by dithering the exposures within a given visit.

This cell may take several minutes to complete, as it will download persistence products for all images in the visit, and not just those for the F140W filter.

url = "https://archive.stsci.edu/pub/wfc3_persist/13459/Visit29/13459.Visit29.tar.gz"
filename = "13459.Visit29.tar.gz"

try:
    with urllib.request.urlopen(url) as response:
        with open(filename, "wb") as out_file:
            out_file.write(response.read())
    print("Extracting files....")

    # Extract files:
    !tar -zxvf {filename}

    # Remove the files after extraction
    os.remove("13459.Visit29.tar.gz")
except Exception as e:
    print(f"An error occured: {e}")

This creates a subdirectory named 13459.Visit29/ in the working directory with the following products for each calibrated ‘flt.fits’ image:

  • persist.fits: The persistence model, including both internal and external persistence

  • extper.fits: The persistence model, including only external persistence

  • flt_corr.fits: The corrected FLT image, equal to the difference between the image and the model (‘flt.fits’ - ‘persist.fits’).

Note that limitations in the accuracy of the model can result in corrected FLT images in which persistence is not properly removed. This is especially true when prior observations were obtained in scanning mode, where the model significantly underestimates the level of persistence. In this case the model may be iteratively scaled and subtracted from the FLT frame until the residual signal is fully removed.

Alternately, this notebook shows how the user can flag the impacted pixels in the FLT data quality array then reprocess with AstroDrizzle to improve the combined image.

3. Analysis #

3.1 Display the images #

The drizzeled (DRZ) product combines the 4 individual FLT exposures and shows faint residual persistence from grism observations obtained just prior to these data. An example of the ‘Persistence Removal Evaluation’ for the first dataset ica529rmq may be found here.

The DRZ pipeline product is shown on the left, and we can see that the persistence was partially, but not completely, removed by the cosmic-ray rejection functionality in AstroDrizzle. The persistence model for the first FLT dataset is shown on the right. A pair of red boxes are overplotted on each image to highlight regions of the detector with the most visible grism persistence in the combined DRZ image.

drz = fits.getdata('ica529030_drz.fits', ext=1)
per1 = fits.getdata('13459.Visit29/ica529rsq_persist.fits', ext=1)

fig = plt.figure(figsize=(20, 8))
ax1 = fig.add_subplot(1, 2, 1)
ax2 = fig.add_subplot(1, 2, 2)

ax1.imshow(drz, vmin=0.85, vmax=1.4, cmap='Greys_r', origin='lower')
ax2.imshow(per1, vmin=0.0, vmax=0.005, cmap='Greys_r', origin='lower')

ax1.set_title('ica529030_drz.fits (Drizzled Product = 4 FLTs)', fontsize=20)
ax2.set_title('ica529rsq_persist.fits (Model for a single FLT)', fontsize=20)

ax1.plot([0, 280, 280, 0, 0], [250, 250, 350, 350, 250], c='red')
ax2.plot([0, 280, 280, 0, 0], [250, 250, 350, 350, 250], c='red')
ax1.plot([200, 480, 480, 200, 200], [360, 360, 460, 460, 360], c='red')
ax2.plot([200, 480, 480, 200, 200], [360, 360, 460, 460, 360], c='red')

3.2 Use the persistence model to add DQ flags #

For any pixels in the model with a signal greater than 0.005 e-/sec, we add a flag of 16384 to the current DQ array values in each FLT frame. This threshold is flexible and should be determined empirically by the user based on the science objective and the fraction of pixels impacted. Note that the IR dark rate is 0.049 e-/s, so a threshold of 0.05 e-/s or 0.01 e-/s may be a more reasonable starting value in order to avoid flagging more pixels than can be filled in with the associated dithered FLT frames.

flt1_dq = fits.open('ica529rmq_flt.fits', mode='update')
per1 = fits.getdata('13459.Visit29/ica529rmq_persist.fits', ext=1)

flt2_dq = fits.open('ica529rsq_flt.fits', mode='update')
per2 = fits.getdata('13459.Visit29/ica529rsq_persist.fits', ext=1)

flt3_dq = fits.open('ica529s0q_flt.fits', mode='update')
per3 = fits.getdata('13459.Visit29/ica529s0q_persist.fits', ext=1)

flt4_dq = fits.open('ica529s6q_flt.fits', mode='update')
per4 = fits.getdata('13459.Visit29/ica529s6q_persist.fits', ext=1)

flt1_dq[3].data[per1 > 0.005] += 16384
flt2_dq[3].data[per2 > 0.005] += 16384
flt3_dq[3].data[per3 > 0.005] += 16384
flt4_dq[3].data[per4 > 0.005] += 16384

flt1_dq.close()
flt2_dq.close()
flt3_dq.close()
flt4_dq.close()

3.3 Redrizzle the FLT data and apply the new DQ flags #

Now, we recombine the FLT data with AstroDrizzle using the updated DQ arrays and compare with the pipeline DRZ data products. The following cell uses the default parameter values recommended for the IR detector, where final_bits tells AstroDrizzle which DQ flags to ignore (e.g. to treat as good data). All other flagged (non-zero) values will be treated as bad pixels and excluded from the combined image.

astrodrizzle.AstroDrizzle('ica529*flt.fits', output='ica529030_pcorr',
                          preserve=False, build=True, clean=True, 
                          skymethod='match', sky_bits='16',
                          driz_sep_bits='512,16', combine_type='median',
                          driz_cr_snr='5.0 4.0', driz_cr_scale='3.0 2.4', 
                          final_bits='512,16', num_cores=1)

3.4 Compare the original and corrected DRZ files #

Here, we display the drizzled image from the pipeline and the reprocessed drizzled image with persistence masked.

drz = fits.getdata('ica529030_drz.fits', ext=1)
drz_corr = fits.getdata('ica529030_pcorr_drz.fits', ext=1)

fig = plt.figure(figsize=(20, 8))
ax1 = fig.add_subplot(1, 2, 1)
ax2 = fig.add_subplot(1, 2, 2)

ax1.imshow(drz, vmin=0.85, vmax=1.4, cmap='Greys_r', origin='lower')
ax2.imshow(drz_corr, vmin=0.85, vmax=1.4, cmap='Greys_r', origin='lower')

ax1.set_title('ica529030_drz.fits (Pipeline SCI)', fontsize=20)
ax2.set_title('ica529030_pcorr_drz.fits (Corrected SCI)', fontsize=20)

ax1.plot([0, 280, 280, 0, 0], [250, 250, 350, 350, 250], c='red')
ax2.plot([0, 280, 280, 0, 0], [250, 250, 350, 350, 250], c='red')
ax1.plot([200, 480, 480, 200, 200], [360, 360, 460, 460, 360], c='red')
ax2.plot([200, 480, 480, 200, 200], [360, 360, 460, 460, 360], c='red')

3.5 Compare the original and corrected WHT files #

When final_wht_type='EXP', the drizzled weight (WHT) images provide an effective exposure time map of the combined array.

In the plots below, we can see that the pipeline products have lower weight in the region impacted by the grism, but that only a single frame was flagged and rejected by the cosmic ray algorithm. The grey horizontal bars have a weight ~500 seconds compared to the total exposure ~700 seconds.

In the redrizzled ‘corrected’ WHT image, only a single exposure contributed to the darkest horizontal bars which have a value of ~200 seconds. The adjacent grey bars have a weight ~500 seconds and the rest of the WHT image is ~700 seconds. Users will need to experiment with the persistence masking thresholds and consider the size of the dithers in their individual datasets to determine the best masking strategy.

wht = fits.getdata('ica529030_drz.fits', ext=2)
wht_corr = fits.getdata('ica529030_pcorr_drz.fits', ext=2)

fig = plt.figure(figsize=(20, 8))
ax1 = fig.add_subplot(1, 2, 1)
ax2 = fig.add_subplot(1, 2, 2)

ax1.imshow(wht, vmin=0, vmax=800, cmap='Greys_r', origin='lower')
ax2.imshow(wht_corr, vmin=0, vmax=800, cmap='Greys_r', origin='lower')

ax1.set_title('ica529030_drz.fits (Pipeline WHT)', fontsize=20)
ax2.set_title('ica529030_pcorr_drz.fits (Corrected WHT)', fontsize=20)

ax1.plot([0, 300, 300, 0, 0], [250, 250, 350, 350, 250], c='red')
ax2.plot([0, 300, 300, 0, 0], [250, 250, 350, 350, 250], c='red')
ax1.plot([200, 480, 480, 200, 200], [360, 360, 460, 460, 360], c='red')
ax2.plot([200, 480, 480, 200, 200], [360, 360, 460, 460, 360], c='red')

4. Conclusions #

Thank you for walking through this notebook. Now with WFC3 data, you should be familiar with:

  • Examining the persistence models for a given dataset.

  • Defining a threshold to use for masking pixels in the DQ array of FLT science frames.

  • Reprocessing dithered frames with the new DQ flags to produce an improved combined DRZ image.

Congratulations, you have completed the notebook.

Additional Resources #

Below are some additional resources that may be helpful. Please send any questions through the HST Helpdesk.

About this Notebook #

Author: Jennifer Mack, WFC3 Instrument Team

Created On: 2021-09-28

Updated On: 2023-11-06

Source: The notebook is sourced from hst_notebooks/notebooks/WFC3/persistence.

Citations #

If you use astropy, astroquery, ccdproc, or drizzlepac for published research, please cite the authors. Follow these links for more information about citing the libraries below:


Top of Page Space Telescope Logo