Optimizing Image Alignment for Multiple HST Visits#

This notebook currently fails to execute, use as reference only

Note: The notebook in this repository 'Initialization.ipynb' goes over many of the basic concepts such as the setup of the environment/package installation and should be read first if you are new to HST images, DrizzlePac, or Astroquery.

Introduction#

This notebook demonstrates how to use TweakReg and AstroDrizzle tasks to align and combine images. While this example focuses on ACS/WFC data, the procedure can be almost identically applied to WFC3/UVIS images. This notebook is based on a prior example from the 2012 DrizzlePac Handbook, but has been updated for compatibility with the STScI AstroConda software distribution. There is a good deal of explanatory text in this notebook, and users new to DrizzlePac are encouraged to start with this tutorial. Additional DrizzlePac software documentation is available at the readthedocs webpage.

Before running the notebook, you will need to install the astroquery package, which is used to retrieve the data from the MAST archive and the ccdproc package which is used to query the image headers.

Summary of steps:

  1. Download the data from MAST using astroquery.

  2. Align the images to a common reference frame, and update the WCS using TweakReg.

  3. Combine the aligned images using AstroDrizzle.

import glob
import os
import shutil

from astropy.io import ascii
from astropy.io import fits
from astropy.table import Table
from astropy.visualization import ZScaleInterval
from astroquery.mast import Observations
from drizzlepac import tweakreg
from drizzlepac import astrodrizzle
from IPython.display import Image
import matplotlib.pyplot as plt

%matplotlib inline

1. Download the Data#

In this example, we align three F606W full-frame 339 second ACS/WFC images of the globular cluster NGC 104 from ACS/CAL Program 10737. These observations were acquired over a 3-month period in separate visits and at different orientations. We will use calibrated, CTE-corrected *_flc.fits files, which are available for both WFC3/UVIS and ACS/WFC detectors.

        j9irw3fwq_flc.fits
        j9irw4b1q_flc.fits
        j9irw5kaq_flc.fits

First, we use astroquery to look for the desired datasets and retrieve them from MAST.

You may query on a large number of parameters, but to obtain these specific datasets we will only need to pass in a few: obstype = 'all' to include both calibration and GO programs in the search, obs_collection = 'HST' to exclude Hubble Legacy Archive images, and finally obs_id to search for a list of the dataset IDs. We finally select only ‘FLC’ data products using the parameter productSubGroupDescription while downloading the files. Because the second ‘FLC’ file for visit W4 is part of a dithered association, the obs_id for that assocation will need to be used instead of the individual filename.

Note: The next cell may take awhile to complete or may need to be run more than once, as errors may occasionally arise from a failed server connection.

# Query the data
obsTable = Observations.query_criteria(obstype='all', 
    obs_collection='HST',
    obs_id=['j9irw3fwq', 'j9irw4040', 'j9irw5kaq'])

# Download the files
products = Observations.get_product_list(obsTable)
Observations.download_products(products,download_dir='',
                               mrp_only=False,
                               productSubGroupDescription='FLC')

# Move to working directory (not necessary, but outputs will all be in the same place if this is done)
input_flcs = glob.glob(os.path.join('mastDownload', 'HST', '*', '*flc.fits'))
print(input_flcs)
for flc in input_flcs:
    shutil.copy(flc, os.path.basename(flc))
shutil.rmtree('mastDownload') #remove mast download dir now that we've moved the files

#Remove the second file in the W4 association to simplify this notebook example and define the input file list.
os.remove('j9irw4b3q_flc.fits')
input_flcs = glob.glob('*flc.fits')
print(input_flcs)

The files were downloaded into the subdirectory ‘mastDownload’ within the current working directory and then moved to the current working directory.

Let’s take a look at the file headers to show the orientation differences between frames, as reflected by the ‘PA_V3’ keyword:

for f in glob.glob('*flc.fits'):
    print('filename:', fits.getval(f, 'rootname'), 'PA_V3:', fits.getval(f, 'PA_V3'))

2. Align the Images with TweakReg#

In order to correctly align and combine multiple images into a distortion-free final drizzled product, AstroDrizzle relies on the World Coordinate System (WCS) information stored in the header of each image. Because the WCS information for an image is tied to the positions of guide stars used for the observation, each image will have small pointing errors offsets due to the uncertainty in guide star positions. The TweakReg software can align a set of images to better than subpixel accuracy (<0.1 pixel). All input images are aligned to one chosen reference image’s WCS. The reference image is typically selected as the first image in the input list, but can be chosen explicitly via the refimage parameter, if desired.

The TweakReg algorithm consists of a few steps:

1. Makes a catalog of source positions for each input (and reference) image using a source finding algorithm        similar to DAOFind. 
2. Converts the pixel position catalogs to sky position catalogs.
3. Finds common source positions between reference and input image catalogs.
4. Calculates the shifts, rotation, and scale needed to align sky positions of sources in the input images.
5. (Optionally) Updates the input image headers with the newly calculated WCS information.

TweakReg has numerous parameters that are listed in the documentation. For this first pass, we will run TweakReg with default parameters and show how it fails for this dataset. In subsequent attempts, we will change critical parameters and show how this improves the alignment. The 5 parameters listed below are commented out when left at their default values and uncommented when set to non-defaults.

'conv_width' : 2x FWHM (~3.5 pix for point sources in ACS/WFC or WFC3/UVIS)
'threshold'  : Signal-to-noise above background. Start large and reduce until number of objects is acceptable.
'peakmax'    : Source brightness cutoff, set to avoid saturated sources. 
'searchrad'  : Search radius for finding common sources between images.
'fitgeometry': Geometry used to fit offsets, rotations and/or scale changes from the matched object lists. 
               The default `fitgeometry` of 'rscale' allows for an x/y shift, scale and rotation, and the 
               'general' fit allows for an xy/shift and an independent scale and rotation for each axis.

Note that both TweakReg and AstroDrizzle will by default load in settings from any existing .cfg files if you’ve run them before. Default parameters are restored by setting configobj = None.

2a. First test: Use ‘default’ parameters.#

NOTE: DONT ACTUALLY RUN THE CELLS IN THIS EXAMPLE, THEY USE TOO MUCH MEMORY!#

The most basic, out-of-the-box call to TweakReg looks like this. You TweakReg on your list of input files.

# tweakreg.TweakReg(input_flcs)

Normally, you’ll want to do some customization though. There are a variety of parameters that can be set, see the documenation for all possibilities. Here we will set a few that are helpful. Again, because we aren’t setting any souce finding parameters in showing the basic call to TweakReg, this will likely crash your computer so don’t proceed with caution if you choose to uncomment and run the cell below.

   > 'interactive' is False by default. This switch controls whether the program stops and waits for the user to examine any generated plots before continuing on to the next image. If turned off, plots will still be displayed, but they will also be saved to disk automatically as a PNG image with an autogenerated name without requiring any user input.

   > `shiftfile` is False by default, here we have set it to true to produce an intermediate output product of a file containing the shifts.
   
   > 'outshifts' is the name of the output file created when `shiftfile` is set to True.
   
   > 'updatehdr' is False by default, and specifies whether or not to update the headers of each input image directly with the shifts that were determined. This will allow the input images to be combined by AstroDrizzle without having to provide the shiftfile as well.

There are many more parameters that can be set, but the example below shows how to run TweakReg and override some of these defaults.

# running tweakreg with some parameters changed from their defaults.
# tweakreg.TweakReg(input_flcs,
#                     interactive=True,
#                     shiftfile=True, 
#                     outshifts='shift_default.txt',              
#                     updatehdr=True)

2b. Second test: Setting parameters for source finding.#

One of the crucial steps for aligning images is creating a catalog of common sources in each image to do the alignment. TweakReg uses a set of default parameters to do this source finding, but you will likely want to override these and tailor them to your dataset. As described above, using the defaults on this dataset of a cluster will simply produce too many sources and use too much memory for the average user’s machine. In this example, we will adjust some of these to parameters.

In the above example, none of the source finding parameters were adjusted. TweakReg, by default, looks for sources which are 4-sigma above the background, but this threshold value is too low, since the catalogs contain ~60,000 objects per chip in each frame, and ~6500 matched sources between each exposure. Setting the threshold to a very low value generally does not translate to a better solution, since all sources are weighted equally when computing fits to match catalogs. This is especially relevant for ACS/WFC and WFC3/UVIS data where CTE tails can shift the centroid position slightly along the readout direction for faint sources and potentially bias the fit.

In this test, the threshold value is adjusted to a larger value, and the peakmax value is set to 70,000 electrons to exclude full-well saturated objects for gain=2. TweakReg now finds a more reasonable number of sources per ACS chip (~500).

Note that there are other source finding paratmeters that can be adjusted and that you may need to adjust. For example, you may need to also adjust searchrad up if your images have large offsets.

tweakreg.TweakReg(input_flcs, 
                    threshold=4000, 
                    peakmax=70000,
                    configobj = None, 
                    interactive=False,
                    shiftfile=True,
                    outshifts='shift_thresh.txt',              
                    updatehdr=False)

2c. Inspecting the quality of the fit#

In your work, you will likely have to play around with TweakReg parameters to find the optimal set for your data. This requires inspecting the quality of the fit after each run. If you scroll to the bottom of the last cell run, you can see some diagnostic plots there including a residual and vector plot that show the quality of fit. Here, we will inspect some of the outputs that are saved.

The shift file below shows that the xrms and yrms of the computed fit are decent, around ~0.04 pixels.

shift_tab = Table.read('shift_thresh.txt',
                       format='ascii.no_header',
                       names=['file','dx','dy','rot','scale','xrms','yrms'])

formats = ['.2f', '.2f', '.3f', '.5f', '.2f', '.2f']
for i, col in enumerate(shift_tab.colnames[1:]):
    shift_tab[col].format = formats[i]
shift_tab

When interactive = False, TweakReg outputs several diagnostic plots in the working directory for you to inspect the fit quality. For each input file, except the reference file, TweakReg creates an x/y plot of fit residuals and a vector plot. Let’s take a look at these.

Image(filename='residuals_j9irw5kaq_flc.png', width=500, height=600) 
Image(filename='vector_j9irw4b1q_flc.png', width=500, height=600) 

The residual plots show a decent RMS of fit around 0.04 pixels, but there is still a systematic skew in the fit residuals, which is believed to be caused by an uncorrected ‘skew’ in the ACS distortion solution.

As of creation of this notebook in 2018, the following calibration reference files were used:

    IDCTAB  = 'jref$0461802ej_idc.fits' / Image Distortion Correction Table
D2IMFILE = 'jref$02c1450oj_d2i.fits' / Column Correction Reference File
    NPOLFILE = 'jref$02c14514j_npl.fits' / Non-polynomial Offsets Reference File (F606W)

New distortion solutions, based on the latest Gaia DR2, are in the process of being derived by the ACS team, so observations retrieved from MAST at a later date may or may not show these systematic skew residuals. Fortunately, these can be corrected by allowing for a higher order fitgeometry as shown in the next test.

More information about the distortion reference files used by AstroDrizzle may be found here.

2d. Fourth test: Adjust the fitgeometry#

In this test, the fitgeometry parameter is changed from the default ‘rscale’ to ‘general’ in order to allows for an xy/shift and an independent scale and rotation for each axis.

tweakreg.TweakReg(input_flcs, 
    threshold=4000, 
    searchrad=4.0,
    peakmax=70000,
    fitgeometry='general',
    configobj = None, 
    interactive=False,
    shiftfile=True, 
    outshifts='shift_general.txt',              
    updatehdr=False)
shift_tab=Table.read('shift_general.txt',
                     format='ascii.no_header',
                     names=['file','dx','dy','rot','scale','xrms','yrms'])

formats = ['.2f', '.2f', '.3f', '.5f', '.2f', '.2f']
for i, col in enumerate(shift_tab.colnames[1:]):
    shift_tab[col].format = formats[i]
shift_tab

While the shift file looks nearly identical to the prior run, this is because the axis-dependent rotation and scale values are averaged together in the shiftfile. To determine the actual values, it is necessary to inspect the logfile.

j9irw5kaq  SCALE_X: 1.000008792  SCALE_Y: 1.00000757  ROT_X: 0.002818152  ROT_Y: 359.998708  SKEW: 359.9958898
j9irw4b1q  SCALE_X: 0.999986877  SCALE_Y: 1.00003072  ROT_X: 0.004877188  ROT_Y: 0.00405990  SKEW:  -0.0008173   

The ‘general’ fit reduces the fit rms to ~0.03 pixels from the prior value of ~0.04 pixels and removes systematics from the residuals.

#Give the 'fit residual plots' a unique name for comparison with other tests.

os.rename('residuals_j9irw4b1q_flc.png', 'residuals_j9irw4b1q_flc_general.png')
Image(filename='residuals_j9irw4b1q_flc_general.png', width=500, height=600) 

The plot above shows residuals in X and Y plotted as functions of X and Y. Each point represents a source that was used for the alignment. The residuals should look fairly random - if any correlation is seen, this is an indicator of a poor alignment solution. In some cases you will notice a wavy pattern in the residuals in X and Y. This is often seen for UVIS images and is a result of lithographic patterns of the detector that are not fully corrected for in the distortion solutions. The RMS in X and Y are also printed. For a good alignment, we are looking for an RMS on the order of 0.1 pixel or less.

#Give the 'vector residual plots' a unique name for comparison with other tests.

os.rename('vector_j9irw4b1q_flc.png', 'vector_j9irw4b1q_flc_general.png')
Image(filename='vector_j9irw4b1q_flc_general.png', width=500, height=600) 

The direction and the magnitude of the arrows in the vector plot above represent the offsets in the source position between the image in question and the reference image. These should visually appear random if the alignment was sucessful.

You may need to run TweakReg run several times with varying parameters until a good fit is found. Notice in the above tests that updatehdr = False. This allows you to attempt the alignment and inspect the results without actually updating the WCS to solidify the alignment. Once you are satisfied with the results, TweakReg is run a final time with updatehdr = True, and a new WCS will be inserted in the file. The default name of this new WCS is ‘TWEAK’, but can be changed by setting the wcsname.

2e. Update header once optimal parameters are found#

Once you’ve decided on the optimal set of parameters for your alignment, it is safe to update the header of the input data.

# Final run with ideal parameters, updatehdr = True

tweakreg.TweakReg(input_flcs, 
    threshold=4000, 
    peakmax=70000,
    fitgeometry='general',
    configobj = None, 
    interactive=False,
    shiftfile=False, 
    updatehdr=True)

2f. Overplot Matched Sources on the Image#

Let’s plot the sources that were matched between all images on the bottom chip. Confusingly, this is referred to as ‘chip 2’ or SCI,1 or extension 1.

The cell below shows how to read in the *_fit.match file as an astropy table. Unfortunatley, it doesn’t automatically name columns so you’ll have to look at the header to grab the columns.

plt.figure(figsize = (20, 10))
chip1_data = fits.open('j9irw4b1q_flc.fits')['SCI', 1].data
zscale = ZScaleInterval()
z1, z2 = zscale.get_limits(chip1_data)
plt.imshow(chip1_data, cmap='Greys',origin='lower', vmin=z1, vmax=z2)

match_tab = ascii.read('j9irw4b1q_flc_catalog_fit.match')            #load match file in astropy table
match_tab_chip2 = match_tab[match_tab['col15'] == 1]                 #filter table for sources on chip 2 (on ext 1)
x_cord, y_cord = match_tab_chip2['col11'], match_tab_chip2['col12']

plt.scatter(x_cord, y_cord, s=30, edgecolor='r', facecolor='None',label='Matched Sources, Chip 2')
plt.ylim(0,2051)
plt.xlim(0,4096)
plt.legend(loc='best', fontsize=20)

3. Combine the Images using AstroDrizzle#

Now that the images are aligned to a common WCS, they are ready for combination with AstroDrizzle. All of the input exposures taken in a single filter will contribute to a single drizzled output file.

The AstroDrizzle steps after alignment are summarized below.

1. A static pixel mask is created to flag bad detector pixels.
2. Sky subtraction is performed on masked images. 
3. Each image is individually drizzled, with geometric distortion corrections, to a common reference frame.
4. The distortion-free drizzled images are combined to create a median image.
5. The median image is blotted, or reverse-drizzled, back to the frame of each input image.
6. By comparing each input image with its counterpart blotted median image, the software locates bad pixels in
   each of the original frames & creates bad pixel masks (typically cosmic rays and bad pixels in the detector)
7. In the final step, input images are drizzled together onto a single output image.

Let’s first run AstroDrizzle on the aligned input images in the next cell to create a combined image f606w_combined_drc, then go into further detail about the drizzle process.

Note that astrodrizzle supports the TEAL GUI interface for setting parameters, as well as loading in a custom configuration file (.cfg) files, but we will be using the command-line syntax interface to set the parameters in this example, where parameters are passed into the function directly. Any existing .cfg file will be overridden by setting configobj = None so that unless explicitly set, parameters will be reset to default.

astrodrizzle.AstroDrizzle(input_flcs,
    output='f606w_combined',
    preserve=False,
    driz_sep_bits='64,16',
    driz_cr_corr=True,
    final_bits='64,16',
    clean=False,
    configobj=None,
    build=True)

You will see the following files (among others) output by AstroDrizzle in the working directory.

  1. ‘f606w_combined_drc.fits’ : Multi-extension FITS file.

    When build = True, the drizzled science image, context image, weight image, and median image will be combined into a single multi-extension fits file. This file contains the following extensions:

     Final drizzled science image is contained in ['SCI', 1].
     Final drizzled weight  image is contained in ['WHT', 1].
     Final drizzled context image is contained in ['CTX', 1].
    

    When build = False, these will be output as separate files (_sci.fits, _wht.fits, _ctx.fits).

    When final_wht_type = EXP (default), the weight image it is effectively an exposure time map of each pixel in the final drizzled image. Other options are error ‘ERR’ and inverse variance ‘IVM’ weighting, as described in more detail here.

    The context image is a map showing which images contribute to the final drizzled stack. Each input image chip is identified by a bit in a 32-bit integer. For example, image1/chip1 = 2^0 = 1, image1/chip2 = 2^1 = 2. Each context image pixel is an additive combination of these bits, depending on which images contributed to the corresponding pixel in the drizzled image.

  2. astrodrizzle.log : Log file containing details of the drizzle processing steps.

  3. f606w_combined_med.fits : Median image computed from the sky-subtracted, separately-drizzled input images.

  4. j*_crclean.fits : Cosmic-ray cleaned versions of the original input flc images.

Because we set clean = False, there will be various other intermediate output files in the directory, including masks, blotted frames, etc. This behavior can be modified with the clean and in_memory parameters.

Inspect the final drizzled image#

plt.figure(figsize = (10, 10))
drc_dat = fits.open('f606w_combined_drc.fits')['SCI', 1].data #final drizzled image in SCI,1 extension
z1, z2 = zscale.get_limits(drc_dat)
plt.imshow(drc_dat, origin='lower', vmin=z1, vmax=z2, cmap='Greys')
plt.title('F606W drizzled image', fontsize=30)

Inspect the final weight image#

plt.figure(figsize = (10, 10))
drc_dat = fits.open('f606w_combined_drc.fits')['WHT', 1].data #final drizzled image in WHT,1 extension
z1, z2 = zscale.get_limits(drc_dat)
plt.imshow(drc_dat, origin='lower', vmin=z1, vmax=z2, cmap='Greys')
plt.title('F606W weight image', fontsize=30)

Discussion of AstroDrizzle#

In this example, we imported and ran the AstroDrizzle task on our input images, which were previously aligned with TweakReg. By setting configobj to None, we ensured that AstroDrizzle was not picking up any existing configuration files and parameters were restored to default values. We then set a select few parameters to non-default values:

  1. output = 'f606w_combined' : Output file name root (which will be appended with various suffixes). Defaults to input file name.

  2. driz_sep_bits = '64,16' : Data quality flags in the flt.fits file, which were set during calibration, can be used as bit mask when drizzling. The user may specify which bit values should actually be considered “good” and included in image combination. In astrodrizzle, this parameter may be given as the sum of those DQ flags or as a comma-separated list, as shown in this example. In this example, 64 and 16 are set, so that both warm pixels and stable hot pixels, which are corrected by the dark, are treated as valid input pixels.

  3. driz_cr_corr = True : When set to True, the task will create both a cosmic ray mask image (suffix crmask.fits) and a clean version of the original input images (suffix crclean.fits), where flagged pixels are replaced by pixels from the blotted median. It is strongly recommended that the quality of the cosmic ray masks be verified by blinking the original flt.fits input image with both the cosmic ray-cleaned image (crclean.fits) and the cosmic-ray mask (crmask.fits).

  4. final_bits = '64,16' : Similar to driz_sep_bits, but for the last step when all the input images are combined.

  5. clean = False : intermediate output files (e.g masks) will be kept. If True, only main outputs will be kept. Also see in_memory to control this behavior.

  6. configobj = None : ignore any TEAL inputs / config files and refresh all parameters to default values.

AstroDrizzle has a large number of parameters, which are described here. Running AstroDrizzle using default parameter values is not recommended, as these defaults may not provide optimal science products. Users should also inspect the quality of the sky subtraction and cosmic ray rejection. For dithered data, users may experiment with the output final_scale and final_pixfrac parameters in the final_drizzle step. For more details, see the notebook in this repository to ‘Optimize Image Sampling’.

About this Notebook#

Author: C. Shanahan, STScI WFC3 Team  
Updated: Jan. 20, 2022