Sky Matching for HST Mosaics #


This notebook currently fails to execute, use as reference only

This notebook assumes you have created and activated a virtual environment using the requirements file in this notebook's repository. Please make sure you have read the contents of the README file before continuing the notebook. Note that the GIF file "sky_matching_comparison.gif" is one of the downloads needed for this notebook:

Learning Goals:#

In this tutorial we explore different options for handling the background sky with AstroDrizzle.
By the end of this notebook you will:
    • Download data with astroquery
    • Align data with TweakReg
    • Compare background sky options using the AstroDrizzle parameter skymethod

Table of Contents#

Introduction

1. Download the Observations with astroquery
    1.1 Check image header data
    1.2 Inspect the alignment
2. Align the visit-level drizzled data with TweakReg
    2.1 Create a catalog of Gaia DR3 sources
    2.2 Create a catalog of Gaia DR3 sources with Proper Motion Data
    2.3 Run Tweakreg
    2.4 Inspect the shift file and fit quality
    2.5 Overplot matched sources and inspect fit residuals
    2.6 Rerun TweakReg and update the header WCS
    2.7 Run TweakBack to propogate the WCS to the FLT files
3. Compare skymethod options in AstroDrizzle
    3.1 skymethod = 'localmin'
    3.2 skymethod = 'match'
    3.3 skymethod = 'globalmin+match'
    3.4 skymethod = 'globalmin'
4. Compare the MDRIZSKY values for each method
5. Display the ‘sky matched’ science mosaic and weight image
6. Conclusion

Additional Resources
About this notebook
Citations

Introduction #

When creating an image mosaic, AstroDrizzle has the ability to compute the sky and then either subtract or equalize the background in input images. Users may select the algorithm used for the sky subtraction via the skymethod parameter.

There are four methods available in sky matching: localmin, match, globalmin, and globalmin+match.

By applying drizzlepac.sky.sky(), or using the skymethod parameter in the call to drizzlepac.astrodrizzle.AstroDrizzle(), AstroDrizzle will update the keyword MDRIZSKY in the headers of the input files but it will not change the science data.

For images of sparse fields with few astronomical sources, the default skymethod = 'localmin' may be used, although this method can slightly oversubtract the background. For images with complicated backgrounds, such as nebulae and large host galaxies, skymethod = 'match' is recommended.

For more information on the specifics of this function, please refer to the documentation here

Below, each of the four methods is demonstrated using a single example dataset, and differences between the methods is highlighted.

# All imports needed through out this notebook are included at the beginning. 
%matplotlib inline
%config InlineBackend.figure_format = 'retina'

from collections import defaultdict
from IPython.display import clear_output 
import glob
import os
import shutil 
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import numpy as np
import pandas

from astropy.coordinates import SkyCoord
from astropy.io import ascii, fits
from astropy.table import Table
from astropy.units import Quantity
import astropy.units as u
from astroquery.gaia import Gaia
from astroquery.mast import Observations
from drizzlepac import astrodrizzle, tweakback, tweakreg 
from drizzlepac.haputils.astrometric_utils import create_astrometric_catalog


Gaia.MAIN_GAIA_TABLE = 'gaiadr3.gaia_source'   # Change if different data release is desired
Gaia.ROW_LIMIT = 100000

1. Download the Observations with astroquery #

MAST queries may be done using query_criteria, where we specify:

    –> obs_id, proposal_id, and filters

MAST data products may be downloaded by using download_products, where we specify:

    –> products = calibrated (FLT, FLC) or drizzled (DRZ, DRC) files

    –> type = standard products (CALxxx) or advanced products (HAP-SVM)


WFC3/IR observations of the Horsehead Nebula in the F160W filter obtained in HST proposal program 12812 will be used for this demonstration.

Nine visits were acquired in a 3x3 mosaic pattern on the sky, with two dither positions per visit in two IR filters. High level science products for these datasets were delivered to MAST in 2013, and this notebook is based on that user tutorial but has been updated to align these data to Gaia.

The 18 FLT images ibxl5*_flt.fits have been processed by the HST WFC3 pipeline (calwf3), which includes bias subtraction, dark current correction, cosmic-ray rejection, and flatfielding. The 9 DRZ files ibxl5*_drz.fits have been processed with AstroDrizzle to remove distortion and to combine the 2 dithered FLT frames by filter for each vist.

THIS IS A LARGE DOWNLOAD (~400 MB). Depending on your connection speed, the next cell may take a few minutes to execute.
obs_ids = ['ibxl5*']
props = ['12812']
filts = ['F160W']

obsTable = Observations.query_criteria(obs_id=obs_ids, proposal_id=props, filters=filts)
products = Observations.get_product_list(obsTable)

data_prod = ['FLT', 'DRZ']     # ['FLC','FLT','DRC','DRZ']                  
data_type = ['CALWF3']         # ['CALACS','CALWF3','CALWP2','HAP-SVM']    

Observations.download_products(products, download_dir='./science',
                               productSubGroupDescription=data_prod, 
                               project=data_type)

Move the files to the local working directory

files = glob.glob(os.path.join(os.curdir, 'science', 'mastDownload', 'HST', '*', '*fits'))
for im in files:
    root = os.path.basename(im)
    os.rename(im, './' + root)
    
if os.path.exists('./science'):
    shutil.rmtree('science/')

1.1 Check image header data #

Here we will look at important keywords in the image headers.

files = sorted(glob.glob('*fl?.fits'))
keywords_ext0 = ["ROOTNAME", "ASN_ID", "TARGNAME", "DETECTOR", "FILTER", "EXPTIME", 
                 "RA_TARG", "DEC_TARG", "POSTARG1", "POSTARG2", "DATE-OBS"]
keywords_ext1 = ["ORIENTAT"]
data = []

for file in files:
    path_data = []
    for keyword in keywords_ext0:
        path_data.append(fits.getval(file, keyword, ext=0))
    for keyword in keywords_ext1:
        path_data.append(fits.getval(file, keyword, ext=1))
    data.append(path_data)
    
keywords = keywords_ext0 + keywords_ext1
table = Table(np.array(data), names=keywords, dtype=['str', 'str', 'str', 'str', 'str', 'f8', 'f8', 'f8', 'f8', 'f8', 'str', 'f8'])
table['EXPTIME'].format = '7.1f' 
table['RA_TARG'].format = table['DEC_TARG'].format = '7.4f'
table['POSTARG1'].format = table['POSTARG2'].format = '7.3f' 
table['ORIENTAT'].format = '7.2f'
table.show_in_notebook()

1.2 Inspect the Alignment #

Check the active WCS solution in the image header. If the image is aligned to a catalog, list the number of matches and the fit RMS in mas.
Convert the fit RMS values to pixels for comparison with the alignment results performed later in this notebook.

ext_0_kws = ['DETECTOR']
ext_1_kws = ['WCSNAME', 'NMATCHES', 'RMS_RA', 'RMS_DEC']

det_scale = {'IR': 0.1283, 'UVIS': 0.0396, 'WFC': 0.05}                  # plate scale (arcsec/pixel)

format_dict = {}
col_dict = defaultdict(list)

for f in sorted(glob.glob('*dr?.fits')):
    col_dict['FILENAME'].append(f)
    hdr0 = fits.getheader(f, 0)
    hdr1 = fits.getheader(f, 1)
    
    for kw in ext_0_kws:                                                # extension 0 keywords
        col_dict[kw].append(hdr0[kw])
    for kw in ext_1_kws:                                                # extension 1 keywords
        if 'RMS' in kw:
            val = np.around(hdr1[kw], 1)
        else: 
            val = hdr1[kw]
        col_dict[kw].append(val)
        
    for kw in ['RMS_RA', 'RMS_DEC']:
        val = np.round(hdr1[kw]/1000./det_scale[hdr0['DETECTOR']], 2)  # convert RMS from mas to pixels
        col_dict[f'{kw}_pix'].append(val)

wcstable = Table(col_dict)
wcstable.show_in_notebook()
Note that there are different WCS solutions for each visit, with Gaia eDR3 as the reference catalog for all but two which were fit to GSC v2.4.2 and which have a much larger fit rms values (>0.5 pixels). Since the WCS solutions are inconsistent for this target, we wish to realign the data to use a common reference catalog.

2. Align the visit-level drizzled data with TweakReg #

Here we will use TweakReg to align the DRZ files to Gaia DR3 and then use TweakBack to propagate those solutions back to the FLT image headers prior to combining with AstroDrizzle.

2.1 Create a catalog of Gaia DR3 sources #

This method uses the RA/Dec of the first image and a radius of 5’.

RA = table['RA_TARG'][0]
Dec = table['DEC_TARG'][0]

coord = SkyCoord(ra=RA, dec=Dec, unit=(u.deg, u.deg))
radius = Quantity(5., u.arcmin)

gaia_query = Gaia.query_object_async(coordinate=coord, radius=radius)
gaia_query
reduced_query = gaia_query['ra', 'dec', 'phot_g_mean_mag']
reduced_query.write('gaia_no_pm.cat', format='ascii.commented_header', overwrite=True)
reduced_query

2.2 Create a catalog of Gaia DR3 sources with Proper Motion Data #

This method uses the image FLT footprints and gives 161 sources, compared to 183 with the prior method.

pm_cat = create_astrometric_catalog(sorted(glob.glob('*flt.fits')))
pm_cat.write('gaia_pm.cat', overwrite=True, format='ascii.no_header')
len(pm_cat)

2.3 Run TweakReg #

Next we run TweakReg on the visit-level drizzled (DRZ) images and align to the Gaia catalog with proper motion data.

Because the fit RMS values for the MAST products were large for some visits, we allow for a larger than usual search radius of 1”. We also set the conv_width value slightly higher than the recommended value of 2.5 for WFC3/IR data in order to use barely resolved sources for alignment.

refcat = 'gaia_pm.cat'                    # Use the catalog with proper motion data

drz_files = sorted(glob.glob('*drz.fits'))

tweakreg.TweakReg(drz_files, 
                  imagefindcfg={'threshold': 4, 'conv_width': 4.5}, 
                  minobj=3,
                  shiftfile=True, 
                  outshifts='shift_drz.txt',
                  refcat=refcat,
                  searchrad=1,
                  ylimit=0.4, 
                  nclip=1,
                  updatehdr=False,       # change later when you verify the alignment works
                  interactive=False)
# clear_output()
# If the alignment is unsuccessful, stop the notebook
with open('shift_drz.txt', 'r') as shift:
    for line_number, line in enumerate(shift, start=1):
        if "nan" in line:
            raise ValueError(f'nan found in line {line_number} in shift file')

2.4 Inspect the shift file and fit quality #

# Read the shift file just created by tweakreg
# There are 7 columns including: filename, x-shift, y-shift, rotation, scale, x-RMS, and y-RMS
shift_table = Table.read('shift_drz.txt',
                         format='ascii.no_header', 
                         names=['file', 'dx', 'dy', 'rot', 'scale', 'xrms', 'yrms'])

# Define the format for each column (excluding 'file').
formats = ['.2f', '.2f', '.3f', '.5f', '.2f', '.2f']

# Iterate over the columns 'dx', 'dy', 'rot', 'scale', 'xrms', 'yrms'
for i, col in enumerate(shift_table.colnames[1:]):
    # Apply the format to the current column 
    shift_table[col].format = formats[i]
    
# Display the table in the notebook
shift_table

Note that there are large residual rotation and scale terms in the shift file for several visits in the MAST data products. These will be corrected when we run TweakReg an additional time and update the WCS in Section 2.6 below.

match_files = sorted(glob.glob('*_drz_catalog_fit.match'))
for f in match_files:
    input = ascii.read(f)
    print(f'Number of matches for {f} {len(input)}')

Next we compare with the MAST WCS solutions, number of matches and fit RMS values. Since there are only 5 Gaia sources in Visits 53 and 56, these did not have a successful Gaia fit during MAST processing and instead were aligned to the next catalog in the priority list, GSC v2.4.2. (Currently the minimum number of matches for a successful fit is 6, but this will updated to 10 in summer 2024.

wcstable

2.5 Overplot matched sources and inspect fit residuals #

Here we overplot the HST sources which were successfully matched with Gaia eDR3 and we look at the astrometric fit residual PNG plots.

While we inspect only two visits (52 and 53) at a time, and additional visits may be uncommented in the cell below.

# Define the rootnames for the two FITS files to be compared
# Uncomment the pairs you want to use

# rootname_A = 'ibxl50030'
# rootname_B = 'ibxl51030'
rootname_A = 'ibxl52030'
rootname_B = 'ibxl53030'
# rootname_A = 'ibxl54030'
# rootname_B = 'ibxl55030'
# rootname_A = 'ibxl56030'
# rootname_B = 'ibxl57030'
# rootname_A = 'ibxl57030'
# rootname_B = 'ibxl58030'

# Create subplots with 1 row and 2 columns
fig, [ax1, ax2] = plt.subplots(1, 2, figsize=(13, 10))

# Open the FITS files and extract the data from the 'SCI' extension
data_a = fits.open(rootname_A+'_drz.fits')['SCI', 1].data
data_b = fits.open(rootname_B+'_drz.fits')['SCI', 1].data

# Display the images in grayscale with a stretch from 0 to 2 
ax1.imshow(data_a, cmap='Greys', origin='lower', vmin=0, vmax=2)
ax2.imshow(data_b, cmap='Greys', origin='lower', vmin=0, vmax=2)

# Read the matching catalog files for both rootnames
match_tab_a = ascii.read(rootname_A+'_drz_catalog_fit.match')
match_tab_b = ascii.read(rootname_B+'_drz_catalog_fit.match')

# Extract x and y coordinates from the matching catalogs
x_coord_a, y_coord_a = match_tab_a['col11'], match_tab_a['col12']
x_coord_b, y_coord_b = match_tab_b['col11'], match_tab_b['col12']

# Plot the coordinates on the images with red circles
ax1.scatter(x_coord_a, y_coord_a, s=30, edgecolor='r', facecolor='None')
ax2.scatter(x_coord_b, y_coord_b, s=30, edgecolor='r', facecolor='None')

# Set the titles for the subplots, including the number of matches
ax1.set_title(rootname_A+f'  N = {len(match_tab_a)}  Gaia Matches', fontsize=20)
ax2.set_title(rootname_B+f'  N = {len(match_tab_b)}  Gaia Matches', fontsize=20)
fig.tight_layout()

# Load, display, and inspect fit residual PNG files
img_A = mpimg.imread(f'residuals_{rootname_A}_drz.png')
img_B = mpimg.imread(f'residuals_{rootname_B}_drz.png')

# Create subplots for the residual images
fig, axs = plt.subplots(1, 2, figsize=(20, 10), dpi=200)
axs[0].imshow(img_A)
axs[1].imshow(img_B)

# Remove unnecessary axis from the residual images
axs[0].axis('off') 
axs[1].axis('off')

# Adjust layout to minimize margins and display plots
fig.tight_layout()
plt.show()

2.6 Rerun TweakReg and update the header WCS #

Once we are happy with the alignment, we run TweakReg again with the same parameters but with the updatehdr=True.

refcat = 'gaia_pm.cat'    # Use the catalog with proper motion data

tweakreg.TweakReg(drz_files, 
                  imagefindcfg={'threshold': 4, 'conv_width': 4.5}, 
                  minobj=4,
                  shiftfile=False, 
                  refcat=refcat,
                  searchrad=1,
                  ylimit=0.4, 
                  nclip=1,
                  updatehdr=True,       # update header
                  interactive=False)
clear_output()

2.7 Run TweakBack to propogate the WCS to the FLT files #

Finally, we run Tweakback on the aligned DRZ files to propogate the updated WCS information back to the FLT files.

for f in drz_files:
    tb_input = f+'[sci,1]'
    with fits.open(f) as hdu:
        tweakback.apply_tweak(tb_input, orig_wcs_name=hdu[1].header['WCSNAME'])
    
clear_output()

3. Compare skymethod options in AstroDrizzle #

Now that the FLT files contain the updated WCS solutions, we explore different algorithms for estimating the sky.

3.1 skymethod = 'localmin' #

When using AstroDrizzle to compute the sky in each frame, ‘localmin’ will compute a common sky value for all chips of a given exposure, using the minimum sky value from all chips. This process is repeated for each input image. This algorithm is recommended when images are dominated by blank sky instead of extended, diffuse sources.

See readthedocs for more details on sky subtraction options.

In the command below, the aligned FLT frames are sky subtracted and drizzled together.

Because the WFC3/IR data products are already cleaned of cosmic rays during calwf3 processing, cosmic-ray rejection is turned off in AstroDrizzle by setting the parameters driz_separate, median, blot, and driz_cr to ‘False’. Note that final_bits=16 means only the stable hot pixels are treated as good.

sky = 'localmin'
astrodrizzle.AstroDrizzle('*flt.fits', 
                          output='f160w_localmin',
                          preserve=False, 
                          context=False,
                          skymethod=sky, 
                          driz_separate=False, median=False, blot=False, driz_cr=False, # CR-rej turned off
                          final_bits='16',
                          final_wcs=True, 
                          final_rot=257.)
clear_output()

3.2 skymethod = 'match' #

When skymethod is set to ‘match’, differences in sky values between images in common sky regions will be computed. Thus, sky values will be relative (delta) to the sky computed in one of the input images whose sky value will be set to and reported as 0. This setting “equalizes” sky values between the images in large mosaics.

This is the recommended setting for images containing diffuse sources (e.g., galaxies, nebulae) covering significant parts of the image.

sky = 'match'
astrodrizzle.AstroDrizzle('*flt.fits', 
                          output='f160w_match',
                          preserve=False, 
                          context=False,
                          skymethod=sky, 
                          driz_separate=False, median=False, blot=False, driz_cr=False, # CRREJ=None
                          final_bits='16',
                          final_wcs=True, 
                          final_rot=257.)
clear_output()

3.3 skymethod = 'globalmin+match' #

When skymethod is set to ‘globalmin+match’, AstroDrizzle will first find a minimum “global” sky value in all input images and then use the ‘match’ method to equalize sky values between images.

sky = 'globalmin+match'
astrodrizzle.AstroDrizzle('*flt.fits', 
                          output='f160w_globalmin_match',
                          preserve=False, 
                          context=False,
                          skymethod=sky, 
                          driz_separate=False, median=False, blot=False, driz_cr=False, # CRREJ=None
                          final_bits='16',
                          final_wcs=True, 
                          final_rot=257.)
clear_output()

3.4 skymethod = 'globalmin' #

When skymethod is set to ‘globalmin’, a common sky value will be computed for all exposures. AstroDrizzle will compute sky values for each chip/image extension, find the minimum sky value from all the exposures, and then subtract that minimum sky value from all chips in all images.

This method may be useful when input images already have matched background values.

sky = 'globalmin'
astrodrizzle.AstroDrizzle('*flt.fits', 
                          output='f160w_globalmin',
                          preserve=False, 
                          context=False,
                          skymethod=sky, 
                          driz_separate=False, median=False, blot=False, driz_cr=False, # CRREJ=None
                          final_bits='16',
                          final_wcs=True, 
                          final_rot=257.)
clear_output()

4. Compare the MDRIZSKY values for each method #

Below we provide a gif comparing the upper portion of the final drizzled image. We cycle through three drizzled images produced using different skymethod algorithms:

The top of the horsehead nebula is shown in grayscale with a stretch of about -0.4 to 2 and the animated gif transitions between displaying the "localmin", "match", and "globalmin+match" sky methods from astrodrizzle. Localmin looks the worst with unmatched backgrounds in the top left and right corners where the outline of separate pointings (dithers) are apparent.

Next we print the sky values computed for each image using the four different methods.

mdrizsky_val = pandas.DataFrame({'rootname': fits.getdata('f160w_globalmin_drz_sci.fits', 1)['rootname'],
                                 'local': fits.getdata('f160w_localmin_drz_sci.fits', 1)['mdrizsky'],
                                 'globalmin': fits.getdata('f160w_globalmin_drz_sci.fits', 1)['mdrizsky'],
                                 'globalmin_match': fits.getdata('f160w_globalmin_match_drz_sci.fits', 1)['mdrizsky'],
                                 'match': fits.getdata('f160w_match_drz_sci.fits', 1)['mdrizsky']})
mdrizsky_val

These computed sky values can be visualized in the plot below. To reiterate, the MDRIZSKY keyword reports the value subtracted from each FLC image prior to drizzling, and not the sky level itself. Thus the values for skymethod='match' are close to zero.

We also note that varying background levels across the individual tiles result in inaccurate sky background determination when skymethod='localmin' and thus a mismatched sky in the final mosaic.

index = mdrizsky_val.index.tolist()
globalmin = list(mdrizsky_val['globalmin'])
globalmin_match = list(mdrizsky_val['globalmin_match'])
match = list(mdrizsky_val['match'])
local = list(mdrizsky_val['local'])

# Plotting code: 
fig = plt.figure(figsize=[7, 7])
plt.scatter(index, globalmin_match, color='magenta', label='Globalmin + Match')
plt.scatter(index, match, color='navy', label='Match')
plt.scatter(index, local, color='olive', label='Local')
plt.scatter(index, globalmin, color='orange', label='Globalmin')
plt.xlabel('Individual Images')
plt.ylabel('MDRIZSKY Value')
plt.legend(loc='best')
plt.xticks(index)
plt.tight_layout()
plt.show()

5. Display the ‘sky matched’ science mosaic and weight image #

Finally, we display the science and weight images for the combined mosaic.

sci = fits.getdata('f160w_match_drz_sci.fits')
fig = plt.figure(figsize=(10, 10), dpi=130)
plt.imshow(sci, vmin=0.5, vmax=3, cmap='Greys_r', origin='lower')
plt.colorbar(shrink=0.85, pad=0.01)
plt.show()
sci = fits.getdata('f160w_match_drz_wht.fits')
fig = plt.figure(figsize=(10, 10), dpi=130)
plt.imshow(sci, vmin=0, vmax=4000, cmap='Greys_r', origin='lower')
plt.colorbar(shrink=0.85, pad=0.01)
plt.show()

6. Conclusion #

Thank you for going through this notebook. You should now have all the necessary tools for assessing the
appropriate skymethod parameter to use when combining images. After completing this notebook you
should be more familiar with:
    • How to effectively use astroquery to download FLT and DRZ files.
    • Checking the WCS of images and aligning them to a reference catalog with TweakReg.
    • Combining data with AstroDrizzle taking into account the skymethod

Congratulations, you have completed the notebook!

Additional Resources #

About this Notebook #

Created: 14 Dec 2018; C. Martlin & J. Mack
Updated: 16 Nov 2023; K. Huynh & J. Mack
Updated: 23 Jul 2024; J. Mack & B. Kuhn

Source: https://github.com/spacetelescope/hst_notebooks

Citations #

If you use Python packages for published research, please cite the authors. Follow these links for more
information about citing packages such as astropy, astroquery, matplotlib, pandas, etc.:

Top of Page Space Telescope Logo