Sky Matching for HST Mosaics #
Learning Goals:#
In this tutorial we explore different options for handling the background sky with AstroDrizzle
.
By the end of this notebook you will:
• Download data with astroquery
• Align data with TweakReg
• Compare background sky options using the AstroDrizzle
parameter skymethod
Table of Contents#
Introduction
1. Download the Observations with astroquery
1.1 Check image header data
1.2 Inspect the alignment
2. Align the visit-level drizzled data with TweakReg
2.1 Create a catalog of Gaia DR3 sources
2.2 Create a catalog of Gaia DR3 sources with Proper Motion Data
2.3 Run Tweakreg
2.4 Inspect the shift file and fit quality
2.5 Overplot matched sources and inspect fit residuals
2.6 Rerun TweakReg
and update the header WCS
2.7 Run TweakBack
to propogate the WCS to the FLT files
3. Compare skymethod
options in AstroDrizzle
3.1 skymethod = 'localmin'
3.2 skymethod = 'match'
3.3 skymethod = 'globalmin+match'
3.4 skymethod = 'globalmin'
4. Compare the MDRIZSKY values for each method
5. Display the ‘sky matched’ science mosaic and weight image
6. Conclusion
Additional Resources
About this notebook
Citations
Introduction #
When creating an image mosaic, AstroDrizzle
has the ability to compute the sky and then either subtract or equalize the background in input images. Users may select the algorithm used for the sky subtraction via the skymethod
parameter.
There are four methods available in sky matching: localmin
, match
, globalmin
, and globalmin+match
.
By applying drizzlepac.sky.sky()
, or using the skymethod
parameter in the call to drizzlepac.astrodrizzle.AstroDrizzle()
, AstroDrizzle will update the keyword MDRIZSKY
in the headers of the input files but it will not change the science data.
For images of sparse fields with few astronomical sources, the default skymethod = 'localmin'
may be used, although this method can slightly oversubtract the background. For images with complicated backgrounds, such as nebulae and large host galaxies, skymethod = 'match'
is recommended.
For more information on the specifics of this function, please refer to the documentation here
Below, each of the four methods is demonstrated using a single example dataset, and differences between the methods is highlighted.
# All imports needed through out this notebook are included at the beginning.
%matplotlib inline
%config InlineBackend.figure_format = 'retina'
from collections import defaultdict
from IPython.display import clear_output
import glob
import os
import shutil
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import numpy as np
import pandas
from astropy.coordinates import SkyCoord
from astropy.io import ascii, fits
from astropy.table import Table
from astropy.units import Quantity
import astropy.units as u
from astroquery.gaia import Gaia
from astroquery.mast import Observations
from drizzlepac import astrodrizzle, tweakback, tweakreg
from drizzlepac.haputils.astrometric_utils import create_astrometric_catalog
Gaia.MAIN_GAIA_TABLE = 'gaiadr3.gaia_source' # Change if different data release is desired
Gaia.ROW_LIMIT = 100000
1. Download the Observations with astroquery
#
MAST queries may be done using query_criteria
, where we specify:
–> obs_id, proposal_id, and filters
MAST data products may be downloaded by using download_products
, where we specify:
–> products = calibrated (FLT, FLC) or drizzled (DRZ, DRC) files
–> type = standard products (CALxxx) or advanced products (HAP-SVM)
WFC3/IR observations of the Horsehead Nebula in the F160W filter obtained in HST proposal program 12812 will be used for this demonstration.
Nine visits were acquired in a 3x3 mosaic pattern on the sky, with two dither positions per visit in two IR filters. High level science products for these datasets were delivered to MAST in 2013, and this notebook is based on that user tutorial but has been updated to align these data to Gaia.
The 18 FLT images ibxl5*_flt.fits have been processed by the HST WFC3 pipeline (calwf3), which includes bias subtraction, dark current correction, cosmic-ray rejection, and flatfielding. The 9 DRZ files ibxl5*_drz.fits have been processed with AstroDrizzle
to remove distortion and to combine the 2 dithered FLT frames by filter for each vist.
obs_ids = ['ibxl5*']
props = ['12812']
filts = ['F160W']
obsTable = Observations.query_criteria(obs_id=obs_ids, proposal_id=props, filters=filts)
products = Observations.get_product_list(obsTable)
data_prod = ['FLT', 'DRZ'] # ['FLC','FLT','DRC','DRZ']
data_type = ['CALWF3'] # ['CALACS','CALWF3','CALWP2','HAP-SVM']
Observations.download_products(products, download_dir='./science',
productSubGroupDescription=data_prod,
project=data_type)
Move the files to the local working directory
files = glob.glob(os.path.join(os.curdir, 'science', 'mastDownload', 'HST', '*', '*fits'))
for im in files:
root = os.path.basename(im)
os.rename(im, './' + root)
if os.path.exists('./science'):
shutil.rmtree('science/')
1.1 Check image header data #
Here we will look at important keywords in the image headers.
files = sorted(glob.glob('*fl?.fits'))
keywords_ext0 = ["ROOTNAME", "ASN_ID", "TARGNAME", "DETECTOR", "FILTER", "EXPTIME",
"RA_TARG", "DEC_TARG", "POSTARG1", "POSTARG2", "DATE-OBS"]
keywords_ext1 = ["ORIENTAT"]
data = []
for file in files:
path_data = []
for keyword in keywords_ext0:
path_data.append(fits.getval(file, keyword, ext=0))
for keyword in keywords_ext1:
path_data.append(fits.getval(file, keyword, ext=1))
data.append(path_data)
keywords = keywords_ext0 + keywords_ext1
table = Table(np.array(data), names=keywords, dtype=['str', 'str', 'str', 'str', 'str', 'f8', 'f8', 'f8', 'f8', 'f8', 'str', 'f8'])
table['EXPTIME'].format = '7.1f'
table['RA_TARG'].format = table['DEC_TARG'].format = '7.4f'
table['POSTARG1'].format = table['POSTARG2'].format = '7.3f'
table['ORIENTAT'].format = '7.2f'
table.show_in_notebook()
1.2 Inspect the Alignment #
Check the active WCS solution in the image header. If the image is aligned to a catalog, list the number of matches and the fit RMS in mas.
Convert the fit RMS values to pixels for comparison with the alignment results performed later in this notebook.
ext_0_kws = ['DETECTOR']
ext_1_kws = ['WCSNAME', 'NMATCHES', 'RMS_RA', 'RMS_DEC']
det_scale = {'IR': 0.1283, 'UVIS': 0.0396, 'WFC': 0.05} # plate scale (arcsec/pixel)
format_dict = {}
col_dict = defaultdict(list)
for f in sorted(glob.glob('*dr?.fits')):
col_dict['FILENAME'].append(f)
hdr0 = fits.getheader(f, 0)
hdr1 = fits.getheader(f, 1)
for kw in ext_0_kws: # extension 0 keywords
col_dict[kw].append(hdr0[kw])
for kw in ext_1_kws: # extension 1 keywords
if 'RMS' in kw:
val = np.around(hdr1[kw], 1)
else:
val = hdr1[kw]
col_dict[kw].append(val)
for kw in ['RMS_RA', 'RMS_DEC']:
val = np.round(hdr1[kw]/1000./det_scale[hdr0['DETECTOR']], 2) # convert RMS from mas to pixels
col_dict[f'{kw}_pix'].append(val)
wcstable = Table(col_dict)
wcstable.show_in_notebook()
2. Align the visit-level drizzled data with TweakReg
#
Here we will use TweakReg
to align the DRZ files to Gaia DR3 and then use TweakBack
to propagate those solutions back to the FLT image headers prior to combining with AstroDrizzle
.
2.1 Create a catalog of Gaia DR3 sources #
This method uses the RA/Dec of the first image and a radius of 5’.
RA = table['RA_TARG'][0]
Dec = table['DEC_TARG'][0]
coord = SkyCoord(ra=RA, dec=Dec, unit=(u.deg, u.deg))
radius = Quantity(5., u.arcmin)
gaia_query = Gaia.query_object_async(coordinate=coord, radius=radius)
gaia_query
reduced_query = gaia_query['ra', 'dec', 'phot_g_mean_mag']
reduced_query.write('gaia_no_pm.cat', format='ascii.commented_header', overwrite=True)
reduced_query
2.2 Create a catalog of Gaia DR3 sources with Proper Motion Data #
This method uses the image FLT footprints and gives 161 sources, compared to 183 with the prior method.
pm_cat = create_astrometric_catalog(sorted(glob.glob('*flt.fits')))
pm_cat.write('gaia_pm.cat', overwrite=True, format='ascii.no_header')
len(pm_cat)
2.3 Run TweakReg
#
Next we run TweakReg
on the visit-level drizzled (DRZ) images and align to the Gaia catalog with proper motion data.
Because the fit RMS values for the MAST products were large for some visits, we allow for a larger than usual search radius of 1”. We also set the conv_width
value slightly higher than the recommended value of 2.5 for WFC3/IR data in order to use barely resolved sources for alignment.
refcat = 'gaia_pm.cat' # Use the catalog with proper motion data
drz_files = sorted(glob.glob('*drz.fits'))
tweakreg.TweakReg(drz_files,
imagefindcfg={'threshold': 4, 'conv_width': 4.5},
minobj=3,
shiftfile=True,
outshifts='shift_drz.txt',
refcat=refcat,
searchrad=1,
ylimit=0.4,
nclip=1,
updatehdr=False, # change later when you verify the alignment works
interactive=False)
# clear_output()
# If the alignment is unsuccessful, stop the notebook
with open('shift_drz.txt', 'r') as shift:
for line_number, line in enumerate(shift, start=1):
if "nan" in line:
raise ValueError(f'nan found in line {line_number} in shift file')
2.4 Inspect the shift file and fit quality #
# Read the shift file just created by tweakreg
# There are 7 columns including: filename, x-shift, y-shift, rotation, scale, x-RMS, and y-RMS
shift_table = Table.read('shift_drz.txt',
format='ascii.no_header',
names=['file', 'dx', 'dy', 'rot', 'scale', 'xrms', 'yrms'])
# Define the format for each column (excluding 'file').
formats = ['.2f', '.2f', '.3f', '.5f', '.2f', '.2f']
# Iterate over the columns 'dx', 'dy', 'rot', 'scale', 'xrms', 'yrms'
for i, col in enumerate(shift_table.colnames[1:]):
# Apply the format to the current column
shift_table[col].format = formats[i]
# Display the table in the notebook
shift_table
Note that there are large residual rotation and scale terms in the shift file for several visits in the MAST data products. These will be corrected when we run TweakReg
an additional time and update the WCS in Section 2.6 below.
match_files = sorted(glob.glob('*_drz_catalog_fit.match'))
for f in match_files:
input = ascii.read(f)
print(f'Number of matches for {f} {len(input)}')
Next we compare with the MAST WCS solutions, number of matches and fit RMS values. Since there are only 5 Gaia sources in Visits 53 and 56, these did not have a successful Gaia fit during MAST processing and instead were aligned to the next catalog in the priority list, GSC v2.4.2. (Currently the minimum number of matches for a successful fit is 6, but this will updated to 10 in summer 2024.
wcstable
2.5 Overplot matched sources and inspect fit residuals #
Here we overplot the HST sources which were successfully matched with Gaia eDR3 and we look at the astrometric fit residual PNG plots.
While we inspect only two visits (52 and 53) at a time, and additional visits may be uncommented in the cell below.
# Define the rootnames for the two FITS files to be compared
# Uncomment the pairs you want to use
# rootname_A = 'ibxl50030'
# rootname_B = 'ibxl51030'
rootname_A = 'ibxl52030'
rootname_B = 'ibxl53030'
# rootname_A = 'ibxl54030'
# rootname_B = 'ibxl55030'
# rootname_A = 'ibxl56030'
# rootname_B = 'ibxl57030'
# rootname_A = 'ibxl57030'
# rootname_B = 'ibxl58030'
# Create subplots with 1 row and 2 columns
fig, [ax1, ax2] = plt.subplots(1, 2, figsize=(13, 10))
# Open the FITS files and extract the data from the 'SCI' extension
data_a = fits.open(rootname_A+'_drz.fits')['SCI', 1].data
data_b = fits.open(rootname_B+'_drz.fits')['SCI', 1].data
# Display the images in grayscale with a stretch from 0 to 2
ax1.imshow(data_a, cmap='Greys', origin='lower', vmin=0, vmax=2)
ax2.imshow(data_b, cmap='Greys', origin='lower', vmin=0, vmax=2)
# Read the matching catalog files for both rootnames
match_tab_a = ascii.read(rootname_A+'_drz_catalog_fit.match')
match_tab_b = ascii.read(rootname_B+'_drz_catalog_fit.match')
# Extract x and y coordinates from the matching catalogs
x_coord_a, y_coord_a = match_tab_a['col11'], match_tab_a['col12']
x_coord_b, y_coord_b = match_tab_b['col11'], match_tab_b['col12']
# Plot the coordinates on the images with red circles
ax1.scatter(x_coord_a, y_coord_a, s=30, edgecolor='r', facecolor='None')
ax2.scatter(x_coord_b, y_coord_b, s=30, edgecolor='r', facecolor='None')
# Set the titles for the subplots, including the number of matches
ax1.set_title(rootname_A+f' N = {len(match_tab_a)} Gaia Matches', fontsize=20)
ax2.set_title(rootname_B+f' N = {len(match_tab_b)} Gaia Matches', fontsize=20)
fig.tight_layout()
# Load, display, and inspect fit residual PNG files
img_A = mpimg.imread(f'residuals_{rootname_A}_drz.png')
img_B = mpimg.imread(f'residuals_{rootname_B}_drz.png')
# Create subplots for the residual images
fig, axs = plt.subplots(1, 2, figsize=(20, 10), dpi=200)
axs[0].imshow(img_A)
axs[1].imshow(img_B)
# Remove unnecessary axis from the residual images
axs[0].axis('off')
axs[1].axis('off')
# Adjust layout to minimize margins and display plots
fig.tight_layout()
plt.show()
2.6 Rerun TweakReg
and update the header WCS #
Once we are happy with the alignment, we run TweakReg
again with the same parameters but with the updatehdr=True
.
refcat = 'gaia_pm.cat' # Use the catalog with proper motion data
tweakreg.TweakReg(drz_files,
imagefindcfg={'threshold': 4, 'conv_width': 4.5},
minobj=4,
shiftfile=False,
refcat=refcat,
searchrad=1,
ylimit=0.4,
nclip=1,
updatehdr=True, # update header
interactive=False)
clear_output()
2.7 Run TweakBack
to propogate the WCS to the FLT files #
Finally, we run Tweakback
on the aligned DRZ files to propogate the updated WCS information back to the FLT files.
for f in drz_files:
tb_input = f+'[sci,1]'
with fits.open(f) as hdu:
tweakback.apply_tweak(tb_input, orig_wcs_name=hdu[1].header['WCSNAME'])
clear_output()
3. Compare skymethod
options in AstroDrizzle #
Now that the FLT files contain the updated WCS solutions, we explore different algorithms for estimating the sky.
3.1 skymethod = 'localmin'
#
When using AstroDrizzle
to compute the sky in each frame, ‘localmin’ will compute a common sky value for all chips of a given exposure, using the minimum sky value from all chips. This process is repeated for each input image. This algorithm is recommended when images are dominated by blank sky instead of extended, diffuse sources.
See readthedocs for more details on sky subtraction options.
In the command below, the aligned FLT frames are sky subtracted and drizzled together.
Because the WFC3/IR data products are already cleaned of cosmic rays during calwf3 processing, cosmic-ray rejection is turned off in AstroDrizzle by setting the parameters driz_separate
, median
, blot
, and driz_cr
to ‘False’. Note that final_bits=16
means only the stable hot pixels are treated as good.
sky = 'localmin'
astrodrizzle.AstroDrizzle('*flt.fits',
output='f160w_localmin',
preserve=False,
context=False,
skymethod=sky,
driz_separate=False, median=False, blot=False, driz_cr=False, # CR-rej turned off
final_bits='16',
final_wcs=True,
final_rot=257.)
clear_output()
3.2 skymethod = 'match'
#
When skymethod
is set to ‘match’, differences in sky values between images in common sky regions will be computed. Thus, sky values will be relative (delta) to the sky computed in one of the input images whose sky value will be set to and reported as 0. This setting “equalizes” sky values between the images in large mosaics.
This is the recommended setting for images containing diffuse sources (e.g., galaxies, nebulae) covering significant parts of the image.
sky = 'match'
astrodrizzle.AstroDrizzle('*flt.fits',
output='f160w_match',
preserve=False,
context=False,
skymethod=sky,
driz_separate=False, median=False, blot=False, driz_cr=False, # CRREJ=None
final_bits='16',
final_wcs=True,
final_rot=257.)
clear_output()
3.3 skymethod = 'globalmin+match'
#
When skymethod
is set to ‘globalmin+match’, AstroDrizzle will first find a minimum “global” sky value in all input images and then use the ‘match’ method to equalize sky values between images.
sky = 'globalmin+match'
astrodrizzle.AstroDrizzle('*flt.fits',
output='f160w_globalmin_match',
preserve=False,
context=False,
skymethod=sky,
driz_separate=False, median=False, blot=False, driz_cr=False, # CRREJ=None
final_bits='16',
final_wcs=True,
final_rot=257.)
clear_output()
3.4 skymethod = 'globalmin'
#
When skymethod
is set to ‘globalmin’, a common sky value will be computed for all exposures. AstroDrizzle will compute sky values for each chip/image extension, find the minimum sky value from all the exposures, and then subtract that minimum sky value from all chips in all images.
This method may be useful when input images already have matched background values.
sky = 'globalmin'
astrodrizzle.AstroDrizzle('*flt.fits',
output='f160w_globalmin',
preserve=False,
context=False,
skymethod=sky,
driz_separate=False, median=False, blot=False, driz_cr=False, # CRREJ=None
final_bits='16',
final_wcs=True,
final_rot=257.)
clear_output()
4. Compare the MDRIZSKY values for each method #
Below we provide a gif comparing the upper portion of the final drizzled image. We cycle through three drizzled images produced using different skymethod
algorithms:
Next we print the sky values computed for each image using the four different methods.
mdrizsky_val = pandas.DataFrame({'rootname': fits.getdata('f160w_globalmin_drz_sci.fits', 1)['rootname'],
'local': fits.getdata('f160w_localmin_drz_sci.fits', 1)['mdrizsky'],
'globalmin': fits.getdata('f160w_globalmin_drz_sci.fits', 1)['mdrizsky'],
'globalmin_match': fits.getdata('f160w_globalmin_match_drz_sci.fits', 1)['mdrizsky'],
'match': fits.getdata('f160w_match_drz_sci.fits', 1)['mdrizsky']})
mdrizsky_val
These computed sky values can be visualized in the plot below. To reiterate, the MDRIZSKY keyword reports the value subtracted from each FLC image prior to drizzling, and not the sky level itself. Thus the values for skymethod='match'
are close to zero.
We also note that varying background levels across the individual tiles result in inaccurate sky background determination when skymethod='localmin'
and thus a mismatched sky in the final mosaic.
index = mdrizsky_val.index.tolist()
globalmin = list(mdrizsky_val['globalmin'])
globalmin_match = list(mdrizsky_val['globalmin_match'])
match = list(mdrizsky_val['match'])
local = list(mdrizsky_val['local'])
# Plotting code:
fig = plt.figure(figsize=[7, 7])
plt.scatter(index, globalmin_match, color='magenta', label='Globalmin + Match')
plt.scatter(index, match, color='navy', label='Match')
plt.scatter(index, local, color='olive', label='Local')
plt.scatter(index, globalmin, color='orange', label='Globalmin')
plt.xlabel('Individual Images')
plt.ylabel('MDRIZSKY Value')
plt.legend(loc='best')
plt.xticks(index)
plt.tight_layout()
plt.show()
5. Display the ‘sky matched’ science mosaic and weight image #
Finally, we display the science and weight images for the combined mosaic.
sci = fits.getdata('f160w_match_drz_sci.fits')
fig = plt.figure(figsize=(10, 10), dpi=130)
plt.imshow(sci, vmin=0.5, vmax=3, cmap='Greys_r', origin='lower')
plt.colorbar(shrink=0.85, pad=0.01)
plt.show()
sci = fits.getdata('f160w_match_drz_wht.fits')
fig = plt.figure(figsize=(10, 10), dpi=130)
plt.imshow(sci, vmin=0, vmax=4000, cmap='Greys_r', origin='lower')
plt.colorbar(shrink=0.85, pad=0.01)
plt.show()
6. Conclusion #
Thank you for going through this notebook. You should now have all the necessary tools for assessing the
appropriate skymethod
parameter to use when combining images. After completing this notebook you
should be more familiar with:
• How to effectively use astroquery
to download FLT and DRZ files.
• Checking the WCS of images and aligning them to a reference catalog with TweakReg
.
• Combining data with AstroDrizzle
taking into account the skymethod
Congratulations, you have completed the notebook!
Additional Resources #
About this Notebook #
Created: 14 Dec 2018; C. Martlin & J. Mack
Updated: 16 Nov 2023; K. Huynh & J. Mack
Updated: 23 Jul 2024; J. Mack & B. Kuhn
Source: https://github.com/spacetelescope/hst_notebooks
Citations #
If you use Python packages for published research, please cite the authors. Follow these links for more
information about citing packages such as astropy
, astroquery
, matplotlib
, pandas
, etc.: