Hubble Catalog of Variables Notebook (API version)#

2019 - 2022, Rick White, Steve Lubow, Trenton McKinney#

This notebook shows how to access the Hubble Catalogs of Variables (HCV). The HCV is a large catalog of faint variable objects extracted from version 3 of the Hubble Source Catalog. The HCV project at the National Observatory of Athens was funded by the European Space Agency (PI: Alceste Bonanos). The data products for the HCV are available both at the ESA Hubble Archive at ESAC through the HCV Explorer interface and at STScI.

This notebook uses the MAST HSC catalog interface, which supports queries to the current and previous versions of the Hubble Source Catalog. It allows searches of several different tables including the HCV summary and detailed tables. It also has an associated API that is used for data access in this notebook.

For similar examples using the MAST CasJobs interface, a SQL database query interface that is more complex to use, but more powerful than the API, see HCV_CasJobs.

Instructions#

  • Complete the initialization steps described below.

  • Run the initialization steps before running the rest of the notebook.

  • Modify and rerun any sections of the Table of Contents below.

Table of Contents#

  • Intialization

  • Variable objects in IC 1613

    • Name resolver

    • Select objects from HCV

    • Information on HCV variable classification

    • Sky coverage

    • Properties of variable objects

    • Color magnitude diagram

  • Light curve for a nova in M87

    • Extract and plot light curve for the nova

    • HLA cutout images for selected measurements

  • Compare the HCV automatic classification to expert validations

    • Plot MAD variability index distribution

    • Plot fraction of artifacts vs. MAD

  • Plot light curve for most variable high-quality candidate in the HCV

Initialization #

Install Python modules#

  1. This notebook requires the use of Python 3.

  2. Modules can be installed with conda, if using the Anaconda distribution of python, or with pip.

    • If you are using conda, do not install / update / remove a module with pip, that exists in a conda channel.

    • If a module is not available with conda, then it’s okay to install it with pip

import astropy
from astropy.coordinates import SkyCoord
import time
import sys
import os
import requests
import json
import numpy as np
import matplotlib.pyplot as plt

from PIL import Image
from io import BytesIO

from astropy.table import Table, join
from astropy.io import ascii

# set width for pprint
astropy.conf.max_width = 150
# set universal matplotlib parameters
plt.rcParams.update({'font.size': 16})

MAST API functions#

  • Here we define several interrelated functions for retrieving information from the MAST API.

    • The hcvcone(ra, dec, radius [, keywords]) function searches the HCV catalog near a position.

    • The hcvsearch() function performs general non-positional queries.

    • The hcvmetadata() function gives information about the columns available in a table.

hscapiurl = "https://catalogs.mast.stsci.edu/api/v0.1/hsc"


def hcvcone(ra, dec, radius, table="hcvsummary", release="v3", format="csv", magtype="magaper2",
            columns=None, baseurl=hscapiurl, verbose=False, **kw):
    """Do a cone search of the HSC catalog (including the HCV)
    
    Parameters
    ----------
    ra (float): (degrees) J2000 Right Ascension
    dec (float): (degrees) J2000 Declination
    radius (float): (degrees) Search radius (<= 0.5 degrees)
    table (string): hcvsummary, hcv, summary, detailed, propermotions, or sourcepositions
    release (string): v3 or v2
    magtype (string): magaper2 or magauto (only applies to summary table)
    format: csv, votable, json
    columns: list of column names to include (None means use defaults)
    baseurl: base URL for the request
    verbose: print info about request
    **kw: other parameters (e.g., 'numimages.gte':2)
    """
    
    data = kw.copy()
    data['ra'] = ra
    data['dec'] = dec
    data['radius'] = radius
    return hcvsearch(table=table, release=release, format=format, magtype=magtype,
                     columns=columns, baseurl=baseurl, verbose=verbose, **data)


def hcvsearch(table="hcvsummary", release="v3", magtype="magaper2", format="csv",
              columns=None, baseurl=hscapiurl, verbose=False, **kw):
    """Do a general search of the HSC catalog (possibly without ra/dec/radius)
    
    Parameters
    ----------
    table (string): hcvsummary, hcv, summary, detailed, propermotions, or sourcepositions
    release (string): v3 or v2
    magtype (string): magaper2 or magauto (only applies to summary table)
    format: csv, votable, json
    columns: list of column names to include (None means use defaults)
    baseurl: base URL for the request
    verbose: print info about request
    **kw: other parameters (e.g., 'numimages.gte':2).  Note this is required!
    """
    
    data = kw.copy()
    if not data:
        raise ValueError("You must specify some parameters for search")
    if format not in ("csv", "votable", "json"):
        raise ValueError("Bad value for format")
    url = f"{cat2url(table, release, magtype, baseurl=baseurl)}.{format}"
    if columns:
        # check that column values are legal
        # create a dictionary to speed this up
        dcols = {}
        for col in hcvmetadata(table, release, magtype)['name']:
            dcols[col.lower()] = 1
        badcols = []
        for col in columns:
            if col.lower().strip() not in dcols:
                badcols.append(col)
        if badcols:
            raise ValueError(f"Some columns not found in table: {', '.join(badcols)}")
        # two different ways to specify a list of column values in the API
        # data['columns'] = columns
        data['columns'] = f"[{','.join(columns)}]"

    # either get or post works
    # r = requests.post(url, data=data)
    r = requests.get(url, params=data)

    if verbose:
        print(r.url)
    r.raise_for_status()
    if format == "json":
        return r.json()
    else:
        return r.text


def hcvmetadata(table="hcvsummary", release="v3", magtype="magaper2", baseurl=hscapiurl):
    """Return metadata for the specified catalog and table
    
    Parameters
    ----------
    table (string): hcvsummary, hcv, summary, detailed, propermotions, or sourcepositions
    release (string): v3 or v2
    magtype (string): magaper2 or magauto (only applies to summary table)
    baseurl: base URL for the request
    
    Returns an astropy table with columns name, type, description
    """
    url = f"{cat2url(table,release,magtype,baseurl=baseurl)}/metadata"
    r = requests.get(url)
    r.raise_for_status()
    v = r.json()
    # convert to astropy table
    tab = Table(rows=[(x['name'], x['type'], x['description']) for x in v],
                names=('name', 'type', 'description'))
    return tab


def cat2url(table="hcvsummary", release="v3", magtype="magaper2", baseurl=hscapiurl):
    """Return URL for the specified catalog and table
    
    Parameters
    ----------
    table (string): hcvsummary, hcv, summary, detailed, propermotions, or sourcepositions
    release (string): v3 or v2
    magtype (string): magaper2 or magauto (only applies to summary table)
    baseurl: base URL for the request
    
    Returns a string with the base URL for this request
    """
    checklegal(table, release, magtype)
    if table == "summary":
        url = f"{baseurl}/{release}/{table}/{magtype}"
    else:
        url = f"{baseurl}/{release}/{table}"
    return url


def checklegal(table, release, magtype):
    """Checks if this combination of table, release and magtype is acceptable
    
    Raises a ValueError exception if there is problem
    """
    
    releaselist = ("v2", "v3")
    if release not in releaselist:
        raise ValueError(f"Bad value for release (must be one of {', '.join(releaselist)})")
    if release == "v2":
        tablelist = ("summary", "detailed")
    else:
        tablelist = ("summary", "detailed", "propermotions", "sourcepositions", "hcvsummary", "hcv")
    if table not in tablelist:
        raise ValueError(f"Bad value for table (for {release} must be one of {', '.join(tablelist)})")
    if table == "summary":
        magtypelist = ("magaper2", "magauto")
        if magtype not in magtypelist:
            raise ValueError(f"Bad value for magtype (must be one of {', '.join(magtypelist)})")

Variable objects near IC 1613 #

Use MAST name resolver to get position of IC 1613 #

target = 'IC 1613'
coord_ic1613 = SkyCoord.from_name(target)

ra_ic1613 = coord_ic1613.ra.degree
dec_ic1613 = coord_ic1613.dec.degree
print(f'ra: {ra_ic1613}\ndec: {dec_ic1613}')
ra: 16.2016962
dec: 2.1194959

Select objects near IC 1613 from HCV #

This searches the HCV summary table for objects within 0.5 degrees of the galaxy center. Note that this returns both variable and non-variable objects.

radius = 0.5 # degrees
t0 = time.time()
tab = ascii.read(hcvcone(ra_ic1613, dec_ic1613, radius, table="hcvsummary"))
print("Completed in {:.1f} sec".format(time.time()-t0))

# clean up the output format
tab['MeanMag'].format = "{:.3f}"
tab['MeanCorrMag'].format = "{:.3f}"
tab['MAD'].format = "{:.4f}"
tab['Chi2'].format = "{:.4f}"
tab['RA'].format = "{:.6f}"
tab['Dec'].format = "{:.6f}"

# show some of the variable sources
tab[tab['AutoClass'] > 0]
Completed in 3.2 sec
Table length=533
MatchIDFilterGroupIDSubGroupIDRADecAutoClassExpertClassVarQualFlagFilterDetFlagNumLCMeanMagMeanCorrMagMADChi2
int64str11int64int64float64float64int64int64str5int64int64float64float64float64float64
12131288ACS_F475W69810-516.0956592.16124611BACAA11225.24825.2500.129611.8544
12131288ACS_F814W69810-516.0956592.16124611AAAAC01225.21525.2170.06112.3024
24652319ACS_F475W69810-516.1074122.16441112BABAC0825.22525.2230.04853.9213
24652319ACS_F814W69810-516.1074122.16441112BACAA1725.76325.7640.15484.7097
59656371ACS_F475W69810-516.1059362.15739721AACAA11225.40825.4090.125773.8505
59656371ACS_F814W69810-516.1059362.15739721AACAA11225.18525.1860.107614.6299
82653722ACS_F475W69810-516.0972392.16206621AACAA11225.23725.2380.231264.6089
82653722ACS_F814W69810-516.0972392.16206621BACAA11225.06925.0700.141620.4986
64824760ACS_F475W69810-516.0960662.17045114AAAAC01224.87424.8740.04362.6037
.............................................
29704448ACS_F475W69810-516.1136442.13132912ABAAC01223.66423.6650.031113.4779
29704448ACS_F814W69810-516.1136442.13132912CBCAA11222.42222.4230.0788191.6911
30717339ACS_F475W69810-516.1056592.13319821CACAA11225.48125.4810.118942.3048
30717339ACS_F814W69810-516.1056592.13319821AABAB11225.19625.1980.09386.4070
73244023ACS_F475W69810-516.1024252.12608321AACAA11125.39625.3970.189534.9265
73244023ACS_F814W69810-516.1024252.12608321AABAB11125.20625.2080.08688.4067
99563258ACS_F475W69810-516.1019402.13086714AAAAC11223.92623.9270.043413.8320
99563258ACS_F814W69810-516.1019402.13086714ABAAC01122.55022.5510.015015.7979
26648962ACS_F475W69810-516.1049192.13147721AACBA11125.46125.4610.158219.5376
26648962ACS_F814W69810-516.1049192.13147721BAAAB11225.21725.2190.10195.4801

Description of the variable classification columns #

Several of the table columns have information on the variability.

  • The columns AutoClass and ExpertClass have summary information on the variability for a given MatchID object.

    • AutoClass: Classification as provided by the system: 0=constant 1=single filter variable candidate (SFVC) 2=multi-filter variable candidate (MFVC)

    • ExpertClass: Classification as provided by expert: 0=not classified by expert, 1=high confidence variable, 2=probable variable, 4=possible artifact

  • The columns MAD and Chi2 are variability indices using the median absolute deviation and the \(\chi^2\) parameter for the given filter.

  • The column VarQualFlag is a variability quality flag (see Section 5 of the paper). The five letters correspond to CI, D, MagerrAper2, MagAper2-MagAuto, p2p; AAAAA corresponds to the highest quality flag.

  • The column FilterDetFlag is the filter detection flag: 1=source is variable in this filter, 0=source is not variable in this filter.

See the HCV paper by Bonanos et al. (2019, AAp) for more details on the computation and meaning of these quantities.

Find objects with measurements in both F475W and F814W#

This could be done in a SQL query through the CasJobs interface. Here we use the Astropy.table.join function instead.

# the only key needed to do the join is MatchID, but we include other common columns
# so that join includes only one copy of them
jtab = join(tab[tab['Filter'] == 'ACS_F475W'], tab[tab['Filter'] == 'ACS_F814W'],
            keys=['MatchID', 'GroupID', 'SubGroupID', 'RA', 'Dec', 'AutoClass', 'ExpertClass'],
            table_names=['f475', 'f814'])
print(len(jtab), "matched F475W+F814W objects")
jtab[jtab['AutoClass'] > 0]
17090 matched F475W+F814W objects
Table length=258
MatchIDFilter_f475GroupIDSubGroupIDRADecAutoClassExpertClassVarQualFlag_f475FilterDetFlag_f475NumLC_f475MeanMag_f475MeanCorrMag_f475MAD_f475Chi2_f475Filter_f814VarQualFlag_f814FilterDetFlag_f814NumLC_f814MeanMag_f814MeanCorrMag_f814MAD_f814Chi2_f814
int64str11int64int64float64float64int64int64str5int64int64float64float64float64float64str11str5int64int64float64float64float64float64
96457ACS_F475W69810-516.1415162.17781521AACAA11223.19223.1920.0686424.7097ACS_F814WAAAAC11222.94622.9470.040277.5733
813653ACS_F475W69810-516.1283532.16014714AAAAC01025.50825.5070.02430.2945ACS_F814WBACAB11124.90624.9080.07958.5797
1012692ACS_F475W69810-516.1348092.14472021AACAA1725.43925.4380.144617.9867ACS_F814WAABAB1825.20825.2100.11526.6471
1085386ACS_F475W69810-516.1185442.16084514AAAAC01124.35124.3520.01743.1311ACS_F814WBABBB11123.47623.4760.069740.7630
1286857ACS_F475W69810-516.1192052.18425212AABAB11223.13723.1380.047753.2282ACS_F814WAAAAC01221.07321.0750.012846.8982
1309271ACS_F475W69810-516.1305712.15251221AACAA11225.34725.3480.139920.0434ACS_F814WAAAAC11225.04325.0440.08045.2954
1479646ACS_F475W69810-516.1208522.15273712CACAA01223.69123.6920.033241.9525ACS_F814WAABAB11222.50122.5030.036268.2414
1661315ACS_F475W69810-516.1105712.14352614AACAB11225.22025.2210.07805.8648ACS_F814WAAAAC01225.88725.8890.05950.8037
1826474ACS_F475W69810-516.1015322.17185512AAAAC01125.73525.7370.02590.5195ACS_F814WBACAB11124.88924.8900.08229.2131
.....................................................................
102132800ACS_F475W69810-516.1261602.14544221AACAA11223.69123.6920.0750129.5717ACS_F814WAACBA11224.50324.5050.103827.1660
102239423ACS_F475W69810-516.1387062.15565212AABAA11222.81022.8110.0982307.5226ACS_F814WAABAA01222.73522.7370.038296.9001
103232694ACS_F475W69810-516.1075652.17439914AAAAC01122.42222.4220.00601.7118ACS_F814WAAAAC11122.42522.4260.033527.9930
104300195ACS_F475W69810-516.1249812.17177221AACAA11225.54025.5410.105458.7587ACS_F814WAACAA11225.38225.3840.12179.4811
105173757ACS_F475W69810-516.1337032.18450621AAAAA11221.37221.3720.15892810.1572ACS_F814WAACAA11222.24022.2420.1825829.2399
106466795ACS_F475W69810-516.1270562.16639011AACAA11225.35725.3580.142513.4065ACS_F814WAAAAC01225.25725.2590.04622.8032
106640363ACS_F475W69810-516.1357962.14909912CACAA1825.57725.5790.11615.8168ACS_F814WAABAB0924.94424.9460.06618.3312
106843213ACS_F475W69810-516.1103422.15037314CACCB01223.69123.6910.026930.7690ACS_F814WCACBA11224.42924.4310.071025.8158
107834538ACS_F475W69810-516.1071052.16008411AABAC11225.40025.4010.08183.4254ACS_F814WAAAAC01225.12825.1290.05781.7916
108048053ACS_F475W69810-516.1505722.14259012AACCA11025.29725.2970.069722.9085ACS_F814WAAAAC01124.54424.5460.02731.6361

Plot object positions on the sky #

We mark the galaxy center as well. Note that this field is in the outskirts of IC 1613. The 0.5 degree search radius (which is the maximum allowed in the API) allows finding these objects.

fig, ax = plt.subplots(figsize=(10, 10))
ax.plot('RA', 'Dec', 'bo', markersize=1, label=f'{len(tab):,} HCV measurements', data=jtab)
ax.plot(ra_ic1613, dec_ic1613, 'rx', label=target, markersize=10)
ax.invert_xaxis()
ax.set(aspect='equal', xlabel='RA [deg]', ylabel='Dec [deg]')
ax.legend(loc='best')
<matplotlib.legend.Legend at 0x7fc89c763690>
../../../_images/dab12e81063726532784434a4396a5a06863b38f69f13021c87e8613f582fbbe.png

Plot HCV MAD variability index versus magnitude in F475W #

The median absolute deviation variability index is used by the HCV to identify variables. It measures the scatter among the multi-epoch measurements. Some scatter is expected from noise (which increases for fainter objects). Objects with MAD values that are high are likely to be variable.

This plots single-filter and multi-filter variable candidates (SFVC and MFVC) in different colors. Note that variable objects with low F475W MAD values are variable in a different filter (typically F814W in this field).

This plot is similar to the upper panel of Figure 4 in Bonanos et al. (2019, AAp).

# define plot parameter lists
auto_class = np.unique(jtab['AutoClass'])
markers = ['x', 'o', 'o']
colors = ['silver', 'blue', 'tab:cyan']
labels = ['non-', 'single-filter ', 'multi-filter ']

fig, ax = plt.subplots(figsize=(15, 10))
for ac, marker, color, label in zip(auto_class, markers, colors, labels):
    data = jtab[jtab['AutoClass'] == ac]
    ax.plot('MeanCorrMag_f475', 'MAD_f475', marker, markersize=4, color=color,
            label=f'{len(data):,} {label}variable candidates', data=data)

ax.set(xlabel='ACS_F475W [mag]', ylabel='ACS_F475W MAD [mag]')
ax.legend(loc='best', title=f'{len(jtab)} HSC measurements near {target}')
<matplotlib.legend.Legend at 0x7fc8915ec390>
../../../_images/0cba3b6881351ef10a97a598dd9ea52d7771d9ee922a1f537eda2b00e93f3e8e.png

Plot variables in color-magnitude diagram #

Many of the candidate variables lie on the instability strip.

This plot is similar to the lower panel of Figure 4 in Bonanos et al. (2019, AAp).

# add a new column to jtab
jtab['MCMf475-MCMf814'] = jtab['MeanCorrMag_f475'] - jtab['MeanCorrMag_f814']

fig, ax = plt.subplots(figsize=(15, 10))
# uses same plot parameters defined in the previous plot
for ac, marker, color, label in zip(auto_class, markers, colors, labels):
    data = jtab[jtab['AutoClass'] == ac]
    ax.plot('MCMf475-MCMf814', 'MeanCorrMag_f475', marker, markersize=4, color=color,
            label=f'{len(data):,} {label}variable candidates', data=data)
    
ax.invert_yaxis()
ax.set(xlim=(-1.1, 4), xlabel='ACS F475W - F814W [mag]', ylabel='ACS F475W [mag]')
ax.legend(loc='best', title=f'{len(jtab)} HSC measurements near {target}')
<matplotlib.legend.Legend at 0x7fc891646850>
../../../_images/be1ea5a9994ca05adeca5eabac0cf4e405a8b92a02d1e6c3c49a868db3698dd6.png

Get a light curve for a nova in M87 #

Extract light curve for a given MatchID #

Note that the MatchID could be determined by positional searches, filtering the catalog, etc. This object comes from the top left panel of Figure 9 in Bonanos et al. (2019, AAp).

matchid = 1905457

t0 = time.time()

# get light curves for F606W and F814W
nova_606 = ascii.read(hcvsearch(table='hcv', MatchID=matchid, Filter='ACS_F606W'))
print(f"{(time.time()-t0):.1f} sec: retrieved {len(nova_606)} F606W measurements")

nova_814 = ascii.read(hcvsearch(table='hcv', MatchID=matchid, Filter='ACS_F814W'))
print(f"{(time.time()-t0):.1f} sec: retrieved {len(nova_814)} F814W measurements")

# get the object RA and Dec as well
nova_tab = ascii.read(hcvsearch(table='hcvsummary', MatchID=matchid, Filter='ACS_F814W'))
print(f"{(time.time()-t0):.1f} sec: retrieved object info")

nova_606
0.4 sec: retrieved 21 F606W measurements
0.7 sec: retrieved 22 F814W measurements
1.1 sec: retrieved object info
Table length=21
MatchIDFilterMJDImageNameMagCorrMagMagErrCID
int64str9float64str26float64float64float64float64float64
1905457ACS_F606W53767.4197952871hst_10543_29_acs_wfc_f606w25.32725.32671328029930.13050.84064811468124413.499119758606
1905457ACS_F606W53768.4190663833hst_10543_30_acs_wfc_f606w23.969423.96761098021960.03941.0437963008880611.9895544052124
1905457ACS_F606W53769.3576541713hst_10543_31_acs_wfc_f606w23.600523.60008421247370.03060.9741666316986089.30478286743164
1905457ACS_F606W53771.1017052617hst_10543_33_acs_wfc_f606w23.610523.61575105175430.0303000010.966574072837832.96933674812317
1905457ACS_F606W53772.8098534283hst_10543_35_acs_wfc_f606w23.62179923.6029938853950.03280.9920370578765874.45478820800781
1905457ACS_F606W53774.4746564087hst_10543_37_acs_wfc_f606w24.01549924.01238680238810.0425999981.02462971210483.4874792098999
1905457ACS_F606W53775.2799112014hst_10543_38_acs_wfc_f606w23.903723.90775473410580.03810.9851852059364323.49740219116211
1905457ACS_F606W53776.0893441574hst_10543_39_acs_wfc_f606w24.01549924.01821088403150.04141.053148150444038.44466972351074
1905457ACS_F606W53776.9439852673hst_10543_40_acs_wfc_f606w24.06570124.07438894903280.0445999991.241574048995977.6749701499939
1905457ACS_F606W53778.562434162hst_10543_42_acs_wfc_f606w24.43079924.42749372240490.0595000011.074259281158453.26662158966064
1905457ACS_F606W53779.4097260174hst_10543_43_acs_wfc_f606w24.41589924.41505845731050.0626000020.9652777910232542.09669971466064
1905457ACS_F606W53780.2090778507hst_10543_44_acs_wfc_f606w24.69759924.69448381196230.0804999990.8106481432914736.5630521774292
1905457ACS_F606W53782.61915897hst_10543_47_acs_wfc_f606w24.56080124.55707489945920.0653999971.099074125289922.88692712783813
1905457ACS_F606W53784.2176541714hst_10543_73_acs_wfc_f606w24.782724.77701107022670.0790000040.8881481885910033.16040587425232
1905457ACS_F606W53786.1563230369hst_10543_86_acs_wfc_f606w24.909124.85326845876910.09021.037777781486518.54914951324463
1905457ACS_F606W53787.0225386124hst_10543_92_acs_wfc_f606w25.11459925.08782309605640.11270.9244444370269784.14372444152832
1905457ACS_F606W53792.6850963715hst_10543_49_acs_wfc_f606w25.22800125.2318322366430.11471.1442593336105311.9895496368408
1905457ACS_F606W53793.5260915786hst_10543_a1_acs_wfc_f606w25.15360125.16043598617860.10710.9256481528282177.5528678894043
1905457ACS_F606W53796.1486378878hst_10543_b8_acs_wfc_f606w24.017223.99859200518370.04361.5117592811584510.1193161010742
1905457ACS_F606W53798.5892174062hst_10543_50_acs_wfc_f606w25.56360125.57089652347360.153500010.95324075222015412.9010400772095
1905457ACS_F606W53799.4608942058hst_10543_c4_acs_wfc_f606w25.589525.59087074177370.16020.929814815521245.97602415084839
fig, ax = plt.subplots(figsize=(15, 10))

ax.errorbar(x='MJD', y='CorrMag', yerr='MagErr', fmt='ob', ecolor='k', elinewidth=1, markersize=8, label='ACS F606W', data=nova_606)
ax.errorbar(x='MJD', y='CorrMag', yerr='MagErr', fmt='or', ecolor='k', elinewidth=1, markersize=8, label='ACS F814W', data=nova_814)

ax.invert_yaxis()
ax.set(xlabel='MJD [days]', ylabel='magnitude')
ax.legend(loc='best', title=f'Nova in M87 (MatchID: {matchid})')
<matplotlib.legend.Legend at 0x7fc89c52a750>
../../../_images/c6beef1aaef71a71398e37f51f6643d014f3c8850745c25d6d246077945dc616.png

Get HLA image cutouts for the nova #

The Hubble Legacy Archive (HLA) images were the source of the measurements in the HSC and HCV, and it can be useful to look at the images. Examination of the images can be useful to identified cosmic-ray contamination and other possible image artifacts. In this case, no issues are seen, so the light curve is reliable.

Note that the ACS F606W images of M87 have only a single exposure, so they do have cosmic ray contamination. The accompanying F814W images have multiple exposures, allowing CRs to be removed. In this case the F814W combined image is used to find objects, while the F606W exposure is used only for photometry. That reduces the effects of F606W CRs on the catalog but it is still a good idea to confirm the quality of the images.

The get_hla_cutout function reads a single cutout image (as a JPEG grayscale image) and returns a PIL image object. See the documentation on the fitscut image cutout service for more information on the web service being used.

def get_hla_cutout(imagename, ra, dec, size=33, autoscale=99.5, asinh=True, zoom=1):
    """Get JPEG cutout for an image"""
    url = "https://hla.stsci.edu/cgi-bin/fitscut.cgi"
    r = requests.get(url, params=dict(ra=ra, dec=dec, size=size,
                                      format="jpeg", red=imagename, autoscale=autoscale, asinh=asinh, zoom=zoom))
    im = Image.open(BytesIO(r.content))
    return im
# sort images by magnitude from brightest to faintest
phot = nova_606
isort = np.argsort(phot['CorrMag'])
# select the brightest, median and faintest magnitudes
ind = [isort[0], isort[len(isort)//2], isort[-1]]

# we plot zoomed-in and zoomed-out views side-by-side for each selected image
nim = len(ind)*2
ncols = 2 # images per row
nrows = (nim+ncols-1)//ncols

imsize1 = 19
imsize2 = 101
mra = nova_tab['RA'][0]
mdec = nova_tab['Dec'][0]

# define figure and axes
fig, axes = plt.subplots(nrows, ncols, figsize=(12, (12/ncols)*nrows), tight_layout=True)

t0 = time.time()

# iterate through each set of two subplots in axes
for (ax1, ax2), k in zip(axes, ind):
    
    # get the images
    im1 = get_hla_cutout(phot['ImageName'][k], mra, mdec, size=imsize1)
    im2 = get_hla_cutout(phot['ImageName'][k], mra, mdec, size=imsize2)
    
    # plot left column   
    ax1.imshow(im1, origin="upper", cmap="gray")
    ax1.set_title(f"{phot['ImageName'][k]} m={phot['CorrMag'][k]:.3f}", fontsize=14)
    
    # plot right column
    ax2.imshow(im2, origin="upper", cmap="gray")
    xbox = np.array([-1, 1])*imsize1/2 + (imsize2-1)//2
    ax2.plot(xbox[[0, 1, 1, 0, 0]], xbox[[0, 0, 1, 1, 0]], 'r-', linewidth=1)
    ax2.set_title(f"{phot['ImageName'][k]} m={phot['CorrMag'][k]:.3f}", fontsize=14)    
    
print(f"{(time.time()-t0):.1f} s: got {nrows*ncols} cutouts")
6.2 s: got 6 cutouts
../../../_images/ee2d70664ed531c9620772da108d2922e2378936bea0eace1b73714967a75f55.png

Compare the HCV automatic classification to expert validations #

The HCV includes an automatic classification AutoClass for candidate variables as well as an expert validation for some fields that were selected for visual examination. For this example, we select all the objects in the HCV that have expert classification information.

t0 = time.time()

# get data for objects with an expert validation
constraints = {"ExpertClass.gte": 1}
tab = ascii.read(hcvsearch(table="hcvsummary", **constraints))
print(f"Retrieved {len(tab)} rows in {(time.time()-t0):.1f} sec")

# clean up the output format
tab['MeanMag'].format = "{:.3f}"
tab['MeanCorrMag'].format = "{:.3f}"
tab['MAD'].format = "{:.4f}"
tab['Chi2'].format = "{:.4f}"
tab['RA'].format = "{:.6f}"
tab['Dec'].format = "{:.6f}"

# tab includes 1 row for each filter (so multiple rows for objects with multiple filters)
# get an array that has only one row per object
mval, uindex = np.unique(tab['MatchID'], return_index=True)
utab = tab[uindex]
print(f"{len(utab)} unique MatchIDs in table")

tab
Retrieved 31258 rows in 1.4 sec
13533 unique MatchIDs in table
Table length=31258
MatchIDFilterGroupIDSubGroupIDRADecAutoClassExpertClassVarQualFlagFilterDetFlagNumLCMeanMagMeanCorrMagMADChi2
int64str11int64int64float64float64int64int64str5int64int64float64float64float64float64
875ACS_F435W1040153-564.149673-24.11035322AAAAA01324.16624.1660.03847.1169
875ACS_F606W1040153-564.149673-24.11035322AAAAC0922.83622.8350.034981.3676
875ACS_F814W1040153-564.149673-24.11035322AAAAA02322.15622.1560.0325141.7171
875WFC3_F105W1040153-564.149673-24.11035322CAAAB11421.84321.8430.0503223.4961
875WFC3_F125W1040153-564.149673-24.11035322CBCCA1621.81321.8140.0655792.8662
875WFC3_F140W1040153-564.149673-24.11035322AAAAA0621.70521.7040.0205127.6242
875WFC3_F160W1040153-564.149673-24.11035322BABAA11321.62321.6240.0322112.8544
2006WFPC2_F555W521507-5268.112152-17.68428222AAAAC1721.81121.8210.154585.2812
2006WFPC2_F814W521507-5268.112152-17.68428222AAAAA1720.51320.5190.0988105.2348
.............................................
108116850ACS_F475W10459048511.62759342.06187411AAAAA0524.77224.7740.049852.0622
108116850ACS_F814W10459048511.62759342.06187411AACAA1524.60524.6070.112056.7844
108127522ACS_F606W1043478-5211.15429754.52158011AACAA1626.78626.7860.149612.5680
108141967ACS_F606W1036556-534.188431-5.20318914BAACA1526.17426.1750.18065.3338
108141967ACS_F814W1036556-534.188431-5.20318914AAAAC0525.97025.9690.05610.2748
108154109ACS_F606W1084533-553.141266-27.71063012CAAAC0724.47724.4770.01613.4616
108154109ACS_F775W1084533-553.141266-27.71063012CAAAC01123.65023.6500.05882.6932
108154109ACS_F814W1084533-553.141266-27.71063012CAAAC0523.57723.5770.00372.9994
108154109ACS_F850LP1084533-553.141266-27.71063012CAAAA11223.45523.4550.06393.4468
108154109WFC3_F105W1084533-553.141266-27.71063012AAAAC0621.54121.5410.00650.4013

An ExpertClass value of 1 indicates that the object is confidently confirmed to be a variable; 2 means that the measurements do not have apparent problems and so the object is likely to be variable (usually the variability is too small to be obvious in the image); 4 means that the variability is likely to be the result of artifacts in the image (e.g., residual cosmic rays or diffraction spikes from nearby bright stars).

Compare the distributions for single-filter variable candidates (SFVC, AutoClass=1) and multi-filter variable candidates (MFVC, AutoClass=2). The fraction of artifacts is lower in the MFVC sample.

sfcount = np.bincount(utab['ExpertClass'][utab['AutoClass'] == 1])
mfcount = np.bincount(utab['ExpertClass'][utab['AutoClass'] == 2])
sfrat = sfcount/sfcount.sum()
mfrat = mfcount/mfcount.sum()

print("Type Variable Likely Artifact Total")
print("SFVC {:8d} {:6d} {:8d} {:5d} counts".format(sfcount[1], sfcount[2], sfcount[4], sfcount.sum()))
print("MFVC {:8d} {:6d} {:8d} {:5d} counts".format(mfcount[1], mfcount[2], mfcount[4], mfcount.sum()))
print("SFVC {:8.3f} {:6.3f} {:8.3f} {:5.3f} fraction".format(sfrat[1], sfrat[2], sfrat[4], sfrat.sum()))
print("MFVC {:8.3f} {:6.3f} {:8.3f} {:5.3f} fraction".format(mfrat[1], mfrat[2], mfrat[4], mfrat.sum()))
Type Variable Likely Artifact Total
SFVC     3323   3055     1761  8139 counts
MFVC     2101   2442      851  5394 counts
SFVC    0.408  0.375    0.216 1.000 fraction
MFVC    0.390  0.453    0.158 1.000 fraction

Plot the MAD variability index distribution with expert classifications #

Note that only the filters identified as variable (FilterDetFlag > 0) are included here.

This version of the plot shows the distributions for the various ExpertClass values along with, for comparison, the distribution for all objects in gray (which is identical in each panel). Most objects are classified as confident or likely variables. Objects with lower MAD values (indicating a lower amplitude of variability) are less likely to be identified as confident variables because low-level variability is more difficult to confirm via visual examination.

Data & Bins#

w = np.where(tab['FilterDetFlag'] > 0)
mad = tab['MAD'][w]
e = tab['ExpertClass'][w]

xrange = [7.e-3, 2.0]
bins = xrange[0]*(xrange[1]/xrange[0])**np.linspace(0.0, 1.0, 50)

Plots#

fig, axes = plt.subplots(3, 1, figsize=(12, 12))

labels = ['Confident', 'Likely', 'Artifact']
colors = ['C2', 'C1', 'C0']

for ax, v, label, color in zip(axes, np.unique(e), labels, colors):
    ax.hist(mad, bins=bins, log=True, color='lightgray', label='All')
    ax.hist(mad[e == v], bins=bins, log=True, label=label, color=color)
    ax.set(xscale='log', ylabel='Count')
    ax.legend(loc='upper left')
    
fig.suptitle('HCV Expert Validation', y=0.9)
_ = axes[2].set_xlabel('MAD Variability Index [mag]')
../../../_images/17952b69c4a18d12259061cfc6a36344af96b1adb5fe43c1fdf0ab799524aa3f.png

The plot below shows the same distributions, but plotted as stacked histograms. The top panel uses a linear scale on the y-axis and the bottom panel uses a log y scale.

fig, axes = plt.subplots(2, 1, figsize=(15, 12))

ylogs = [False, True]

for ax, ylog in zip(axes, ylogs):
    ax.hist(mad, bins=bins, log=ylog, label='Artifact')
    ax.hist(mad[e < 4], bins=bins, log=ylog, label='Likely Variable')
    ax.hist(mad[e == 1], bins=bins, log=ylog, label='Confident Variable')

    ax.set_xscale('log')
    ax.set_xlabel('MAD Variability Index [mag]')
    ax.set_ylabel('Count')
    ax.legend(loc='upper right', title='HCV Expert Validation')
../../../_images/91bdcfe309a0cd9a1e17bad615a4a0ed75efdad64ad474d8bd554deaede53a23.png

Plot the fraction of artifacts as a function of MAD variability index #

This shows how the fraction of artifacts varies with the MAD value. For larger MAD values the fraction decreases sharply, presumably because such large values are less likely to result from the usual artifacts. Interestingly, the artifact fraction also declines for smaller MAD values (MAD < 0.1 mag). Probably that happens because typical artifacts are more likely to produce strong signals than the weaker signals indicated by a low MAD value.

w = np.where(tab['FilterDetFlag'] > 0)
mad = tab['MAD'][w]
e = tab['ExpertClass'][w]

xrange = [7.e-3, 2.0]
bins = xrange[0]*(xrange[1]/xrange[0])**np.linspace(0.0, 1.0, 30)

all_count, bin_edges = np.histogram(mad, bins=bins)
artifact_count, bin_edges = np.histogram(mad[e == 4], bins=bins)
wnz = np.where(all_count > 0)[0]
nnz = len(wnz)

artifact_count = artifact_count[wnz]
all_count = all_count[wnz]
xerr = np.empty((2, nnz), dtype=float)
xerr[0] = bin_edges[wnz]
xerr[1] = bin_edges[wnz+1]

# combine bins at edge into one big bin to improve the statistics there
iz = np.where(all_count.cumsum() > 10)[0][0]
if iz > 0:
    all_count[iz] += all_count[:iz].sum()
    artifact_count[iz] += artifact_count[:iz].sum()
    xerr[0, iz] = xerr[0, 0]
    all_count = all_count[iz:]
    artifact_count = artifact_count[iz:]
    xerr = xerr[:, iz:]
    
iz = np.where(all_count[::-1].cumsum() > 40)[0][0]
if iz > 0:
    all_count[-iz-1] += all_count[-iz:].sum()
    artifact_count[-iz-1] = artifact_count[-iz:].sum()
    xerr[1, -iz-1] = xerr[1, -1]
    all_count = all_count[:-iz]
    artifact_count = artifact_count[:-iz]
    xerr = xerr[:, :-iz]

x = np.sqrt(xerr[0]*xerr[1])
xerr[0] = x - xerr[0]
xerr[1] = xerr[1] - x

frac = artifact_count/all_count
# error on fraction using binomial distribution (approximate)
ferr = np.sqrt(frac*(1-frac)/all_count)

Create the plot

fig, ax = plt.subplots(figsize=(12, 12))
ax.errorbar(x, frac, xerr=xerr, yerr=ferr, fmt='ob', markersize=5, label='Artifact fraction')
ax.set(xscale='log', xlabel='MAD Variability Index [mag]', ylabel='Artifact Fraction')
ax.legend(loc='upper right', title='HCV Expert Validation')
<matplotlib.legend.Legend at 0x7fc8946d2050>
../../../_images/ff1c2dbe53ff170e9c61a3f35864a01fbcb2ca60fbb90d585f3cb76b46a7cf6a.png

Plot light curve for the most variable high quality candidate in the HCV #

Select the candidate variable with the largest MAD value and VarQualFlag = ‘AAAAA’. To find the highest MAD value, we sort by MAD in descending order and select the first result.

constraints = {'VarQualFlag': 'AAAAA', 'sort_by': 'MAD.desc', 'pagesize': 1}

t0 = time.time()
tab = ascii.read(hcvsearch(table='hcvsummary', **constraints))
print("Completed in {:.1f} sec".format(time.time()-t0))

# clean up the output format
tab['MeanMag'].format = "{:.3f}"
tab['MeanCorrMag'].format = "{:.3f}"
tab['MAD'].format = "{:.4f}"
tab['Chi2'].format = "{:.4f}"
tab['RA'].format = "{:.6f}"
tab['Dec'].format = "{:.6f}"

print("MatchID {} has largest MAD value = {:.2f}".format(
    tab['MatchID'][0], tab['MAD'][0]))
tab
Completed in 2.6 sec
MatchID 5742711 has largest MAD value = 0.86
Table length=1
MatchIDFilterGroupIDSubGroupIDRADecAutoClassExpertClassVarQualFlagFilterDetFlagNumLCMeanMagMeanCorrMagMADChi2
int64str9int64int64float64float64int64int64str5int64int64float64float64float64float64
5742711ACS_F814W10459041610.92860141.16429510AAAAA1522.26422.2650.85816698.4430

Get the light curve.

matchid = tab['MatchID'][0]
mfilter = tab['Filter'][0]

t0 = time.time()
lc = ascii.read(hcvsearch(table="hcv", MatchID=matchid, Filter=mfilter))
print(f"{(time.time()-t0):.1f} sec: retrieved {len(lc)} {mfilter} measurements")
0.5 sec: retrieved 5 ACS_F814W measurements

Plot the light curve.

fig, ax = plt.subplots(figsize=(15, 10))

ax.errorbar(x='MJD', y='CorrMag', yerr='MagErr', fmt='ob', ecolor='k', elinewidth=1, markersize=8, label=mfilter, data=lc)

ax.invert_yaxis()
ax.set(xlabel='MJD [days]', ylabel='magnitude')
ax.legend(loc='best', title=f"MatchID: {matchid} MAD={tab['MAD'][0]:.2f}")
<matplotlib.legend.Legend at 0x7fc897d62750>
../../../_images/e257e0f2efb1de5b1ca1ae397bc2e17cf0246e741a4498cc74050224e50f40ec.png

Extract cutout images for the entire light curve (since it does not have many points).

# sort images in MJD order
ind = np.argsort(lc['MJD'])

# we plot zoomed-in and zoomed-out views side-by-side for each selected image
nim = len(ind)*2
ncols = 2 # images per row
nrows = (nim+ncols-1)//ncols

imsize1 = 19
imsize2 = 101
mra = tab['RA'][0]
mdec = tab['Dec'][0]

# define figure and axes
fig, axes = plt.subplots(nrows, ncols, figsize=(12, (12/ncols)*nrows), tight_layout=True)

t0 = time.time()

# iterate through each set of two subplots in axes
for i, ((ax1, ax2), k) in enumerate(zip(axes, ind), 1):
    
    # get the images
    im1 = get_hla_cutout(lc['ImageName'][k], mra, mdec, size=imsize1)
    im2 = get_hla_cutout(lc['ImageName'][k], mra, mdec, size=imsize2)
    
    # plot left column    
    ax1.imshow(im1, origin="upper", cmap="gray")
    ax1.set_title(lc['ImageName'][k], fontsize=14)
    
    # plot right column
    ax2.imshow(im2, origin="upper", cmap="gray")
    xbox = np.array([-1, 1])*imsize1/2 + (imsize2-1)//2
    ax2.plot(xbox[[0, 1, 1, 0, 0]], xbox[[0, 0, 1, 1, 0]], 'r-', linewidth=1)
    ax2.set_title(f"m={lc['CorrMag'][k]:.3f} MJD={lc['MJD'][k]:.2f}", fontsize=14)
    
    print(f"{(time.time()-t0):.1f} s: finished {i} of {len(ind)} epochs")

print(f"{(time.time()-t0):.1f} s: got {nrows*ncols} cutouts")
2.2 s: finished 1 of 5 epochs
4.2 s: finished 2 of 5 epochs
6.3 s: finished 3 of 5 epochs
8.5 s: finished 4 of 5 epochs
10.5 s: finished 5 of 5 epochs
10.5 s: got 10 cutouts
../../../_images/7b5eced6c4bbe6ec70d563bfeffee886deacf284cc7037c3e58187d65cf5a9f3.png