Hubble Source Catalog API Notebook#

2019 - 2022, Rick White, Trenton McKinney#

A new MAST interface supports queries to the current and previous versions of the Hubble Source Catalog. It allows searches of the summary table (with multi-filter mean photometry) and the detailed table (with all the multi-epoch measurements). It also has an associated API, which is used in this notebook.

This is based on HSC Use Case #3.

  • It searches the HSC for variable objects in the vicinity of dwarf galaxy IC 1613,

  • shows the positions of those objects in a color-magnitude diagram,

  • extracts light curves for an example object, and

  • displays cutout images from the Hubble observations that were used for the light curve measurements.

The whole process takes only 30 seconds to complete.

Another notebook generates a color-magnitude diagram for the Small Magellanic Cloud in only a couple of minutes. A more complex notebook that shows how to access the proper motion tables using the HSC API is also available.


  • Complete the initialization steps described below.

  • Run the notebook.

Running the notebook from top to bottom takes about 30 seconds.

Table of Contents#

  • Initialization

  • Get metadata on available HSC columns

  • Find variable objects in IC 1613

    • Use MAST name resolver

    • Search HSC summary table

    • Plot variability index versus magnitude

    • Show variable objects in a color-magnitude diagram

  • Get HSC light curve for a variable

  • Extract HLA cutout images for the variable

Initialization #

Install Python modules#

  1. This notebook requires the use of Python 3.

  2. Modules can be installed with conda, if using the Anaconda distribution of python, or with pip.

    • If you are using conda, do not install / update / remove a module with pip, that exists in a conda channel.

    • If a module is not available with conda, then it’s okay to install it with pip

import astropy
from astropy.coordinates import SkyCoord
import time
import sys
import os
import requests
import json
import numpy as np
import matplotlib.pyplot as plt

from pprint import pprint

from astropy.table import Table
import pandas as pd

from PIL import Image
from io import BytesIO, StringIO

# set width for pprint
astropy.conf.max_width = 150
/tmp/ipykernel_1978/ DeprecationWarning: 
Pyarrow will become a required dependency of pandas in the next major release of pandas (pandas 3.0),
(to allow more performant data types, such as the Arrow string type, and better interoperability with other libraries)
but was not found to be installed on your system.
If this would cause problems for you,
please provide us feedback at
  import pandas as pd
# set universal matplotlib parameters
plt.rcParams.update({'font.size': 16})

MAST API functions#

  • Execute HSC searches and resolve names using MAST query.

  • Here we define several interrelated functions for retrieving information from the MAST API.

    • The hcvcone(ra, dec, radius [, keywords]) function searches the HCV catalog near a position.

    • The hcvsearch() function performs general non-positional queries.

    • The hcvmetadata() function gives information about the columns available in a table.

hscapiurl = ""

def hsccone(ra, dec, radius, table="summary", release="v3", format="csv", magtype="magaper2",
            columns=None, baseurl=hscapiurl, verbose=False, **kw):
    """Do a cone search of the HSC catalog

    ra (float): (degrees) J2000 Right Ascension
    dec (float): (degrees) J2000 Declination
    radius (float): (degrees) Search radius (<= 0.5 degrees)
    table (string): summary, detailed, propermotions, or sourcepositions
    release (string): v3 or v2
    magtype (string): magaper2 or magauto (only applies to summary table)
    format: csv, votable, json, table
    columns: list of column names to include (None means use defaults)
    baseurl: base URL for the request
    verbose: print info about request
    **kw: other parameters (e.g., 'numimages.gte':2)

    data = kw.copy()
    data['ra'] = ra
    data['dec'] = dec
    data['radius'] = radius
    return hscsearch(table=table, release=release, format=format, magtype=magtype,
                     columns=columns, baseurl=baseurl, verbose=verbose, **data)

def hscsearch(table="summary", release="v3", magtype="magaper2", format="csv",
              columns=None, baseurl=hscapiurl, verbose=False, **kw):
    """Do a general search of the HSC catalog (possibly without ra/dec/radius)

    table (string): summary, detailed, propermotions, or sourcepositions
    release (string): v3 or v2
    magtype (string): magaper2 or magauto (only applies to summary table)
    format: csv, votable, json, table
    columns: list of column names to include (None means use defaults)
    baseurl: base URL for the request
    verbose: print info about request
    **kw: other parameters (e.g., 'numimages.gte':2).  Note this is required!

    data = kw.copy()
    if not data:
        raise ValueError("You must specify some parameters for search")
    if format not in ("csv", "votable", "json", 'table'):
        raise ValueError("Bad value for format")
    if format == "table":
        rformat = "csv"
        rformat = format
    url = f"{cat2url(table, release, magtype, baseurl=baseurl)}.{rformat}"
    if columns:
        # check that column values are legal
        # create a dictionary to speed this up
        dcols = {}
        for col in hscmetadata(table, release, magtype)['name']:
            dcols[col.lower()] = 1
        badcols = []
        for col in columns:
            if col.lower().strip() not in dcols:
        if badcols:
            raise ValueError(f"Some columns not found in table: {', '.join(badcols)}")
        # two different ways to specify a list of column values in the API
        # data['columns'] = columns
        data['columns'] = f"[{','.join(columns)}]"

    # either get or post works
    # r =, data=data)
    r = requests.get(url, params=data)

    if verbose:
    if format == "json":
        return r.json()
    elif format == "table":
        # use pandas to work around bug in Windows for
        return Table.from_pandas(pd.read_csv(StringIO(r.text)))
        return r.text

def hscmetadata(table="summary", release="v3", magtype="magaper2", baseurl=hscapiurl):
    """Return metadata for the specified catalog and table
    table (string): summary, detailed, propermotions, or sourcepositions
    release (string): v3 or v2
    magtype (string): magaper2 or magauto (only applies to summary table)
    baseurl: base URL for the request
    Returns an astropy table with columns name, type, description
    url = f"{cat2url(table, release, magtype, baseurl=baseurl)}/metadata"
    r = requests.get(url)
    v = r.json()
    # convert to astropy table
    tab = Table(rows=[(x['name'], x['type'], x['description']) for x in v],
                names=('name', 'type', 'description'))
    return tab

def cat2url(table="summary", release="v3", magtype="magaper2", baseurl=hscapiurl):
    """Return URL for the specified catalog and table
    table (string): summary, detailed, propermotions, or sourcepositions
    release (string): v3 or v2
    magtype (string): magaper2 or magauto (only applies to summary table)
    baseurl: base URL for the request
    Returns a string with the base URL for this request
    checklegal(table, release, magtype)
    if table == "summary":
        url = f"{baseurl}/{release}/{table}/{magtype}"
        url = f"{baseurl}/{release}/{table}"
    return url

def checklegal(table, release, magtype):
    """Checks if this combination of table, release and magtype is acceptable
    Raises a ValueError exception if there is problem
    releaselist = ("v2", "v3")
    if release not in releaselist:
        raise ValueError(f"Bad value for release (must be one of {', '.join(releaselist)})")
    if release == "v2":
        tablelist = ("summary", "detailed")
        tablelist = ("summary", "detailed", "propermotions", "sourcepositions")
    if table not in tablelist:
        raise ValueError(f"Bad value for table (for {release} must be one of {', '.join(tablelist)})")
    if table == "summary":
        magtypelist = ("magaper2", "magauto")
        if magtype not in magtypelist:
            raise ValueError(f"Bad value for magtype (must be one of {', '.join(magtypelist)})")

Get metadata on available columns #

The metadata query returns information on the columns in the table. It works for any of the tables in the API (summary, detailed, propermotions, sourcepositions).

Note that the summary table has a huge number of columns! Each of the 133 filter/detector combinations has 3 columns with the magnitude, median absolute deviation (MAD, a robust measure of the scatter among the measurements), and the number of independent measurements in the filter. The filter name includes a prefix for the detector (A=ACS/WFC, W3=WFC3/UVIS or WFC3/IR, W2=WFPC2) followed by the standard name of the filter. So for instance all three instruments have an F814W filter, so there are columns for A_F814W, W3_F814W, and W2_F814W.

meta = hscmetadata("summary")
print(len(meta), "columns in summary")
filterlist = meta['name'][19::3].tolist()
print(len(filterlist), "filters")
pprint(filterlist, compact=True)
418 columns in summary
133 filters
['W3_BLANK', 'W2_F122M', 'W2_F160BN15', 'W2_F160BW', 'W2_F170W', 'W2_F185W',
 'W3_F200LP', 'W3_F218W', 'W2_F218W', 'W3_F225W', 'W3_FQ232N', 'W3_FQ243N',
 'W2_F255W', 'W3_F275W', 'W3_F280N', 'W3_G280', 'W2_F300W', 'W3_F300X',
 'W3_F336W', 'W2_F336W', 'W3_F343N', 'W2_F343N', 'W3_F350LP', 'W3_F373N',
 'W2_F375N', 'W3_FQ378N', 'W2_F380W', 'W3_FQ387N', 'W3_F390M', 'W2_F390N',
 'W3_F390W', 'W3_F395N', 'W3_F410M', 'W2_F410M', 'W3_FQ422M', 'A_F435W',
 'W3_FQ436N', 'W3_FQ437N', 'W2_F437N', 'W3_F438W', 'W2_F439W', 'W2_F450W',
 'W3_F467M', 'W2_F467M', 'W3_F469N', 'W2_F469N', 'A_F475W', 'W3_F475W',
 'W3_F475X', 'W3_F487N', 'W2_F487N', 'W3_FQ492N', 'A_F502N', 'W3_F502N',
 'W2_F502N', 'W3_FQ508N', 'W3_F547M', 'W2_F547M', 'A_F550M', 'A_F555W',
 'W3_F555W', 'W2_F555W', 'W2_F569W', 'W3_FQ575N', 'W2_F588N', 'W3_F600LP',
 'A_F606W', 'W3_F606W', 'W2_F606W', 'W3_FQ619N', 'W3_F621M', 'W2_F622W',
 'A_F625W', 'W3_F625W', 'W3_F631N', 'W2_F631N', 'W3_FQ634N', 'W3_F645N',
 'W3_F656N', 'W2_F656N', 'W3_F657N', 'A_F658N', 'W3_F658N', 'W2_F658N',
 'A_F660N', 'W3_F665N', 'W3_F665N_F6', 'W3_FQ672N', 'W3_F673N', 'W2_F673N',
 'W3_FQ674N', 'W2_F675W', 'W3_F680N', 'W3_F689M', 'W2_F702W', 'W3_FQ727N',
 'W3_FQ750N', 'W3_F763M', 'A_F775W', 'W3_F775W', 'W2_F785LP', 'W2_F791W',
 'A_F814W', 'W3_F814W', 'W2_F814W', 'W3_F845M', 'A_F850LP', 'W3_F850LP',
 'W2_F850LP', 'W3_FQ889N', 'W3_FQ906N', 'W3_FQ924N', 'W3_FQ937N', 'W3_F953N',
 'W2_F953N', 'W3_F098M', 'W3_G102', 'W2_F1042M', 'W3_F105W', 'W3_F110W',
 'W3_F125W', 'W3_F126N', 'W3_F127M', 'W3_F128N', 'W3_F130N', 'W3_F132N',
 'W3_F139M', 'W3_F140W', 'W3_G141', 'W3_F153M', 'W3_F160W', 'W3_F164N',
Table length=19
MatchIDlongidentifier for the match
MatchRAfloatright ascension coordinate of the match position
MatchDecfloatdeclination coordinate of the match position
DSigmafloatstandard deviation of source positions in match
AbsCorrcharindicator of whether the match contains sources that are aligned to a standard catalog
NumFiltersintnumber of filters in match with sources detected in the aper2 aperture
NumVisitsintnumber of visits in match with sources detected in the aper2 aperture
NumImagesintnumber of Hubble Legacy Archive single filter, visit-combined (level 2) images in match with sources detected in the aper2 aperture
StartTimecharearliest start time of exposures in match with sources detected in the aper2 aperture
StopTimecharlatest stop time of exposures in match with sources detected in the aper2 aperture
StartMJDfloatmodified Julian date (MJD) for earliest start time of exposures in match with sources detected in the aper2 aperture
StopMJDfloatmodified Julian date (MJD) for latest stop time of exposures in match with sources detected in the aper2 aperture
TargetNamecharname of a target for an exposure in match
CIfloataverage normalized concentration index for sources detected in the aper2 aperture within the match
CI_Sigmafloatstandard deviation of normalized concentration index values for sources detected in the aper2 aperture within the match
KronRadiusfloataverage Kron radius for sources detected in the aper2 aperture within the match
KronRadius_Sigmafloatstandard deviation of Kron radius values for sources detected in the aper2 aperture within the match
Extinctionfloatextinction, obtained from the NASA/IPAC Extragalactic Database (NED), along the line of sight to the match position
SpectrumFlagcharY/N indicator of whether there is a spectrum in the Hubble Legacy Archive for this match. If the value is Y, then there is an entry in table SpecCat for the match.

Find variable objects in the dwarf irregular galaxy IC 1613 #

This is based on HSC Use Case #3, which shows an example of selecting objects from the HSC in portal. This is simple to do using the HSC API.

Use astropy name resolver to get position of IC 1613 #

target = 'IC 1613'
coord_ic1613 = SkyCoord.from_name(target)

ra_ic1613 =
dec_ic1613 =
print(f'ra: {ra_ic1613}\ndec: {dec_ic1613}')
ra: 16.2016962
dec: 2.1194959

Select objects with enough measurements to determine variability #

This searches the summary table for objects within 0.5 degrees of the galaxy center that have at least 10 measurements in both ACS F475W and F814W.

# save typing a quoted list of columns
columns = """MatchID,MatchRA,MatchDec,NumFilters,NumVisits,NumImages,StartMJD,StopMJD,
    A_F475W, A_F475W_N, A_F475W_MAD,
    A_F814W, A_F814W_N, A_F814W_MAD""".split(",")
columns = [x.strip() for x in columns]
columns = [x for x in columns if x and not x.startswith('#')]

constraints = {'A_F475W_N.gte': 10, 'A_F814W_N.gte': 10}

t0 = time.time()
tab = hsccone(ra_ic1613, dec_ic1613, 0.5, table="summary", release='v3', columns=columns, verbose=True, format="table", **constraints)
print(f"{(time.time()-t0):.1f} s: retrieved data and converted to {len(tab)}-row astropy table")

# clean up the output format
tab['A_F475W'].format = "{:.3f}"
tab['A_F475W_MAD'].format = "{:.3f}"
tab['A_F814W'].format = "{:.3f}"
tab['A_F814W_MAD'].format = "{:.3f}"
tab['MatchRA'].format = "{:.6f}"
tab['MatchDec'].format = "{:.6f}"
tab['StartMJD'].format = "{:.5f}"
tab['StopMJD'].format = "{:.5f}"
7.9 s: retrieved data and converted to 18666-row astropy table
Table length=18666

Plot object positions on the sky#

We mark the galaxy center as well. Note that this field is in the outskirts of IC 1613. The 0.5 search radius (which is the maximum allowed in the API) allows finding these objects.

fig, ax = plt.subplots(figsize=(10, 10))
ax.plot('MatchRA', 'MatchDec', 'bo', markersize=1, data=tab, label=f'{len(tab)} HSC measurements')
ax.plot(ra_ic1613, dec_ic1613, 'rx', label=target, markersize=10)
ax.set(xlabel='RA [deg]', ylabel='Dec [deg]', aspect='equal')
_ = ax.legend(loc='best')

Plot MAD variability index versus magnitude in F475W #

The median absolute deviation is measured among the ~12 magnitude measurements in the catalog. Some scatter is expected from noise (which increases for fainter objects). Objects with MAD values that are high are likely to be variable.

Select variable objects that are not too faint.

wvar = np.where((tab['A_F475W_MAD'] > 0.1) & (tab['A_F475W'] < 24) & (tab['A_F475W'] > 21))[0]

fig, ax = plt.subplots(figsize=(10, 10))
ax.plot('A_F475W', 'A_F475W_MAD', 'bo', markersize=2, alpha=0.1, data=tab,
        label=f'{len(tab)} HSC measurements near {target}')
ax.plot('A_F475W', 'A_F475W_MAD', 'ro', markersize=5, data=tab[wvar],
        label=f'{len(wvar)} variable candidates')
ax.set(xlabel='A_F475W [mag]', ylabel='A_F475W_MAD [mag]')
_ = ax.legend(loc='best')

Check positions of variable objects in a color-magnitude diagram #

Note that these objects are generally located in the Cepheid instability strip.

b_minus_i = tab['A_F475W'] - tab['A_F814W']

fig, ax = plt.subplots(figsize=(10, 10))
ax.plot(b_minus_i, tab['A_F475W'], 'bo', markersize=2, alpha=0.1,
        label=f'{len(tab)} HSC measurements near {target}')
ax.plot(b_minus_i[wvar], tab['A_F475W'][wvar], 'ro', markersize=5,
        label=f'{len(wvar)} variable candidates')
ax.set(xlabel='A_F475W - A_F814W [mag]', ylabel='A_F475W [mag]')
_ = ax.legend(loc='best')

Query the API for the light curve for one of the objects #

Select the most variable object as an example.

wvar = wvar[np.argsort(-tab['A_F475W_MAD'][wvar])]
iselect = wvar[0]
print(f"MatchID {tab['MatchID'][iselect]} B = {tab['A_F475W'][iselect]:.3f} B-I = {b_minus_i[iselect]:.3f}")
MatchID 80189155 B = 22.451 B-I = 0.450
Table length=29

Get column metadata for detailed observation table (which has time-dependent magnitudes).

meta = hscmetadata("detailed")
print(len(meta), "columns in detailed")
pprint(meta['name'].tolist(), compact=True)
39 columns in detailed
['CatID', 'MatchID', 'MemID', 'SourceID', 'ImageID', 'Det', 'MatchRA',
 'MatchDec', 'SourceRA', 'SourceDec', 'D', 'DSigma', 'AbsCorr', 'XImage',
 'YImage', 'ImageName', 'Instrument', 'Mode', 'Detector', 'Aperture',
 'ExposureTime', 'StartTime', 'StopTime', 'StartMJD', 'StopMJD', 'WaveLength',
 'Filter', 'TargetName', 'FluxAper2', 'MagAper2', 'MagAuto', 'PropID', 'CI',
 'KronRadius', 'Flags', 'HTMID', 'X', 'Y', 'Z']

Get separate light curves for F475W and F814W from the detailed table#

columns = """MatchID,SourceID,StartMJD,Detector,Filter,MagAper2,Flags,ImageName""".split(",")
columns = [x.strip() for x in columns]
columns = [x for x in columns if x and not x.startswith('#')]

constraints = {'MatchID': tab['MatchID'][iselect], 'Detector': 'ACS/WFC'}
t0 = time.time()
f475 = hscsearch(table="detailed", release='v3', columns=columns, Filter='F475W', format="table", **constraints)
f814 = hscsearch(table="detailed", release='v3', columns=columns, Filter='F814W', format="table", **constraints)
print(f"{time.time()-t0:.1f} s: retrieved data and converted to {len(f475)} (F475W) and {len(f814)} (F814W) row astropy tables")

f475['MagAper2'].format = "{:.3f}"
f475['StartMJD'].format = "{:.5f}"
f814['MagAper2'].format = "{:.3f}"
f814['StartMJD'].format = "{:.5f}"

0.5 s: retrieved data and converted to 12 (F475W) and 12 (F814W) row astropy tables
Table length=12

Plot the light curves#

The light curves appear well-behaved and are closely correlated in the two filters.

fig, ax = plt.subplots(figsize=(10, 6), tight_layout=True)

ax.plot('StartMJD', 'MagAper2', 'bo', data=f475, label='ACS/WFC F475W')
ax.plot('StartMJD', 'MagAper2', 'ro', data=f814, label='ACS/WFC F814W')

ax.set(xlabel='MJD [days]', ylabel='[mag]')
<matplotlib.legend.Legend at 0x7f4a8abbdc90>

Extract HLA cutout images for the F475W images #

Get HLA F475W cutout images for the example variable. The get_hla_cutout function reads a single cutout image (as a JPEG grayscale image) and returns a PIL image object. See the documentation on the fitscut image cutout service for more information on the web service being used.

Examination of the images can be useful to identified cosmic-ray contamination and other possible image artifacts. In this case, no issues are seen, so the light curve is likely to be reliable.

def get_hla_cutout(imagename, ra, dec, size=33, autoscale=99.5, asinh=1, zoom=1):
    """Get JPEG cutout for an image"""
    url = ""
    r = requests.get(url, params=dict(ra=ra, dec=dec, size=size, format="jpeg",
                                      red=imagename, autoscale=autoscale, asinh=asinh, zoom=zoom))
    im =
    return im
# sort images by magnitude from faintest to brightest
isort = np.argsort(-f475['MagAper2'])

imagename = f475['ImageName'][isort]
mag = f475['MagAper2'][isort]
mjd = f475['StartMJD'][isort]

nim = len(imagename)
ncols = 4 # images per row
nrows = (nim+ncols-1)//ncols

imsize = 15
mra = tab['MatchRA'][iselect]
mdec = tab['MatchDec'][iselect]
# download list of images; might take a minute
images = [get_hla_cutout(imagename[k], mra, mdec, size=imsize) for k in range(nim)]
plt.rcParams.update({"font.size": 11})
fig, axes = plt.subplots(nrows=nrows, ncols=ncols, figsize=(15, (15/ncols)*nrows), tight_layout=True)

axes = axes.flat

for i, (ax, img) in enumerate(zip(axes, images)):
    ax.imshow(img, origin="upper", cmap="gray")
    ax.set_title(f'{mjd[i]:.5f} f475w={mag[i]:.3f}')