Files
AstronomicalData/_build/latex/book.tex
Allen Downey 667889785b Updating pages
2020-11-13 11:15:13 -05:00

6779 lines
328 KiB
TeX
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
%% Generated by Sphinx.
\def\sphinxdocclass{report}
\documentclass[letterpaper,10pt,english]{sphinxmanual}
\ifdefined\pdfpxdimen
\let\sphinxpxdimen\pdfpxdimen\else\newdimen\sphinxpxdimen
\fi \sphinxpxdimen=.75bp\relax
\PassOptionsToPackage{warn}{textcomp}
\usepackage[utf8]{inputenc}
\ifdefined\DeclareUnicodeCharacter
% support both utf8 and utf8x syntaxes
\ifdefined\DeclareUnicodeCharacterAsOptional
\def\sphinxDUC#1{\DeclareUnicodeCharacter{"#1}}
\else
\let\sphinxDUC\DeclareUnicodeCharacter
\fi
\sphinxDUC{00A0}{\nobreakspace}
\sphinxDUC{2500}{\sphinxunichar{2500}}
\sphinxDUC{2502}{\sphinxunichar{2502}}
\sphinxDUC{2514}{\sphinxunichar{2514}}
\sphinxDUC{251C}{\sphinxunichar{251C}}
\sphinxDUC{2572}{\textbackslash}
\fi
\usepackage{cmap}
\usepackage[T1]{fontenc}
\usepackage{amsmath,amssymb,amstext}
\usepackage{babel}
\usepackage{times}
\expandafter\ifx\csname T@LGR\endcsname\relax
\else
% LGR was declared as font encoding
\substitutefont{LGR}{\rmdefault}{cmr}
\substitutefont{LGR}{\sfdefault}{cmss}
\substitutefont{LGR}{\ttdefault}{cmtt}
\fi
\expandafter\ifx\csname T@X2\endcsname\relax
\expandafter\ifx\csname T@T2A\endcsname\relax
\else
% T2A was declared as font encoding
\substitutefont{T2A}{\rmdefault}{cmr}
\substitutefont{T2A}{\sfdefault}{cmss}
\substitutefont{T2A}{\ttdefault}{cmtt}
\fi
\else
% X2 was declared as font encoding
\substitutefont{X2}{\rmdefault}{cmr}
\substitutefont{X2}{\sfdefault}{cmss}
\substitutefont{X2}{\ttdefault}{cmtt}
\fi
\usepackage[Bjarne]{fncychap}
\usepackage[,numfigreset=1,mathnumfig]{sphinx}
\fvset{fontsize=\small}
\usepackage{geometry}
% Include hyperref last.
\usepackage{hyperref}
% Fix anchor placement for figures with captions.
\usepackage{hypcap}% it must be loaded after hyperref.
% Set up styles of URL: it should be placed after hyperref.
\urlstyle{same}
\usepackage{sphinxmessages}
\title{Astronomical Data in Python}
\date{Nov 04, 2020}
\release{}
\author{Allen B.\@{} Downey}
\newcommand{\sphinxlogo}{\vbox{}}
\renewcommand{\releasename}{}
\makeindex
\begin{document}
\pagestyle{empty}
\sphinxmaketitle
\pagestyle{plain}
\sphinxtableofcontents
\pagestyle{normal}
\phantomsection\label{\detokenize{README::doc}}
\sphinxstyleemphasis{Astronomical Data in Python} is an introduction to tools and practices for working with astronomical data. Topics covered include:
\begin{itemize}
\item {}
Writing queries that select and download data from a database.
\item {}
Using data stored in an Astropy \sphinxcode{\sphinxupquote{Table}} or Pandas \sphinxcode{\sphinxupquote{DataFrame}}.
\item {}
Working with coordinates and other quantities with units.
\item {}
Storing data in various formats.
\item {}
Performing database join operations that combine data from multiple tables.
\item {}
Visualizing data and preparing publication\sphinxhyphen{}quality figures.
\end{itemize}
As a running example, we will replicate part of the analysis in a recent paper, “\sphinxhref{https://arxiv.org/abs/1805.00425}{Off the beaten path: Gaia reveals GD\sphinxhyphen{}1 stars outside of the main stream}” by Adrian M. Price\sphinxhyphen{}Whelan and Ana Bonaca.
This material was developed in collaboration with \sphinxhref{https://carpentries.org/}{The Carpentries} and the Astronomy Curriculum Development Committee, and supported by funding from the American Institute of Physics through the American Astronomical Society.
I am grateful for contributions from the members of the committee \textendash{} Azalee Bostroem, Rodolfo Montez, and Phil Rosenfield \textendash{} and from Erin Becker, Brett Morris and Adrian Price\sphinxhyphen{}Whelan.
The original format of this material is a series of Jupyter notebooks. Using the
links below, you can read the notebooks on NBViewer or run them on Colab. If you
want to run the notebooks in your own environment, you can download them from
this repository and follow the instructions below to set up your environment.
This material is also available in the form of \sphinxhref{https://datacarpentry.github.io/astronomy-python}{Carpentries lessons}, but you should be
aware that these versions might diverge in the future.
\sphinxstylestrong{Prerequisites}
This material should be accessible to people familiar with basic Python, but not necessarily the libraries we will use, like Astropy or Pandas. If you are familiar with Python lists and dictionaries, and you know how to write a function that takes parameters and returns a value, that should be enough.
We assume that you are familiar with astronomy at the undergraduate level, but we will not assume specialized knowledge of the datasets or analysis methods well use.
\sphinxstylestrong{Notebook 1}
This notebook demonstrates the following steps:
\begin{enumerate}
\sphinxsetlistlabels{\arabic}{enumi}{enumii}{}{.}%
\item {}
Making a connection to the Gaia server,
\item {}
Exploring information about the database and the tables it contains,
\item {}
Writing a query and sending it to the server, and finally
\item {}
Downloading the response from the server as an Astropy \sphinxcode{\sphinxupquote{Table}}.
\end{enumerate}
Press this button to run this notebook on Colab:
\sphinxhref{https://colab.research.google.com/github/AllenDowney/AstronomicalData/blob/main/01\_query.ipynb}{}
\sphinxhref{https://nbviewer.jupyter.org/github/AllenDowney/AstronomicalData/blob/main/01\_query.ipynb}{or click here to read it on NBViewer}
\sphinxstylestrong{Notebook 2}
This notebook starts with an example that does a “cone search”; that is, it selects stars that appear in a circular region of the sky.
Then, to select stars in the vicinity of GD\sphinxhyphen{}1, we:
\begin{itemize}
\item {}
Use \sphinxcode{\sphinxupquote{Quantity}} objects to represent measurements with units.
\item {}
Use the \sphinxcode{\sphinxupquote{Gala}} library to convert coordinates from one frame to another.
\item {}
Use the ADQL keywords \sphinxcode{\sphinxupquote{POLYGON}}, \sphinxcode{\sphinxupquote{CONTAINS}}, and \sphinxcode{\sphinxupquote{POINT}} to select stars that fall within a polygonal region.
\item {}
Submit a query and download the results.
\item {}
Store the results in a FITS file.
\end{itemize}
Press this button to run this notebook on Colab:
\sphinxhref{https://colab.research.google.com/github/AllenDowney/AstronomicalData/blob/main/02\_coords.ipynb}{}
\sphinxhref{https://nbviewer.jupyter.org/github/AllenDowney/AstronomicalData/blob/main/02\_coords.ipynb}{or click here to read it on NBViewer}
\sphinxstylestrong{Notebook 3}
Here are the steps in this notebook:
\begin{enumerate}
\sphinxsetlistlabels{\arabic}{enumi}{enumii}{}{.}%
\item {}
Well read back the results from the previous notebook, which we saved in a FITS file.
\item {}
Then well transform the coordinates and proper motion data from ICRS back to the coordinate frame of GD\sphinxhyphen{}1.
\item {}
Well put those results into a Pandas \sphinxcode{\sphinxupquote{DataFrame}}, which well use to select stars near the centerline of GD\sphinxhyphen{}1.
\item {}
Plotting the proper motion of those stars, well identify a region of proper motion for stars that are likely to be in GD\sphinxhyphen{}1.
\item {}
Finally, well select and plot the stars whose proper motion is in that region.
\end{enumerate}
Press this button to run this notebook on Colab:
\sphinxhref{https://colab.research.google.com/github/AllenDowney/AstronomicalData/blob/main/03\_motion.ipynb}{}
\sphinxhref{https://nbviewer.jupyter.org/github/AllenDowney/AstronomicalData/blob/main/03\_motion.ipynb}{or click here to read it on NBViewer}
\sphinxstylestrong{Notebook 4}
Here are the steps in this notebook:
\begin{enumerate}
\sphinxsetlistlabels{\arabic}{enumi}{enumii}{}{.}%
\item {}
Using data from the previous notebook, well identify the values of proper motion for stars likely to be in GD\sphinxhyphen{}1.
\item {}
Then well compose an ADQL query that selects stars based on proper motion, so we can download only the data we need.
\item {}
Well also see how to write the results to a CSV file.
\end{enumerate}
That will make it possible to search a bigger region of the sky in a single query.
Press this button to run this notebook on Colab:
\sphinxhref{https://colab.research.google.com/github/AllenDowney/AstronomicalData/blob/main/04\_select.ipynb}{}
\sphinxhref{https://nbviewer.jupyter.org/github/AllenDowney/AstronomicalData/blob/main/04\_select.ipynb}{or click here to read it on NBViewer}
\sphinxstylestrong{Notebook 5}
Here are the steps in this notebook:
\begin{enumerate}
\sphinxsetlistlabels{\arabic}{enumi}{enumii}{}{.}%
\item {}
Well reload the candidate stars we identified in the previous notebook.
\item {}
Then well run a query on the Gaia server that uploads the table of candidates and uses a \sphinxcode{\sphinxupquote{JOIN}} operation to select photometry data for the candidate stars.
\item {}
Well write the results to a file for use in the next notebook.
\end{enumerate}
Press this button to run this notebook on Colab:
\sphinxhref{https://colab.research.google.com/github/AllenDowney/AstronomicalData/blob/main/05\_join.ipynb}{}
\sphinxhref{https://nbviewer.jupyter.org/github/AllenDowney/AstronomicalData/blob/main/05\_join.ipynb}{or click here to read it on NBViewer}
\sphinxstylestrong{Notebook 6}
Here are the steps in this notebook:
\begin{enumerate}
\sphinxsetlistlabels{\arabic}{enumi}{enumii}{}{.}%
\item {}
Well reload the data from the previous notebook and make a color\sphinxhyphen{}magnitude diagram.
\item {}
Then well specify a polygon in the diagram that contains stars with the photometry we expect.
\item {}
Then well merge the photometry data with the list of candidate stars, storing the result in a Pandas \sphinxcode{\sphinxupquote{DataFrame}}.
\end{enumerate}
Press this button to run this notebook on Colab:
\sphinxhref{https://colab.research.google.com/github/AllenDowney/AstronomicalData/blob/main/06\_photo.ipynb}{}
\sphinxhref{https://nbviewer.jupyter.org/github/AllenDowney/AstronomicalData/blob/main/06\_photo.ipynb}{or click here to read it on NBViewer}
\sphinxstylestrong{Notebook 7}
Here are the steps in this notebook:
\begin{enumerate}
\sphinxsetlistlabels{\arabic}{enumi}{enumii}{}{.}%
\item {}
Starting with the figure from the previous notebook, well add annotations to present the results more clearly.
\item {}
The well see several ways to customize figures to make them more appealing and effective.
\item {}
Finally, well see how to make a figure with multiple panels or subplots.
\end{enumerate}
Press this button to run this notebook on Colab:
\sphinxhref{https://colab.research.google.com/github/AllenDowney/AstronomicalData/blob/main/07\_plot.ipynb}{}
\sphinxhref{https://nbviewer.jupyter.org/github/AllenDowney/AstronomicalData/blob/main/07\_plot.ipynb}{or click here to read it on NBViewer}
\sphinxstylestrong{Installation instructions}
Coming soon.
\chapter{Chapter 1}
\label{\detokenize{01_query:chapter-1}}\label{\detokenize{01_query::doc}}
\sphinxstyleemphasis{Astronomical Data in Python} is an introduction to tools and practices for working with astronomical data. Topics covered include:
\begin{itemize}
\item {}
Writing queries that select and download data from a database.
\item {}
Using data stored in an Astropy \sphinxcode{\sphinxupquote{Table}} or Pandas \sphinxcode{\sphinxupquote{DataFrame}}.
\item {}
Working with coordinates and other quantities with units.
\item {}
Storing data in various formats.
\item {}
Performing database join operations that combine data from multiple tables.
\item {}
Visualizing data and preparing publication\sphinxhyphen{}quality figures.
\end{itemize}
As a running example, we will replicate part of the analysis in a recent paper, “\sphinxhref{https://arxiv.org/abs/1805.00425}{Off the beaten path: Gaia reveals GD\sphinxhyphen{}1 stars outside of the main stream}” by Adrian M. Price\sphinxhyphen{}Whelan and Ana Bonaca.
As the abstract explains, “Using data from the Gaia second data release combined with Pan\sphinxhyphen{}STARRS photometry, we present a sample of highly\sphinxhyphen{}probable members of the longest cold stream in the Milky Way, GD\sphinxhyphen{}1.”
GD\sphinxhyphen{}1 is a \sphinxhref{https://en.wikipedia.org/wiki/List\_of\_stellar\_streams}{stellar stream}, which is “an association of stars orbiting a galaxy that was once a globular cluster or dwarf galaxy that has now been torn apart and stretched out along its orbit by tidal forces.”
\sphinxhref{https://www.sciencemag.org/news/2018/10/streams-stars-reveal-galaxy-s-violent-history-and-perhaps-its-unseen-dark-matter}{This article in \sphinxstyleemphasis{Science} magazine} explains some of the background, including the process that led to the paper and an discussion of the scientific implications:
\begin{itemize}
\item {}
“The streams are particularly useful for … galactic archaeology — rewinding the cosmic clock to reconstruct the assembly of the Milky Way.”
\item {}
“They also are being used as exquisitely sensitive scales to measure the galaxys mass.”
\item {}
“… the streams are well\sphinxhyphen{}positioned to reveal the presence of dark matter … because the streams are so fragile, theorists say, collisions with marauding clumps of dark matter could leave telltale scars, potential clues to its nature.”
\end{itemize}
\section{Data}
\label{\detokenize{01_query:data}}
The datasets we will work with are:
\begin{itemize}
\item {}
\sphinxhref{https://en.wikipedia.org/wiki/Gaia\_(spacecraft)}{Gaia}, which is “a space observatory of the European Space Agency (ESA), launched in 2013 … designed for astrometry: measuring the positions, distances and motions of stars with unprecedented precision”, and
\item {}
\sphinxhref{https://en.wikipedia.org/wiki/Pan-STARRS}{Pan\sphinxhyphen{}STARRS}, The Panoramic Survey Telescope and Rapid Response System, which is a survey designed to monitor the sky for transient objects, producing a catalog with accurate astronometry and photometry of detected sources.
\end{itemize}
Both of these datasets are very large, which can make them challenging to work with. It might not be possible, or practical, to download the entire dataset.
One of the goals of this workshop is to provide tools for working with large datasets.
\section{Prerequisites}
\label{\detokenize{01_query:prerequisites}}
These notebooks are meant for people who are familiar with basic Python, but not necessarily the libraries we will use, like Astropy or Pandas. If you are familiar with Python lists and dictionaries, and you know how to write a function that takes parameters and returns a value, you know enough Python to get started.
We assume that you have some familiarity with operating systems, like the ability to use a command\sphinxhyphen{}line interface. But we dont assume you have any prior experience with databases.
We assume that you are familiar with astronomy at the undergraduate level, but we will not assume specialized knowledge of the datasets or analysis methods well use.
\section{Outline}
\label{\detokenize{01_query:outline}}
The first lesson demonstrates the steps for selecting and downloading data from the Gaia Database:
\begin{enumerate}
\sphinxsetlistlabels{\arabic}{enumi}{enumii}{}{.}%
\item {}
First well make a connection to the Gaia server,
\item {}
We will explore information about the database and the tables it contains,
\item {}
We will write a query and send it to the server, and finally
\item {}
We will download the response from the server.
\end{enumerate}
After completing this lesson, you should be able to
\begin{itemize}
\item {}
Compose a basic query in ADQL.
\item {}
Use queries to explore a database and its tables.
\item {}
Use queries to download data.
\item {}
Develop, test, and debug a query incrementally.
\end{itemize}
\section{Query Language}
\label{\detokenize{01_query:query-language}}
In order to select data from a database, you have to compose a query, which is like a program written in a “query language”.
The query language well use is ADQL, which stands for “Astronomical Data Query Language”.
ADQL is a dialect of \sphinxhref{https://en.wikipedia.org/wiki/SQL}{SQL} (Structured Query Language), which is by far the most commonly used query language. Almost everything you will learn about ADQL also works in SQL.
\sphinxhref{http://www.ivoa.net/documents/ADQL/20180112/PR-ADQL-2.1-20180112.html}{The reference manual for ADQL is here}.
But you might find it easier to learn from \sphinxhref{https://www.gaia.ac.uk/data/gaia-data-release-1/adql-cookbook}{this ADQL Cookbook}.
\section{Installing libraries}
\label{\detokenize{01_query:installing-libraries}}
The library well use to get Gaia data is \sphinxhref{https://astroquery.readthedocs.io/en/latest/}{Astroquery}.
If you are running this notebook on Colab, you can run the following cell to install Astroquery and the other libraries well use.
If you are running this notebook on your own computer, you might have to install these libraries yourself.
If you are using this notebook as part of a Carpentries workshop, you should have received setup instructions.
TODO: Add a link to the instructions.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{c+c1}{\PYGZsh{} If we\PYGZsq{}re running on Colab, install libraries}
\PYG{k+kn}{import} \PYG{n+nn}{sys}
\PYG{n}{IN\PYGZus{}COLAB} \PYG{o}{=} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{google.colab}\PYG{l+s+s1}{\PYGZsq{}} \PYG{o+ow}{in} \PYG{n}{sys}\PYG{o}{.}\PYG{n}{modules}
\PYG{k}{if} \PYG{n}{IN\PYGZus{}COLAB}\PYG{p}{:}
\PYG{o}{!}pip install astroquery astro\PYGZhy{}gala pyia
\end{sphinxVerbatim}
\section{Connecting to Gaia}
\label{\detokenize{01_query:connecting-to-gaia}}
Astroquery provides \sphinxcode{\sphinxupquote{Gaia}}, which is an \sphinxhref{https://astroquery.readthedocs.io/en/latest/gaia/gaia.html}{object that represents a connection to the Gaia database}.
We can connect to the Gaia database like this:
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k+kn}{from} \PYG{n+nn}{astroquery}\PYG{n+nn}{.}\PYG{n+nn}{gaia} \PYG{k+kn}{import} \PYG{n}{Gaia}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
Created TAP+ (v1.2.1) \PYGZhy{} Connection:
Host: gea.esac.esa.int
Use HTTPS: True
Port: 443
SSL Port: 443
Created TAP+ (v1.2.1) \PYGZhy{} Connection:
Host: geadata.esac.esa.int
Use HTTPS: True
Port: 443
SSL Port: 443
\end{sphinxVerbatim}
Running this import statement has the effect of creating a \sphinxhref{http://www.ivoa.net/documents/TAP/}{TAP+} connection; TAP stands for “Table Access Protocol”. It is a network protocol for sending queries to the database and getting back the results. Were not sure why it seems to create two connections.
\section{Databases and Tables}
\label{\detokenize{01_query:databases-and-tables}}
What is a database, anyway? Most generally, it can be any collection of data, but when we are talking about ADQL or SQL:
\begin{itemize}
\item {}
A database is a collection of one or more named tables.
\item {}
Each table is a 2\sphinxhyphen{}D array with one or more named columns of data.
\end{itemize}
We can use \sphinxcode{\sphinxupquote{Gaia.load\_tables}} to get the names of the tables in the Gaia database. With the option \sphinxcode{\sphinxupquote{only\_names=True}}, it loads information about the tables, called the “metadata”, not the data itself.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{tables} \PYG{o}{=} \PYG{n}{Gaia}\PYG{o}{.}\PYG{n}{load\PYGZus{}tables}\PYG{p}{(}\PYG{n}{only\PYGZus{}names}\PYG{o}{=}\PYG{k+kc}{True}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
INFO: Retrieving tables... [astroquery.utils.tap.core]
INFO: Parsing tables... [astroquery.utils.tap.core]
INFO: Done. [astroquery.utils.tap.core]
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k}{for} \PYG{n}{table} \PYG{o+ow}{in} \PYG{p}{(}\PYG{n}{tables}\PYG{p}{)}\PYG{p}{:}
\PYG{n+nb}{print}\PYG{p}{(}\PYG{n}{table}\PYG{o}{.}\PYG{n}{get\PYGZus{}qualified\PYGZus{}name}\PYG{p}{(}\PYG{p}{)}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
external.external.apassdr9
external.external.gaiadr2\PYGZus{}geometric\PYGZus{}distance
external.external.galex\PYGZus{}ais
external.external.ravedr5\PYGZus{}com
external.external.ravedr5\PYGZus{}dr5
external.external.ravedr5\PYGZus{}gra
external.external.ravedr5\PYGZus{}on
external.external.sdssdr13\PYGZus{}photoprimary
external.external.skymapperdr1\PYGZus{}master
external.external.tmass\PYGZus{}xsc
public.public.hipparcos
public.public.hipparcos\PYGZus{}newreduction
public.public.hubble\PYGZus{}sc
public.public.igsl\PYGZus{}source
public.public.igsl\PYGZus{}source\PYGZus{}catalog\PYGZus{}ids
public.public.tycho2
public.public.dual
tap\PYGZus{}config.tap\PYGZus{}config.coord\PYGZus{}sys
tap\PYGZus{}config.tap\PYGZus{}config.properties
tap\PYGZus{}schema.tap\PYGZus{}schema.columns
tap\PYGZus{}schema.tap\PYGZus{}schema.key\PYGZus{}columns
tap\PYGZus{}schema.tap\PYGZus{}schema.keys
tap\PYGZus{}schema.tap\PYGZus{}schema.schemas
tap\PYGZus{}schema.tap\PYGZus{}schema.tables
gaiadr1.gaiadr1.aux\PYGZus{}qso\PYGZus{}icrf2\PYGZus{}match
gaiadr1.gaiadr1.ext\PYGZus{}phot\PYGZus{}zero\PYGZus{}point
gaiadr1.gaiadr1.allwise\PYGZus{}best\PYGZus{}neighbour
gaiadr1.gaiadr1.allwise\PYGZus{}neighbourhood
gaiadr1.gaiadr1.gsc23\PYGZus{}best\PYGZus{}neighbour
gaiadr1.gaiadr1.gsc23\PYGZus{}neighbourhood
gaiadr1.gaiadr1.ppmxl\PYGZus{}best\PYGZus{}neighbour
gaiadr1.gaiadr1.ppmxl\PYGZus{}neighbourhood
gaiadr1.gaiadr1.sdss\PYGZus{}dr9\PYGZus{}best\PYGZus{}neighbour
gaiadr1.gaiadr1.sdss\PYGZus{}dr9\PYGZus{}neighbourhood
gaiadr1.gaiadr1.tmass\PYGZus{}best\PYGZus{}neighbour
gaiadr1.gaiadr1.tmass\PYGZus{}neighbourhood
gaiadr1.gaiadr1.ucac4\PYGZus{}best\PYGZus{}neighbour
gaiadr1.gaiadr1.ucac4\PYGZus{}neighbourhood
gaiadr1.gaiadr1.urat1\PYGZus{}best\PYGZus{}neighbour
gaiadr1.gaiadr1.urat1\PYGZus{}neighbourhood
gaiadr1.gaiadr1.cepheid
gaiadr1.gaiadr1.phot\PYGZus{}variable\PYGZus{}time\PYGZus{}series\PYGZus{}gfov
gaiadr1.gaiadr1.phot\PYGZus{}variable\PYGZus{}time\PYGZus{}series\PYGZus{}gfov\PYGZus{}statistical\PYGZus{}parameters
gaiadr1.gaiadr1.rrlyrae
gaiadr1.gaiadr1.variable\PYGZus{}summary
gaiadr1.gaiadr1.allwise\PYGZus{}original\PYGZus{}valid
gaiadr1.gaiadr1.gsc23\PYGZus{}original\PYGZus{}valid
gaiadr1.gaiadr1.ppmxl\PYGZus{}original\PYGZus{}valid
gaiadr1.gaiadr1.sdssdr9\PYGZus{}original\PYGZus{}valid
gaiadr1.gaiadr1.tmass\PYGZus{}original\PYGZus{}valid
gaiadr1.gaiadr1.ucac4\PYGZus{}original\PYGZus{}valid
gaiadr1.gaiadr1.urat1\PYGZus{}original\PYGZus{}valid
gaiadr1.gaiadr1.gaia\PYGZus{}source
gaiadr1.gaiadr1.tgas\PYGZus{}source
gaiadr2.gaiadr2.aux\PYGZus{}allwise\PYGZus{}agn\PYGZus{}gdr2\PYGZus{}cross\PYGZus{}id
gaiadr2.gaiadr2.aux\PYGZus{}iers\PYGZus{}gdr2\PYGZus{}cross\PYGZus{}id
gaiadr2.gaiadr2.aux\PYGZus{}sso\PYGZus{}orbit\PYGZus{}residuals
gaiadr2.gaiadr2.aux\PYGZus{}sso\PYGZus{}orbits
gaiadr2.gaiadr2.dr1\PYGZus{}neighbourhood
gaiadr2.gaiadr2.allwise\PYGZus{}best\PYGZus{}neighbour
gaiadr2.gaiadr2.allwise\PYGZus{}neighbourhood
gaiadr2.gaiadr2.apassdr9\PYGZus{}best\PYGZus{}neighbour
gaiadr2.gaiadr2.apassdr9\PYGZus{}neighbourhood
gaiadr2.gaiadr2.gsc23\PYGZus{}best\PYGZus{}neighbour
gaiadr2.gaiadr2.gsc23\PYGZus{}neighbourhood
gaiadr2.gaiadr2.hipparcos2\PYGZus{}best\PYGZus{}neighbour
gaiadr2.gaiadr2.hipparcos2\PYGZus{}neighbourhood
gaiadr2.gaiadr2.panstarrs1\PYGZus{}best\PYGZus{}neighbour
gaiadr2.gaiadr2.panstarrs1\PYGZus{}neighbourhood
gaiadr2.gaiadr2.ppmxl\PYGZus{}best\PYGZus{}neighbour
gaiadr2.gaiadr2.ppmxl\PYGZus{}neighbourhood
gaiadr2.gaiadr2.ravedr5\PYGZus{}best\PYGZus{}neighbour
gaiadr2.gaiadr2.ravedr5\PYGZus{}neighbourhood
gaiadr2.gaiadr2.sdssdr9\PYGZus{}best\PYGZus{}neighbour
gaiadr2.gaiadr2.sdssdr9\PYGZus{}neighbourhood
gaiadr2.gaiadr2.tmass\PYGZus{}best\PYGZus{}neighbour
gaiadr2.gaiadr2.tmass\PYGZus{}neighbourhood
gaiadr2.gaiadr2.tycho2\PYGZus{}best\PYGZus{}neighbour
gaiadr2.gaiadr2.tycho2\PYGZus{}neighbourhood
gaiadr2.gaiadr2.urat1\PYGZus{}best\PYGZus{}neighbour
gaiadr2.gaiadr2.urat1\PYGZus{}neighbourhood
gaiadr2.gaiadr2.sso\PYGZus{}observation
gaiadr2.gaiadr2.sso\PYGZus{}source
gaiadr2.gaiadr2.vari\PYGZus{}cepheid
gaiadr2.gaiadr2.vari\PYGZus{}classifier\PYGZus{}class\PYGZus{}definition
gaiadr2.gaiadr2.vari\PYGZus{}classifier\PYGZus{}definition
gaiadr2.gaiadr2.vari\PYGZus{}classifier\PYGZus{}result
gaiadr2.gaiadr2.vari\PYGZus{}long\PYGZus{}period\PYGZus{}variable
gaiadr2.gaiadr2.vari\PYGZus{}rotation\PYGZus{}modulation
gaiadr2.gaiadr2.vari\PYGZus{}rrlyrae
gaiadr2.gaiadr2.vari\PYGZus{}short\PYGZus{}timescale
gaiadr2.gaiadr2.vari\PYGZus{}time\PYGZus{}series\PYGZus{}statistics
gaiadr2.gaiadr2.panstarrs1\PYGZus{}original\PYGZus{}valid
gaiadr2.gaiadr2.gaia\PYGZus{}source
gaiadr2.gaiadr2.ruwe
\end{sphinxVerbatim}
So thats a lot of tables. The ones well use are:
\begin{itemize}
\item {}
\sphinxcode{\sphinxupquote{gaiadr2.gaia\_source}}, which contains Gaia data from \sphinxhref{https://www.cosmos.esa.int/web/gaia/data-release-2}{data release 2},
\item {}
\sphinxcode{\sphinxupquote{gaiadr2.panstarrs1\_original\_valid}}, which contains the photometry data well use from PanSTARRS, and
\item {}
\sphinxcode{\sphinxupquote{gaiadr2.panstarrs1\_best\_neighbour}}, which well use to cross\sphinxhyphen{}match each star observed by Gaia with the same star observed by PanSTARRS.
\end{itemize}
We can use \sphinxcode{\sphinxupquote{load\_table}} (not \sphinxcode{\sphinxupquote{load\_tables}}) to get the metadata for a single table. The name of this function is misleading, because it only downloads metadata.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{meta} \PYG{o}{=} \PYG{n}{Gaia}\PYG{o}{.}\PYG{n}{load\PYGZus{}table}\PYG{p}{(}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{gaiadr2.gaia\PYGZus{}source}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\PYG{n}{meta}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
Retrieving table \PYGZsq{}gaiadr2.gaia\PYGZus{}source\PYGZsq{}
Parsing table \PYGZsq{}gaiadr2.gaia\PYGZus{}source\PYGZsq{}...
Done.
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYGZlt{}astroquery.utils.tap.model.taptable.TapTableMeta at 0x7f922376e0a0\PYGZgt{}
\end{sphinxVerbatim}
Jupyter shows that the result is an object of type \sphinxcode{\sphinxupquote{TapTableMeta}}, but it does not display the contents.
To see the metadata, we have to print the object.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n+nb}{print}\PYG{p}{(}\PYG{n}{meta}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
TAP Table name: gaiadr2.gaiadr2.gaia\PYGZus{}source
Description: This table has an entry for every Gaia observed source as listed in the
Main Database accumulating catalogue version from which the catalogue
release has been generated. It contains the basic source parameters,
that is only final data (no epoch data) and no spectra (neither final
nor epoch).
Num. columns: 96
\end{sphinxVerbatim}
Notice one gotcha: in the list of table names, this table appears as \sphinxcode{\sphinxupquote{gaiadr2.gaiadr2.gaia\_source}}, but when we load the metadata, we refer to it as \sphinxcode{\sphinxupquote{gaiadr2.gaia\_source}}.
\sphinxstylestrong{Exercise:} Go back and try
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{meta} \PYG{o}{=} \PYG{n}{Gaia}\PYG{o}{.}\PYG{n}{load\PYGZus{}table}\PYG{p}{(}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{gaiadr2.gaiadr2.gaia\PYGZus{}source}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\end{sphinxVerbatim}
What happens? Is the error message helpful? If you had not made this error deliberately, would you have been able to figure it out?
\section{Columns}
\label{\detokenize{01_query:columns}}
The following loop prints the names of the columns in the table.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k}{for} \PYG{n}{column} \PYG{o+ow}{in} \PYG{n}{meta}\PYG{o}{.}\PYG{n}{columns}\PYG{p}{:}
\PYG{n+nb}{print}\PYG{p}{(}\PYG{n}{column}\PYG{o}{.}\PYG{n}{name}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
solution\PYGZus{}id
designation
source\PYGZus{}id
random\PYGZus{}index
ref\PYGZus{}epoch
ra
ra\PYGZus{}error
dec
dec\PYGZus{}error
parallax
parallax\PYGZus{}error
parallax\PYGZus{}over\PYGZus{}error
pmra
pmra\PYGZus{}error
pmdec
pmdec\PYGZus{}error
ra\PYGZus{}dec\PYGZus{}corr
ra\PYGZus{}parallax\PYGZus{}corr
ra\PYGZus{}pmra\PYGZus{}corr
ra\PYGZus{}pmdec\PYGZus{}corr
dec\PYGZus{}parallax\PYGZus{}corr
dec\PYGZus{}pmra\PYGZus{}corr
dec\PYGZus{}pmdec\PYGZus{}corr
parallax\PYGZus{}pmra\PYGZus{}corr
parallax\PYGZus{}pmdec\PYGZus{}corr
pmra\PYGZus{}pmdec\PYGZus{}corr
astrometric\PYGZus{}n\PYGZus{}obs\PYGZus{}al
astrometric\PYGZus{}n\PYGZus{}obs\PYGZus{}ac
astrometric\PYGZus{}n\PYGZus{}good\PYGZus{}obs\PYGZus{}al
astrometric\PYGZus{}n\PYGZus{}bad\PYGZus{}obs\PYGZus{}al
astrometric\PYGZus{}gof\PYGZus{}al
astrometric\PYGZus{}chi2\PYGZus{}al
astrometric\PYGZus{}excess\PYGZus{}noise
astrometric\PYGZus{}excess\PYGZus{}noise\PYGZus{}sig
astrometric\PYGZus{}params\PYGZus{}solved
astrometric\PYGZus{}primary\PYGZus{}flag
astrometric\PYGZus{}weight\PYGZus{}al
astrometric\PYGZus{}pseudo\PYGZus{}colour
astrometric\PYGZus{}pseudo\PYGZus{}colour\PYGZus{}error
mean\PYGZus{}varpi\PYGZus{}factor\PYGZus{}al
astrometric\PYGZus{}matched\PYGZus{}observations
visibility\PYGZus{}periods\PYGZus{}used
astrometric\PYGZus{}sigma5d\PYGZus{}max
frame\PYGZus{}rotator\PYGZus{}object\PYGZus{}type
matched\PYGZus{}observations
duplicated\PYGZus{}source
phot\PYGZus{}g\PYGZus{}n\PYGZus{}obs
phot\PYGZus{}g\PYGZus{}mean\PYGZus{}flux
phot\PYGZus{}g\PYGZus{}mean\PYGZus{}flux\PYGZus{}error
phot\PYGZus{}g\PYGZus{}mean\PYGZus{}flux\PYGZus{}over\PYGZus{}error
phot\PYGZus{}g\PYGZus{}mean\PYGZus{}mag
phot\PYGZus{}bp\PYGZus{}n\PYGZus{}obs
phot\PYGZus{}bp\PYGZus{}mean\PYGZus{}flux
phot\PYGZus{}bp\PYGZus{}mean\PYGZus{}flux\PYGZus{}error
phot\PYGZus{}bp\PYGZus{}mean\PYGZus{}flux\PYGZus{}over\PYGZus{}error
phot\PYGZus{}bp\PYGZus{}mean\PYGZus{}mag
phot\PYGZus{}rp\PYGZus{}n\PYGZus{}obs
phot\PYGZus{}rp\PYGZus{}mean\PYGZus{}flux
phot\PYGZus{}rp\PYGZus{}mean\PYGZus{}flux\PYGZus{}error
phot\PYGZus{}rp\PYGZus{}mean\PYGZus{}flux\PYGZus{}over\PYGZus{}error
phot\PYGZus{}rp\PYGZus{}mean\PYGZus{}mag
phot\PYGZus{}bp\PYGZus{}rp\PYGZus{}excess\PYGZus{}factor
phot\PYGZus{}proc\PYGZus{}mode
bp\PYGZus{}rp
bp\PYGZus{}g
g\PYGZus{}rp
radial\PYGZus{}velocity
radial\PYGZus{}velocity\PYGZus{}error
rv\PYGZus{}nb\PYGZus{}transits
rv\PYGZus{}template\PYGZus{}teff
rv\PYGZus{}template\PYGZus{}logg
rv\PYGZus{}template\PYGZus{}fe\PYGZus{}h
phot\PYGZus{}variable\PYGZus{}flag
l
b
ecl\PYGZus{}lon
ecl\PYGZus{}lat
priam\PYGZus{}flags
teff\PYGZus{}val
teff\PYGZus{}percentile\PYGZus{}lower
teff\PYGZus{}percentile\PYGZus{}upper
a\PYGZus{}g\PYGZus{}val
a\PYGZus{}g\PYGZus{}percentile\PYGZus{}lower
a\PYGZus{}g\PYGZus{}percentile\PYGZus{}upper
e\PYGZus{}bp\PYGZus{}min\PYGZus{}rp\PYGZus{}val
e\PYGZus{}bp\PYGZus{}min\PYGZus{}rp\PYGZus{}percentile\PYGZus{}lower
e\PYGZus{}bp\PYGZus{}min\PYGZus{}rp\PYGZus{}percentile\PYGZus{}upper
flame\PYGZus{}flags
radius\PYGZus{}val
radius\PYGZus{}percentile\PYGZus{}lower
radius\PYGZus{}percentile\PYGZus{}upper
lum\PYGZus{}val
lum\PYGZus{}percentile\PYGZus{}lower
lum\PYGZus{}percentile\PYGZus{}upper
datalink\PYGZus{}url
epoch\PYGZus{}photometry\PYGZus{}url
\end{sphinxVerbatim}
You can probably guess what many of these columns are by looking at the names, but you should resist the temptation to guess.
To find out what the columns mean, \sphinxhref{https://gea.esac.esa.int/archive/documentation/GDR2/Gaia\_archive/chap\_datamodel/sec\_dm\_main\_tables/ssec\_dm\_gaia\_source.html}{read the documentation}.
If you want to know what can go wrong when you dont read the documentation, \sphinxhref{https://www.vox.com/future-perfect/2019/6/4/18650969/married-women-miserable-fake-paul-dolan-happiness}{you might like this article}.
\sphinxstylestrong{Exercise:} One of the other tables well use is \sphinxcode{\sphinxupquote{gaiadr2.gaiadr2.panstarrs1\_original\_valid}}. Use \sphinxcode{\sphinxupquote{load\_table}} to get the metadata for this table. How many columns are there and what are their names?
Hint: Remember the gotcha we mentioned earlier.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{c+c1}{\PYGZsh{} Solution}
\PYG{k}{for} \PYG{n}{column} \PYG{o+ow}{in} \PYG{n}{meta2}\PYG{o}{.}\PYG{n}{columns}\PYG{p}{:}
\PYG{n+nb}{print}\PYG{p}{(}\PYG{n}{column}\PYG{o}{.}\PYG{n}{name}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
obj\PYGZus{}name
obj\PYGZus{}id
ra
dec
ra\PYGZus{}error
dec\PYGZus{}error
epoch\PYGZus{}mean
g\PYGZus{}mean\PYGZus{}psf\PYGZus{}mag
g\PYGZus{}mean\PYGZus{}psf\PYGZus{}mag\PYGZus{}error
g\PYGZus{}flags
r\PYGZus{}mean\PYGZus{}psf\PYGZus{}mag
r\PYGZus{}mean\PYGZus{}psf\PYGZus{}mag\PYGZus{}error
r\PYGZus{}flags
i\PYGZus{}mean\PYGZus{}psf\PYGZus{}mag
i\PYGZus{}mean\PYGZus{}psf\PYGZus{}mag\PYGZus{}error
i\PYGZus{}flags
z\PYGZus{}mean\PYGZus{}psf\PYGZus{}mag
z\PYGZus{}mean\PYGZus{}psf\PYGZus{}mag\PYGZus{}error
z\PYGZus{}flags
y\PYGZus{}mean\PYGZus{}psf\PYGZus{}mag
y\PYGZus{}mean\PYGZus{}psf\PYGZus{}mag\PYGZus{}error
y\PYGZus{}flags
n\PYGZus{}detections
zone\PYGZus{}id
obj\PYGZus{}info\PYGZus{}flag
quality\PYGZus{}flag
\end{sphinxVerbatim}
\section{Writing queries}
\label{\detokenize{01_query:writing-queries}}
By now you might be wondering how we actually download the data. With tables this big, you generally dont. Instead, you use queries to select only the data you want.
A query is a string written in a query language like SQL; for the Gaia database, the query language is a dialect of SQL called ADQL.
Heres an example of an ADQL query.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{query1} \PYG{o}{=} \PYG{l+s+s2}{\PYGZdq{}\PYGZdq{}\PYGZdq{}}\PYG{l+s+s2}{SELECT }
\PYG{l+s+s2}{TOP 10}
\PYG{l+s+s2}{source\PYGZus{}id, ref\PYGZus{}epoch, ra, dec, parallax }
\PYG{l+s+s2}{FROM gaiadr2.gaia\PYGZus{}source}\PYG{l+s+s2}{\PYGZdq{}\PYGZdq{}\PYGZdq{}}
\end{sphinxVerbatim}
\sphinxstylestrong{Python note:} We use a \sphinxhref{https://docs.python.org/3/tutorial/introduction.html\#strings}{triple\sphinxhyphen{}quoted string} here so we can include line breaks in the query, which makes it easier to read.
The words in uppercase are ADQL keywords:
\begin{itemize}
\item {}
\sphinxcode{\sphinxupquote{SELECT}} indicates that we are selecting data (as opposed to adding or modifying data).
\item {}
\sphinxcode{\sphinxupquote{TOP}} indicates that we only want the first 10 rows of the table, which is useful for testing a query before asking for all of the data.
\item {}
\sphinxcode{\sphinxupquote{FROM}} specifies which table we want data from.
\end{itemize}
The third line is a list of column names, indicating which columns we want.
In this example, the keywords are capitalized and the column names are lowercase. This is a common style, but it is not required. ADQL and SQL are not case\sphinxhyphen{}sensitive.
To run this query, we use the \sphinxcode{\sphinxupquote{Gaia}} object, which represents our connection to the Gaia database, and invoke \sphinxcode{\sphinxupquote{launch\_job}}:
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{job1} \PYG{o}{=} \PYG{n}{Gaia}\PYG{o}{.}\PYG{n}{launch\PYGZus{}job}\PYG{p}{(}\PYG{n}{query1}\PYG{p}{)}
\PYG{n}{job1}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYGZlt{}astroquery.utils.tap.model.job.Job at 0x7f9222e9cb20\PYGZgt{}
\end{sphinxVerbatim}
The result is an object that represents the job running on a Gaia server.
If you print it, it displays metadata for the forthcoming table.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n+nb}{print}\PYG{p}{(}\PYG{n}{job1}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYGZlt{}Table length=10\PYGZgt{}
name dtype unit description
\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{} \PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{} \PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{} \PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}
source\PYGZus{}id int64 Unique source identifier (unique within a particular Data Release)
ref\PYGZus{}epoch float64 yr Reference epoch
ra float64 deg Right ascension
dec float64 deg Declination
parallax float64 mas Parallax
Jobid: None
Phase: COMPLETED
Owner: None
Output file: sync\PYGZus{}20201005090721.xml.gz
Results: None
\end{sphinxVerbatim}
Dont worry about \sphinxcode{\sphinxupquote{Results: None}}. That does not actually mean there are no results.
However, \sphinxcode{\sphinxupquote{Phase: COMPLETED}} indicates that the job is complete, so we can get the results like this:
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{results1} \PYG{o}{=} \PYG{n}{job1}\PYG{o}{.}\PYG{n}{get\PYGZus{}results}\PYG{p}{(}\PYG{p}{)}
\PYG{n+nb}{type}\PYG{p}{(}\PYG{n}{results1}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
astropy.table.table.Table
\end{sphinxVerbatim}
\sphinxstylestrong{Optional detail:} Why is \sphinxcode{\sphinxupquote{table}} repeated three times? The first is the name of the module, the second is the name of the submodule, and the third is the name of the class. Most of the time we only care about the last one. Its like the Linnean name for gorilla, which is \sphinxstyleemphasis{Gorilla Gorilla Gorilla}.
The result is an \sphinxhref{https://docs.astropy.org/en/stable/table/}{Astropy Table}, which is similar to a table in an SQL database except:
\begin{itemize}
\item {}
SQL databases are stored on disk drives, so they are persistent; that is, they “survive” even if you turn off the computer. An Astropy \sphinxcode{\sphinxupquote{Table}} is stored in memory; it disappears when you turn off the computer (or shut down this Jupyter notebook).
\item {}
SQL databases are designed to process queries. An Astropy \sphinxcode{\sphinxupquote{Table}} can perform some query\sphinxhyphen{}like operations, like selecting columns and rows. But these operations use Python syntax, not SQL.
\end{itemize}
Jupyter knows how to display the contents of a \sphinxcode{\sphinxupquote{Table}}.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{results1}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYGZlt{}Table length=10\PYGZgt{}
source\PYGZus{}id ref\PYGZus{}epoch ... dec parallax
yr ... deg mas
int64 float64 ... float64 float64
\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{} \PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{} ... \PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{} \PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}
4530738361793769600 2015.5 ... 20.40682117430378 0.9785380604519425
4530752651135081216 2015.5 ... 20.523350496351846 0.2674800612552977
4530743343951405568 2015.5 ... 20.474147574053124 \PYGZhy{}0.43911323550176806
4530755060627162368 2015.5 ... 20.558523922346158 1.1422630184554958
4530746844341315968 2015.5 ... 20.377852388898184 1.0092247424630945
4530768456615026432 2015.5 ... 20.31829694530366 \PYGZhy{}0.06900136127674149
4530763513119137280 2015.5 ... 20.20956829578524 0.1266016679823622
4530736364618539264 2015.5 ... 20.346579041327693 0.3894019486060072
4530735952305177728 2015.5 ... 20.311030903719928 0.2041189982608354
4530751281056022656 2015.5 ... 20.460309556214753 0.10294642821734962
\end{sphinxVerbatim}
Each column has a name, units, and a data type.
For example, the units of \sphinxcode{\sphinxupquote{ra}} and \sphinxcode{\sphinxupquote{dec}} are degrees, and their data type is \sphinxcode{\sphinxupquote{float64}}, which is a 64\sphinxhyphen{}bit floating\sphinxhyphen{}point number, used to store measurements with a fraction part.
This information comes from the Gaia database, and has been stored in the Astropy \sphinxcode{\sphinxupquote{Table}} by Astroquery.
\sphinxstylestrong{Exercise:} Read \sphinxhref{https://gea.esac.esa.int/archive/documentation/GDR2/Gaia\_archive/chap\_datamodel/sec\_dm\_main\_tables/ssec\_dm\_gaia\_source.html}{the documentation of this table} and choose a column that looks interesting to you. Add the column name to the query and run it again. What are the units of the column you selected? What is its data type?
\section{Asynchronous queries}
\label{\detokenize{01_query:asynchronous-queries}}
\sphinxcode{\sphinxupquote{launch\_job}} asks the server to run the job “synchronously”, which normally means it runs immediately. But synchronous jobs are limited to 2000 rows. For queries that return more rows, you should run “asynchronously”, which mean they might take longer to get started.
If you are not sure how many rows a query will return, you can use the SQL command \sphinxcode{\sphinxupquote{COUNT}} to find out how many rows are in the result without actually returning them. Well see an example of this later.
The results of an asynchronous query are stored in a file on the server, so you can start a query and come back later to get the results.
For anonymous users, files are kept for three days.
As an example, lets try a query thats similar to \sphinxcode{\sphinxupquote{query1}}, with two changes:
\begin{itemize}
\item {}
It selects the first 3000 rows, so it is bigger than we should run synchronously.
\item {}
It uses a new keyword, \sphinxcode{\sphinxupquote{WHERE}}.
\end{itemize}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{query2} \PYG{o}{=} \PYG{l+s+s2}{\PYGZdq{}\PYGZdq{}\PYGZdq{}}\PYG{l+s+s2}{SELECT TOP 3000}
\PYG{l+s+s2}{source\PYGZus{}id, ref\PYGZus{}epoch, ra, dec, parallax}
\PYG{l+s+s2}{FROM gaiadr2.gaia\PYGZus{}source}
\PYG{l+s+s2}{WHERE parallax \PYGZlt{} 1}
\PYG{l+s+s2}{\PYGZdq{}\PYGZdq{}\PYGZdq{}}
\end{sphinxVerbatim}
A \sphinxcode{\sphinxupquote{WHERE}} clause indicates which rows we want; in this case, the query selects only rows “where” \sphinxcode{\sphinxupquote{parallax}} is less than 1. This has the effect of selecting stars with relatively low parallax, which are farther away. Well use this clause to exclude nearby stars that are unlikely to be part of GD\sphinxhyphen{}1.
\sphinxcode{\sphinxupquote{WHERE}} is one of the most common clauses in ADQL/SQL, and one of the most useful, because it allows us to select only the rows we need from the database.
We use \sphinxcode{\sphinxupquote{launch\_job\_async}} to submit an asynchronous query.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{job2} \PYG{o}{=} \PYG{n}{Gaia}\PYG{o}{.}\PYG{n}{launch\PYGZus{}job\PYGZus{}async}\PYG{p}{(}\PYG{n}{query2}\PYG{p}{)}
\PYG{n+nb}{print}\PYG{p}{(}\PYG{n}{job2}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
INFO: Query finished. [astroquery.utils.tap.core]
\PYGZlt{}Table length=3000\PYGZgt{}
name dtype unit description
\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{} \PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{} \PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{} \PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}
source\PYGZus{}id int64 Unique source identifier (unique within a particular Data Release)
ref\PYGZus{}epoch float64 yr Reference epoch
ra float64 deg Right ascension
dec float64 deg Declination
parallax float64 mas Parallax
Jobid: 1601903242219O
Phase: COMPLETED
Owner: None
Output file: async\PYGZus{}20201005090722.vot
Results: None
\end{sphinxVerbatim}
And here are the results.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{results2} \PYG{o}{=} \PYG{n}{job2}\PYG{o}{.}\PYG{n}{get\PYGZus{}results}\PYG{p}{(}\PYG{p}{)}
\PYG{n}{results2}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYGZlt{}Table length=3000\PYGZgt{}
source\PYGZus{}id ref\PYGZus{}epoch ... dec parallax
yr ... deg mas
int64 float64 ... float64 float64
\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{} \PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{} ... \PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{} \PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}
4530738361793769600 2015.5 ... 20.40682117430378 0.9785380604519425
4530752651135081216 2015.5 ... 20.523350496351846 0.2674800612552977
4530743343951405568 2015.5 ... 20.474147574053124 \PYGZhy{}0.43911323550176806
4530768456615026432 2015.5 ... 20.31829694530366 \PYGZhy{}0.06900136127674149
4530763513119137280 2015.5 ... 20.20956829578524 0.1266016679823622
4530736364618539264 2015.5 ... 20.346579041327693 0.3894019486060072
4530735952305177728 2015.5 ... 20.311030903719928 0.2041189982608354
4530751281056022656 2015.5 ... 20.460309556214753 0.10294642821734962
4530740938774409344 2015.5 ... 20.436140058941206 0.9242670062090182
... ... ... ... ...
4467710915011802624 2015.5 ... 1.1429085038160882 0.42361471245557913
4467706551328679552 2015.5 ... 1.0565747323689927 0.922888231734588
4467712255037300096 2015.5 ... 0.6581664892880896 \PYGZhy{}2.669179465293931
4467735001181761792 2015.5 ... 0.8947079323599124 0.6117399163086398
4467737101421916672 2015.5 ... 0.9806225910160181 \PYGZhy{}0.39818224846127004
4467707547757327488 2015.5 ... 1.0212759940136962 0.7741412301054209
4467732772094573056 2015.5 ... 0.9037072088489417 \PYGZhy{}1.7920417800164183
4467732355491087744 2015.5 ... 0.9197224705139885 \PYGZhy{}0.3464446494840354
4467717099766944512 2015.5 ... 0.726277659009568 0.05443955111134051
4467719058265781248 2015.5 ... 0.8205551921782785 0.3733943917490343
\end{sphinxVerbatim}
You might notice that some values of \sphinxcode{\sphinxupquote{parallax}} are negative. As \sphinxhref{https://www.cosmos.esa.int/web/gaia/archive-tips\#negative\%20parallax}{this FAQ explains}, “Negative parallaxes are caused by errors in the observations.” Negative parallaxes have “no physical meaning,” but they can be a “useful diagnostic on the quality of the astrometric solution.”
Later we will see an example where we use \sphinxcode{\sphinxupquote{parallax}} and \sphinxcode{\sphinxupquote{parallax\_error}} to identify stars where the distance estimate is likely to be inaccurate.
\sphinxstylestrong{Exercise:} The clauses in a query have to be in the right order. Go back and change the order of the clauses in \sphinxcode{\sphinxupquote{query2}} and run it again.
The query should fail, but notice that you dont get much useful debugging information.
For this reason, developing and debugging ADQL queries can be really hard. A few suggestions that might help:
\begin{itemize}
\item {}
Whenever possible, start with a working query, either an example you find online or a query you have used in the past.
\item {}
Make small changes and test each change before you continue.
\item {}
While you are debugging, use \sphinxcode{\sphinxupquote{TOP}} to limit the number of rows in the result. That will make each attempt run faster, which reduces your testing time.
\item {}
Launching test queries synchronously might make them start faster, too.
\end{itemize}
\section{Operators}
\label{\detokenize{01_query:operators}}
In a \sphinxcode{\sphinxupquote{WHERE}} clause, you can use any of the \sphinxhref{https://www.w3schools.com/sql/sql\_operators.asp}{SQL comparison operators}; here are the most common ones:
\begin{savenotes}\sphinxattablestart
\centering
\begin{tabulary}{\linewidth}[t]{|T|T|}
\hline
\sphinxstyletheadfamily
Symbol
&\sphinxstyletheadfamily
Operation
\\
\hline
\sphinxcode{\sphinxupquote{\textgreater{}}}
&
greater than
\\
\hline
\sphinxcode{\sphinxupquote{\textless{}}}
&
less than
\\
\hline
\sphinxcode{\sphinxupquote{\textgreater{}=}}
&
greater than or equal
\\
\hline
\sphinxcode{\sphinxupquote{\textless{}=}}
&
less than or equal
\\
\hline
\sphinxcode{\sphinxupquote{=}}
&
equal
\\
\hline
\sphinxcode{\sphinxupquote{!=}} or \sphinxcode{\sphinxupquote{\textless{}\textgreater{}}}
&
not equal
\\
\hline
\end{tabulary}
\par
\sphinxattableend\end{savenotes}
Most of these are the same as Python, but some are not. In particular, notice that the equality operator is \sphinxcode{\sphinxupquote{=}}, not \sphinxcode{\sphinxupquote{==}}.
Be careful to keep your Python out of your ADQL!
You can combine comparisons using the logical operators:
\begin{itemize}
\item {}
AND: true if both comparisons are true
\item {}
OR: true if either or both comparisons are true
\end{itemize}
Finally, you can use \sphinxcode{\sphinxupquote{NOT}} to invert the result of a comparison.
\sphinxstylestrong{Exercise:} \sphinxhref{https://www.w3schools.com/sql/sql\_operators.asp}{Read about SQL operators here} and then modify the previous query to select rows where \sphinxcode{\sphinxupquote{bp\_rp}} is between \sphinxcode{\sphinxupquote{\sphinxhyphen{}0.75}} and \sphinxcode{\sphinxupquote{2}}.
You can \sphinxhref{https://gea.esac.esa.int/archive/documentation/GDR2/Gaia\_archive/chap\_datamodel/sec\_dm\_main\_tables/ssec\_dm\_gaia\_source.html}{read about this variable here}.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{c+c1}{\PYGZsh{} Solution}
\PYG{c+c1}{\PYGZsh{} This is what most people will probably do}
\PYG{n}{query} \PYG{o}{=} \PYG{l+s+s2}{\PYGZdq{}\PYGZdq{}\PYGZdq{}}\PYG{l+s+s2}{SELECT TOP 10}
\PYG{l+s+s2}{source\PYGZus{}id, ref\PYGZus{}epoch, ra, dec, parallax}
\PYG{l+s+s2}{FROM gaiadr2.gaia\PYGZus{}source}
\PYG{l+s+s2}{WHERE parallax \PYGZlt{} 1 }
\PYG{l+s+s2}{ AND bp\PYGZus{}rp \PYGZgt{} \PYGZhy{}0.75 AND bp\PYGZus{}rp \PYGZlt{} 2}
\PYG{l+s+s2}{\PYGZdq{}\PYGZdq{}\PYGZdq{}}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{c+c1}{\PYGZsh{} Solution}
\PYG{c+c1}{\PYGZsh{} But if someone notices the BETWEEN operator, }
\PYG{c+c1}{\PYGZsh{} they might do this}
\PYG{n}{query} \PYG{o}{=} \PYG{l+s+s2}{\PYGZdq{}\PYGZdq{}\PYGZdq{}}\PYG{l+s+s2}{SELECT TOP 10}
\PYG{l+s+s2}{source\PYGZus{}id, ref\PYGZus{}epoch, ra, dec, parallax}
\PYG{l+s+s2}{FROM gaiadr2.gaia\PYGZus{}source}
\PYG{l+s+s2}{WHERE parallax \PYGZlt{} 1 }
\PYG{l+s+s2}{ AND bp\PYGZus{}rp BETWEEN \PYGZhy{}0.75 AND 2}
\PYG{l+s+s2}{\PYGZdq{}\PYGZdq{}\PYGZdq{}}
\end{sphinxVerbatim}
This \sphinxhref{https://sci.esa.int/web/gaia/-/60198-gaia-hertzsprung-russell-diagram}{Hertzsprung\sphinxhyphen{}Russell diagram} shows the BP\sphinxhyphen{}RP color and luminosity of stars in the Gaia catalog.
Selecting stars with \sphinxcode{\sphinxupquote{bp\sphinxhyphen{}rp}} less than 2 excludes many \sphinxhref{https://xkcd.com/2360/}{class M dwarf stars}, which are low temperature, low luminosity. A star like that at GD\sphinxhyphen{}1s distance would be hard to detect, so if it is detected, it it more likely to be in the foreground.
\section{Cleaning up}
\label{\detokenize{01_query:cleaning-up}}
Asynchronous jobs have a \sphinxcode{\sphinxupquote{jobid}}.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{job1}\PYG{o}{.}\PYG{n}{jobid}\PYG{p}{,} \PYG{n}{job2}\PYG{o}{.}\PYG{n}{jobid}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
(None, \PYGZsq{}1601903242219O\PYGZsq{})
\end{sphinxVerbatim}
Which you can use to remove the job from the server.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{Gaia}\PYG{o}{.}\PYG{n}{remove\PYGZus{}jobs}\PYG{p}{(}\PYG{p}{[}\PYG{n}{job2}\PYG{o}{.}\PYG{n}{jobid}\PYG{p}{]}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
Removed jobs: \PYGZsq{}[\PYGZsq{}1601903242219O\PYGZsq{}]\PYGZsq{}.
\end{sphinxVerbatim}
If you dont remove it job from the server, it will be removed eventually, so dont feel too bad if you dont clean up after yourself.
\section{Formatting queries}
\label{\detokenize{01_query:formatting-queries}}
So far the queries have been string “literals”, meaning that the entire string is part of the program.
But writing queries yourself can be slow, repetitive, and error\sphinxhyphen{}prone.
It is often a good idea to write Python code that assembles a query for you. One useful tool for that is the \sphinxhref{https://www.w3schools.com/python/ref\_string\_format.asp}{string \sphinxcode{\sphinxupquote{format}} method}.
As an example, well divide the previous query into two parts; a list of column names and a “base” for the query that contains everything except the column names.
Heres the list of columns well select.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{columns} \PYG{o}{=} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{source\PYGZus{}id, ra, dec, pmra, pmdec, parallax, parallax\PYGZus{}error, radial\PYGZus{}velocity}\PYG{l+s+s1}{\PYGZsq{}}
\end{sphinxVerbatim}
And heres the base; its a string that contains at least one format specifier in curly brackets (braces).
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{query3\PYGZus{}base} \PYG{o}{=} \PYG{l+s+s2}{\PYGZdq{}\PYGZdq{}\PYGZdq{}}\PYG{l+s+s2}{SELECT TOP 10 }
\PYG{l+s+si}{\PYGZob{}columns\PYGZcb{}}
\PYG{l+s+s2}{FROM gaiadr2.gaia\PYGZus{}source}
\PYG{l+s+s2}{WHERE parallax \PYGZlt{} 1}
\PYG{l+s+s2}{ AND bp\PYGZus{}rp BETWEEN \PYGZhy{}0.75 AND 2}
\PYG{l+s+s2}{\PYGZdq{}\PYGZdq{}\PYGZdq{}}
\end{sphinxVerbatim}
This base query contains one format specifier, \sphinxcode{\sphinxupquote{\{columns\}}}, which is a placeholder for the list of column names we will provide.
To assemble the query, we invoke \sphinxcode{\sphinxupquote{format}} on the base string and provide a keyword argument that assigns a value to \sphinxcode{\sphinxupquote{columns}}.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{query3} \PYG{o}{=} \PYG{n}{query3\PYGZus{}base}\PYG{o}{.}\PYG{n}{format}\PYG{p}{(}\PYG{n}{columns}\PYG{o}{=}\PYG{n}{columns}\PYG{p}{)}
\end{sphinxVerbatim}
The result is a string with line breaks. If you display it, the line breaks appear as \sphinxcode{\sphinxupquote{\textbackslash{}n}}.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{query3}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYGZsq{}SELECT TOP 10 \PYGZbs{}nsource\PYGZus{}id, ra, dec, pmra, pmdec, parallax, parallax\PYGZus{}error, radial\PYGZus{}velocity\PYGZbs{}nFROM gaiadr2.gaia\PYGZus{}source\PYGZbs{}nWHERE parallax \PYGZlt{} 1\PYGZbs{}n AND bp\PYGZus{}rp BETWEEN \PYGZhy{}0.75 AND 2\PYGZbs{}n\PYGZsq{}
\end{sphinxVerbatim}
But if you print it, the line breaks appear as… line breaks.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n+nb}{print}\PYG{p}{(}\PYG{n}{query3}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
SELECT TOP 10
source\PYGZus{}id, ra, dec, pmra, pmdec, parallax, parallax\PYGZus{}error, radial\PYGZus{}velocity
FROM gaiadr2.gaia\PYGZus{}source
WHERE parallax \PYGZlt{} 1
AND bp\PYGZus{}rp BETWEEN \PYGZhy{}0.75 AND 2
\end{sphinxVerbatim}
Notice that the format specifier has been replaced with the value of \sphinxcode{\sphinxupquote{columns}}.
Lets run it and see if it works:
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{job3} \PYG{o}{=} \PYG{n}{Gaia}\PYG{o}{.}\PYG{n}{launch\PYGZus{}job}\PYG{p}{(}\PYG{n}{query3}\PYG{p}{)}
\PYG{n+nb}{print}\PYG{p}{(}\PYG{n}{job3}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYGZlt{}Table length=10\PYGZgt{}
name dtype unit description n\PYGZus{}bad
\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{} \PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{} \PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{} \PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{} \PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}
source\PYGZus{}id int64 Unique source identifier (unique within a particular Data Release) 0
ra float64 deg Right ascension 0
dec float64 deg Declination 0
pmra float64 mas / yr Proper motion in right ascension direction 0
pmdec float64 mas / yr Proper motion in declination direction 0
parallax float64 mas Parallax 0
parallax\PYGZus{}error float64 mas Standard error of parallax 0
radial\PYGZus{}velocity float64 km / s Radial velocity 10
Jobid: None
Phase: COMPLETED
Owner: None
Output file: sync\PYGZus{}20201005090726.xml.gz
Results: None
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{results3} \PYG{o}{=} \PYG{n}{job3}\PYG{o}{.}\PYG{n}{get\PYGZus{}results}\PYG{p}{(}\PYG{p}{)}
\PYG{n}{results3}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYGZlt{}Table length=10\PYGZgt{}
source\PYGZus{}id ra ... parallax\PYGZus{}error radial\PYGZus{}velocity
deg ... mas km / s
int64 float64 ... float64 float64
\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{} \PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{} ... \PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{} \PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}
4467710915011802624 269.9680969307347 ... 0.470352406647465 \PYGZhy{}\PYGZhy{}
4467706551328679552 270.033164589881 ... 0.927008559859825 \PYGZhy{}\PYGZhy{}
4467712255037300096 270.7724717923047 ... 0.9719742773203504 \PYGZhy{}\PYGZhy{}
4467735001181761792 270.3628606248308 ... 0.509812721702093 \PYGZhy{}\PYGZhy{}
4467737101421916672 270.5110834661444 ... 0.7549581886719651 \PYGZhy{}\PYGZhy{}
4467707547757327488 269.88746280594927 ... 0.3022057897812064 \PYGZhy{}\PYGZhy{}
4467732355491087744 270.6730790702491 ... 0.4937921513912002 \PYGZhy{}\PYGZhy{}
4467717099766944512 270.57667173120825 ... 0.8867339293525688 \PYGZhy{}\PYGZhy{}
4467719058265781248 270.7248052971514 ... 0.390952370410666 \PYGZhy{}\PYGZhy{}
4467722326741572352 270.87431291888504 ... 0.1660452431882023 \PYGZhy{}\PYGZhy{}
\end{sphinxVerbatim}
Good so far.
\sphinxstylestrong{Exercise:} This query always selects sources with \sphinxcode{\sphinxupquote{parallax}} less than 1. But suppose you want to take that upper bound as an input.
Modify \sphinxcode{\sphinxupquote{query3\_base}} to replace \sphinxcode{\sphinxupquote{1}} with a format specifier like \sphinxcode{\sphinxupquote{\{max\_parallax\}}}. Now, when you call \sphinxcode{\sphinxupquote{format}}, add a keyword argument that assigns a value to \sphinxcode{\sphinxupquote{max\_parallax}}, and confirm that the format specifier gets replaced with the value you provide.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{c+c1}{\PYGZsh{} Solution}
\PYG{n}{query4\PYGZus{}base} \PYG{o}{=} \PYG{l+s+s2}{\PYGZdq{}\PYGZdq{}\PYGZdq{}}\PYG{l+s+s2}{SELECT TOP 10}
\PYG{l+s+si}{\PYGZob{}columns\PYGZcb{}}
\PYG{l+s+s2}{FROM gaiadr2.gaia\PYGZus{}source}
\PYG{l+s+s2}{WHERE parallax \PYGZlt{} }\PYG{l+s+si}{\PYGZob{}max\PYGZus{}parallax\PYGZcb{}}\PYG{l+s+s2}{ AND }
\PYG{l+s+s2}{bp\PYGZus{}rp BETWEEN \PYGZhy{}0.75 AND 2}
\PYG{l+s+s2}{\PYGZdq{}\PYGZdq{}\PYGZdq{}}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{c+c1}{\PYGZsh{} Solution}
\PYG{n}{query4} \PYG{o}{=} \PYG{n}{query4\PYGZus{}base}\PYG{o}{.}\PYG{n}{format}\PYG{p}{(}\PYG{n}{columns}\PYG{o}{=}\PYG{n}{columns}\PYG{p}{,}
\PYG{n}{max\PYGZus{}parallax}\PYG{o}{=}\PYG{l+m+mf}{0.5}\PYG{p}{)}
\PYG{n+nb}{print}\PYG{p}{(}\PYG{n}{query}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
SELECT TOP 10
source\PYGZus{}id, ra, dec, pmra, pmdec, parallax, parallax\PYGZus{}error, radial\PYGZus{}velocity
FROM gaiadr2.gaia\PYGZus{}source
WHERE parallax \PYGZlt{} 0.5 AND
bp\PYGZus{}rp BETWEEN \PYGZhy{}0.75 AND 2
\end{sphinxVerbatim}
\sphinxstylestrong{Style note:} You might notice that the variable names in this notebook are numbered, like \sphinxcode{\sphinxupquote{query1}}, \sphinxcode{\sphinxupquote{query2}}, etc.
The advantage of this style is that it isolates each section of the notebook from the others, so if you go back and run the cells out of order, its less likely that you will get unexpected interactions.
A drawback of this style is that it can be a nuisance to update the notebook if you add, remove, or reorder a section.
What do you think of this choice? Are there alternatives you prefer?
\section{Summary}
\label{\detokenize{01_query:summary}}
This notebook demonstrates the following steps:
\begin{enumerate}
\sphinxsetlistlabels{\arabic}{enumi}{enumii}{}{.}%
\item {}
Making a connection to the Gaia server,
\item {}
Exploring information about the database and the tables it contains,
\item {}
Writing a query and sending it to the server, and finally
\item {}
Downloading the response from the server as an Astropy \sphinxcode{\sphinxupquote{Table}}.
\end{enumerate}
\section{Best practices}
\label{\detokenize{01_query:best-practices}}\begin{itemize}
\item {}
If you cant download an entire dataset (or its not practical) use queries to select the data you need.
\item {}
Read the metadata and the documentation to make sure you understand the tables, their columns, and what they mean.
\item {}
Develop queries incrementally: start with something simple, test it, and add a little bit at a time.
\item {}
Use ADQL features like \sphinxcode{\sphinxupquote{TOP}} and \sphinxcode{\sphinxupquote{COUNT}} to test before you run a query that might return a lot of data.
\item {}
If you know your query will return fewer than 3000 rows, you can run it synchronously, which might complete faster (but it doesnt seem to make much difference). If it might return more than 3000 rows, you should run it asynchronously.
\item {}
ADQL and SQL are not case\sphinxhyphen{}sensitive, so you dont have to capitalize the keywords, but you should.
\item {}
ADQL and SQL dont require you to break a query into multiple lines, but you should.
\end{itemize}
Jupyter notebooks can be good for developing and testing code, but they have some drawbacks. In particular, if you run the cells out of order, you might find that variables dont have the values you expect.
There are a few things you can do to mitigate these problems:
\begin{itemize}
\item {}
Make each section of the notebook self\sphinxhyphen{}contained. Try not to use the same variable name in more than one section.
\item {}
Keep notebooks short. Look for places where you can break your analysis into phases with one notebook per phase.
\end{itemize}
\chapter{Chapter 2}
\label{\detokenize{02_coords:chapter-2}}\label{\detokenize{02_coords::doc}}
This is the second in a series of notebooks related to astronomy data.
As a running example, we are replicating parts of the analysis in a recent paper, “\sphinxhref{https://arxiv.org/abs/1805.00425}{Off the beaten path: Gaia reveals GD\sphinxhyphen{}1 stars outside of the main stream}” by Adrian M. Price\sphinxhyphen{}Whelan and Ana Bonaca.
In the first notebook, we wrote ADQL queries and used them to select and download data from the Gaia server.
In this notebook, well pick up where we left off and write a query to select stars from the region of the sky where we expect GD\sphinxhyphen{}1 to be.
\section{Outline}
\label{\detokenize{02_coords:outline}}
Well start with an example that does a “cone search”; that is, it selects stars that appear in a circular region of the sky.
Then, to select stars in the vicinity of GD\sphinxhyphen{}1, well:
\begin{itemize}
\item {}
Use \sphinxcode{\sphinxupquote{Quantity}} objects to represent measurements with units.
\item {}
Use the \sphinxcode{\sphinxupquote{Gala}} library to convert coordinates from one frame to another.
\item {}
Use the ADQL keywords \sphinxcode{\sphinxupquote{POLYGON}}, \sphinxcode{\sphinxupquote{CONTAINS}}, and \sphinxcode{\sphinxupquote{POINT}} to select stars that fall within a polygonal region.
\item {}
Submit a query and download the results.
\item {}
Store the results in a FITS file.
\end{itemize}
After completing this lesson, you should be able to
\begin{itemize}
\item {}
Use Python string formatting to compose more complex ADQL queries.
\item {}
Work with coordinates and other quantities that have units.
\item {}
Download the results of a query and store them in a file.
\end{itemize}
\section{Installing libraries}
\label{\detokenize{02_coords:installing-libraries}}
If you are running this notebook on Colab, you can run the following cell to install Astroquery and a the other libraries well use.
If you are running this notebook on your own computer, you might have to install these libraries yourself.
If you are using this notebook as part of a Carpentries workshop, you should have received setup instructions.
TODO: Add a link to the instructions.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{c+c1}{\PYGZsh{} If we\PYGZsq{}re running on Colab, install libraries}
\PYG{k+kn}{import} \PYG{n+nn}{sys}
\PYG{n}{IN\PYGZus{}COLAB} \PYG{o}{=} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{google.colab}\PYG{l+s+s1}{\PYGZsq{}} \PYG{o+ow}{in} \PYG{n}{sys}\PYG{o}{.}\PYG{n}{modules}
\PYG{k}{if} \PYG{n}{IN\PYGZus{}COLAB}\PYG{p}{:}
\PYG{o}{!}pip install astroquery astro\PYGZhy{}gala pyia
\end{sphinxVerbatim}
\section{Selecting a region}
\label{\detokenize{02_coords:selecting-a-region}}
One of the most common ways to restrict a query is to select stars in a particular region of the sky.
For example, heres a query from the \sphinxhref{https://gea.esac.esa.int/archive-help/adql/examples/index.html}{Gaia archive documentation} that selects “all the objects … in a circular region centered at (266.41683, \sphinxhyphen{}29.00781) with a search radius of 5 arcmin (0.08333 deg).”
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{query} \PYG{o}{=} \PYG{l+s+s2}{\PYGZdq{}\PYGZdq{}\PYGZdq{}}
\PYG{l+s+s2}{SELECT }
\PYG{l+s+s2}{TOP 10 source\PYGZus{}id}
\PYG{l+s+s2}{FROM gaiadr2.gaia\PYGZus{}source}
\PYG{l+s+s2}{WHERE 1=CONTAINS(}
\PYG{l+s+s2}{ POINT(ra, dec),}
\PYG{l+s+s2}{ CIRCLE(266.41683, \PYGZhy{}29.00781, 0.08333333))}
\PYG{l+s+s2}{\PYGZdq{}\PYGZdq{}\PYGZdq{}}
\end{sphinxVerbatim}
This query uses three keywords that are specific to ADQL (not SQL):
\begin{itemize}
\item {}
\sphinxcode{\sphinxupquote{POINT}}: a location in \sphinxhref{https://en.wikipedia.org/wiki/International\_Celestial\_Reference\_System}{ICRS coordinates}, specified in degrees of right ascension and declination.
\item {}
\sphinxcode{\sphinxupquote{CIRCLE}}: a circle where the first two values are the coordinates of the center and the third is the radius in degrees.
\item {}
\sphinxcode{\sphinxupquote{CONTAINS}}: a function that returns \sphinxcode{\sphinxupquote{1}} if a \sphinxcode{\sphinxupquote{POINT}} is contained in a shape and \sphinxcode{\sphinxupquote{0}} otherwise.
\end{itemize}
Here is the \sphinxhref{http://www.ivoa.net/documents/ADQL/20180112/PR-ADQL-2.1-20180112.html\#tth\_sEc4.2.12}{documentation of \sphinxcode{\sphinxupquote{CONTAINS}}}.
A query like this is called a cone search because it selects stars in a cone.
Heres how we run it.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k+kn}{from} \PYG{n+nn}{astroquery}\PYG{n+nn}{.}\PYG{n+nn}{gaia} \PYG{k+kn}{import} \PYG{n}{Gaia}
\PYG{n}{job} \PYG{o}{=} \PYG{n}{Gaia}\PYG{o}{.}\PYG{n}{launch\PYGZus{}job}\PYG{p}{(}\PYG{n}{query}\PYG{p}{)}
\PYG{n}{result} \PYG{o}{=} \PYG{n}{job}\PYG{o}{.}\PYG{n}{get\PYGZus{}results}\PYG{p}{(}\PYG{p}{)}
\PYG{n}{result}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
Created TAP+ (v1.2.1) \PYGZhy{} Connection:
Host: gea.esac.esa.int
Use HTTPS: True
Port: 443
SSL Port: 443
Created TAP+ (v1.2.1) \PYGZhy{} Connection:
Host: geadata.esac.esa.int
Use HTTPS: True
Port: 443
SSL Port: 443
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYGZlt{}Table length=10\PYGZgt{}
source\PYGZus{}id
int64
\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}
4057468321929794432
4057468287575835392
4057482027171038976
4057470349160630656
4057470039924301696
4057469868125641984
4057468351995073024
4057469661959554560
4057470520960672640
4057470555320409600
\end{sphinxVerbatim}
\sphinxstylestrong{Exercise:} When you are debugging queries like this, you can use \sphinxcode{\sphinxupquote{TOP}} to limit the size of the results, but then you still dont know how big the results will be.
An alternative is to use \sphinxcode{\sphinxupquote{COUNT}}, which asks for the number of rows that would be selected, but it does not return them.
In the previous query, replace \sphinxcode{\sphinxupquote{TOP 10 source\_id}} with \sphinxcode{\sphinxupquote{COUNT(source\_id)}} and run the query again. How many stars has Gaia identified in the cone we searched?
\section{Getting GD\sphinxhyphen{}1 Data}
\label{\detokenize{02_coords:getting-gd-1-data}}
From the Price\sphinxhyphen{}Whelan and Bonaca paper, we will try to reproduce Figure 1, which includes this representation of stars likely to belong to GD\sphinxhyphen{}1:
Along the axis of right ascension (\(\phi_1\)) the figure extends from \sphinxhyphen{}100 to 20 degrees.
Along the axis of declination (\(\phi_2\)) the figure extends from about \sphinxhyphen{}8 to 4 degrees.
Ideally, we would select all stars from this rectangle, but there are more than 10 million of them, so
\begin{itemize}
\item {}
That would be difficult to work with,
\item {}
As anonymous users, we are limited to 3 million rows in a single query, and
\item {}
While we are developing and testing code, it will be faster to work with a smaller dataset.
\end{itemize}
So well start by selecting stars in a smaller rectangle, from \sphinxhyphen{}55 to \sphinxhyphen{}45 degrees right ascension and \sphinxhyphen{}8 to 4 degrees of declination.
But first we lets see how to represent quantities with units like degrees.
\section{Working with coordinates}
\label{\detokenize{02_coords:working-with-coordinates}}
Coordinates are physical quantities, which means that they have two parts, a value and a unit.
For example, the coordinate \(30^{\circ}\) has value 30 and its units are degrees.
Until recently, most scientific computation was done with values only; units were left out of the program altogether, \sphinxhref{https://en.wikipedia.org/wiki/Mars\_Climate\_Orbiter\#Cause\_of\_failure}{often with disastrous results}.
Astropy provides tools for including units explicitly in computations, which makes it possible to detect errors before they cause disasters.
To use Astropy units, we import them like this:
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k+kn}{import} \PYG{n+nn}{astropy}\PYG{n+nn}{.}\PYG{n+nn}{units} \PYG{k}{as} \PYG{n+nn}{u}
\PYG{n}{u}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYGZlt{}module \PYGZsq{}astropy.units\PYGZsq{} from \PYGZsq{}/home/downey/anaconda3/envs/AstronomicalData/lib/python3.8/site\PYGZhy{}packages/astropy/units/\PYGZus{}\PYGZus{}init\PYGZus{}\PYGZus{}.py\PYGZsq{}\PYGZgt{}
\end{sphinxVerbatim}
\sphinxcode{\sphinxupquote{u}} is an object that contains most common units and all SI units.
You can use \sphinxcode{\sphinxupquote{dir}} to list them, but you should also \sphinxhref{https://docs.astropy.org/en/stable/units/}{read the documentation}.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n+nb}{dir}\PYG{p}{(}\PYG{n}{u}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
[\PYGZsq{}A\PYGZsq{},
\PYGZsq{}AA\PYGZsq{},
\PYGZsq{}AB\PYGZsq{},
\PYGZsq{}ABflux\PYGZsq{},
\PYGZsq{}ABmag\PYGZsq{},
\PYGZsq{}AU\PYGZsq{},
\PYGZsq{}Angstrom\PYGZsq{},
\PYGZsq{}B\PYGZsq{},
\PYGZsq{}Ba\PYGZsq{},
\PYGZsq{}Barye\PYGZsq{},
\PYGZsq{}Bi\PYGZsq{},
\PYGZsq{}Biot\PYGZsq{},
\PYGZsq{}Bol\PYGZsq{},
\PYGZsq{}Bq\PYGZsq{},
\PYGZsq{}C\PYGZsq{},
\PYGZsq{}Celsius\PYGZsq{},
\PYGZsq{}Ci\PYGZsq{},
\PYGZsq{}CompositeUnit\PYGZsq{},
\PYGZsq{}D\PYGZsq{},
\PYGZsq{}Da\PYGZsq{},
\PYGZsq{}Dalton\PYGZsq{},
\PYGZsq{}Debye\PYGZsq{},
\PYGZsq{}Decibel\PYGZsq{},
\PYGZsq{}DecibelUnit\PYGZsq{},
\PYGZsq{}Dex\PYGZsq{},
\PYGZsq{}DexUnit\PYGZsq{},
\PYGZsq{}EA\PYGZsq{},
\PYGZsq{}EAU\PYGZsq{},
\PYGZsq{}EB\PYGZsq{},
\PYGZsq{}EBa\PYGZsq{},
\PYGZsq{}EC\PYGZsq{},
\PYGZsq{}ED\PYGZsq{},
\PYGZsq{}EF\PYGZsq{},
\PYGZsq{}EG\PYGZsq{},
\PYGZsq{}EGal\PYGZsq{},
\PYGZsq{}EH\PYGZsq{},
\PYGZsq{}EHz\PYGZsq{},
\PYGZsq{}EJ\PYGZsq{},
\PYGZsq{}EJy\PYGZsq{},
\PYGZsq{}EK\PYGZsq{},
\PYGZsq{}EL\PYGZsq{},
\PYGZsq{}EN\PYGZsq{},
\PYGZsq{}EOhm\PYGZsq{},
\PYGZsq{}EP\PYGZsq{},
\PYGZsq{}EPa\PYGZsq{},
\PYGZsq{}ER\PYGZsq{},
\PYGZsq{}ERy\PYGZsq{},
\PYGZsq{}ES\PYGZsq{},
\PYGZsq{}ESt\PYGZsq{},
\PYGZsq{}ET\PYGZsq{},
\PYGZsq{}EV\PYGZsq{},
\PYGZsq{}EW\PYGZsq{},
\PYGZsq{}EWb\PYGZsq{},
\PYGZsq{}Ea\PYGZsq{},
\PYGZsq{}Eadu\PYGZsq{},
\PYGZsq{}Earcmin\PYGZsq{},
\PYGZsq{}Earcsec\PYGZsq{},
\PYGZsq{}Eau\PYGZsq{},
\PYGZsq{}Eb\PYGZsq{},
\PYGZsq{}Ebarn\PYGZsq{},
\PYGZsq{}Ebeam\PYGZsq{},
\PYGZsq{}Ebin\PYGZsq{},
\PYGZsq{}Ebit\PYGZsq{},
\PYGZsq{}Ebyte\PYGZsq{},
\PYGZsq{}Ecd\PYGZsq{},
\PYGZsq{}Echan\PYGZsq{},
\PYGZsq{}Ecount\PYGZsq{},
\PYGZsq{}Ect\PYGZsq{},
\PYGZsq{}Ed\PYGZsq{},
\PYGZsq{}Edeg\PYGZsq{},
\PYGZsq{}Edyn\PYGZsq{},
\PYGZsq{}EeV\PYGZsq{},
\PYGZsq{}Eerg\PYGZsq{},
\PYGZsq{}Eg\PYGZsq{},
\PYGZsq{}Eh\PYGZsq{},
\PYGZsq{}EiB\PYGZsq{},
\PYGZsq{}Eib\PYGZsq{},
\PYGZsq{}Eibit\PYGZsq{},
\PYGZsq{}Eibyte\PYGZsq{},
\PYGZsq{}Ek\PYGZsq{},
\PYGZsq{}El\PYGZsq{},
\PYGZsq{}Elm\PYGZsq{},
\PYGZsq{}Elx\PYGZsq{},
\PYGZsq{}Elyr\PYGZsq{},
\PYGZsq{}Em\PYGZsq{},
\PYGZsq{}Emag\PYGZsq{},
\PYGZsq{}Emin\PYGZsq{},
\PYGZsq{}Emol\PYGZsq{},
\PYGZsq{}Eohm\PYGZsq{},
\PYGZsq{}Epc\PYGZsq{},
\PYGZsq{}Eph\PYGZsq{},
\PYGZsq{}Ephoton\PYGZsq{},
\PYGZsq{}Epix\PYGZsq{},
\PYGZsq{}Epixel\PYGZsq{},
\PYGZsq{}Erad\PYGZsq{},
\PYGZsq{}Es\PYGZsq{},
\PYGZsq{}Esr\PYGZsq{},
\PYGZsq{}Eu\PYGZsq{},
\PYGZsq{}Evox\PYGZsq{},
\PYGZsq{}Evoxel\PYGZsq{},
\PYGZsq{}Eyr\PYGZsq{},
\PYGZsq{}F\PYGZsq{},
\PYGZsq{}Farad\PYGZsq{},
\PYGZsq{}Fr\PYGZsq{},
\PYGZsq{}Franklin\PYGZsq{},
\PYGZsq{}FunctionQuantity\PYGZsq{},
\PYGZsq{}FunctionUnitBase\PYGZsq{},
\PYGZsq{}G\PYGZsq{},
\PYGZsq{}GA\PYGZsq{},
\PYGZsq{}GAU\PYGZsq{},
\PYGZsq{}GB\PYGZsq{},
\PYGZsq{}GBa\PYGZsq{},
\PYGZsq{}GC\PYGZsq{},
\PYGZsq{}GD\PYGZsq{},
\PYGZsq{}GF\PYGZsq{},
\PYGZsq{}GG\PYGZsq{},
\PYGZsq{}GGal\PYGZsq{},
\PYGZsq{}GH\PYGZsq{},
\PYGZsq{}GHz\PYGZsq{},
\PYGZsq{}GJ\PYGZsq{},
\PYGZsq{}GJy\PYGZsq{},
\PYGZsq{}GK\PYGZsq{},
\PYGZsq{}GL\PYGZsq{},
\PYGZsq{}GN\PYGZsq{},
\PYGZsq{}GOhm\PYGZsq{},
\PYGZsq{}GP\PYGZsq{},
\PYGZsq{}GPa\PYGZsq{},
\PYGZsq{}GR\PYGZsq{},
\PYGZsq{}GRy\PYGZsq{},
\PYGZsq{}GS\PYGZsq{},
\PYGZsq{}GSt\PYGZsq{},
\PYGZsq{}GT\PYGZsq{},
\PYGZsq{}GV\PYGZsq{},
\PYGZsq{}GW\PYGZsq{},
\PYGZsq{}GWb\PYGZsq{},
\PYGZsq{}Ga\PYGZsq{},
\PYGZsq{}Gadu\PYGZsq{},
\PYGZsq{}Gal\PYGZsq{},
\PYGZsq{}Garcmin\PYGZsq{},
\PYGZsq{}Garcsec\PYGZsq{},
\PYGZsq{}Gau\PYGZsq{},
\PYGZsq{}Gauss\PYGZsq{},
\PYGZsq{}Gb\PYGZsq{},
\PYGZsq{}Gbarn\PYGZsq{},
\PYGZsq{}Gbeam\PYGZsq{},
\PYGZsq{}Gbin\PYGZsq{},
\PYGZsq{}Gbit\PYGZsq{},
\PYGZsq{}Gbyte\PYGZsq{},
\PYGZsq{}Gcd\PYGZsq{},
\PYGZsq{}Gchan\PYGZsq{},
\PYGZsq{}Gcount\PYGZsq{},
\PYGZsq{}Gct\PYGZsq{},
\PYGZsq{}Gd\PYGZsq{},
\PYGZsq{}Gdeg\PYGZsq{},
\PYGZsq{}Gdyn\PYGZsq{},
\PYGZsq{}GeV\PYGZsq{},
\PYGZsq{}Gerg\PYGZsq{},
\PYGZsq{}Gg\PYGZsq{},
\PYGZsq{}Gh\PYGZsq{},
\PYGZsq{}GiB\PYGZsq{},
\PYGZsq{}Gib\PYGZsq{},
\PYGZsq{}Gibit\PYGZsq{},
\PYGZsq{}Gibyte\PYGZsq{},
\PYGZsq{}Gk\PYGZsq{},
\PYGZsq{}Gl\PYGZsq{},
\PYGZsq{}Glm\PYGZsq{},
\PYGZsq{}Glx\PYGZsq{},
\PYGZsq{}Glyr\PYGZsq{},
\PYGZsq{}Gm\PYGZsq{},
\PYGZsq{}Gmag\PYGZsq{},
\PYGZsq{}Gmin\PYGZsq{},
\PYGZsq{}Gmol\PYGZsq{},
\PYGZsq{}Gohm\PYGZsq{},
\PYGZsq{}Gpc\PYGZsq{},
\PYGZsq{}Gph\PYGZsq{},
\PYGZsq{}Gphoton\PYGZsq{},
\PYGZsq{}Gpix\PYGZsq{},
\PYGZsq{}Gpixel\PYGZsq{},
\PYGZsq{}Grad\PYGZsq{},
\PYGZsq{}Gs\PYGZsq{},
\PYGZsq{}Gsr\PYGZsq{},
\PYGZsq{}Gu\PYGZsq{},
\PYGZsq{}Gvox\PYGZsq{},
\PYGZsq{}Gvoxel\PYGZsq{},
\PYGZsq{}Gyr\PYGZsq{},
\PYGZsq{}H\PYGZsq{},
\PYGZsq{}Henry\PYGZsq{},
\PYGZsq{}Hertz\PYGZsq{},
\PYGZsq{}Hz\PYGZsq{},
\PYGZsq{}IrreducibleUnit\PYGZsq{},
\PYGZsq{}J\PYGZsq{},
\PYGZsq{}Jansky\PYGZsq{},
\PYGZsq{}Joule\PYGZsq{},
\PYGZsq{}Jy\PYGZsq{},
\PYGZsq{}K\PYGZsq{},
\PYGZsq{}Kayser\PYGZsq{},
\PYGZsq{}Kelvin\PYGZsq{},
\PYGZsq{}KiB\PYGZsq{},
\PYGZsq{}Kib\PYGZsq{},
\PYGZsq{}Kibit\PYGZsq{},
\PYGZsq{}Kibyte\PYGZsq{},
\PYGZsq{}L\PYGZsq{},
\PYGZsq{}L\PYGZus{}bol\PYGZsq{},
\PYGZsq{}L\PYGZus{}sun\PYGZsq{},
\PYGZsq{}LogQuantity\PYGZsq{},
\PYGZsq{}LogUnit\PYGZsq{},
\PYGZsq{}Lsun\PYGZsq{},
\PYGZsq{}MA\PYGZsq{},
\PYGZsq{}MAU\PYGZsq{},
\PYGZsq{}MB\PYGZsq{},
\PYGZsq{}MBa\PYGZsq{},
\PYGZsq{}MC\PYGZsq{},
\PYGZsq{}MD\PYGZsq{},
\PYGZsq{}MF\PYGZsq{},
\PYGZsq{}MG\PYGZsq{},
\PYGZsq{}MGal\PYGZsq{},
\PYGZsq{}MH\PYGZsq{},
\PYGZsq{}MHz\PYGZsq{},
\PYGZsq{}MJ\PYGZsq{},
\PYGZsq{}MJy\PYGZsq{},
\PYGZsq{}MK\PYGZsq{},
\PYGZsq{}ML\PYGZsq{},
\PYGZsq{}MN\PYGZsq{},
\PYGZsq{}MOhm\PYGZsq{},
\PYGZsq{}MP\PYGZsq{},
\PYGZsq{}MPa\PYGZsq{},
\PYGZsq{}MR\PYGZsq{},
\PYGZsq{}MRy\PYGZsq{},
\PYGZsq{}MS\PYGZsq{},
\PYGZsq{}MSt\PYGZsq{},
\PYGZsq{}MT\PYGZsq{},
\PYGZsq{}MV\PYGZsq{},
\PYGZsq{}MW\PYGZsq{},
\PYGZsq{}MWb\PYGZsq{},
\PYGZsq{}M\PYGZus{}bol\PYGZsq{},
\PYGZsq{}M\PYGZus{}e\PYGZsq{},
\PYGZsq{}M\PYGZus{}earth\PYGZsq{},
\PYGZsq{}M\PYGZus{}jup\PYGZsq{},
\PYGZsq{}M\PYGZus{}jupiter\PYGZsq{},
\PYGZsq{}M\PYGZus{}p\PYGZsq{},
\PYGZsq{}M\PYGZus{}sun\PYGZsq{},
\PYGZsq{}Ma\PYGZsq{},
\PYGZsq{}Madu\PYGZsq{},
\PYGZsq{}MagUnit\PYGZsq{},
\PYGZsq{}Magnitude\PYGZsq{},
\PYGZsq{}Marcmin\PYGZsq{},
\PYGZsq{}Marcsec\PYGZsq{},
\PYGZsq{}Mau\PYGZsq{},
\PYGZsq{}Mb\PYGZsq{},
\PYGZsq{}Mbarn\PYGZsq{},
\PYGZsq{}Mbeam\PYGZsq{},
\PYGZsq{}Mbin\PYGZsq{},
\PYGZsq{}Mbit\PYGZsq{},
\PYGZsq{}Mbyte\PYGZsq{},
\PYGZsq{}Mcd\PYGZsq{},
\PYGZsq{}Mchan\PYGZsq{},
\PYGZsq{}Mcount\PYGZsq{},
\PYGZsq{}Mct\PYGZsq{},
\PYGZsq{}Md\PYGZsq{},
\PYGZsq{}Mdeg\PYGZsq{},
\PYGZsq{}Mdyn\PYGZsq{},
\PYGZsq{}MeV\PYGZsq{},
\PYGZsq{}Mearth\PYGZsq{},
\PYGZsq{}Merg\PYGZsq{},
\PYGZsq{}Mg\PYGZsq{},
\PYGZsq{}Mh\PYGZsq{},
\PYGZsq{}MiB\PYGZsq{},
\PYGZsq{}Mib\PYGZsq{},
\PYGZsq{}Mibit\PYGZsq{},
\PYGZsq{}Mibyte\PYGZsq{},
\PYGZsq{}Mjup\PYGZsq{},
\PYGZsq{}Mjupiter\PYGZsq{},
\PYGZsq{}Mk\PYGZsq{},
\PYGZsq{}Ml\PYGZsq{},
\PYGZsq{}Mlm\PYGZsq{},
\PYGZsq{}Mlx\PYGZsq{},
\PYGZsq{}Mlyr\PYGZsq{},
\PYGZsq{}Mm\PYGZsq{},
\PYGZsq{}Mmag\PYGZsq{},
\PYGZsq{}Mmin\PYGZsq{},
\PYGZsq{}Mmol\PYGZsq{},
\PYGZsq{}Mohm\PYGZsq{},
\PYGZsq{}Mpc\PYGZsq{},
\PYGZsq{}Mph\PYGZsq{},
\PYGZsq{}Mphoton\PYGZsq{},
\PYGZsq{}Mpix\PYGZsq{},
\PYGZsq{}Mpixel\PYGZsq{},
\PYGZsq{}Mrad\PYGZsq{},
\PYGZsq{}Ms\PYGZsq{},
\PYGZsq{}Msr\PYGZsq{},
\PYGZsq{}Msun\PYGZsq{},
\PYGZsq{}Mu\PYGZsq{},
\PYGZsq{}Mvox\PYGZsq{},
\PYGZsq{}Mvoxel\PYGZsq{},
\PYGZsq{}Myr\PYGZsq{},
\PYGZsq{}N\PYGZsq{},
\PYGZsq{}NamedUnit\PYGZsq{},
\PYGZsq{}Newton\PYGZsq{},
\PYGZsq{}Ohm\PYGZsq{},
\PYGZsq{}P\PYGZsq{},
\PYGZsq{}PA\PYGZsq{},
\PYGZsq{}PAU\PYGZsq{},
\PYGZsq{}PB\PYGZsq{},
\PYGZsq{}PBa\PYGZsq{},
\PYGZsq{}PC\PYGZsq{},
\PYGZsq{}PD\PYGZsq{},
\PYGZsq{}PF\PYGZsq{},
\PYGZsq{}PG\PYGZsq{},
\PYGZsq{}PGal\PYGZsq{},
\PYGZsq{}PH\PYGZsq{},
\PYGZsq{}PHz\PYGZsq{},
\PYGZsq{}PJ\PYGZsq{},
\PYGZsq{}PJy\PYGZsq{},
\PYGZsq{}PK\PYGZsq{},
\PYGZsq{}PL\PYGZsq{},
\PYGZsq{}PN\PYGZsq{},
\PYGZsq{}POhm\PYGZsq{},
\PYGZsq{}PP\PYGZsq{},
\PYGZsq{}PPa\PYGZsq{},
\PYGZsq{}PR\PYGZsq{},
\PYGZsq{}PRy\PYGZsq{},
\PYGZsq{}PS\PYGZsq{},
\PYGZsq{}PSt\PYGZsq{},
\PYGZsq{}PT\PYGZsq{},
\PYGZsq{}PV\PYGZsq{},
\PYGZsq{}PW\PYGZsq{},
\PYGZsq{}PWb\PYGZsq{},
\PYGZsq{}Pa\PYGZsq{},
\PYGZsq{}Padu\PYGZsq{},
\PYGZsq{}Parcmin\PYGZsq{},
\PYGZsq{}Parcsec\PYGZsq{},
\PYGZsq{}Pascal\PYGZsq{},
\PYGZsq{}Pau\PYGZsq{},
\PYGZsq{}Pb\PYGZsq{},
\PYGZsq{}Pbarn\PYGZsq{},
\PYGZsq{}Pbeam\PYGZsq{},
\PYGZsq{}Pbin\PYGZsq{},
\PYGZsq{}Pbit\PYGZsq{},
\PYGZsq{}Pbyte\PYGZsq{},
\PYGZsq{}Pcd\PYGZsq{},
\PYGZsq{}Pchan\PYGZsq{},
\PYGZsq{}Pcount\PYGZsq{},
\PYGZsq{}Pct\PYGZsq{},
\PYGZsq{}Pd\PYGZsq{},
\PYGZsq{}Pdeg\PYGZsq{},
\PYGZsq{}Pdyn\PYGZsq{},
\PYGZsq{}PeV\PYGZsq{},
\PYGZsq{}Perg\PYGZsq{},
\PYGZsq{}Pg\PYGZsq{},
\PYGZsq{}Ph\PYGZsq{},
\PYGZsq{}PiB\PYGZsq{},
\PYGZsq{}Pib\PYGZsq{},
\PYGZsq{}Pibit\PYGZsq{},
\PYGZsq{}Pibyte\PYGZsq{},
\PYGZsq{}Pk\PYGZsq{},
\PYGZsq{}Pl\PYGZsq{},
\PYGZsq{}Plm\PYGZsq{},
\PYGZsq{}Plx\PYGZsq{},
\PYGZsq{}Plyr\PYGZsq{},
\PYGZsq{}Pm\PYGZsq{},
\PYGZsq{}Pmag\PYGZsq{},
\PYGZsq{}Pmin\PYGZsq{},
\PYGZsq{}Pmol\PYGZsq{},
\PYGZsq{}Pohm\PYGZsq{},
\PYGZsq{}Ppc\PYGZsq{},
\PYGZsq{}Pph\PYGZsq{},
\PYGZsq{}Pphoton\PYGZsq{},
\PYGZsq{}Ppix\PYGZsq{},
\PYGZsq{}Ppixel\PYGZsq{},
\PYGZsq{}Prad\PYGZsq{},
\PYGZsq{}PrefixUnit\PYGZsq{},
\PYGZsq{}Ps\PYGZsq{},
\PYGZsq{}Psr\PYGZsq{},
\PYGZsq{}Pu\PYGZsq{},
\PYGZsq{}Pvox\PYGZsq{},
\PYGZsq{}Pvoxel\PYGZsq{},
\PYGZsq{}Pyr\PYGZsq{},
\PYGZsq{}Quantity\PYGZsq{},
\PYGZsq{}QuantityInfo\PYGZsq{},
\PYGZsq{}QuantityInfoBase\PYGZsq{},
\PYGZsq{}R\PYGZsq{},
\PYGZsq{}R\PYGZus{}earth\PYGZsq{},
\PYGZsq{}R\PYGZus{}jup\PYGZsq{},
\PYGZsq{}R\PYGZus{}jupiter\PYGZsq{},
\PYGZsq{}R\PYGZus{}sun\PYGZsq{},
\PYGZsq{}Rayleigh\PYGZsq{},
\PYGZsq{}Rearth\PYGZsq{},
\PYGZsq{}Rjup\PYGZsq{},
\PYGZsq{}Rjupiter\PYGZsq{},
\PYGZsq{}Rsun\PYGZsq{},
\PYGZsq{}Ry\PYGZsq{},
\PYGZsq{}S\PYGZsq{},
\PYGZsq{}ST\PYGZsq{},
\PYGZsq{}STflux\PYGZsq{},
\PYGZsq{}STmag\PYGZsq{},
\PYGZsq{}Siemens\PYGZsq{},
\PYGZsq{}SpecificTypeQuantity\PYGZsq{},
\PYGZsq{}St\PYGZsq{},
\PYGZsq{}Sun\PYGZsq{},
\PYGZsq{}T\PYGZsq{},
\PYGZsq{}TA\PYGZsq{},
\PYGZsq{}TAU\PYGZsq{},
\PYGZsq{}TB\PYGZsq{},
\PYGZsq{}TBa\PYGZsq{},
\PYGZsq{}TC\PYGZsq{},
\PYGZsq{}TD\PYGZsq{},
\PYGZsq{}TF\PYGZsq{},
\PYGZsq{}TG\PYGZsq{},
\PYGZsq{}TGal\PYGZsq{},
\PYGZsq{}TH\PYGZsq{},
\PYGZsq{}THz\PYGZsq{},
\PYGZsq{}TJ\PYGZsq{},
\PYGZsq{}TJy\PYGZsq{},
\PYGZsq{}TK\PYGZsq{},
\PYGZsq{}TL\PYGZsq{},
\PYGZsq{}TN\PYGZsq{},
\PYGZsq{}TOhm\PYGZsq{},
\PYGZsq{}TP\PYGZsq{},
\PYGZsq{}TPa\PYGZsq{},
\PYGZsq{}TR\PYGZsq{},
\PYGZsq{}TRy\PYGZsq{},
\PYGZsq{}TS\PYGZsq{},
\PYGZsq{}TSt\PYGZsq{},
\PYGZsq{}TT\PYGZsq{},
\PYGZsq{}TV\PYGZsq{},
\PYGZsq{}TW\PYGZsq{},
\PYGZsq{}TWb\PYGZsq{},
\PYGZsq{}Ta\PYGZsq{},
\PYGZsq{}Tadu\PYGZsq{},
\PYGZsq{}Tarcmin\PYGZsq{},
\PYGZsq{}Tarcsec\PYGZsq{},
\PYGZsq{}Tau\PYGZsq{},
\PYGZsq{}Tb\PYGZsq{},
\PYGZsq{}Tbarn\PYGZsq{},
\PYGZsq{}Tbeam\PYGZsq{},
\PYGZsq{}Tbin\PYGZsq{},
\PYGZsq{}Tbit\PYGZsq{},
\PYGZsq{}Tbyte\PYGZsq{},
\PYGZsq{}Tcd\PYGZsq{},
\PYGZsq{}Tchan\PYGZsq{},
\PYGZsq{}Tcount\PYGZsq{},
\PYGZsq{}Tct\PYGZsq{},
\PYGZsq{}Td\PYGZsq{},
\PYGZsq{}Tdeg\PYGZsq{},
\PYGZsq{}Tdyn\PYGZsq{},
\PYGZsq{}TeV\PYGZsq{},
\PYGZsq{}Terg\PYGZsq{},
\PYGZsq{}Tesla\PYGZsq{},
\PYGZsq{}Tg\PYGZsq{},
\PYGZsq{}Th\PYGZsq{},
\PYGZsq{}TiB\PYGZsq{},
\PYGZsq{}Tib\PYGZsq{},
\PYGZsq{}Tibit\PYGZsq{},
\PYGZsq{}Tibyte\PYGZsq{},
\PYGZsq{}Tk\PYGZsq{},
\PYGZsq{}Tl\PYGZsq{},
\PYGZsq{}Tlm\PYGZsq{},
\PYGZsq{}Tlx\PYGZsq{},
\PYGZsq{}Tlyr\PYGZsq{},
\PYGZsq{}Tm\PYGZsq{},
\PYGZsq{}Tmag\PYGZsq{},
\PYGZsq{}Tmin\PYGZsq{},
\PYGZsq{}Tmol\PYGZsq{},
\PYGZsq{}Tohm\PYGZsq{},
\PYGZsq{}Tpc\PYGZsq{},
\PYGZsq{}Tph\PYGZsq{},
\PYGZsq{}Tphoton\PYGZsq{},
\PYGZsq{}Tpix\PYGZsq{},
\PYGZsq{}Tpixel\PYGZsq{},
\PYGZsq{}Trad\PYGZsq{},
\PYGZsq{}Ts\PYGZsq{},
\PYGZsq{}Tsr\PYGZsq{},
\PYGZsq{}Tu\PYGZsq{},
\PYGZsq{}Tvox\PYGZsq{},
\PYGZsq{}Tvoxel\PYGZsq{},
\PYGZsq{}Tyr\PYGZsq{},
\PYGZsq{}Unit\PYGZsq{},
\PYGZsq{}UnitBase\PYGZsq{},
\PYGZsq{}UnitConversionError\PYGZsq{},
\PYGZsq{}UnitTypeError\PYGZsq{},
\PYGZsq{}UnitsError\PYGZsq{},
\PYGZsq{}UnitsWarning\PYGZsq{},
\PYGZsq{}UnrecognizedUnit\PYGZsq{},
\PYGZsq{}V\PYGZsq{},
\PYGZsq{}Volt\PYGZsq{},
\PYGZsq{}W\PYGZsq{},
\PYGZsq{}Watt\PYGZsq{},
\PYGZsq{}Wb\PYGZsq{},
\PYGZsq{}Weber\PYGZsq{},
\PYGZsq{}YA\PYGZsq{},
\PYGZsq{}YAU\PYGZsq{},
\PYGZsq{}YB\PYGZsq{},
\PYGZsq{}YBa\PYGZsq{},
\PYGZsq{}YC\PYGZsq{},
\PYGZsq{}YD\PYGZsq{},
\PYGZsq{}YF\PYGZsq{},
\PYGZsq{}YG\PYGZsq{},
\PYGZsq{}YGal\PYGZsq{},
\PYGZsq{}YH\PYGZsq{},
\PYGZsq{}YHz\PYGZsq{},
\PYGZsq{}YJ\PYGZsq{},
\PYGZsq{}YJy\PYGZsq{},
\PYGZsq{}YK\PYGZsq{},
\PYGZsq{}YL\PYGZsq{},
\PYGZsq{}YN\PYGZsq{},
\PYGZsq{}YOhm\PYGZsq{},
\PYGZsq{}YP\PYGZsq{},
\PYGZsq{}YPa\PYGZsq{},
\PYGZsq{}YR\PYGZsq{},
\PYGZsq{}YRy\PYGZsq{},
\PYGZsq{}YS\PYGZsq{},
\PYGZsq{}YSt\PYGZsq{},
\PYGZsq{}YT\PYGZsq{},
\PYGZsq{}YV\PYGZsq{},
\PYGZsq{}YW\PYGZsq{},
\PYGZsq{}YWb\PYGZsq{},
\PYGZsq{}Ya\PYGZsq{},
\PYGZsq{}Yadu\PYGZsq{},
\PYGZsq{}Yarcmin\PYGZsq{},
\PYGZsq{}Yarcsec\PYGZsq{},
\PYGZsq{}Yau\PYGZsq{},
\PYGZsq{}Yb\PYGZsq{},
\PYGZsq{}Ybarn\PYGZsq{},
\PYGZsq{}Ybeam\PYGZsq{},
\PYGZsq{}Ybin\PYGZsq{},
\PYGZsq{}Ybit\PYGZsq{},
\PYGZsq{}Ybyte\PYGZsq{},
\PYGZsq{}Ycd\PYGZsq{},
\PYGZsq{}Ychan\PYGZsq{},
\PYGZsq{}Ycount\PYGZsq{},
\PYGZsq{}Yct\PYGZsq{},
\PYGZsq{}Yd\PYGZsq{},
\PYGZsq{}Ydeg\PYGZsq{},
\PYGZsq{}Ydyn\PYGZsq{},
\PYGZsq{}YeV\PYGZsq{},
\PYGZsq{}Yerg\PYGZsq{},
\PYGZsq{}Yg\PYGZsq{},
\PYGZsq{}Yh\PYGZsq{},
\PYGZsq{}Yk\PYGZsq{},
\PYGZsq{}Yl\PYGZsq{},
\PYGZsq{}Ylm\PYGZsq{},
\PYGZsq{}Ylx\PYGZsq{},
\PYGZsq{}Ylyr\PYGZsq{},
\PYGZsq{}Ym\PYGZsq{},
\PYGZsq{}Ymag\PYGZsq{},
\PYGZsq{}Ymin\PYGZsq{},
\PYGZsq{}Ymol\PYGZsq{},
\PYGZsq{}Yohm\PYGZsq{},
\PYGZsq{}Ypc\PYGZsq{},
\PYGZsq{}Yph\PYGZsq{},
\PYGZsq{}Yphoton\PYGZsq{},
\PYGZsq{}Ypix\PYGZsq{},
\PYGZsq{}Ypixel\PYGZsq{},
\PYGZsq{}Yrad\PYGZsq{},
\PYGZsq{}Ys\PYGZsq{},
\PYGZsq{}Ysr\PYGZsq{},
\PYGZsq{}Yu\PYGZsq{},
\PYGZsq{}Yvox\PYGZsq{},
\PYGZsq{}Yvoxel\PYGZsq{},
\PYGZsq{}Yyr\PYGZsq{},
\PYGZsq{}ZA\PYGZsq{},
\PYGZsq{}ZAU\PYGZsq{},
\PYGZsq{}ZB\PYGZsq{},
\PYGZsq{}ZBa\PYGZsq{},
\PYGZsq{}ZC\PYGZsq{},
\PYGZsq{}ZD\PYGZsq{},
\PYGZsq{}ZF\PYGZsq{},
\PYGZsq{}ZG\PYGZsq{},
\PYGZsq{}ZGal\PYGZsq{},
\PYGZsq{}ZH\PYGZsq{},
\PYGZsq{}ZHz\PYGZsq{},
\PYGZsq{}ZJ\PYGZsq{},
\PYGZsq{}ZJy\PYGZsq{},
\PYGZsq{}ZK\PYGZsq{},
\PYGZsq{}ZL\PYGZsq{},
\PYGZsq{}ZN\PYGZsq{},
\PYGZsq{}ZOhm\PYGZsq{},
\PYGZsq{}ZP\PYGZsq{},
\PYGZsq{}ZPa\PYGZsq{},
\PYGZsq{}ZR\PYGZsq{},
\PYGZsq{}ZRy\PYGZsq{},
\PYGZsq{}ZS\PYGZsq{},
\PYGZsq{}ZSt\PYGZsq{},
\PYGZsq{}ZT\PYGZsq{},
\PYGZsq{}ZV\PYGZsq{},
\PYGZsq{}ZW\PYGZsq{},
\PYGZsq{}ZWb\PYGZsq{},
\PYGZsq{}Za\PYGZsq{},
\PYGZsq{}Zadu\PYGZsq{},
\PYGZsq{}Zarcmin\PYGZsq{},
\PYGZsq{}Zarcsec\PYGZsq{},
\PYGZsq{}Zau\PYGZsq{},
\PYGZsq{}Zb\PYGZsq{},
\PYGZsq{}Zbarn\PYGZsq{},
\PYGZsq{}Zbeam\PYGZsq{},
\PYGZsq{}Zbin\PYGZsq{},
\PYGZsq{}Zbit\PYGZsq{},
\PYGZsq{}Zbyte\PYGZsq{},
\PYGZsq{}Zcd\PYGZsq{},
\PYGZsq{}Zchan\PYGZsq{},
\PYGZsq{}Zcount\PYGZsq{},
\PYGZsq{}Zct\PYGZsq{},
\PYGZsq{}Zd\PYGZsq{},
\PYGZsq{}Zdeg\PYGZsq{},
\PYGZsq{}Zdyn\PYGZsq{},
\PYGZsq{}ZeV\PYGZsq{},
\PYGZsq{}Zerg\PYGZsq{},
\PYGZsq{}Zg\PYGZsq{},
\PYGZsq{}Zh\PYGZsq{},
\PYGZsq{}Zk\PYGZsq{},
\PYGZsq{}Zl\PYGZsq{},
\PYGZsq{}Zlm\PYGZsq{},
\PYGZsq{}Zlx\PYGZsq{},
\PYGZsq{}Zlyr\PYGZsq{},
\PYGZsq{}Zm\PYGZsq{},
\PYGZsq{}Zmag\PYGZsq{},
\PYGZsq{}Zmin\PYGZsq{},
\PYGZsq{}Zmol\PYGZsq{},
\PYGZsq{}Zohm\PYGZsq{},
\PYGZsq{}Zpc\PYGZsq{},
\PYGZsq{}Zph\PYGZsq{},
\PYGZsq{}Zphoton\PYGZsq{},
\PYGZsq{}Zpix\PYGZsq{},
\PYGZsq{}Zpixel\PYGZsq{},
\PYGZsq{}Zrad\PYGZsq{},
\PYGZsq{}Zs\PYGZsq{},
\PYGZsq{}Zsr\PYGZsq{},
\PYGZsq{}Zu\PYGZsq{},
\PYGZsq{}Zvox\PYGZsq{},
\PYGZsq{}Zvoxel\PYGZsq{},
\PYGZsq{}Zyr\PYGZsq{},
\PYGZsq{}\PYGZus{}\PYGZus{}builtins\PYGZus{}\PYGZus{}\PYGZsq{},
\PYGZsq{}\PYGZus{}\PYGZus{}cached\PYGZus{}\PYGZus{}\PYGZsq{},
\PYGZsq{}\PYGZus{}\PYGZus{}doc\PYGZus{}\PYGZus{}\PYGZsq{},
\PYGZsq{}\PYGZus{}\PYGZus{}file\PYGZus{}\PYGZus{}\PYGZsq{},
\PYGZsq{}\PYGZus{}\PYGZus{}loader\PYGZus{}\PYGZus{}\PYGZsq{},
\PYGZsq{}\PYGZus{}\PYGZus{}name\PYGZus{}\PYGZus{}\PYGZsq{},
\PYGZsq{}\PYGZus{}\PYGZus{}package\PYGZus{}\PYGZus{}\PYGZsq{},
\PYGZsq{}\PYGZus{}\PYGZus{}path\PYGZus{}\PYGZus{}\PYGZsq{},
\PYGZsq{}\PYGZus{}\PYGZus{}spec\PYGZus{}\PYGZus{}\PYGZsq{},
\PYGZsq{}a\PYGZsq{},
\PYGZsq{}aA\PYGZsq{},
\PYGZsq{}aAU\PYGZsq{},
\PYGZsq{}aB\PYGZsq{},
\PYGZsq{}aBa\PYGZsq{},
\PYGZsq{}aC\PYGZsq{},
\PYGZsq{}aD\PYGZsq{},
\PYGZsq{}aF\PYGZsq{},
\PYGZsq{}aG\PYGZsq{},
\PYGZsq{}aGal\PYGZsq{},
\PYGZsq{}aH\PYGZsq{},
\PYGZsq{}aHz\PYGZsq{},
\PYGZsq{}aJ\PYGZsq{},
\PYGZsq{}aJy\PYGZsq{},
\PYGZsq{}aK\PYGZsq{},
\PYGZsq{}aL\PYGZsq{},
\PYGZsq{}aN\PYGZsq{},
\PYGZsq{}aOhm\PYGZsq{},
\PYGZsq{}aP\PYGZsq{},
\PYGZsq{}aPa\PYGZsq{},
\PYGZsq{}aR\PYGZsq{},
\PYGZsq{}aRy\PYGZsq{},
\PYGZsq{}aS\PYGZsq{},
\PYGZsq{}aSt\PYGZsq{},
\PYGZsq{}aT\PYGZsq{},
\PYGZsq{}aV\PYGZsq{},
\PYGZsq{}aW\PYGZsq{},
\PYGZsq{}aWb\PYGZsq{},
\PYGZsq{}aa\PYGZsq{},
\PYGZsq{}aadu\PYGZsq{},
\PYGZsq{}aarcmin\PYGZsq{},
\PYGZsq{}aarcsec\PYGZsq{},
\PYGZsq{}aau\PYGZsq{},
\PYGZsq{}ab\PYGZsq{},
\PYGZsq{}abA\PYGZsq{},
\PYGZsq{}abC\PYGZsq{},
\PYGZsq{}abampere\PYGZsq{},
\PYGZsq{}abarn\PYGZsq{},
\PYGZsq{}abcoulomb\PYGZsq{},
\PYGZsq{}abeam\PYGZsq{},
\PYGZsq{}abin\PYGZsq{},
\PYGZsq{}abit\PYGZsq{},
\PYGZsq{}abyte\PYGZsq{},
\PYGZsq{}acd\PYGZsq{},
\PYGZsq{}achan\PYGZsq{},
\PYGZsq{}acount\PYGZsq{},
\PYGZsq{}act\PYGZsq{},
\PYGZsq{}ad\PYGZsq{},
\PYGZsq{}add\PYGZus{}enabled\PYGZus{}equivalencies\PYGZsq{},
\PYGZsq{}add\PYGZus{}enabled\PYGZus{}units\PYGZsq{},
\PYGZsq{}adeg\PYGZsq{},
\PYGZsq{}adu\PYGZsq{},
\PYGZsq{}adyn\PYGZsq{},
\PYGZsq{}aeV\PYGZsq{},
\PYGZsq{}aerg\PYGZsq{},
\PYGZsq{}ag\PYGZsq{},
\PYGZsq{}ah\PYGZsq{},
\PYGZsq{}ak\PYGZsq{},
\PYGZsq{}al\PYGZsq{},
\PYGZsq{}allclose\PYGZsq{},
\PYGZsq{}alm\PYGZsq{},
\PYGZsq{}alx\PYGZsq{},
\PYGZsq{}alyr\PYGZsq{},
\PYGZsq{}am\PYGZsq{},
\PYGZsq{}amag\PYGZsq{},
\PYGZsq{}amin\PYGZsq{},
\PYGZsq{}amol\PYGZsq{},
\PYGZsq{}amp\PYGZsq{},
\PYGZsq{}ampere\PYGZsq{},
\PYGZsq{}angstrom\PYGZsq{},
\PYGZsq{}annum\PYGZsq{},
\PYGZsq{}aohm\PYGZsq{},
\PYGZsq{}apc\PYGZsq{},
\PYGZsq{}aph\PYGZsq{},
\PYGZsq{}aphoton\PYGZsq{},
\PYGZsq{}apix\PYGZsq{},
\PYGZsq{}apixel\PYGZsq{},
\PYGZsq{}arad\PYGZsq{},
\PYGZsq{}arcmin\PYGZsq{},
\PYGZsq{}arcminute\PYGZsq{},
\PYGZsq{}arcsec\PYGZsq{},
\PYGZsq{}arcsecond\PYGZsq{},
\PYGZsq{}asr\PYGZsq{},
\PYGZsq{}astronomical\PYGZus{}unit\PYGZsq{},
\PYGZsq{}astrophys\PYGZsq{},
\PYGZsq{}attoBarye\PYGZsq{},
\PYGZsq{}attoDa\PYGZsq{},
\PYGZsq{}attoDalton\PYGZsq{},
\PYGZsq{}attoDebye\PYGZsq{},
\PYGZsq{}attoFarad\PYGZsq{},
\PYGZsq{}attoGauss\PYGZsq{},
\PYGZsq{}attoHenry\PYGZsq{},
\PYGZsq{}attoHertz\PYGZsq{},
\PYGZsq{}attoJansky\PYGZsq{},
\PYGZsq{}attoJoule\PYGZsq{},
\PYGZsq{}attoKayser\PYGZsq{},
\PYGZsq{}attoKelvin\PYGZsq{},
\PYGZsq{}attoNewton\PYGZsq{},
\PYGZsq{}attoOhm\PYGZsq{},
\PYGZsq{}attoPascal\PYGZsq{},
\PYGZsq{}attoRayleigh\PYGZsq{},
\PYGZsq{}attoSiemens\PYGZsq{},
\PYGZsq{}attoTesla\PYGZsq{},
\PYGZsq{}attoVolt\PYGZsq{},
\PYGZsq{}attoWatt\PYGZsq{},
\PYGZsq{}attoWeber\PYGZsq{},
\PYGZsq{}attoamp\PYGZsq{},
\PYGZsq{}attoampere\PYGZsq{},
\PYGZsq{}attoannum\PYGZsq{},
\PYGZsq{}attoarcminute\PYGZsq{},
\PYGZsq{}attoarcsecond\PYGZsq{},
\PYGZsq{}attoastronomical\PYGZus{}unit\PYGZsq{},
\PYGZsq{}attobarn\PYGZsq{},
\PYGZsq{}attobarye\PYGZsq{},
\PYGZsq{}attobit\PYGZsq{},
\PYGZsq{}attobyte\PYGZsq{},
\PYGZsq{}attocandela\PYGZsq{},
\PYGZsq{}attocoulomb\PYGZsq{},
\PYGZsq{}attocount\PYGZsq{},
\PYGZsq{}attoday\PYGZsq{},
\PYGZsq{}attodebye\PYGZsq{},
\PYGZsq{}attodegree\PYGZsq{},
\PYGZsq{}attodyne\PYGZsq{},
\PYGZsq{}attoelectronvolt\PYGZsq{},
\PYGZsq{}attofarad\PYGZsq{},
\PYGZsq{}attogal\PYGZsq{},
\PYGZsq{}attogauss\PYGZsq{},
\PYGZsq{}attogram\PYGZsq{},
\PYGZsq{}attohenry\PYGZsq{},
\PYGZsq{}attohertz\PYGZsq{},
\PYGZsq{}attohour\PYGZsq{},
\PYGZsq{}attohr\PYGZsq{},
\PYGZsq{}attojansky\PYGZsq{},
\PYGZsq{}attojoule\PYGZsq{},
\PYGZsq{}attokayser\PYGZsq{},
\PYGZsq{}attolightyear\PYGZsq{},
\PYGZsq{}attoliter\PYGZsq{},
\PYGZsq{}attolumen\PYGZsq{},
\PYGZsq{}attolux\PYGZsq{},
\PYGZsq{}attometer\PYGZsq{},
\PYGZsq{}attominute\PYGZsq{},
\PYGZsq{}attomole\PYGZsq{},
\PYGZsq{}attonewton\PYGZsq{},
\PYGZsq{}attoparsec\PYGZsq{},
\PYGZsq{}attopascal\PYGZsq{},
\PYGZsq{}attophoton\PYGZsq{},
\PYGZsq{}attopixel\PYGZsq{},
\PYGZsq{}attopoise\PYGZsq{},
\PYGZsq{}attoradian\PYGZsq{},
\PYGZsq{}attorayleigh\PYGZsq{},
\PYGZsq{}attorydberg\PYGZsq{},
\PYGZsq{}attosecond\PYGZsq{},
\PYGZsq{}attosiemens\PYGZsq{},
\PYGZsq{}attosteradian\PYGZsq{},
\PYGZsq{}attostokes\PYGZsq{},
\PYGZsq{}attotesla\PYGZsq{},
\PYGZsq{}attovolt\PYGZsq{},
\PYGZsq{}attovoxel\PYGZsq{},
\PYGZsq{}attowatt\PYGZsq{},
\PYGZsq{}attoweber\PYGZsq{},
\PYGZsq{}attoyear\PYGZsq{},
\PYGZsq{}au\PYGZsq{},
\PYGZsq{}avox\PYGZsq{},
\PYGZsq{}avoxel\PYGZsq{},
\PYGZsq{}ayr\PYGZsq{},
\PYGZsq{}b\PYGZsq{},
\PYGZsq{}bar\PYGZsq{},
\PYGZsq{}barn\PYGZsq{},
\PYGZsq{}barye\PYGZsq{},
\PYGZsq{}beam\PYGZsq{},
\PYGZsq{}beam\PYGZus{}angular\PYGZus{}area\PYGZsq{},
\PYGZsq{}becquerel\PYGZsq{},
\PYGZsq{}bin\PYGZsq{},
\PYGZsq{}binary\PYGZus{}prefixes\PYGZsq{},
\PYGZsq{}bit\PYGZsq{},
\PYGZsq{}bol\PYGZsq{},
\PYGZsq{}brightness\PYGZus{}temperature\PYGZsq{},
\PYGZsq{}byte\PYGZsq{},
\PYGZsq{}cA\PYGZsq{},
\PYGZsq{}cAU\PYGZsq{},
\PYGZsq{}cB\PYGZsq{},
\PYGZsq{}cBa\PYGZsq{},
\PYGZsq{}cC\PYGZsq{},
\PYGZsq{}cD\PYGZsq{},
\PYGZsq{}cF\PYGZsq{},
\PYGZsq{}cG\PYGZsq{},
\PYGZsq{}cGal\PYGZsq{},
\PYGZsq{}cH\PYGZsq{},
\PYGZsq{}cHz\PYGZsq{},
\PYGZsq{}cJ\PYGZsq{},
\PYGZsq{}cJy\PYGZsq{},
\PYGZsq{}cK\PYGZsq{},
\PYGZsq{}cL\PYGZsq{},
\PYGZsq{}cN\PYGZsq{},
\PYGZsq{}cOhm\PYGZsq{},
\PYGZsq{}cP\PYGZsq{},
\PYGZsq{}cPa\PYGZsq{},
\PYGZsq{}cR\PYGZsq{},
\PYGZsq{}cRy\PYGZsq{},
\PYGZsq{}cS\PYGZsq{},
\PYGZsq{}cSt\PYGZsq{},
\PYGZsq{}cT\PYGZsq{},
\PYGZsq{}cV\PYGZsq{},
\PYGZsq{}cW\PYGZsq{},
\PYGZsq{}cWb\PYGZsq{},
\PYGZsq{}ca\PYGZsq{},
\PYGZsq{}cadu\PYGZsq{},
\PYGZsq{}candela\PYGZsq{},
\PYGZsq{}carcmin\PYGZsq{},
\PYGZsq{}carcsec\PYGZsq{},
\PYGZsq{}cau\PYGZsq{},
\PYGZsq{}cb\PYGZsq{},
\PYGZsq{}cbarn\PYGZsq{},
\PYGZsq{}cbeam\PYGZsq{},
\PYGZsq{}cbin\PYGZsq{},
\PYGZsq{}cbit\PYGZsq{},
\PYGZsq{}cbyte\PYGZsq{},
\PYGZsq{}ccd\PYGZsq{},
\PYGZsq{}cchan\PYGZsq{},
\PYGZsq{}ccount\PYGZsq{},
\PYGZsq{}cct\PYGZsq{},
\PYGZsq{}cd\PYGZsq{},
\PYGZsq{}cdeg\PYGZsq{},
\PYGZsq{}cdyn\PYGZsq{},
\PYGZsq{}ceV\PYGZsq{},
\PYGZsq{}centiBarye\PYGZsq{},
\PYGZsq{}centiDa\PYGZsq{},
\PYGZsq{}centiDalton\PYGZsq{},
\PYGZsq{}centiDebye\PYGZsq{},
\PYGZsq{}centiFarad\PYGZsq{},
\PYGZsq{}centiGauss\PYGZsq{},
\PYGZsq{}centiHenry\PYGZsq{},
\PYGZsq{}centiHertz\PYGZsq{},
\PYGZsq{}centiJansky\PYGZsq{},
\PYGZsq{}centiJoule\PYGZsq{},
\PYGZsq{}centiKayser\PYGZsq{},
\PYGZsq{}centiKelvin\PYGZsq{},
\PYGZsq{}centiNewton\PYGZsq{},
\PYGZsq{}centiOhm\PYGZsq{},
\PYGZsq{}centiPascal\PYGZsq{},
\PYGZsq{}centiRayleigh\PYGZsq{},
\PYGZsq{}centiSiemens\PYGZsq{},
\PYGZsq{}centiTesla\PYGZsq{},
\PYGZsq{}centiVolt\PYGZsq{},
\PYGZsq{}centiWatt\PYGZsq{},
\PYGZsq{}centiWeber\PYGZsq{},
\PYGZsq{}centiamp\PYGZsq{},
\PYGZsq{}centiampere\PYGZsq{},
\PYGZsq{}centiannum\PYGZsq{},
\PYGZsq{}centiarcminute\PYGZsq{},
\PYGZsq{}centiarcsecond\PYGZsq{},
\PYGZsq{}centiastronomical\PYGZus{}unit\PYGZsq{},
\PYGZsq{}centibarn\PYGZsq{},
\PYGZsq{}centibarye\PYGZsq{},
\PYGZsq{}centibit\PYGZsq{},
\PYGZsq{}centibyte\PYGZsq{},
\PYGZsq{}centicandela\PYGZsq{},
\PYGZsq{}centicoulomb\PYGZsq{},
\PYGZsq{}centicount\PYGZsq{},
\PYGZsq{}centiday\PYGZsq{},
\PYGZsq{}centidebye\PYGZsq{},
\PYGZsq{}centidegree\PYGZsq{},
\PYGZsq{}centidyne\PYGZsq{},
\PYGZsq{}centielectronvolt\PYGZsq{},
\PYGZsq{}centifarad\PYGZsq{},
\PYGZsq{}centigal\PYGZsq{},
\PYGZsq{}centigauss\PYGZsq{},
\PYGZsq{}centigram\PYGZsq{},
\PYGZsq{}centihenry\PYGZsq{},
\PYGZsq{}centihertz\PYGZsq{},
\PYGZsq{}centihour\PYGZsq{},
\PYGZsq{}centihr\PYGZsq{},
\PYGZsq{}centijansky\PYGZsq{},
\PYGZsq{}centijoule\PYGZsq{},
\PYGZsq{}centikayser\PYGZsq{},
\PYGZsq{}centilightyear\PYGZsq{},
\PYGZsq{}centiliter\PYGZsq{},
\PYGZsq{}centilumen\PYGZsq{},
\PYGZsq{}centilux\PYGZsq{},
\PYGZsq{}centimeter\PYGZsq{},
\PYGZsq{}centiminute\PYGZsq{},
\PYGZsq{}centimole\PYGZsq{},
\PYGZsq{}centinewton\PYGZsq{},
\PYGZsq{}centiparsec\PYGZsq{},
\PYGZsq{}centipascal\PYGZsq{},
\PYGZsq{}centiphoton\PYGZsq{},
\PYGZsq{}centipixel\PYGZsq{},
\PYGZsq{}centipoise\PYGZsq{},
\PYGZsq{}centiradian\PYGZsq{},
\PYGZsq{}centirayleigh\PYGZsq{},
\PYGZsq{}centirydberg\PYGZsq{},
\PYGZsq{}centisecond\PYGZsq{},
\PYGZsq{}centisiemens\PYGZsq{},
\PYGZsq{}centisteradian\PYGZsq{},
\PYGZsq{}centistokes\PYGZsq{},
\PYGZsq{}centitesla\PYGZsq{},
\PYGZsq{}centivolt\PYGZsq{},
\PYGZsq{}centivoxel\PYGZsq{},
\PYGZsq{}centiwatt\PYGZsq{},
\PYGZsq{}centiweber\PYGZsq{},
\PYGZsq{}centiyear\PYGZsq{},
\PYGZsq{}cerg\PYGZsq{},
\PYGZsq{}cg\PYGZsq{},
\PYGZsq{}cgs\PYGZsq{},
\PYGZsq{}ch\PYGZsq{},
\PYGZsq{}chan\PYGZsq{},
\PYGZsq{}ck\PYGZsq{},
\PYGZsq{}cl\PYGZsq{},
\PYGZsq{}clm\PYGZsq{},
\PYGZsq{}clx\PYGZsq{},
\PYGZsq{}clyr\PYGZsq{},
\PYGZsq{}cm\PYGZsq{},
\PYGZsq{}cmag\PYGZsq{},
\PYGZsq{}cmin\PYGZsq{},
\PYGZsq{}cmol\PYGZsq{},
\PYGZsq{}cohm\PYGZsq{},
\PYGZsq{}core\PYGZsq{},
\PYGZsq{}coulomb\PYGZsq{},
\PYGZsq{}count\PYGZsq{},
\PYGZsq{}cpc\PYGZsq{},
\PYGZsq{}cph\PYGZsq{},
\PYGZsq{}cphoton\PYGZsq{},
\PYGZsq{}cpix\PYGZsq{},
\PYGZsq{}cpixel\PYGZsq{},
\PYGZsq{}crad\PYGZsq{},
\PYGZsq{}cs\PYGZsq{},
\PYGZsq{}csr\PYGZsq{},
\PYGZsq{}ct\PYGZsq{},
\PYGZsq{}cu\PYGZsq{},
\PYGZsq{}curie\PYGZsq{},
\PYGZsq{}cvox\PYGZsq{},
\PYGZsq{}cvoxel\PYGZsq{},
\PYGZsq{}cy\PYGZsq{},
\PYGZsq{}cycle\PYGZsq{},
\PYGZsq{}cyr\PYGZsq{},
\PYGZsq{}d\PYGZsq{},
\PYGZsq{}dA\PYGZsq{},
\PYGZsq{}dAU\PYGZsq{},
\PYGZsq{}dB\PYGZsq{},
\PYGZsq{}dBa\PYGZsq{},
\PYGZsq{}dC\PYGZsq{},
\PYGZsq{}dD\PYGZsq{},
\PYGZsq{}dF\PYGZsq{},
\PYGZsq{}dG\PYGZsq{},
\PYGZsq{}dGal\PYGZsq{},
\PYGZsq{}dH\PYGZsq{},
\PYGZsq{}dHz\PYGZsq{},
\PYGZsq{}dJ\PYGZsq{},
\PYGZsq{}dJy\PYGZsq{},
\PYGZsq{}dK\PYGZsq{},
\PYGZsq{}dL\PYGZsq{},
\PYGZsq{}dN\PYGZsq{},
\PYGZsq{}dOhm\PYGZsq{},
\PYGZsq{}dP\PYGZsq{},
\PYGZsq{}dPa\PYGZsq{},
\PYGZsq{}dR\PYGZsq{},
\PYGZsq{}dRy\PYGZsq{},
\PYGZsq{}dS\PYGZsq{},
\PYGZsq{}dSt\PYGZsq{},
\PYGZsq{}dT\PYGZsq{},
...]
\end{sphinxVerbatim}
To create a quantity, we multiply a value by a unit.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{coord} \PYG{o}{=} \PYG{l+m+mi}{30} \PYG{o}{*} \PYG{n}{u}\PYG{o}{.}\PYG{n}{deg}
\PYG{n+nb}{type}\PYG{p}{(}\PYG{n}{coord}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
astropy.units.quantity.Quantity
\end{sphinxVerbatim}
The result is a \sphinxcode{\sphinxupquote{Quantity}} object.
Jupyter knows how to display \sphinxcode{\sphinxupquote{Quantities}} like this:
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{coord}
\end{sphinxVerbatim}
\begin{equation*}
\begin{split}30 \; \mathrm{{}^{\circ}}\end{split}
\end{equation*}
\section{Selecting a rectangle}
\label{\detokenize{02_coords:selecting-a-rectangle}}
Now well select a rectangle from \sphinxhyphen{}55 to \sphinxhyphen{}45 degrees right ascension and \sphinxhyphen{}8 to 4 degrees of declination.
Well define variables to contain these limits.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{phi1\PYGZus{}min} \PYG{o}{=} \PYG{o}{\PYGZhy{}}\PYG{l+m+mi}{55}
\PYG{n}{phi1\PYGZus{}max} \PYG{o}{=} \PYG{o}{\PYGZhy{}}\PYG{l+m+mi}{45}
\PYG{n}{phi2\PYGZus{}min} \PYG{o}{=} \PYG{o}{\PYGZhy{}}\PYG{l+m+mi}{8}
\PYG{n}{phi2\PYGZus{}max} \PYG{o}{=} \PYG{l+m+mi}{4}
\end{sphinxVerbatim}
To represent a rectangle, well use two lists of coordinates and multiply by their units.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{phi1\PYGZus{}rect} \PYG{o}{=} \PYG{p}{[}\PYG{n}{phi1\PYGZus{}min}\PYG{p}{,} \PYG{n}{phi1\PYGZus{}min}\PYG{p}{,} \PYG{n}{phi1\PYGZus{}max}\PYG{p}{,} \PYG{n}{phi1\PYGZus{}max}\PYG{p}{]} \PYG{o}{*} \PYG{n}{u}\PYG{o}{.}\PYG{n}{deg}
\PYG{n}{phi2\PYGZus{}rect} \PYG{o}{=} \PYG{p}{[}\PYG{n}{phi2\PYGZus{}min}\PYG{p}{,} \PYG{n}{phi2\PYGZus{}max}\PYG{p}{,} \PYG{n}{phi2\PYGZus{}max}\PYG{p}{,} \PYG{n}{phi2\PYGZus{}min}\PYG{p}{]} \PYG{o}{*} \PYG{n}{u}\PYG{o}{.}\PYG{n}{deg}
\end{sphinxVerbatim}
\sphinxcode{\sphinxupquote{phi1\_rect}} and \sphinxcode{\sphinxupquote{phi2\_rect}} represent the coordinates of the corners of a rectangle.
But they are in “\sphinxhref{https://gala-astro.readthedocs.io/en/latest/\_modules/gala/coordinates/gd1.html}{a Heliocentric spherical coordinate system defined by the orbit of the GD1 stream}
In order to use them in a Gaia query, we have to convert them to \sphinxhref{https://en.wikipedia.org/wiki/International\_Celestial\_Reference\_System}{International Celestial Reference System} (ICRS) coordinates. We can do that by storing the coordinates in a \sphinxcode{\sphinxupquote{GD1Koposov10}} object provided by \sphinxhref{https://gala-astro.readthedocs.io/en/latest/coordinates/}{Gala}.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k+kn}{import} \PYG{n+nn}{gala}\PYG{n+nn}{.}\PYG{n+nn}{coordinates} \PYG{k}{as} \PYG{n+nn}{gc}
\PYG{n}{corners} \PYG{o}{=} \PYG{n}{gc}\PYG{o}{.}\PYG{n}{GD1Koposov10}\PYG{p}{(}\PYG{n}{phi1}\PYG{o}{=}\PYG{n}{phi1\PYGZus{}rect}\PYG{p}{,} \PYG{n}{phi2}\PYG{o}{=}\PYG{n}{phi2\PYGZus{}rect}\PYG{p}{)}
\PYG{n+nb}{type}\PYG{p}{(}\PYG{n}{corners}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
gala.coordinates.gd1.GD1Koposov10
\end{sphinxVerbatim}
We can display the result like this:
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{corners}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYGZlt{}GD1Koposov10 Coordinate: (phi1, phi2) in deg
[(\PYGZhy{}55., \PYGZhy{}8.), (\PYGZhy{}55., 4.), (\PYGZhy{}45., 4.), (\PYGZhy{}45., \PYGZhy{}8.)]\PYGZgt{}
\end{sphinxVerbatim}
Now we can use \sphinxcode{\sphinxupquote{transform\_to}} to convert to ICRS coordinates.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k+kn}{import} \PYG{n+nn}{astropy}\PYG{n+nn}{.}\PYG{n+nn}{coordinates} \PYG{k}{as} \PYG{n+nn}{coord}
\PYG{n}{corners\PYGZus{}icrs} \PYG{o}{=} \PYG{n}{corners}\PYG{o}{.}\PYG{n}{transform\PYGZus{}to}\PYG{p}{(}\PYG{n}{coord}\PYG{o}{.}\PYG{n}{ICRS}\PYG{p}{)}
\PYG{n+nb}{type}\PYG{p}{(}\PYG{n}{corners\PYGZus{}icrs}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
astropy.coordinates.builtin\PYGZus{}frames.icrs.ICRS
\end{sphinxVerbatim}
The result is an \sphinxcode{\sphinxupquote{ICRS}} object.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{corners\PYGZus{}icrs}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYGZlt{}ICRS Coordinate: (ra, dec) in deg
[(146.27533314, 19.26190982), (135.42163944, 25.87738723),
(141.60264825, 34.3048303 ), (152.81671045, 27.13611254)]\PYGZgt{}
\end{sphinxVerbatim}
Notice that a rectangle in one coordinate system is not necessarily a rectangle in another. In this example, the result is a polygon.
\section{Selecting a polygon}
\label{\detokenize{02_coords:selecting-a-polygon}}
In order to use this polygon as part of an ADQL query, we have to convert it to a string with a comma\sphinxhyphen{}separated list of coordinates, as in this example:
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{l+s+sd}{\PYGZdq{}\PYGZdq{}\PYGZdq{}}
\PYG{l+s+sd}{POLYGON(143.65, 20.98, }
\PYG{l+s+sd}{ 134.46, 26.39, }
\PYG{l+s+sd}{ 140.58, 34.85, }
\PYG{l+s+sd}{ 150.16, 29.01)}
\PYG{l+s+sd}{\PYGZdq{}\PYGZdq{}\PYGZdq{}}
\end{sphinxVerbatim}
\sphinxcode{\sphinxupquote{corners\_icrs}} behaves like a list, so we can use a \sphinxcode{\sphinxupquote{for}} loop to iterate through the points.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k}{for} \PYG{n}{point} \PYG{o+ow}{in} \PYG{n}{corners\PYGZus{}icrs}\PYG{p}{:}
\PYG{n+nb}{print}\PYG{p}{(}\PYG{n}{point}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYGZlt{}ICRS Coordinate: (ra, dec) in deg
(146.27533314, 19.26190982)\PYGZgt{}
\PYGZlt{}ICRS Coordinate: (ra, dec) in deg
(135.42163944, 25.87738723)\PYGZgt{}
\PYGZlt{}ICRS Coordinate: (ra, dec) in deg
(141.60264825, 34.3048303)\PYGZgt{}
\PYGZlt{}ICRS Coordinate: (ra, dec) in deg
(152.81671045, 27.13611254)\PYGZgt{}
\end{sphinxVerbatim}
From that, we can select the coordinates \sphinxcode{\sphinxupquote{ra}} and \sphinxcode{\sphinxupquote{dec}}:
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k}{for} \PYG{n}{point} \PYG{o+ow}{in} \PYG{n}{corners\PYGZus{}icrs}\PYG{p}{:}
\PYG{n+nb}{print}\PYG{p}{(}\PYG{n}{point}\PYG{o}{.}\PYG{n}{ra}\PYG{p}{,} \PYG{n}{point}\PYG{o}{.}\PYG{n}{dec}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
146d16m31.1993s 19d15m42.8754s
135d25m17.902s 25d52m38.594s
141d36m09.5337s 34d18m17.3891s
152d49m00.1576s 27d08m10.0051s
\end{sphinxVerbatim}
The results are quantities with units, but if we select the \sphinxcode{\sphinxupquote{value}} part, we get a dimensionless floating\sphinxhyphen{}point number.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k}{for} \PYG{n}{point} \PYG{o+ow}{in} \PYG{n}{corners\PYGZus{}icrs}\PYG{p}{:}
\PYG{n+nb}{print}\PYG{p}{(}\PYG{n}{point}\PYG{o}{.}\PYG{n}{ra}\PYG{o}{.}\PYG{n}{value}\PYG{p}{,} \PYG{n}{point}\PYG{o}{.}\PYG{n}{dec}\PYG{o}{.}\PYG{n}{value}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
146.27533313607782 19.261909820533692
135.42163944306296 25.87738722767213
141.60264825107333 34.304830296257144
152.81671044675923 27.136112541397996
\end{sphinxVerbatim}
We can use string \sphinxcode{\sphinxupquote{format}} to convert these numbers to strings.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{point\PYGZus{}base} \PYG{o}{=} \PYG{l+s+s2}{\PYGZdq{}}\PYG{l+s+si}{\PYGZob{}point.ra.value\PYGZcb{}}\PYG{l+s+s2}{, }\PYG{l+s+si}{\PYGZob{}point.dec.value\PYGZcb{}}\PYG{l+s+s2}{\PYGZdq{}}
\PYG{n}{t} \PYG{o}{=} \PYG{p}{[}\PYG{n}{point\PYGZus{}base}\PYG{o}{.}\PYG{n}{format}\PYG{p}{(}\PYG{n}{point}\PYG{o}{=}\PYG{n}{point}\PYG{p}{)}
\PYG{k}{for} \PYG{n}{point} \PYG{o+ow}{in} \PYG{n}{corners\PYGZus{}icrs}\PYG{p}{]}
\PYG{n}{t}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
[\PYGZsq{}146.27533313607782, 19.261909820533692\PYGZsq{},
\PYGZsq{}135.42163944306296, 25.87738722767213\PYGZsq{},
\PYGZsq{}141.60264825107333, 34.304830296257144\PYGZsq{},
\PYGZsq{}152.81671044675923, 27.136112541397996\PYGZsq{}]
\end{sphinxVerbatim}
The result is a list of strings, which we can join into a single string using \sphinxcode{\sphinxupquote{join}}.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{point\PYGZus{}list} \PYG{o}{=} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{, }\PYG{l+s+s1}{\PYGZsq{}}\PYG{o}{.}\PYG{n}{join}\PYG{p}{(}\PYG{n}{t}\PYG{p}{)}
\PYG{n}{point\PYGZus{}list}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYGZsq{}146.27533313607782, 19.261909820533692, 135.42163944306296, 25.87738722767213, 141.60264825107333, 34.304830296257144, 152.81671044675923, 27.136112541397996\PYGZsq{}
\end{sphinxVerbatim}
Notice that we invoke \sphinxcode{\sphinxupquote{join}} on a string and pass the list as an argument.
Before we can assemble the query, we need \sphinxcode{\sphinxupquote{columns}} again (as we saw in the previous notebook).
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{columns} \PYG{o}{=} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{source\PYGZus{}id, ra, dec, pmra, pmdec, parallax, parallax\PYGZus{}error, radial\PYGZus{}velocity}\PYG{l+s+s1}{\PYGZsq{}}
\end{sphinxVerbatim}
Heres the base for the query, with format specifiers for \sphinxcode{\sphinxupquote{columns}} and \sphinxcode{\sphinxupquote{point\_list}}.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{query\PYGZus{}base} \PYG{o}{=} \PYG{l+s+s2}{\PYGZdq{}\PYGZdq{}\PYGZdq{}}\PYG{l+s+s2}{SELECT }\PYG{l+s+si}{\PYGZob{}columns\PYGZcb{}}
\PYG{l+s+s2}{FROM gaiadr2.gaia\PYGZus{}source}
\PYG{l+s+s2}{WHERE parallax \PYGZlt{} 1}
\PYG{l+s+s2}{ AND bp\PYGZus{}rp BETWEEN \PYGZhy{}0.75 AND 2 }
\PYG{l+s+s2}{ AND 1 = CONTAINS(POINT(ra, dec), }
\PYG{l+s+s2}{ POLYGON(}\PYG{l+s+si}{\PYGZob{}point\PYGZus{}list\PYGZcb{}}\PYG{l+s+s2}{))}
\PYG{l+s+s2}{\PYGZdq{}\PYGZdq{}\PYGZdq{}}
\end{sphinxVerbatim}
And heres the result:
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{query} \PYG{o}{=} \PYG{n}{query\PYGZus{}base}\PYG{o}{.}\PYG{n}{format}\PYG{p}{(}\PYG{n}{columns}\PYG{o}{=}\PYG{n}{columns}\PYG{p}{,}
\PYG{n}{point\PYGZus{}list}\PYG{o}{=}\PYG{n}{point\PYGZus{}list}\PYG{p}{)}
\PYG{n+nb}{print}\PYG{p}{(}\PYG{n}{query}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
SELECT source\PYGZus{}id, ra, dec, pmra, pmdec, parallax, parallax\PYGZus{}error, radial\PYGZus{}velocity
FROM gaiadr2.gaia\PYGZus{}source
WHERE parallax \PYGZlt{} 1
AND bp\PYGZus{}rp BETWEEN \PYGZhy{}0.75 AND 2
AND 1 = CONTAINS(POINT(ra, dec),
POLYGON(146.27533313607782, 19.261909820533692, 135.42163944306296, 25.87738722767213, 141.60264825107333, 34.304830296257144, 152.81671044675923, 27.136112541397996))
\end{sphinxVerbatim}
As always, we should take a minute to proof\sphinxhyphen{}read the query before we launch it.
The result will be bigger than our previous queries, so it will take a little longer.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{job} \PYG{o}{=} \PYG{n}{Gaia}\PYG{o}{.}\PYG{n}{launch\PYGZus{}job\PYGZus{}async}\PYG{p}{(}\PYG{n}{query}\PYG{p}{)}
\PYG{n+nb}{print}\PYG{p}{(}\PYG{n}{job}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
INFO: Query finished. [astroquery.utils.tap.core]
\PYGZlt{}Table length=140340\PYGZgt{}
name dtype unit description n\PYGZus{}bad
\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{} \PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{} \PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{} \PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{} \PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}
source\PYGZus{}id int64 Unique source identifier (unique within a particular Data Release) 0
ra float64 deg Right ascension 0
dec float64 deg Declination 0
pmra float64 mas / yr Proper motion in right ascension direction 0
pmdec float64 mas / yr Proper motion in declination direction 0
parallax float64 mas Parallax 0
parallax\PYGZus{}error float64 mas Standard error of parallax 0
radial\PYGZus{}velocity float64 km / s Radial velocity 139374
Jobid: 1603114980658O
Phase: COMPLETED
Owner: None
Output file: async\PYGZus{}20201019094300.vot
Results: None
\end{sphinxVerbatim}
Here are the results.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{results} \PYG{o}{=} \PYG{n}{job}\PYG{o}{.}\PYG{n}{get\PYGZus{}results}\PYG{p}{(}\PYG{p}{)}
\PYG{n+nb}{len}\PYG{p}{(}\PYG{n}{results}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
140340
\end{sphinxVerbatim}
There are more than 100,000 stars in this polygon, but thats a manageable size to work with.
\section{Saving results}
\label{\detokenize{02_coords:saving-results}}
This is the set of stars well work with in the next step. But since we have a substantial dataset now, this is a good time to save it.
Storing the data in a file means we can shut down this notebook and pick up where we left off without running the previous query again.
Astropy \sphinxcode{\sphinxupquote{Table}} objects provide \sphinxcode{\sphinxupquote{write}}, which writes the table to disk.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{filename} \PYG{o}{=} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{gd1\PYGZus{}results.fits}\PYG{l+s+s1}{\PYGZsq{}}
\PYG{n}{results}\PYG{o}{.}\PYG{n}{write}\PYG{p}{(}\PYG{n}{filename}\PYG{p}{,} \PYG{n}{overwrite}\PYG{o}{=}\PYG{k+kc}{True}\PYG{p}{)}
\end{sphinxVerbatim}
Because the filename ends with \sphinxcode{\sphinxupquote{fits}}, the table is written in the \sphinxhref{https://en.wikipedia.org/wiki/FITS}{FITS format}, which preserves the metadata associated with the table.
If the file already exists, the \sphinxcode{\sphinxupquote{overwrite}} argument causes it to be overwritten.
To see how big the file is, we can use \sphinxcode{\sphinxupquote{ls}} with the \sphinxcode{\sphinxupquote{\sphinxhyphen{}lh}} option, which prints information about the file including its size in human\sphinxhyphen{}readable form.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{o}{!}ls \PYGZhy{}lh gd1\PYGZus{}results.fits
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYGZhy{}rw\PYGZhy{}rw\PYGZhy{}r\PYGZhy{}\PYGZhy{} 1 downey downey 8.6M Oct 19 09:43 gd1\PYGZus{}results.fits
\end{sphinxVerbatim}
The file is about 8.6 MB. If you are using Windows, \sphinxcode{\sphinxupquote{ls}} might not work; in that case, try:
\begin{sphinxVerbatim}[commandchars=\\\{\}]
!dir gd1\PYGZus{}results.fits
\end{sphinxVerbatim}
\section{Summary}
\label{\detokenize{02_coords:summary}}
In this notebook, we composed more complex queries to select stars within a polygonal region of the sky. Then we downloaded the results and saved them in a FITS file.
In the next notebook, well reload the data from this file and replicate the next step in the analysis, using proper motion to identify stars likely to be in GD\sphinxhyphen{}1.
\section{Best practices}
\label{\detokenize{02_coords:best-practices}}\begin{itemize}
\item {}
For measurements with units, use \sphinxcode{\sphinxupquote{Quantity}} objects that represent units explicitly and check for errors.
\item {}
Use the \sphinxcode{\sphinxupquote{format}} function to compose queries; it is often faster and less error\sphinxhyphen{}prone.
\item {}
Develop queries incrementally: start with something simple, test it, and add a little bit at a time.
\item {}
Once you have a query working, save the data in a local file. If you shut down the notebook and come back to it later, you can reload the file; you dont have to run the query again.
\end{itemize}
\chapter{Chapter 3}
\label{\detokenize{03_motion:chapter-3}}\label{\detokenize{03_motion::doc}}
This is the third in a series of notebooks related to astronomy data.
As a running example, we are replicating parts of the analysis in a recent paper, “\sphinxhref{https://arxiv.org/abs/1805.00425}{Off the beaten path: Gaia reveals GD\sphinxhyphen{}1 stars outside of the main stream}” by Adrian M. Price\sphinxhyphen{}Whelan and Ana Bonaca.
In the first lesson, we wrote ADQL queries and used them to select and download data from the Gaia server.
In the second lesson, we wrote a query to select stars from the region of the sky where we expect GD\sphinxhyphen{}1 to be, and saved the results in a FITS file.
Now well read that data back and implement the next step in the analysis, identifying stars with the proper motion we expect for GD\sphinxhyphen{}1.
\section{Outline}
\label{\detokenize{03_motion:outline}}
Here are the steps in this lesson:
\begin{enumerate}
\sphinxsetlistlabels{\arabic}{enumi}{enumii}{}{.}%
\item {}
Well read back the results from the previous lesson, which we saved in a FITS file.
\item {}
Then well transform the coordinates and proper motion data from ICRS back to the coordinate frame of GD\sphinxhyphen{}1.
\item {}
Well put those results into a Pandas \sphinxcode{\sphinxupquote{DataFrame}}, which well use to select stars near the centerline of GD\sphinxhyphen{}1.
\item {}
Plotting the proper motion of those stars, well identify a region of proper motion for stars that are likely to be in GD\sphinxhyphen{}1.
\item {}
Finally, well select and plot the stars whose proper motion is in that region.
\end{enumerate}
After completing this lesson, you should be able to
\begin{itemize}
\item {}
Select rows and columns from an Astropy \sphinxcode{\sphinxupquote{Table}}.
\item {}
Use Matplotlib to make a scatter plot.
\item {}
Use Gala to transform coordinates.
\item {}
Make a Pandas \sphinxcode{\sphinxupquote{DataFrame}} and use a Boolean \sphinxcode{\sphinxupquote{Series}} to select rows.
\item {}
Save a \sphinxcode{\sphinxupquote{DataFrame}} in an HDF5 file.
\end{itemize}
\section{Installing libraries}
\label{\detokenize{03_motion:installing-libraries}}
If you are running this notebook on Colab, you can run the following cell to install Astroquery and a the other libraries well use.
If you are running this notebook on your own computer, you might have to install these libraries yourself.
If you are using this notebook as part of a Carpentries workshop, you should have received setup instructions.
TODO: Add a link to the instructions.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{c+c1}{\PYGZsh{} If we\PYGZsq{}re running on Colab, install libraries}
\PYG{k+kn}{import} \PYG{n+nn}{sys}
\PYG{n}{IN\PYGZus{}COLAB} \PYG{o}{=} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{google.colab}\PYG{l+s+s1}{\PYGZsq{}} \PYG{o+ow}{in} \PYG{n}{sys}\PYG{o}{.}\PYG{n}{modules}
\PYG{k}{if} \PYG{n}{IN\PYGZus{}COLAB}\PYG{p}{:}
\PYG{o}{!}pip install astroquery astro\PYGZhy{}gala pyia python\PYGZhy{}wget
\end{sphinxVerbatim}
\section{Reload the data}
\label{\detokenize{03_motion:reload-the-data}}
In the previous lesson, we ran a query on the Gaia server and downloaded data for roughly 100,000 stars. We saved the data in a FITS file so that now, picking up where we left off, we can read the data from a local file rather than running the query again.
If you ran the previous lesson successfully, you should already have a file called \sphinxcode{\sphinxupquote{gd1\_results.fits}} that contains the data we downloaded.
If not, you can run the following cell, which downloads the data from our repository.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k+kn}{import} \PYG{n+nn}{os}
\PYG{k+kn}{from} \PYG{n+nn}{wget} \PYG{k+kn}{import} \PYG{n}{download}
\PYG{n}{filename} \PYG{o}{=} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{gd1\PYGZus{}results.fits}\PYG{l+s+s1}{\PYGZsq{}}
\PYG{n}{path} \PYG{o}{=} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{https://github.com/AllenDowney/AstronomicalData/raw/main/data/}\PYG{l+s+s1}{\PYGZsq{}}
\PYG{k}{if} \PYG{o+ow}{not} \PYG{n}{os}\PYG{o}{.}\PYG{n}{path}\PYG{o}{.}\PYG{n}{exists}\PYG{p}{(}\PYG{n}{filename}\PYG{p}{)}\PYG{p}{:}
\PYG{n+nb}{print}\PYG{p}{(}\PYG{n}{download}\PYG{p}{(}\PYG{n}{path}\PYG{o}{+}\PYG{n}{filename}\PYG{p}{)}\PYG{p}{)}
\end{sphinxVerbatim}
Now heres how we can read the data from the file back into an Astropy \sphinxcode{\sphinxupquote{Table}}:
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k+kn}{from} \PYG{n+nn}{astropy}\PYG{n+nn}{.}\PYG{n+nn}{table} \PYG{k+kn}{import} \PYG{n}{Table}
\PYG{n}{results} \PYG{o}{=} \PYG{n}{Table}\PYG{o}{.}\PYG{n}{read}\PYG{p}{(}\PYG{n}{filename}\PYG{p}{)}
\end{sphinxVerbatim}
The result is an Astropy \sphinxcode{\sphinxupquote{Table}}.
We can use \sphinxcode{\sphinxupquote{info}} to refresh our memory of the contents.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{results}\PYG{o}{.}\PYG{n}{info}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYGZlt{}Table length=140340\PYGZgt{}
name dtype unit description
\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{} \PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{} \PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{} \PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}
source\PYGZus{}id int64 Unique source identifier (unique within a particular Data Release)
ra float64 deg Right ascension
dec float64 deg Declination
pmra float64 mas / yr Proper motion in right ascension direction
pmdec float64 mas / yr Proper motion in declination direction
parallax float64 mas Parallax
parallax\PYGZus{}error float64 mas Standard error of parallax
radial\PYGZus{}velocity float64 km / s Radial velocity
\end{sphinxVerbatim}
\section{Selecting rows and columns}
\label{\detokenize{03_motion:selecting-rows-and-columns}}
In this section well see operations for selecting columns and rows from an Astropy \sphinxcode{\sphinxupquote{Table}}. You can find more information about these operations in the \sphinxhref{https://docs.astropy.org/en/stable/table/access\_table.html}{Astropy documentation}.
We can get the names of the columns like this:
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{results}\PYG{o}{.}\PYG{n}{colnames}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
[\PYGZsq{}source\PYGZus{}id\PYGZsq{},
\PYGZsq{}ra\PYGZsq{},
\PYGZsq{}dec\PYGZsq{},
\PYGZsq{}pmra\PYGZsq{},
\PYGZsq{}pmdec\PYGZsq{},
\PYGZsq{}parallax\PYGZsq{},
\PYGZsq{}parallax\PYGZus{}error\PYGZsq{},
\PYGZsq{}radial\PYGZus{}velocity\PYGZsq{}]
\end{sphinxVerbatim}
And select an individual column like this:
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{results}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{ra}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYGZlt{}Column name=\PYGZsq{}ra\PYGZsq{} dtype=\PYGZsq{}float64\PYGZsq{} unit=\PYGZsq{}deg\PYGZsq{} description=\PYGZsq{}Right ascension\PYGZsq{} length=140340\PYGZgt{}
142.48301935991023
142.25452941346344
142.64528557468074
142.57739430926034
142.58913564478618
141.81762228999614
143.18339801317677
142.9347319464589
142.26769745823267
142.89551292869012
142.2780935768316
142.06138786534987
...
143.05456487172972
144.0436496516182
144.06566578919313
144.13177563215973
143.77696341662764
142.945956347594
142.97282480557786
143.4166017695258
143.64484588686904
143.41554585481808
143.6908739159247
143.7702681295401
\end{sphinxVerbatim}
The result is a \sphinxcode{\sphinxupquote{Column}} object that contains the data, and also the data type, units, and name of the column.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n+nb}{type}\PYG{p}{(}\PYG{n}{results}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{ra}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
astropy.table.column.Column
\end{sphinxVerbatim}
The rows in the \sphinxcode{\sphinxupquote{Table}} are numbered from 0 to \sphinxcode{\sphinxupquote{n\sphinxhyphen{}1}}, where \sphinxcode{\sphinxupquote{n}} is the number of rows. We can select the first row like this:
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{results}\PYG{p}{[}\PYG{l+m+mi}{0}\PYG{p}{]}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYGZlt{}Row index=0\PYGZgt{}
source\PYGZus{}id ra dec pmra pmdec parallax parallax\PYGZus{}error radial\PYGZus{}velocity
deg deg mas / yr mas / yr mas mas km / s
int64 float64 float64 float64 float64 float64 float64 float64
\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{} \PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{} \PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{} \PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{} \PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{} \PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{} \PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{} \PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}
637987125186749568 142.48301935991023 21.75771616932985 \PYGZhy{}2.5168384683875766 2.941813096629439 \PYGZhy{}0.2573448962333354 0.823720794509811 1e+20
\end{sphinxVerbatim}
As you might have guessed, the result is a \sphinxcode{\sphinxupquote{Row}} object.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n+nb}{type}\PYG{p}{(}\PYG{n}{results}\PYG{p}{[}\PYG{l+m+mi}{0}\PYG{p}{]}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
astropy.table.row.Row
\end{sphinxVerbatim}
Notice that the bracket operator selects both columns and rows. You might wonder how it knows which to select.
If the expression in brackets is a string, it selects a column; if the expression is an integer, it selects a row.
If you apply the bracket operator twice, you can select a column and then an element from the column.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{results}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{ra}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}\PYG{p}{[}\PYG{l+m+mi}{0}\PYG{p}{]}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
142.48301935991023
\end{sphinxVerbatim}
Or you can select a row and then an element from the row.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{results}\PYG{p}{[}\PYG{l+m+mi}{0}\PYG{p}{]}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{ra}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
142.48301935991023
\end{sphinxVerbatim}
You get the same result either way.
\section{Scatter plot}
\label{\detokenize{03_motion:scatter-plot}}
To see what the results look like, well use a scatter plot. The library well use is \sphinxhref{https://matplotlib.org/}{Matplotlib}, which is the most widely\sphinxhyphen{}used plotting library for Python.
The Matplotlib interface is based on MATLAB (hence the name), so if you know MATLAB, some of it will be familiar.
Well import like this.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k+kn}{import} \PYG{n+nn}{matplotlib}\PYG{n+nn}{.}\PYG{n+nn}{pyplot} \PYG{k}{as} \PYG{n+nn}{plt}
\end{sphinxVerbatim}
Pyplot part of the Matplotlib library. It is conventional to import it using the shortened name \sphinxcode{\sphinxupquote{plt}}.
Pyplot provides two functions that can make scatterplots, \sphinxhref{https://matplotlib.org/3.3.0/api/\_as\_gen/matplotlib.pyplot.scatter.html}{plt.scatter} and \sphinxhref{https://matplotlib.org/api/\_as\_gen/matplotlib.pyplot.plot.html}{plt.plot}.
\begin{itemize}
\item {}
\sphinxcode{\sphinxupquote{scatter}} is more versatile; for example, you can make every point in a scatter plot a different color.
\item {}
\sphinxcode{\sphinxupquote{plot}} is more limited, but for simple cases, it can be substantially faster.
\end{itemize}
Jake Vanderplas explains these differences in \sphinxhref{https://jakevdp.github.io/PythonDataScienceHandbook/04.02-simple-scatter-plots.html}{The Python Data Science Handbook}
Since we are plotting more than 100,000 points and they are all the same size and color, well use \sphinxcode{\sphinxupquote{plot}}.
Heres a scatter plot with right ascension on the x\sphinxhyphen{}axis and declination on the y\sphinxhyphen{}axis, both ICRS coordinates in degrees.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{x} \PYG{o}{=} \PYG{n}{results}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{ra}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}
\PYG{n}{y} \PYG{o}{=} \PYG{n}{results}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{dec}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{plot}\PYG{p}{(}\PYG{n}{x}\PYG{p}{,} \PYG{n}{y}\PYG{p}{,} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{ko}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{xlabel}\PYG{p}{(}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{ra (degree ICRS)}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{ylabel}\PYG{p}{(}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{dec (degree ICRS)}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}\PYG{p}{;}
\end{sphinxVerbatim}
\noindent\sphinxincludegraphics{{03_motion_28_0}.png}
The arguments to \sphinxcode{\sphinxupquote{plt.plot}} are \sphinxcode{\sphinxupquote{x}}, \sphinxcode{\sphinxupquote{y}}, and a string that specifies the style. In this case, the letters \sphinxcode{\sphinxupquote{ko}} indicate that we want a black, round marker (\sphinxcode{\sphinxupquote{k}} is for black because \sphinxcode{\sphinxupquote{b}} is for blue).
The functions \sphinxcode{\sphinxupquote{xlabel}} and \sphinxcode{\sphinxupquote{ylabel}} put labels on the axes.
This scatter plot has a problem. It is “\sphinxhref{https://python-graph-gallery.com/134-how-to-avoid-overplotting-with-python/}{overplotted}”, which means that there are so many overlapping points, we cant distinguish between high and low density areas.
To fix this, we can provide optional arguments to control the size and transparency of the points.
\sphinxstylestrong{Exercise:} In the call to \sphinxcode{\sphinxupquote{plt.plot}}, add the keyword argument \sphinxcode{\sphinxupquote{markersize=0.1}} to make the markers smaller.
Then add the argument \sphinxcode{\sphinxupquote{alpha=0.1}} to make the markers nearly transparent.
Adjust these arguments until you think the figure shows the data most clearly.
Note: Once you have made these changes, you might notice that the figure shows stripes with lower density of stars. These stripes are caused by the way Gaia scans the sky, which \sphinxhref{https://www.cosmos.esa.int/web/gaia/scanning-law}{you can read about here}. The dataset we are using, \sphinxhref{https://www.cosmos.esa.int/web/gaia/dr2}{Gaia Data Release 2}, covers 22 months of observations; during this time, some parts of the sky were scanned more than others.
\section{Transform back}
\label{\detokenize{03_motion:transform-back}}
Remember that we selected data from a rectangle of coordinates in the \sphinxcode{\sphinxupquote{GD1Koposov10}} frame, then transformed them to ICRS when we constructed the query.
The coordinates in \sphinxcode{\sphinxupquote{results}} are in ICRS.
To plot them, we will transform them back to the \sphinxcode{\sphinxupquote{GD1Koposov10}} frame; that way, the axes of the figure are aligned with the GD\sphinxhyphen{}1, which will make it easy to select stars near the centerline of the stream.
To do that, well put the results into a \sphinxcode{\sphinxupquote{GaiaData}} object, provided by the \sphinxhref{https://pyia.readthedocs.io/en/latest/api/pyia.GaiaData.html}{pyia library}.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k+kn}{from} \PYG{n+nn}{pyia} \PYG{k+kn}{import} \PYG{n}{GaiaData}
\PYG{n}{gaia\PYGZus{}data} \PYG{o}{=} \PYG{n}{GaiaData}\PYG{p}{(}\PYG{n}{results}\PYG{p}{)}
\PYG{n+nb}{type}\PYG{p}{(}\PYG{n}{gaia\PYGZus{}data}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
pyia.data.GaiaData
\end{sphinxVerbatim}
Now we can extract sky coordinates from the \sphinxcode{\sphinxupquote{GaiaData}} object, like this:
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k+kn}{import} \PYG{n+nn}{astropy}\PYG{n+nn}{.}\PYG{n+nn}{units} \PYG{k}{as} \PYG{n+nn}{u}
\PYG{n}{skycoord} \PYG{o}{=} \PYG{n}{gaia\PYGZus{}data}\PYG{o}{.}\PYG{n}{get\PYGZus{}skycoord}\PYG{p}{(}
\PYG{n}{distance}\PYG{o}{=}\PYG{l+m+mi}{8}\PYG{o}{*}\PYG{n}{u}\PYG{o}{.}\PYG{n}{kpc}\PYG{p}{,}
\PYG{n}{radial\PYGZus{}velocity}\PYG{o}{=}\PYG{l+m+mi}{0}\PYG{o}{*}\PYG{n}{u}\PYG{o}{.}\PYG{n}{km}\PYG{o}{/}\PYG{n}{u}\PYG{o}{.}\PYG{n}{s}\PYG{p}{)}
\end{sphinxVerbatim}
We provide \sphinxcode{\sphinxupquote{distance}} and \sphinxcode{\sphinxupquote{radial\_velocity}} to prepare the data for reflex correction, which we explain below.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n+nb}{type}\PYG{p}{(}\PYG{n}{skycoord}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
astropy.coordinates.sky\PYGZus{}coordinate.SkyCoord
\end{sphinxVerbatim}
The result is an Astropy \sphinxcode{\sphinxupquote{SkyCoord}} object (\sphinxhref{https://docs.astropy.org/en/stable/api/astropy.coordinates.SkyCoord.html\#astropy.coordinates.SkyCoord}{documentation here}), which provides \sphinxcode{\sphinxupquote{transform\_to}}, so we can transform the coordinates to other frames.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k+kn}{import} \PYG{n+nn}{gala}\PYG{n+nn}{.}\PYG{n+nn}{coordinates} \PYG{k}{as} \PYG{n+nn}{gc}
\PYG{n}{transformed} \PYG{o}{=} \PYG{n}{skycoord}\PYG{o}{.}\PYG{n}{transform\PYGZus{}to}\PYG{p}{(}\PYG{n}{gc}\PYG{o}{.}\PYG{n}{GD1Koposov10}\PYG{p}{)}
\PYG{n+nb}{type}\PYG{p}{(}\PYG{n}{transformed}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
astropy.coordinates.sky\PYGZus{}coordinate.SkyCoord
\end{sphinxVerbatim}
The result is another \sphinxcode{\sphinxupquote{SkyCoord}} object, now in the \sphinxcode{\sphinxupquote{GD1Koposov10}} frame.
The next step is to correct the proper motion measurements from Gaia for reflex due to the motion of our solar system around the Galactic center.
When we created \sphinxcode{\sphinxupquote{skycoord}}, we provided \sphinxcode{\sphinxupquote{distance}} and \sphinxcode{\sphinxupquote{radial\_velocity}} as arguments, which means we ignore the measurements provided by Gaia and replace them with these fixed values.
That might seem like a strange thing to do, but heres the motivation:
\begin{itemize}
\item {}
Because the stars in GD\sphinxhyphen{}1 are so far away, the distance estimates we get from Gaia, which are based on parallax, are not very precise. So we replace them with our current best estimate of the mean distance to GD\sphinxhyphen{}1, about 8 kpc. See \sphinxhref{https://ui.adsabs.harvard.edu/abs/2010ApJ...712..260K/abstract}{Koposov, Rix, and Hogg, 2010}.
\item {}
For the other stars in the table, this distance estimate will be inaccurate, so reflex correction will not be correct. But that should have only a small effect on our ability to identify stars with the proper motion we expect for GD\sphinxhyphen{}1.
\item {}
The measurement of radial velocity has no effect on the correction for proper motion; the value we provide is arbitrary, but we have to provide a value to avoid errors in the reflex correction calculation.
\end{itemize}
We are grateful to Adrian Price\sphinxhyphen{}Whelen for his help explaining this step in the analysis.
With this preparation, we can use \sphinxcode{\sphinxupquote{reflex\_correct}} from Gala (\sphinxhref{https://gala-astro.readthedocs.io/en/latest/api/gala.coordinates.reflex\_correct.html}{documentation here}) to correct for solar reflex motion.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{gd1\PYGZus{}coord} \PYG{o}{=} \PYG{n}{gc}\PYG{o}{.}\PYG{n}{reflex\PYGZus{}correct}\PYG{p}{(}\PYG{n}{transformed}\PYG{p}{)}
\PYG{n+nb}{type}\PYG{p}{(}\PYG{n}{gd1\PYGZus{}coord}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
astropy.coordinates.sky\PYGZus{}coordinate.SkyCoord
\end{sphinxVerbatim}
The result is a \sphinxcode{\sphinxupquote{SkyCoord}} object that contains
\begin{itemize}
\item {}
The transformed coordinates as attributes named \sphinxcode{\sphinxupquote{phi1}} and \sphinxcode{\sphinxupquote{phi2}}, which represent right ascension and declination in the \sphinxcode{\sphinxupquote{GD1Koposov10}} frame.
\item {}
The transformed and corrected proper motions as \sphinxcode{\sphinxupquote{pm\_phi1\_cosphi2}} and \sphinxcode{\sphinxupquote{pm\_phi2}}.
\end{itemize}
We can select the coordinates like this:
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{phi1} \PYG{o}{=} \PYG{n}{gd1\PYGZus{}coord}\PYG{o}{.}\PYG{n}{phi1}
\PYG{n}{phi2} \PYG{o}{=} \PYG{n}{gd1\PYGZus{}coord}\PYG{o}{.}\PYG{n}{phi2}
\end{sphinxVerbatim}
And plot them like this:
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{plot}\PYG{p}{(}\PYG{n}{phi1}\PYG{p}{,} \PYG{n}{phi2}\PYG{p}{,} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{ko}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{,} \PYG{n}{markersize}\PYG{o}{=}\PYG{l+m+mf}{0.1}\PYG{p}{,} \PYG{n}{alpha}\PYG{o}{=}\PYG{l+m+mf}{0.2}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{xlabel}\PYG{p}{(}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{ra (degree GD1)}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{ylabel}\PYG{p}{(}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{dec (degree GD1)}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}\PYG{p}{;}
\end{sphinxVerbatim}
\noindent\sphinxincludegraphics{{03_motion_45_0}.png}
Remember that we started with a rectangle in GD\sphinxhyphen{}1 coordinates. When transformed to ICRS, its a non\sphinxhyphen{}rectangular polygon. Now that we have transformed back to GD\sphinxhyphen{}1 coordinates, its a rectangle again.
\section{Pandas DataFrame}
\label{\detokenize{03_motion:pandas-dataframe}}
At this point we have three objects containing different subsets of the data.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n+nb}{type}\PYG{p}{(}\PYG{n}{results}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
astropy.table.table.Table
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n+nb}{type}\PYG{p}{(}\PYG{n}{gaia\PYGZus{}data}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
pyia.data.GaiaData
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n+nb}{type}\PYG{p}{(}\PYG{n}{gd1\PYGZus{}coord}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
astropy.coordinates.sky\PYGZus{}coordinate.SkyCoord
\end{sphinxVerbatim}
On one hand, this makes sense, since each object provides different capabilities. But working with three different object types can be awkward.
It will be more convenient to choose one object and get all of the data into it. Well use a Pandas DataFrame, for two reasons:
\begin{enumerate}
\sphinxsetlistlabels{\arabic}{enumi}{enumii}{}{.}%
\item {}
It provides capabilities that are pretty much a superset of the other data structures, so its the all\sphinxhyphen{}in\sphinxhyphen{}one solution.
\item {}
Pandas is a general\sphinxhyphen{}purpose tool that is useful in many domains, especially data science. If you are going to develop expertise in one tool, Pandas is a good choice.
\end{enumerate}
However, compared to an Astropy \sphinxcode{\sphinxupquote{Table}}, Pandas has one big drawback: it does not keep the metadata associated with the table, including the units for the columns.
Its easy to convert a \sphinxcode{\sphinxupquote{Table}} to a Pandas \sphinxcode{\sphinxupquote{DataFrame}}.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k+kn}{import} \PYG{n+nn}{pandas} \PYG{k}{as} \PYG{n+nn}{pd}
\PYG{n}{df} \PYG{o}{=} \PYG{n}{results}\PYG{o}{.}\PYG{n}{to\PYGZus{}pandas}\PYG{p}{(}\PYG{p}{)}
\PYG{n}{df}\PYG{o}{.}\PYG{n}{shape}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
(140340, 8)
\end{sphinxVerbatim}
\sphinxcode{\sphinxupquote{DataFrame}} provides \sphinxcode{\sphinxupquote{shape}}, which shows the number of rows and columns.
It also provides \sphinxcode{\sphinxupquote{head}}, which displays the first few rows. It is useful for spot\sphinxhyphen{}checking large results as you go along.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{df}\PYG{o}{.}\PYG{n}{head}\PYG{p}{(}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
source\PYGZus{}id ra dec pmra pmdec parallax \PYGZbs{}
0 637987125186749568 142.483019 21.757716 \PYGZhy{}2.516838 2.941813 \PYGZhy{}0.257345
1 638285195917112960 142.254529 22.476168 2.662702 \PYGZhy{}12.165984 0.422728
2 638073505568978688 142.645286 22.166932 18.306747 \PYGZhy{}7.950660 0.103640
3 638086386175786752 142.577394 22.227920 0.987786 \PYGZhy{}2.584105 \PYGZhy{}0.857327
4 638049655615392384 142.589136 22.110783 0.244439 \PYGZhy{}4.941079 0.099625
parallax\PYGZus{}error radial\PYGZus{}velocity
0 0.823721 1.000000e+20
1 0.297472 1.000000e+20
2 0.544584 1.000000e+20
3 1.059607 1.000000e+20
4 0.486224 1.000000e+20
\end{sphinxVerbatim}
Python detail: \sphinxcode{\sphinxupquote{shape}} is an attribute, so we can display its value without calling it as a function; \sphinxcode{\sphinxupquote{head}} is a function, so we need the parentheses.
Now we can extract the columns we want from \sphinxcode{\sphinxupquote{gd1\_coord}} and add them as columns in the \sphinxcode{\sphinxupquote{DataFrame}}. \sphinxcode{\sphinxupquote{phi1}} and \sphinxcode{\sphinxupquote{phi2}} contain the transformed coordinates.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{df}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{phi1}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]} \PYG{o}{=} \PYG{n}{gd1\PYGZus{}coord}\PYG{o}{.}\PYG{n}{phi1}
\PYG{n}{df}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{phi2}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]} \PYG{o}{=} \PYG{n}{gd1\PYGZus{}coord}\PYG{o}{.}\PYG{n}{phi2}
\PYG{n}{df}\PYG{o}{.}\PYG{n}{shape}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
(140340, 10)
\end{sphinxVerbatim}
\sphinxcode{\sphinxupquote{pm\_phi1\_cosphi2}} and \sphinxcode{\sphinxupquote{pm\_phi2}} contain the components of proper motion in the transformed frame.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{df}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{pm\PYGZus{}phi1}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]} \PYG{o}{=} \PYG{n}{gd1\PYGZus{}coord}\PYG{o}{.}\PYG{n}{pm\PYGZus{}phi1\PYGZus{}cosphi2}
\PYG{n}{df}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{pm\PYGZus{}phi2}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]} \PYG{o}{=} \PYG{n}{gd1\PYGZus{}coord}\PYG{o}{.}\PYG{n}{pm\PYGZus{}phi2}
\PYG{n}{df}\PYG{o}{.}\PYG{n}{shape}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
(140340, 12)
\end{sphinxVerbatim}
\sphinxstylestrong{Detail:} If you notice that \sphinxcode{\sphinxupquote{SkyCoord}} has an attribute called \sphinxcode{\sphinxupquote{proper\_motion}}, you might wonder why we are not using it.
We could have: \sphinxcode{\sphinxupquote{proper\_motion}} contains the same data as \sphinxcode{\sphinxupquote{pm\_phi1\_cosphi2}} and \sphinxcode{\sphinxupquote{pm\_phi2}}, but in a different format.
\section{Plot proper motion}
\label{\detokenize{03_motion:plot-proper-motion}}
Now we are ready to replicate one of the panels in Figure 1 of the Price\sphinxhyphen{}Whelan and Bonaca paper, the one that shows the components of proper motion as a scatter plot:
In this figure, the shaded area is a high\sphinxhyphen{}density region of stars with the proper motion we expect for stars in GD\sphinxhyphen{}1.
\begin{itemize}
\item {}
Due to the nature of tidal streams, we expect the proper motion for most stars to be along the axis of the stream; that is, we expect motion in the direction of \sphinxcode{\sphinxupquote{phi2}} to be near 0.
\item {}
In the direction of \sphinxcode{\sphinxupquote{phi1}}, we dont have a prior expectation for proper motion, except that it should form a cluster at a non\sphinxhyphen{}zero value.
\end{itemize}
To locate this cluster, well select stars near the centerline of GD\sphinxhyphen{}1 and plot their proper motion.
\section{Selecting the centerline}
\label{\detokenize{03_motion:selecting-the-centerline}}
As we can see in the following figure, many stars in GD\sphinxhyphen{}1 are less than 1 degree of declination from the line \sphinxcode{\sphinxupquote{phi2=0}}.
If we select stars near this line, they are more likely to be in GD\sphinxhyphen{}1.
Well start by selecting the \sphinxcode{\sphinxupquote{phi2}} column from the \sphinxcode{\sphinxupquote{DataFrame}}:
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{phi2} \PYG{o}{=} \PYG{n}{df}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{phi2}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}
\PYG{n+nb}{type}\PYG{p}{(}\PYG{n}{phi2}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
pandas.core.series.Series
\end{sphinxVerbatim}
The result is a \sphinxcode{\sphinxupquote{Series}}, which is the structure Pandas uses to represent columns.
We can use a comparison operator, \sphinxcode{\sphinxupquote{\textgreater{}}}, to compare the values in a \sphinxcode{\sphinxupquote{Series}} to a constant.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{phi2\PYGZus{}min} \PYG{o}{=} \PYG{o}{\PYGZhy{}}\PYG{l+m+mf}{1.0} \PYG{o}{*} \PYG{n}{u}\PYG{o}{.}\PYG{n}{deg}
\PYG{n}{phi2\PYGZus{}max} \PYG{o}{=} \PYG{l+m+mf}{1.0} \PYG{o}{*} \PYG{n}{u}\PYG{o}{.}\PYG{n}{deg}
\PYG{n}{mask} \PYG{o}{=} \PYG{p}{(}\PYG{n}{df}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{phi2}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]} \PYG{o}{\PYGZgt{}} \PYG{n}{phi2\PYGZus{}min}\PYG{p}{)}
\PYG{n+nb}{type}\PYG{p}{(}\PYG{n}{mask}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
pandas.core.series.Series
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{mask}\PYG{o}{.}\PYG{n}{dtype}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
dtype(\PYGZsq{}bool\PYGZsq{})
\end{sphinxVerbatim}
The result is a \sphinxcode{\sphinxupquote{Series}} of Boolean values, that is, \sphinxcode{\sphinxupquote{True}} and \sphinxcode{\sphinxupquote{False}}.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{mask}\PYG{o}{.}\PYG{n}{head}\PYG{p}{(}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
0 False
1 False
2 False
3 False
4 False
Name: phi2, dtype: bool
\end{sphinxVerbatim}
A Boolean \sphinxcode{\sphinxupquote{Series}} is sometimes called a “mask” because we can use it to mask out some of the rows in a \sphinxcode{\sphinxupquote{DataFrame}} and select the rest, like this:
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{selected} \PYG{o}{=} \PYG{n}{df}\PYG{p}{[}\PYG{n}{mask}\PYG{p}{]}
\PYG{n+nb}{type}\PYG{p}{(}\PYG{n}{selected}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
pandas.core.frame.DataFrame
\end{sphinxVerbatim}
\sphinxcode{\sphinxupquote{selected}} is a \sphinxcode{\sphinxupquote{DataFrame}} that contains only the rows from \sphinxcode{\sphinxupquote{df}} that correspond to \sphinxcode{\sphinxupquote{True}} values in \sphinxcode{\sphinxupquote{mask}}.
The previous mask selects all stars where \sphinxcode{\sphinxupquote{phi2}} exceeds \sphinxcode{\sphinxupquote{phi2\_min}}; now well select stars where \sphinxcode{\sphinxupquote{phi2}} falls between \sphinxcode{\sphinxupquote{phi2\_min}} and \sphinxcode{\sphinxupquote{phi2\_max}}.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{phi\PYGZus{}mask} \PYG{o}{=} \PYG{p}{(}\PYG{p}{(}\PYG{n}{df}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{phi2}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]} \PYG{o}{\PYGZgt{}} \PYG{n}{phi2\PYGZus{}min}\PYG{p}{)} \PYG{o}{\PYGZam{}}
\PYG{p}{(}\PYG{n}{df}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{phi2}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]} \PYG{o}{\PYGZlt{}} \PYG{n}{phi2\PYGZus{}max}\PYG{p}{)}\PYG{p}{)}
\end{sphinxVerbatim}
The \sphinxcode{\sphinxupquote{\&}} operator computes “logical AND”, which means the result is true where elements from both Boolean \sphinxcode{\sphinxupquote{Series}} are true.
The sum of a Boolean \sphinxcode{\sphinxupquote{Series}} is the number of \sphinxcode{\sphinxupquote{True}} values, so we can use \sphinxcode{\sphinxupquote{sum}} to see how many stars are in the selected region.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{phi\PYGZus{}mask}\PYG{o}{.}\PYG{n}{sum}\PYG{p}{(}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
25084
\end{sphinxVerbatim}
And we can use \sphinxcode{\sphinxupquote{phi1\_mask}} to select stars near the centerline, which are more likely to be in GD\sphinxhyphen{}1.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{centerline} \PYG{o}{=} \PYG{n}{df}\PYG{p}{[}\PYG{n}{phi\PYGZus{}mask}\PYG{p}{]}
\PYG{n+nb}{len}\PYG{p}{(}\PYG{n}{centerline}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
25084
\end{sphinxVerbatim}
Heres a scatter plot of proper motion for the selected stars.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{pm1} \PYG{o}{=} \PYG{n}{centerline}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{pm\PYGZus{}phi1}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}
\PYG{n}{pm2} \PYG{o}{=} \PYG{n}{centerline}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{pm\PYGZus{}phi2}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{plot}\PYG{p}{(}\PYG{n}{pm1}\PYG{p}{,} \PYG{n}{pm2}\PYG{p}{,} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{ko}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{,} \PYG{n}{markersize}\PYG{o}{=}\PYG{l+m+mf}{0.1}\PYG{p}{,} \PYG{n}{alpha}\PYG{o}{=}\PYG{l+m+mf}{0.1}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{xlabel}\PYG{p}{(}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{Proper motion phi1 (GD1 frame)}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{ylabel}\PYG{p}{(}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{Proper motion phi2 (GD1 frame)}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}\PYG{p}{;}
\end{sphinxVerbatim}
\noindent\sphinxincludegraphics{{03_motion_79_0}.png}
Looking at these results, we see a large cluster around (0, 0), and a smaller cluster near (0, \sphinxhyphen{}10).
We can use \sphinxcode{\sphinxupquote{xlim}} and \sphinxcode{\sphinxupquote{ylim}} to set the limits on the axes and zoom in on the region near (0, 0).
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{pm1} \PYG{o}{=} \PYG{n}{centerline}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{pm\PYGZus{}phi1}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}
\PYG{n}{pm2} \PYG{o}{=} \PYG{n}{centerline}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{pm\PYGZus{}phi2}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{plot}\PYG{p}{(}\PYG{n}{pm1}\PYG{p}{,} \PYG{n}{pm2}\PYG{p}{,} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{ko}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{,} \PYG{n}{markersize}\PYG{o}{=}\PYG{l+m+mf}{0.3}\PYG{p}{,} \PYG{n}{alpha}\PYG{o}{=}\PYG{l+m+mf}{0.3}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{xlabel}\PYG{p}{(}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{Proper motion phi1 (GD1 frame)}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{ylabel}\PYG{p}{(}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{Proper motion phi2 (GD1 frame)}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{xlim}\PYG{p}{(}\PYG{o}{\PYGZhy{}}\PYG{l+m+mi}{12}\PYG{p}{,} \PYG{l+m+mi}{8}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{ylim}\PYG{p}{(}\PYG{o}{\PYGZhy{}}\PYG{l+m+mi}{10}\PYG{p}{,} \PYG{l+m+mi}{10}\PYG{p}{)}\PYG{p}{;}
\end{sphinxVerbatim}
\noindent\sphinxincludegraphics{{03_motion_81_0}.png}
Now we can see the smaller cluster more clearly.
You might notice that our figure is less dense than the one in the paper. Thats because we started with a set of stars from a relatively small region. The figure in the paper is based on a region about 10 times bigger.
In the next lesson well go back and select stars from a larger region. But first well use the proper motion data to identify stars likely to be in GD\sphinxhyphen{}1.
\section{Filtering based on proper motion}
\label{\detokenize{03_motion:filtering-based-on-proper-motion}}
The next step is to select stars in the “overdense” region of proper motion, which are candidates to be in GD\sphinxhyphen{}1.
In the original paper, Price\sphinxhyphen{}Whelan and Bonaca used a polygon to cover this region, as shown in this figure.
Well use a simple rectangle for now, but in a later lesson well see how to select a polygonal region as well.
Here are bounds on proper motion we chose by eye,
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{pm1\PYGZus{}min} \PYG{o}{=} \PYG{o}{\PYGZhy{}}\PYG{l+m+mf}{8.9}
\PYG{n}{pm1\PYGZus{}max} \PYG{o}{=} \PYG{o}{\PYGZhy{}}\PYG{l+m+mf}{6.9}
\PYG{n}{pm2\PYGZus{}min} \PYG{o}{=} \PYG{o}{\PYGZhy{}}\PYG{l+m+mf}{2.2}
\PYG{n}{pm2\PYGZus{}max} \PYG{o}{=} \PYG{l+m+mf}{1.0}
\end{sphinxVerbatim}
To draw these bounds, well make two lists containing the coordinates of the corners of the rectangle.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{pm1\PYGZus{}rect} \PYG{o}{=} \PYG{p}{[}\PYG{n}{pm1\PYGZus{}min}\PYG{p}{,} \PYG{n}{pm1\PYGZus{}min}\PYG{p}{,} \PYG{n}{pm1\PYGZus{}max}\PYG{p}{,} \PYG{n}{pm1\PYGZus{}max}\PYG{p}{,} \PYG{n}{pm1\PYGZus{}min}\PYG{p}{]} \PYG{o}{*} \PYG{n}{u}\PYG{o}{.}\PYG{n}{mas}\PYG{o}{/}\PYG{n}{u}\PYG{o}{.}\PYG{n}{yr}
\PYG{n}{pm2\PYGZus{}rect} \PYG{o}{=} \PYG{p}{[}\PYG{n}{pm2\PYGZus{}min}\PYG{p}{,} \PYG{n}{pm2\PYGZus{}max}\PYG{p}{,} \PYG{n}{pm2\PYGZus{}max}\PYG{p}{,} \PYG{n}{pm2\PYGZus{}min}\PYG{p}{,} \PYG{n}{pm2\PYGZus{}min}\PYG{p}{]} \PYG{o}{*} \PYG{n}{u}\PYG{o}{.}\PYG{n}{mas}\PYG{o}{/}\PYG{n}{u}\PYG{o}{.}\PYG{n}{yr}
\end{sphinxVerbatim}
Heres what the plot looks like with the bounds we chose.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{plot}\PYG{p}{(}\PYG{n}{pm1}\PYG{p}{,} \PYG{n}{pm2}\PYG{p}{,} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{ko}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{,} \PYG{n}{markersize}\PYG{o}{=}\PYG{l+m+mf}{0.3}\PYG{p}{,} \PYG{n}{alpha}\PYG{o}{=}\PYG{l+m+mf}{0.3}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{plot}\PYG{p}{(}\PYG{n}{pm1\PYGZus{}rect}\PYG{p}{,} \PYG{n}{pm2\PYGZus{}rect}\PYG{p}{,} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{\PYGZhy{}}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{xlabel}\PYG{p}{(}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{Proper motion phi1 (GD1 frame)}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{ylabel}\PYG{p}{(}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{Proper motion phi2 (GD1 frame)}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{xlim}\PYG{p}{(}\PYG{o}{\PYGZhy{}}\PYG{l+m+mi}{12}\PYG{p}{,} \PYG{l+m+mi}{8}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{ylim}\PYG{p}{(}\PYG{o}{\PYGZhy{}}\PYG{l+m+mi}{10}\PYG{p}{,} \PYG{l+m+mi}{10}\PYG{p}{)}\PYG{p}{;}
\end{sphinxVerbatim}
\noindent\sphinxincludegraphics{{03_motion_88_0}.png}
To select rows that fall within these bounds, well use the following function, which uses Pandas operators to make a mask that selects rows where \sphinxcode{\sphinxupquote{series}} falls between \sphinxcode{\sphinxupquote{low}} and \sphinxcode{\sphinxupquote{high}}.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k}{def} \PYG{n+nf}{between}\PYG{p}{(}\PYG{n}{series}\PYG{p}{,} \PYG{n}{low}\PYG{p}{,} \PYG{n}{high}\PYG{p}{)}\PYG{p}{:}
\PYG{l+s+sd}{\PYGZdq{}\PYGZdq{}\PYGZdq{}Make a Boolean Series.}
\PYG{l+s+sd}{ }
\PYG{l+s+sd}{ series: Pandas Series}
\PYG{l+s+sd}{ low: lower bound}
\PYG{l+s+sd}{ high: upper bound}
\PYG{l+s+sd}{ }
\PYG{l+s+sd}{ returns: Boolean Series}
\PYG{l+s+sd}{ \PYGZdq{}\PYGZdq{}\PYGZdq{}}
\PYG{k}{return} \PYG{p}{(}\PYG{n}{series} \PYG{o}{\PYGZgt{}} \PYG{n}{low}\PYG{p}{)} \PYG{o}{\PYGZam{}} \PYG{p}{(}\PYG{n}{series} \PYG{o}{\PYGZlt{}} \PYG{n}{high}\PYG{p}{)}
\end{sphinxVerbatim}
The following mask select stars with proper motion in the region we chose.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{pm\PYGZus{}mask} \PYG{o}{=} \PYG{p}{(}\PYG{n}{between}\PYG{p}{(}\PYG{n}{df}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{pm\PYGZus{}phi1}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}\PYG{p}{,} \PYG{n}{pm1\PYGZus{}min}\PYG{p}{,} \PYG{n}{pm1\PYGZus{}max}\PYG{p}{)} \PYG{o}{\PYGZam{}}
\PYG{n}{between}\PYG{p}{(}\PYG{n}{df}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{pm\PYGZus{}phi2}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}\PYG{p}{,} \PYG{n}{pm2\PYGZus{}min}\PYG{p}{,} \PYG{n}{pm2\PYGZus{}max}\PYG{p}{)}\PYG{p}{)}
\end{sphinxVerbatim}
Again, the sum of a Boolean series is the number of \sphinxcode{\sphinxupquote{True}} values.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{pm\PYGZus{}mask}\PYG{o}{.}\PYG{n}{sum}\PYG{p}{(}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
1049
\end{sphinxVerbatim}
Now we can use this mask to select rows from \sphinxcode{\sphinxupquote{df}}.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{selected} \PYG{o}{=} \PYG{n}{df}\PYG{p}{[}\PYG{n}{pm\PYGZus{}mask}\PYG{p}{]}
\PYG{n+nb}{len}\PYG{p}{(}\PYG{n}{selected}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
1049
\end{sphinxVerbatim}
These are the stars we think are likely to be in GD\sphinxhyphen{}1. Lets see what they look like, plotting their coordinates (not their proper motion).
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{phi1} \PYG{o}{=} \PYG{n}{selected}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{phi1}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}
\PYG{n}{phi2} \PYG{o}{=} \PYG{n}{selected}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{phi2}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{plot}\PYG{p}{(}\PYG{n}{phi1}\PYG{p}{,} \PYG{n}{phi2}\PYG{p}{,} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{ko}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{,} \PYG{n}{markersize}\PYG{o}{=}\PYG{l+m+mf}{0.5}\PYG{p}{,} \PYG{n}{alpha}\PYG{o}{=}\PYG{l+m+mf}{0.5}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{xlabel}\PYG{p}{(}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{ra (degree GD1)}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{ylabel}\PYG{p}{(}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{dec (degree GD1)}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}\PYG{p}{;}
\end{sphinxVerbatim}
\noindent\sphinxincludegraphics{{03_motion_98_0}.png}
Now thats starting to look like a tidal stream!
\section{Saving the DataFrame}
\label{\detokenize{03_motion:saving-the-dataframe}}
At this point we have run a successful query and cleaned up the results; this is a good time to save the data.
To save a Pandas \sphinxcode{\sphinxupquote{DataFrame}}, one option is to convert it to an Astropy \sphinxcode{\sphinxupquote{Table}}, like this:
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{selected\PYGZus{}table} \PYG{o}{=} \PYG{n}{Table}\PYG{o}{.}\PYG{n}{from\PYGZus{}pandas}\PYG{p}{(}\PYG{n}{selected}\PYG{p}{)}
\PYG{n+nb}{type}\PYG{p}{(}\PYG{n}{selected\PYGZus{}table}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
astropy.table.table.Table
\end{sphinxVerbatim}
Then we could write the \sphinxcode{\sphinxupquote{Table}} to a FITS file, as we did in the previous lesson.
But Pandas provides functions to write DataFrames in other formats; to see what they are \sphinxhref{https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html}{find the functions here that begin with \sphinxcode{\sphinxupquote{to\_}}}.
One of the best options is HDF5, which is Version 5 of \sphinxhref{https://en.wikipedia.org/wiki/Hierarchical\_Data\_Format}{Hierarchical Data Format}.
HDF5 is a binary format, so files are small and fast to read and write (like FITS, but unlike XML).
An HDF5 file is similar to an SQL database in the sense that it can contain more than one table, although in HDF5 vocabulary, a table is called a Dataset. (\sphinxhref{https://www.stsci.edu/itt/review/dhb\_2011/Intro/intro\_ch23.html}{Multi\sphinxhyphen{}extension FITS files} can also contain more than one table.)
And HDF5 stores the metadata associated with the table, including column names, row labels, and data types (like FITS).
Finally, HDF5 is a cross\sphinxhyphen{}language standard, so if you write an HDF5 file with Pandas, you can read it back with many other software tools (more than FITS).
Before we write the HDF5, lets delete the old one, if it exists.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{o}{!}rm \PYGZhy{}f gd1\PYGZus{}dataframe.hdf5
\end{sphinxVerbatim}
We can write a Pandas \sphinxcode{\sphinxupquote{DataFrame}} to an HDF5 file like this:
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{filename} \PYG{o}{=} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{gd1\PYGZus{}dataframe.hdf5}\PYG{l+s+s1}{\PYGZsq{}}
\PYG{n}{df}\PYG{o}{.}\PYG{n}{to\PYGZus{}hdf}\PYG{p}{(}\PYG{n}{filename}\PYG{p}{,} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{df}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\end{sphinxVerbatim}
Because an HDF5 file can contain more than one Dataset, we have to provide a name, or “key”, that identifies the Dataset in the file.
We could use any string as the key, but in this example I use the variable name \sphinxcode{\sphinxupquote{df}}.
\sphinxstylestrong{Exercise:} Were going to need \sphinxcode{\sphinxupquote{centerline}} and \sphinxcode{\sphinxupquote{selected}} later as well. Write a line or two of code to add it as a second Dataset in the HDF5 file.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{c+c1}{\PYGZsh{} Solution}
\PYG{n}{centerline}\PYG{o}{.}\PYG{n}{to\PYGZus{}hdf}\PYG{p}{(}\PYG{n}{filename}\PYG{p}{,} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{centerline}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\PYG{n}{selected}\PYG{o}{.}\PYG{n}{to\PYGZus{}hdf}\PYG{p}{(}\PYG{n}{filename}\PYG{p}{,} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{selected}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\end{sphinxVerbatim}
\sphinxstylestrong{Detail:} Reading and writing HDF5 tables requires a library called \sphinxcode{\sphinxupquote{PyTables}} that is not always installed with Pandas. You can install it with pip like this:
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{pip} \PYG{n}{install} \PYG{n}{tables}
\end{sphinxVerbatim}
If you install it using Conda, the name of the package is \sphinxcode{\sphinxupquote{pytables}}.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{conda} \PYG{n}{install} \PYG{n}{pytables}
\end{sphinxVerbatim}
We can use \sphinxcode{\sphinxupquote{ls}} to confirm that the file exists and check the size:
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{o}{!}ls \PYGZhy{}lh gd1\PYGZus{}dataframe.hdf5
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYGZhy{}rw\PYGZhy{}rw\PYGZhy{}r\PYGZhy{}\PYGZhy{} 1 downey downey 17M Oct 19 12:05 gd1\PYGZus{}dataframe.hdf5
\end{sphinxVerbatim}
If you are using Windows, \sphinxcode{\sphinxupquote{ls}} might not work; in that case, try:
\begin{sphinxVerbatim}[commandchars=\\\{\}]
!dir gd1\PYGZus{}dataframe.hdf5
\end{sphinxVerbatim}
We can read the file back like this:
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{read\PYGZus{}back\PYGZus{}df} \PYG{o}{=} \PYG{n}{pd}\PYG{o}{.}\PYG{n}{read\PYGZus{}hdf}\PYG{p}{(}\PYG{n}{filename}\PYG{p}{,} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{df}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\PYG{n}{read\PYGZus{}back\PYGZus{}df}\PYG{o}{.}\PYG{n}{shape}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
(140340, 12)
\end{sphinxVerbatim}
Pandas can write a variety of other formats, \sphinxhref{https://pandas.pydata.org/pandas-docs/stable/user\_guide/io.html}{which you can read about here}.
\section{Summary}
\label{\detokenize{03_motion:summary}}
In this lesson, we re\sphinxhyphen{}loaded the Gaia data we saved from a previous query.
We transformed the coordinates and proper motion from ICRS to a frame aligned with GD\sphinxhyphen{}1, and stored the results in a Pandas \sphinxcode{\sphinxupquote{DataFrame}}.
Then we replicated the selection process from the Price\sphinxhyphen{}Whelan and Bonaca paper:
\begin{itemize}
\item {}
We selected stars near the centerline of GD\sphinxhyphen{}1 and made a scatter plot of their proper motion.
\item {}
We identified a region of proper motion that contains stars likely to be in GD\sphinxhyphen{}1.
\item {}
We used a Boolean \sphinxcode{\sphinxupquote{Series}} as a mask to select stars whose proper motion is in that region.
\end{itemize}
So far, we have used data from a relatively small region of the sky. In the next lesson, well write a query that selects stars based on proper motion, which will allow us to explore a larger region.
\section{Best practices}
\label{\detokenize{03_motion:best-practices}}\begin{itemize}
\item {}
When you make a scatter plot, adjust the size of the markers and their transparency so the figure is not overplotted; otherwise it can misrepresent the data badly.
\item {}
For simple scatter plots in Matplotlib, \sphinxcode{\sphinxupquote{plot}} is faster than \sphinxcode{\sphinxupquote{scatter}}.
\item {}
An Astropy \sphinxcode{\sphinxupquote{Table}} and a Pandas \sphinxcode{\sphinxupquote{DataFrame}} are similar in many ways and they provide many of the same functions. They have pros and cons, but for many projects, either one would be a reasonable choice.
\end{itemize}
\chapter{Chapter 4}
\label{\detokenize{04_select:chapter-4}}\label{\detokenize{04_select::doc}}
This is the fourth in a series of notebooks related to astronomy data.
As a running example, we are replicating parts of the analysis in a recent paper, “\sphinxhref{https://arxiv.org/abs/1805.00425}{Off the beaten path: Gaia reveals GD\sphinxhyphen{}1 stars outside of the main stream}” by Adrian M. Price\sphinxhyphen{}Whelan and Ana Bonaca.
In the first lesson, we wrote ADQL queries and used them to select and download data from the Gaia server.
In the second lesson, we write a query to select stars from the region of the sky where we expect GD\sphinxhyphen{}1 to be, and save the results in a FITS file.
In the third lesson, we read that data back and identified stars with the proper motion we expect for GD\sphinxhyphen{}1.
\section{Outline}
\label{\detokenize{04_select:outline}}
Here are the steps in this lesson:
\begin{enumerate}
\sphinxsetlistlabels{\arabic}{enumi}{enumii}{}{.}%
\item {}
Using data from the previous lesson, well identify the values of proper motion for stars likely to be in GD\sphinxhyphen{}1.
\item {}
Then well compose an ADQL query that selects stars based on proper motion, so we can download only the data we need.
\item {}
Well also see how to write the results to a CSV file.
\end{enumerate}
That will make it possible to search a bigger region of the sky in a single query.
After completing this lesson, you should be able to
\begin{itemize}
\item {}
Convert proper motion between frames.
\item {}
Write an ADQL query that selects based on proper motion.
\end{itemize}
\section{Installing libraries}
\label{\detokenize{04_select:installing-libraries}}
If you are running this notebook on Colab, you can run the following cell to install Astroquery and a the other libraries well use.
If you are running this notebook on your own computer, you might have to install these libraries yourself.
If you are using this notebook as part of a Carpentries workshop, you should have received setup instructions.
TODO: Add a link to the instructions.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{c+c1}{\PYGZsh{} If we\PYGZsq{}re running on Colab, install libraries}
\PYG{k+kn}{import} \PYG{n+nn}{sys}
\PYG{n}{IN\PYGZus{}COLAB} \PYG{o}{=} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{google.colab}\PYG{l+s+s1}{\PYGZsq{}} \PYG{o+ow}{in} \PYG{n}{sys}\PYG{o}{.}\PYG{n}{modules}
\PYG{k}{if} \PYG{n}{IN\PYGZus{}COLAB}\PYG{p}{:}
\PYG{o}{!}pip install astroquery astro\PYGZhy{}gala pyia python\PYGZhy{}wget
\end{sphinxVerbatim}
\section{Reload the data}
\label{\detokenize{04_select:reload-the-data}}
The following cells download the data from the previous lesson, if necessary, and load it into a Pandas \sphinxcode{\sphinxupquote{DataFrame}}.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k+kn}{import} \PYG{n+nn}{os}
\PYG{k+kn}{from} \PYG{n+nn}{wget} \PYG{k+kn}{import} \PYG{n}{download}
\PYG{n}{filename} \PYG{o}{=} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{gd1\PYGZus{}dataframe.hdf5}\PYG{l+s+s1}{\PYGZsq{}}
\PYG{n}{path} \PYG{o}{=} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{https://github.com/AllenDowney/AstronomicalData/raw/main/data/}\PYG{l+s+s1}{\PYGZsq{}}
\PYG{k}{if} \PYG{o+ow}{not} \PYG{n}{os}\PYG{o}{.}\PYG{n}{path}\PYG{o}{.}\PYG{n}{exists}\PYG{p}{(}\PYG{n}{filename}\PYG{p}{)}\PYG{p}{:}
\PYG{n+nb}{print}\PYG{p}{(}\PYG{n}{download}\PYG{p}{(}\PYG{n}{path}\PYG{o}{+}\PYG{n}{filename}\PYG{p}{)}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k+kn}{import} \PYG{n+nn}{pandas} \PYG{k}{as} \PYG{n+nn}{pd}
\PYG{n}{df} \PYG{o}{=} \PYG{n}{pd}\PYG{o}{.}\PYG{n}{read\PYGZus{}hdf}\PYG{p}{(}\PYG{n}{filename}\PYG{p}{,} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{df}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\PYG{n}{centerline} \PYG{o}{=} \PYG{n}{pd}\PYG{o}{.}\PYG{n}{read\PYGZus{}hdf}\PYG{p}{(}\PYG{n}{filename}\PYG{p}{,} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{centerline}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\PYG{n}{selected} \PYG{o}{=} \PYG{n}{pd}\PYG{o}{.}\PYG{n}{read\PYGZus{}hdf}\PYG{p}{(}\PYG{n}{filename}\PYG{p}{,} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{selected}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\end{sphinxVerbatim}
\section{Selection by proper motion}
\label{\detokenize{04_select:selection-by-proper-motion}}
At this point we have downloaded data for a relatively large number of stars (more than 100,000) and selected a relatively small number (around 1000).
It would be more efficient to use ADQL to select only the stars we need. That would also make it possible to download data covering a larger region of the sky.
However, the selection we did was based on proper motion in the \sphinxcode{\sphinxupquote{GD1Koposov10}} frame. In order to do the same selection in ADQL, we have to work with proper motions in ICRS.
As a reminder, heres the rectangle we selected based on proper motion in the \sphinxcode{\sphinxupquote{GD1Koposov10}} frame.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{pm1\PYGZus{}min} \PYG{o}{=} \PYG{o}{\PYGZhy{}}\PYG{l+m+mf}{8.9}
\PYG{n}{pm1\PYGZus{}max} \PYG{o}{=} \PYG{o}{\PYGZhy{}}\PYG{l+m+mf}{6.9}
\PYG{n}{pm2\PYGZus{}min} \PYG{o}{=} \PYG{o}{\PYGZhy{}}\PYG{l+m+mf}{2.2}
\PYG{n}{pm2\PYGZus{}max} \PYG{o}{=} \PYG{l+m+mf}{1.0}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k+kn}{import} \PYG{n+nn}{astropy}\PYG{n+nn}{.}\PYG{n+nn}{units} \PYG{k}{as} \PYG{n+nn}{u}
\PYG{n}{pm1\PYGZus{}rect} \PYG{o}{=} \PYG{p}{[}\PYG{n}{pm1\PYGZus{}min}\PYG{p}{,} \PYG{n}{pm1\PYGZus{}min}\PYG{p}{,} \PYG{n}{pm1\PYGZus{}max}\PYG{p}{,} \PYG{n}{pm1\PYGZus{}max}\PYG{p}{,} \PYG{n}{pm1\PYGZus{}min}\PYG{p}{]} \PYG{o}{*} \PYG{n}{u}\PYG{o}{.}\PYG{n}{mas}\PYG{o}{/}\PYG{n}{u}\PYG{o}{.}\PYG{n}{yr}
\PYG{n}{pm2\PYGZus{}rect} \PYG{o}{=} \PYG{p}{[}\PYG{n}{pm2\PYGZus{}min}\PYG{p}{,} \PYG{n}{pm2\PYGZus{}max}\PYG{p}{,} \PYG{n}{pm2\PYGZus{}max}\PYG{p}{,} \PYG{n}{pm2\PYGZus{}min}\PYG{p}{,} \PYG{n}{pm2\PYGZus{}min}\PYG{p}{]} \PYG{o}{*} \PYG{n}{u}\PYG{o}{.}\PYG{n}{mas}\PYG{o}{/}\PYG{n}{u}\PYG{o}{.}\PYG{n}{yr}
\end{sphinxVerbatim}
The following figure shows:
\begin{itemize}
\item {}
Proper motion for the stars we selected along the center line of GD\sphinxhyphen{}1,
\item {}
The rectangle we selected, and
\item {}
The stars inside the rectangle highlighted in green.
\end{itemize}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k+kn}{import} \PYG{n+nn}{matplotlib}\PYG{n+nn}{.}\PYG{n+nn}{pyplot} \PYG{k}{as} \PYG{n+nn}{plt}
\PYG{n}{pm1} \PYG{o}{=} \PYG{n}{centerline}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{pm\PYGZus{}phi1}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}
\PYG{n}{pm2} \PYG{o}{=} \PYG{n}{centerline}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{pm\PYGZus{}phi2}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{plot}\PYG{p}{(}\PYG{n}{pm1}\PYG{p}{,} \PYG{n}{pm2}\PYG{p}{,} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{ko}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{,} \PYG{n}{markersize}\PYG{o}{=}\PYG{l+m+mf}{0.3}\PYG{p}{,} \PYG{n}{alpha}\PYG{o}{=}\PYG{l+m+mf}{0.3}\PYG{p}{)}
\PYG{n}{pm1} \PYG{o}{=} \PYG{n}{selected}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{pm\PYGZus{}phi1}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}
\PYG{n}{pm2} \PYG{o}{=} \PYG{n}{selected}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{pm\PYGZus{}phi2}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{plot}\PYG{p}{(}\PYG{n}{pm1}\PYG{p}{,} \PYG{n}{pm2}\PYG{p}{,} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{gx}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{,} \PYG{n}{markersize}\PYG{o}{=}\PYG{l+m+mf}{0.3}\PYG{p}{,} \PYG{n}{alpha}\PYG{o}{=}\PYG{l+m+mf}{0.3}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{plot}\PYG{p}{(}\PYG{n}{pm1\PYGZus{}rect}\PYG{p}{,} \PYG{n}{pm2\PYGZus{}rect}\PYG{p}{,} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{\PYGZhy{}}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{xlabel}\PYG{p}{(}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{Proper motion phi1 (GD1 frame)}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{ylabel}\PYG{p}{(}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{Proper motion phi2 (GD1 frame)}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{xlim}\PYG{p}{(}\PYG{o}{\PYGZhy{}}\PYG{l+m+mi}{12}\PYG{p}{,} \PYG{l+m+mi}{8}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{ylim}\PYG{p}{(}\PYG{o}{\PYGZhy{}}\PYG{l+m+mi}{10}\PYG{p}{,} \PYG{l+m+mi}{10}\PYG{p}{)}\PYG{p}{;}
\end{sphinxVerbatim}
\noindent\sphinxincludegraphics{{04_select_11_0}.png}
Now well make the same plot using proper motions in the ICRS frame, which are stored in columns \sphinxcode{\sphinxupquote{pmra}} and \sphinxcode{\sphinxupquote{pmdec}}.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{pm1} \PYG{o}{=} \PYG{n}{centerline}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{pmra}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}
\PYG{n}{pm2} \PYG{o}{=} \PYG{n}{centerline}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{pmdec}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{plot}\PYG{p}{(}\PYG{n}{pm1}\PYG{p}{,} \PYG{n}{pm2}\PYG{p}{,} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{ko}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{,} \PYG{n}{markersize}\PYG{o}{=}\PYG{l+m+mf}{0.3}\PYG{p}{,} \PYG{n}{alpha}\PYG{o}{=}\PYG{l+m+mf}{0.3}\PYG{p}{)}
\PYG{n}{pm1} \PYG{o}{=} \PYG{n}{selected}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{pmra}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}
\PYG{n}{pm2} \PYG{o}{=} \PYG{n}{selected}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{pmdec}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{plot}\PYG{p}{(}\PYG{n}{pm1}\PYG{p}{,} \PYG{n}{pm2}\PYG{p}{,} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{gx}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{,} \PYG{n}{markersize}\PYG{o}{=}\PYG{l+m+mi}{1}\PYG{p}{,} \PYG{n}{alpha}\PYG{o}{=}\PYG{l+m+mf}{0.3}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{xlabel}\PYG{p}{(}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{Proper motion phi1 (ICRS frame)}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{ylabel}\PYG{p}{(}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{Proper motion phi2 (ICRS frame)}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{xlim}\PYG{p}{(}\PYG{p}{[}\PYG{o}{\PYGZhy{}}\PYG{l+m+mi}{10}\PYG{p}{,} \PYG{l+m+mi}{5}\PYG{p}{]}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{ylim}\PYG{p}{(}\PYG{p}{[}\PYG{o}{\PYGZhy{}}\PYG{l+m+mi}{20}\PYG{p}{,} \PYG{l+m+mi}{5}\PYG{p}{]}\PYG{p}{)}\PYG{p}{;}
\end{sphinxVerbatim}
\noindent\sphinxincludegraphics{{04_select_13_0}.png}
The proper motions of the selected stars are more spread out in this frame, which is why it was preferable to do the selection in the GD\sphinxhyphen{}1 frame.
But now we can define a polygon that encloses the proper motions of these stars in ICRS,
and use the polygon as a selection criterion in an ADQL query.
SciPy provides a function that computes the \sphinxhref{https://en.wikipedia.org/wiki/Convex\_hull}{convex hull} of a set of points, which is the smallest convex polygon that contains all of the points.
To use it, Ill select columns \sphinxcode{\sphinxupquote{pmra}} and \sphinxcode{\sphinxupquote{pmdec}} and convert them to a NumPy array.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k+kn}{import} \PYG{n+nn}{numpy} \PYG{k}{as} \PYG{n+nn}{np}
\PYG{n}{points} \PYG{o}{=} \PYG{n}{selected}\PYG{p}{[}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{pmra}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{,}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{pmdec}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}\PYG{p}{]}\PYG{o}{.}\PYG{n}{to\PYGZus{}numpy}\PYG{p}{(}\PYG{p}{)}
\PYG{n}{points}\PYG{o}{.}\PYG{n}{shape}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
(1049, 2)
\end{sphinxVerbatim}
Well pass the points to \sphinxcode{\sphinxupquote{ConvexHull}}, which returns an object that contains the results.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k+kn}{from} \PYG{n+nn}{scipy}\PYG{n+nn}{.}\PYG{n+nn}{spatial} \PYG{k+kn}{import} \PYG{n}{ConvexHull}
\PYG{n}{hull} \PYG{o}{=} \PYG{n}{ConvexHull}\PYG{p}{(}\PYG{n}{points}\PYG{p}{)}
\PYG{n}{hull}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYGZlt{}scipy.spatial.qhull.ConvexHull at 0x7f446b1e8bb0\PYGZgt{}
\end{sphinxVerbatim}
\sphinxcode{\sphinxupquote{hull.vertices}} contains the indices of the points that fall on the perimeter of the hull.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{hull}\PYG{o}{.}\PYG{n}{vertices}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
array([ 692, 873, 141, 303, 42, 622, 45, 83, 127, 182, 1006,
971, 967, 1001, 969, 940], dtype=int32)
\end{sphinxVerbatim}
We can use them as an index into the original array to select the corresponding rows.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{pm\PYGZus{}vertices} \PYG{o}{=} \PYG{n}{points}\PYG{p}{[}\PYG{n}{hull}\PYG{o}{.}\PYG{n}{vertices}\PYG{p}{]}
\PYG{n}{pm\PYGZus{}vertices}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
array([[ \PYGZhy{}4.05037121, \PYGZhy{}14.75623261],
[ \PYGZhy{}3.41981085, \PYGZhy{}14.72365546],
[ \PYGZhy{}3.03521988, \PYGZhy{}14.44357135],
[ \PYGZhy{}2.26847919, \PYGZhy{}13.7140236 ],
[ \PYGZhy{}2.61172203, \PYGZhy{}13.24797471],
[ \PYGZhy{}2.73471401, \PYGZhy{}13.09054471],
[ \PYGZhy{}3.19923146, \PYGZhy{}12.5942653 ],
[ \PYGZhy{}3.34082546, \PYGZhy{}12.47611926],
[ \PYGZhy{}5.67489413, \PYGZhy{}11.16083338],
[ \PYGZhy{}5.95159272, \PYGZhy{}11.10547884],
[ \PYGZhy{}6.42394023, \PYGZhy{}11.05981295],
[ \PYGZhy{}7.09631023, \PYGZhy{}11.95187806],
[ \PYGZhy{}7.30641519, \PYGZhy{}12.24559977],
[ \PYGZhy{}7.04016696, \PYGZhy{}12.88580702],
[ \PYGZhy{}6.00347705, \PYGZhy{}13.75912098],
[ \PYGZhy{}4.42442296, \PYGZhy{}14.74641176]])
\end{sphinxVerbatim}
To plot the resulting polygon, we have to pull out the x and y coordinates.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{pmra\PYGZus{}poly}\PYG{p}{,} \PYG{n}{pmdec\PYGZus{}poly} \PYG{o}{=} \PYG{n}{np}\PYG{o}{.}\PYG{n}{transpose}\PYG{p}{(}\PYG{n}{pm\PYGZus{}vertices}\PYG{p}{)}
\end{sphinxVerbatim}
The following figure shows proper motion in ICRS again, along with the convex hull we just computed.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{pm1} \PYG{o}{=} \PYG{n}{centerline}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{pmra}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}
\PYG{n}{pm2} \PYG{o}{=} \PYG{n}{centerline}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{pmdec}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{plot}\PYG{p}{(}\PYG{n}{pm1}\PYG{p}{,} \PYG{n}{pm2}\PYG{p}{,} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{ko}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{,} \PYG{n}{markersize}\PYG{o}{=}\PYG{l+m+mf}{0.3}\PYG{p}{,} \PYG{n}{alpha}\PYG{o}{=}\PYG{l+m+mf}{0.3}\PYG{p}{)}
\PYG{n}{pm1} \PYG{o}{=} \PYG{n}{selected}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{pmra}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}
\PYG{n}{pm2} \PYG{o}{=} \PYG{n}{selected}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{pmdec}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{plot}\PYG{p}{(}\PYG{n}{pm1}\PYG{p}{,} \PYG{n}{pm2}\PYG{p}{,} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{gx}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{,} \PYG{n}{markersize}\PYG{o}{=}\PYG{l+m+mf}{0.3}\PYG{p}{,} \PYG{n}{alpha}\PYG{o}{=}\PYG{l+m+mf}{0.3}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{plot}\PYG{p}{(}\PYG{n}{pmra\PYGZus{}poly}\PYG{p}{,} \PYG{n}{pmdec\PYGZus{}poly}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{xlabel}\PYG{p}{(}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{Proper motion phi1 (ICRS frame)}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{ylabel}\PYG{p}{(}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{Proper motion phi2 (ICRS frame)}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{xlim}\PYG{p}{(}\PYG{p}{[}\PYG{o}{\PYGZhy{}}\PYG{l+m+mi}{10}\PYG{p}{,} \PYG{l+m+mi}{5}\PYG{p}{]}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{ylim}\PYG{p}{(}\PYG{p}{[}\PYG{o}{\PYGZhy{}}\PYG{l+m+mi}{20}\PYG{p}{,} \PYG{l+m+mi}{5}\PYG{p}{]}\PYG{p}{)}\PYG{p}{;}
\end{sphinxVerbatim}
\noindent\sphinxincludegraphics{{04_select_25_0}.png}
To use \sphinxcode{\sphinxupquote{pm\_vertices}} as part of an ADQL query, we have to convert it to a string.
Well use \sphinxcode{\sphinxupquote{flatten}} to convert from a 2\sphinxhyphen{}D array to a 1\sphinxhyphen{}D array, and \sphinxcode{\sphinxupquote{str}} to convert each element to a string.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{t} \PYG{o}{=} \PYG{p}{[}\PYG{n+nb}{str}\PYG{p}{(}\PYG{n}{x}\PYG{p}{)} \PYG{k}{for} \PYG{n}{x} \PYG{o+ow}{in} \PYG{n}{pm\PYGZus{}vertices}\PYG{o}{.}\PYG{n}{flatten}\PYG{p}{(}\PYG{p}{)}\PYG{p}{]}
\PYG{n}{t}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
[\PYGZsq{}\PYGZhy{}4.050371212154984\PYGZsq{},
\PYGZsq{}\PYGZhy{}14.75623260987968\PYGZsq{},
\PYGZsq{}\PYGZhy{}3.4198108491382455\PYGZsq{},
\PYGZsq{}\PYGZhy{}14.723655456335619\PYGZsq{},
\PYGZsq{}\PYGZhy{}3.035219883740934\PYGZsq{},
\PYGZsq{}\PYGZhy{}14.443571352854612\PYGZsq{},
\PYGZsq{}\PYGZhy{}2.268479190206636\PYGZsq{},
\PYGZsq{}\PYGZhy{}13.714023598831554\PYGZsq{},
\PYGZsq{}\PYGZhy{}2.611722027231764\PYGZsq{},
\PYGZsq{}\PYGZhy{}13.247974712069263\PYGZsq{},
\PYGZsq{}\PYGZhy{}2.7347140078529106\PYGZsq{},
\PYGZsq{}\PYGZhy{}13.090544709622938\PYGZsq{},
\PYGZsq{}\PYGZhy{}3.199231461993783\PYGZsq{},
\PYGZsq{}\PYGZhy{}12.594265302440828\PYGZsq{},
\PYGZsq{}\PYGZhy{}3.34082545787549\PYGZsq{},
\PYGZsq{}\PYGZhy{}12.476119260818695\PYGZsq{},
\PYGZsq{}\PYGZhy{}5.674894125178565\PYGZsq{},
\PYGZsq{}\PYGZhy{}11.160833381392624\PYGZsq{},
\PYGZsq{}\PYGZhy{}5.95159272432137\PYGZsq{},
\PYGZsq{}\PYGZhy{}11.105478836426514\PYGZsq{},
\PYGZsq{}\PYGZhy{}6.423940229776128\PYGZsq{},
\PYGZsq{}\PYGZhy{}11.05981294804957\PYGZsq{},
\PYGZsq{}\PYGZhy{}7.096310230579248\PYGZsq{},
\PYGZsq{}\PYGZhy{}11.951878058650085\PYGZsq{},
\PYGZsq{}\PYGZhy{}7.306415190921692\PYGZsq{},
\PYGZsq{}\PYGZhy{}12.245599765990594\PYGZsq{},
\PYGZsq{}\PYGZhy{}7.040166963232815\PYGZsq{},
\PYGZsq{}\PYGZhy{}12.885807024935527\PYGZsq{},
\PYGZsq{}\PYGZhy{}6.0034770546523735\PYGZsq{},
\PYGZsq{}\PYGZhy{}13.759120984106968\PYGZsq{},
\PYGZsq{}\PYGZhy{}4.42442296194263\PYGZsq{},
\PYGZsq{}\PYGZhy{}14.7464117578883\PYGZsq{}]
\end{sphinxVerbatim}
Now \sphinxcode{\sphinxupquote{t}} is a list of strings; we can use \sphinxcode{\sphinxupquote{join}} to make a single string with commas between the elements.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{pm\PYGZus{}point\PYGZus{}list} \PYG{o}{=} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{, }\PYG{l+s+s1}{\PYGZsq{}}\PYG{o}{.}\PYG{n}{join}\PYG{p}{(}\PYG{n}{t}\PYG{p}{)}
\PYG{n}{pm\PYGZus{}point\PYGZus{}list}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYGZsq{}\PYGZhy{}4.050371212154984, \PYGZhy{}14.75623260987968, \PYGZhy{}3.4198108491382455, \PYGZhy{}14.723655456335619, \PYGZhy{}3.035219883740934, \PYGZhy{}14.443571352854612, \PYGZhy{}2.268479190206636, \PYGZhy{}13.714023598831554, \PYGZhy{}2.611722027231764, \PYGZhy{}13.247974712069263, \PYGZhy{}2.7347140078529106, \PYGZhy{}13.090544709622938, \PYGZhy{}3.199231461993783, \PYGZhy{}12.594265302440828, \PYGZhy{}3.34082545787549, \PYGZhy{}12.476119260818695, \PYGZhy{}5.674894125178565, \PYGZhy{}11.160833381392624, \PYGZhy{}5.95159272432137, \PYGZhy{}11.105478836426514, \PYGZhy{}6.423940229776128, \PYGZhy{}11.05981294804957, \PYGZhy{}7.096310230579248, \PYGZhy{}11.951878058650085, \PYGZhy{}7.306415190921692, \PYGZhy{}12.245599765990594, \PYGZhy{}7.040166963232815, \PYGZhy{}12.885807024935527, \PYGZhy{}6.0034770546523735, \PYGZhy{}13.759120984106968, \PYGZhy{}4.42442296194263, \PYGZhy{}14.7464117578883\PYGZsq{}
\end{sphinxVerbatim}
\section{Selecting the region}
\label{\detokenize{04_select:selecting-the-region}}
Lets review how we got to this point.
\begin{enumerate}
\sphinxsetlistlabels{\arabic}{enumi}{enumii}{}{.}%
\item {}
We made an ADQL query to the Gaia server to get data for stars in the vicinity of GD\sphinxhyphen{}1.
\item {}
We transformed to \sphinxcode{\sphinxupquote{GD1}} coordinates so we could select stars along the centerline of GD\sphinxhyphen{}1.
\item {}
We plotted the proper motion of the centerline stars to identify the bounds of the overdense region.
\item {}
We made a mask that selects stars whose proper motion is in the overdense region.
\end{enumerate}
The problem is that we downloaded data for more than 100,000 stars and selected only about 1000 of them.
It will be more efficient if we select on proper motion as part of the query. That will allow us to work with a larger region of the sky in a single query, and download less unneeded data.
This query will select on the following conditions:
\begin{itemize}
\item {}
\sphinxcode{\sphinxupquote{parallax \textless{} 1}}
\item {}
\sphinxcode{\sphinxupquote{bp\_rp BETWEEN \sphinxhyphen{}0.75 AND 2}}
\item {}
Coordinates within a rectangle in the GD\sphinxhyphen{}1 frame, transformed to ICRS.
\item {}
Proper motion with the polygon we just computed.
\end{itemize}
The first three conditions are the same as in the previous query. Only the last one is new.
Heres the rectangle in the GD\sphinxhyphen{}1 frame well select.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{phi1\PYGZus{}min} \PYG{o}{=} \PYG{o}{\PYGZhy{}}\PYG{l+m+mi}{70}
\PYG{n}{phi1\PYGZus{}max} \PYG{o}{=} \PYG{o}{\PYGZhy{}}\PYG{l+m+mi}{20}
\PYG{n}{phi2\PYGZus{}min} \PYG{o}{=} \PYG{o}{\PYGZhy{}}\PYG{l+m+mi}{5}
\PYG{n}{phi2\PYGZus{}max} \PYG{o}{=} \PYG{l+m+mi}{5}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{phi1\PYGZus{}rect} \PYG{o}{=} \PYG{p}{[}\PYG{n}{phi1\PYGZus{}min}\PYG{p}{,} \PYG{n}{phi1\PYGZus{}min}\PYG{p}{,} \PYG{n}{phi1\PYGZus{}max}\PYG{p}{,} \PYG{n}{phi1\PYGZus{}max}\PYG{p}{]} \PYG{o}{*} \PYG{n}{u}\PYG{o}{.}\PYG{n}{deg}
\PYG{n}{phi2\PYGZus{}rect} \PYG{o}{=} \PYG{p}{[}\PYG{n}{phi2\PYGZus{}min}\PYG{p}{,} \PYG{n}{phi2\PYGZus{}max}\PYG{p}{,} \PYG{n}{phi2\PYGZus{}max}\PYG{p}{,} \PYG{n}{phi2\PYGZus{}min}\PYG{p}{]} \PYG{o}{*} \PYG{n}{u}\PYG{o}{.}\PYG{n}{deg}
\end{sphinxVerbatim}
Heres how we transform it to ICRS, as we saw in the previous lesson.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k+kn}{import} \PYG{n+nn}{gala}\PYG{n+nn}{.}\PYG{n+nn}{coordinates} \PYG{k}{as} \PYG{n+nn}{gc}
\PYG{k+kn}{import} \PYG{n+nn}{astropy}\PYG{n+nn}{.}\PYG{n+nn}{coordinates} \PYG{k}{as} \PYG{n+nn}{coord}
\PYG{n}{corners} \PYG{o}{=} \PYG{n}{gc}\PYG{o}{.}\PYG{n}{GD1Koposov10}\PYG{p}{(}\PYG{n}{phi1}\PYG{o}{=}\PYG{n}{phi1\PYGZus{}rect}\PYG{p}{,} \PYG{n}{phi2}\PYG{o}{=}\PYG{n}{phi2\PYGZus{}rect}\PYG{p}{)}
\PYG{n}{corners\PYGZus{}icrs} \PYG{o}{=} \PYG{n}{corners}\PYG{o}{.}\PYG{n}{transform\PYGZus{}to}\PYG{p}{(}\PYG{n}{coord}\PYG{o}{.}\PYG{n}{ICRS}\PYG{p}{)}
\end{sphinxVerbatim}
To use \sphinxcode{\sphinxupquote{corners\_icrs}} as part of an ADQL query, we have to convert it to a string. Heres how we do that, as we saw in the previous lesson.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{point\PYGZus{}base} \PYG{o}{=} \PYG{l+s+s2}{\PYGZdq{}}\PYG{l+s+si}{\PYGZob{}point.ra.value\PYGZcb{}}\PYG{l+s+s2}{, }\PYG{l+s+si}{\PYGZob{}point.dec.value\PYGZcb{}}\PYG{l+s+s2}{\PYGZdq{}}
\PYG{n}{t} \PYG{o}{=} \PYG{p}{[}\PYG{n}{point\PYGZus{}base}\PYG{o}{.}\PYG{n}{format}\PYG{p}{(}\PYG{n}{point}\PYG{o}{=}\PYG{n}{point}\PYG{p}{)}
\PYG{k}{for} \PYG{n}{point} \PYG{o+ow}{in} \PYG{n}{corners\PYGZus{}icrs}\PYG{p}{]}
\PYG{n}{point\PYGZus{}list} \PYG{o}{=} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{, }\PYG{l+s+s1}{\PYGZsq{}}\PYG{o}{.}\PYG{n}{join}\PYG{p}{(}\PYG{n}{t}\PYG{p}{)}
\PYG{n}{point\PYGZus{}list}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYGZsq{}135.30559858565638, 8.398623940157561, 126.50951508623503, 13.44494195652069, 163.0173655836748, 54.24242734020255, 172.9328536286811, 46.47260492416258\PYGZsq{}
\end{sphinxVerbatim}
Now we have everything we need to assemble the query.
\section{Assemble the query}
\label{\detokenize{04_select:assemble-the-query}}
Heres the base string we used for the query in the previous lesson.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{query\PYGZus{}base} \PYG{o}{=} \PYG{l+s+s2}{\PYGZdq{}\PYGZdq{}\PYGZdq{}}\PYG{l+s+s2}{SELECT }
\PYG{l+s+si}{\PYGZob{}columns\PYGZcb{}}
\PYG{l+s+s2}{FROM gaiadr2.gaia\PYGZus{}source}
\PYG{l+s+s2}{WHERE parallax \PYGZlt{} 1}
\PYG{l+s+s2}{ AND bp\PYGZus{}rp BETWEEN \PYGZhy{}0.75 AND 2 }
\PYG{l+s+s2}{ AND 1 = CONTAINS(POINT(ra, dec), }
\PYG{l+s+s2}{ POLYGON(}\PYG{l+s+si}{\PYGZob{}point\PYGZus{}list\PYGZcb{}}\PYG{l+s+s2}{))}
\PYG{l+s+s2}{\PYGZdq{}\PYGZdq{}\PYGZdq{}}
\end{sphinxVerbatim}
\sphinxstylestrong{Exercise:} Modify \sphinxcode{\sphinxupquote{query\_base}} by adding a new clause to select stars whose coordinates of proper motion, \sphinxcode{\sphinxupquote{pmra}} and \sphinxcode{\sphinxupquote{pmdec}}, fall within the polygon defined by \sphinxcode{\sphinxupquote{pm\_point\_list}}.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{c+c1}{\PYGZsh{} Solution}
\PYG{n}{query\PYGZus{}base} \PYG{o}{=} \PYG{l+s+s2}{\PYGZdq{}\PYGZdq{}\PYGZdq{}}\PYG{l+s+s2}{SELECT }
\PYG{l+s+si}{\PYGZob{}columns\PYGZcb{}}
\PYG{l+s+s2}{FROM gaiadr2.gaia\PYGZus{}source}
\PYG{l+s+s2}{WHERE parallax \PYGZlt{} 1}
\PYG{l+s+s2}{ AND bp\PYGZus{}rp BETWEEN \PYGZhy{}0.75 AND 2 }
\PYG{l+s+s2}{ AND 1 = CONTAINS(POINT(ra, dec), }
\PYG{l+s+s2}{ POLYGON(}\PYG{l+s+si}{\PYGZob{}point\PYGZus{}list\PYGZcb{}}\PYG{l+s+s2}{))}
\PYG{l+s+s2}{ AND 1 = CONTAINS(POINT(pmra, pmdec),}
\PYG{l+s+s2}{ POLYGON(}\PYG{l+s+si}{\PYGZob{}pm\PYGZus{}point\PYGZus{}list\PYGZcb{}}\PYG{l+s+s2}{))}
\PYG{l+s+s2}{\PYGZdq{}\PYGZdq{}\PYGZdq{}}
\end{sphinxVerbatim}
Here again are the columns we want to select.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{columns} \PYG{o}{=} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{source\PYGZus{}id, ra, dec, pmra, pmdec, parallax, parallax\PYGZus{}error, radial\PYGZus{}velocity}\PYG{l+s+s1}{\PYGZsq{}}
\end{sphinxVerbatim}
\sphinxstylestrong{Exercise:} Use \sphinxcode{\sphinxupquote{format}} to format \sphinxcode{\sphinxupquote{query\_base}} and define \sphinxcode{\sphinxupquote{query}}, filling in the values of \sphinxcode{\sphinxupquote{columns}}, \sphinxcode{\sphinxupquote{point\_list}}, and \sphinxcode{\sphinxupquote{pm\_point\_list}}.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{c+c1}{\PYGZsh{} Solution}
\PYG{n}{query} \PYG{o}{=} \PYG{n}{query\PYGZus{}base}\PYG{o}{.}\PYG{n}{format}\PYG{p}{(}\PYG{n}{columns}\PYG{o}{=}\PYG{n}{columns}\PYG{p}{,}
\PYG{n}{point\PYGZus{}list}\PYG{o}{=}\PYG{n}{point\PYGZus{}list}\PYG{p}{,}
\PYG{n}{pm\PYGZus{}point\PYGZus{}list}\PYG{o}{=}\PYG{n}{pm\PYGZus{}point\PYGZus{}list}\PYG{p}{)}
\PYG{n+nb}{print}\PYG{p}{(}\PYG{n}{query}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
SELECT
source\PYGZus{}id, ra, dec, pmra, pmdec, parallax, parallax\PYGZus{}error, radial\PYGZus{}velocity
FROM gaiadr2.gaia\PYGZus{}source
WHERE parallax \PYGZlt{} 1
AND bp\PYGZus{}rp BETWEEN \PYGZhy{}0.75 AND 2
AND 1 = CONTAINS(POINT(ra, dec),
POLYGON(135.30559858565638, 8.398623940157561, 126.50951508623503, 13.44494195652069, 163.0173655836748, 54.24242734020255, 172.9328536286811, 46.47260492416258))
AND 1 = CONTAINS(POINT(pmra, pmdec),
POLYGON(\PYGZhy{}4.050371212154984, \PYGZhy{}14.75623260987968, \PYGZhy{}3.4198108491382455, \PYGZhy{}14.723655456335619, \PYGZhy{}3.035219883740934, \PYGZhy{}14.443571352854612, \PYGZhy{}2.268479190206636, \PYGZhy{}13.714023598831554, \PYGZhy{}2.611722027231764, \PYGZhy{}13.247974712069263, \PYGZhy{}2.7347140078529106, \PYGZhy{}13.090544709622938, \PYGZhy{}3.199231461993783, \PYGZhy{}12.594265302440828, \PYGZhy{}3.34082545787549, \PYGZhy{}12.476119260818695, \PYGZhy{}5.674894125178565, \PYGZhy{}11.160833381392624, \PYGZhy{}5.95159272432137, \PYGZhy{}11.105478836426514, \PYGZhy{}6.423940229776128, \PYGZhy{}11.05981294804957, \PYGZhy{}7.096310230579248, \PYGZhy{}11.951878058650085, \PYGZhy{}7.306415190921692, \PYGZhy{}12.245599765990594, \PYGZhy{}7.040166963232815, \PYGZhy{}12.885807024935527, \PYGZhy{}6.0034770546523735, \PYGZhy{}13.759120984106968, \PYGZhy{}4.42442296194263, \PYGZhy{}14.7464117578883))
\end{sphinxVerbatim}
Heres how we run it.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k+kn}{from} \PYG{n+nn}{astroquery}\PYG{n+nn}{.}\PYG{n+nn}{gaia} \PYG{k+kn}{import} \PYG{n}{Gaia}
\PYG{n}{job} \PYG{o}{=} \PYG{n}{Gaia}\PYG{o}{.}\PYG{n}{launch\PYGZus{}job\PYGZus{}async}\PYG{p}{(}\PYG{n}{query}\PYG{p}{)}
\PYG{n+nb}{print}\PYG{p}{(}\PYG{n}{job}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
Created TAP+ (v1.2.1) \PYGZhy{} Connection:
Host: gea.esac.esa.int
Use HTTPS: True
Port: 443
SSL Port: 443
Created TAP+ (v1.2.1) \PYGZhy{} Connection:
Host: geadata.esac.esa.int
Use HTTPS: True
Port: 443
SSL Port: 443
INFO: Query finished. [astroquery.utils.tap.core]
\PYGZlt{}Table length=7346\PYGZgt{}
name dtype unit description n\PYGZus{}bad
\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{} \PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{} \PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{} \PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{} \PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}
source\PYGZus{}id int64 Unique source identifier (unique within a particular Data Release) 0
ra float64 deg Right ascension 0
dec float64 deg Declination 0
pmra float64 mas / yr Proper motion in right ascension direction 0
pmdec float64 mas / yr Proper motion in declination direction 0
parallax float64 mas Parallax 0
parallax\PYGZus{}error float64 mas Standard error of parallax 0
radial\PYGZus{}velocity float64 km / s Radial velocity 7295
Jobid: 1603132746237O
Phase: COMPLETED
Owner: None
Output file: async\PYGZus{}20201019143906.vot
Results: None
\end{sphinxVerbatim}
And get the results.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{candidate\PYGZus{}table} \PYG{o}{=} \PYG{n}{job}\PYG{o}{.}\PYG{n}{get\PYGZus{}results}\PYG{p}{(}\PYG{p}{)}
\PYG{n+nb}{len}\PYG{p}{(}\PYG{n}{candidate\PYGZus{}table}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
7346
\end{sphinxVerbatim}
\section{Plotting one more time}
\label{\detokenize{04_select:plotting-one-more-time}}
Lets see what the results look like.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{x} \PYG{o}{=} \PYG{n}{candidate\PYGZus{}table}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{ra}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}
\PYG{n}{y} \PYG{o}{=} \PYG{n}{candidate\PYGZus{}table}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{dec}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{plot}\PYG{p}{(}\PYG{n}{x}\PYG{p}{,} \PYG{n}{y}\PYG{p}{,} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{ko}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{,} \PYG{n}{markersize}\PYG{o}{=}\PYG{l+m+mf}{0.3}\PYG{p}{,} \PYG{n}{alpha}\PYG{o}{=}\PYG{l+m+mf}{0.3}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{xlabel}\PYG{p}{(}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{ra (degree ICRS)}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{ylabel}\PYG{p}{(}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{dec (degree ICRS)}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}\PYG{p}{;}
\end{sphinxVerbatim}
\noindent\sphinxincludegraphics{{04_select_51_0}.png}
Here we can see why it was useful to transform these coordinates. In ICRS, it is more difficult to identity the stars near the centerline of GD\sphinxhyphen{}1.
So, before we move on to the next step, lets collect the code we used to transform the coordinates and make a Pandas \sphinxcode{\sphinxupquote{DataFrame}}:
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k+kn}{from} \PYG{n+nn}{pyia} \PYG{k+kn}{import} \PYG{n}{GaiaData}
\PYG{k}{def} \PYG{n+nf}{make\PYGZus{}dataframe}\PYG{p}{(}\PYG{n}{table}\PYG{p}{)}\PYG{p}{:}
\PYG{l+s+sd}{\PYGZdq{}\PYGZdq{}\PYGZdq{}Transform coordinates from ICRS to GD\PYGZhy{}1 frame.}
\PYG{l+s+sd}{ }
\PYG{l+s+sd}{ table: Astropy Table}
\PYG{l+s+sd}{ }
\PYG{l+s+sd}{ returns: Pandas DataFrame}
\PYG{l+s+sd}{ \PYGZdq{}\PYGZdq{}\PYGZdq{}}
\PYG{n}{gaia\PYGZus{}data} \PYG{o}{=} \PYG{n}{GaiaData}\PYG{p}{(}\PYG{n}{table}\PYG{p}{)}
\PYG{n}{c\PYGZus{}sky} \PYG{o}{=} \PYG{n}{gaia\PYGZus{}data}\PYG{o}{.}\PYG{n}{get\PYGZus{}skycoord}\PYG{p}{(}\PYG{n}{distance}\PYG{o}{=}\PYG{l+m+mi}{8}\PYG{o}{*}\PYG{n}{u}\PYG{o}{.}\PYG{n}{kpc}\PYG{p}{,}
\PYG{n}{radial\PYGZus{}velocity}\PYG{o}{=}\PYG{l+m+mi}{0}\PYG{o}{*}\PYG{n}{u}\PYG{o}{.}\PYG{n}{km}\PYG{o}{/}\PYG{n}{u}\PYG{o}{.}\PYG{n}{s}\PYG{p}{)}
\PYG{n}{c\PYGZus{}gd1} \PYG{o}{=} \PYG{n}{gc}\PYG{o}{.}\PYG{n}{reflex\PYGZus{}correct}\PYG{p}{(}
\PYG{n}{c\PYGZus{}sky}\PYG{o}{.}\PYG{n}{transform\PYGZus{}to}\PYG{p}{(}\PYG{n}{gc}\PYG{o}{.}\PYG{n}{GD1Koposov10}\PYG{p}{)}\PYG{p}{)}
\PYG{n}{df} \PYG{o}{=} \PYG{n}{table}\PYG{o}{.}\PYG{n}{to\PYGZus{}pandas}\PYG{p}{(}\PYG{p}{)}
\PYG{n}{df}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{phi1}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]} \PYG{o}{=} \PYG{n}{c\PYGZus{}gd1}\PYG{o}{.}\PYG{n}{phi1}
\PYG{n}{df}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{phi2}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]} \PYG{o}{=} \PYG{n}{c\PYGZus{}gd1}\PYG{o}{.}\PYG{n}{phi2}
\PYG{n}{df}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{pm\PYGZus{}phi1}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]} \PYG{o}{=} \PYG{n}{c\PYGZus{}gd1}\PYG{o}{.}\PYG{n}{pm\PYGZus{}phi1\PYGZus{}cosphi2}
\PYG{n}{df}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{pm\PYGZus{}phi2}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]} \PYG{o}{=} \PYG{n}{c\PYGZus{}gd1}\PYG{o}{.}\PYG{n}{pm\PYGZus{}phi2}
\PYG{k}{return} \PYG{n}{df}
\end{sphinxVerbatim}
Heres how we can use this function:
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{candidate\PYGZus{}df} \PYG{o}{=} \PYG{n}{make\PYGZus{}dataframe}\PYG{p}{(}\PYG{n}{candidate\PYGZus{}table}\PYG{p}{)}
\end{sphinxVerbatim}
And lets see the results.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{x} \PYG{o}{=} \PYG{n}{candidate\PYGZus{}df}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{phi1}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}
\PYG{n}{y} \PYG{o}{=} \PYG{n}{candidate\PYGZus{}df}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{phi2}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{plot}\PYG{p}{(}\PYG{n}{x}\PYG{p}{,} \PYG{n}{y}\PYG{p}{,} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{ko}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{,} \PYG{n}{markersize}\PYG{o}{=}\PYG{l+m+mf}{0.5}\PYG{p}{,} \PYG{n}{alpha}\PYG{o}{=}\PYG{l+m+mf}{0.5}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{xlabel}\PYG{p}{(}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{ra (degree GD1)}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{ylabel}\PYG{p}{(}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{dec (degree GD1)}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}\PYG{p}{;}
\end{sphinxVerbatim}
\noindent\sphinxincludegraphics{{04_select_57_0}.png}
Were starting to see GD\sphinxhyphen{}1 more clearly.
We can compare this figure with one of these panels in Figure 1 from the original paper:
The top panel shows stars selected based on proper motion only, so it is comparable to our figure (although notice that it covers a wider region).
In the next lesson, we will use photometry data from Pan\sphinxhyphen{}STARRS to do a second round of filtering, and see if we can replicate the bottom panel.
Well also learn how to add annotations like the ones in the figure from the paper, and customize the style of the figure to present the results clearly and compellingly.
\section{Saving the DataFrame}
\label{\detokenize{04_select:saving-the-dataframe}}
Lets save this \sphinxcode{\sphinxupquote{DataFrame}} so we can pick up where we left off without running this query again.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{o}{!}rm \PYGZhy{}f gd1\PYGZus{}candidates.hdf5
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{filename} \PYG{o}{=} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{gd1\PYGZus{}candidates.hdf5}\PYG{l+s+s1}{\PYGZsq{}}
\PYG{n}{candidate\PYGZus{}df}\PYG{o}{.}\PYG{n}{to\PYGZus{}hdf}\PYG{p}{(}\PYG{n}{filename}\PYG{p}{,} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{candidate\PYGZus{}df}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\end{sphinxVerbatim}
We can use \sphinxcode{\sphinxupquote{ls}} to confirm that the file exists and check the size:
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{o}{!}ls \PYGZhy{}lh gd1\PYGZus{}candidates.hdf5
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYGZhy{}rw\PYGZhy{}rw\PYGZhy{}r\PYGZhy{}\PYGZhy{} 1 downey downey 756K Oct 19 14:39 gd1\PYGZus{}candidates.hdf5
\end{sphinxVerbatim}
If you are using Windows, \sphinxcode{\sphinxupquote{ls}} might not work; in that case, try:
\begin{sphinxVerbatim}[commandchars=\\\{\}]
!dir gd1\PYGZus{}candidates.hdf5
\end{sphinxVerbatim}
\section{CSV}
\label{\detokenize{04_select:csv}}
Pandas can write a variety of other formats, \sphinxhref{https://pandas.pydata.org/pandas-docs/stable/user\_guide/io.html}{which you can read about here}.
We wont cover all of them, but one other important one is \sphinxhref{https://en.wikipedia.org/wiki/Comma-separated\_values}{CSV}, which stands for “comma\sphinxhyphen{}separated values”.
CSV is a plain\sphinxhyphen{}text format with minimal formatting requirements, so it can be read and written by pretty much any tool that works with data. In that sense, it is the “least common denominator” of data formats.
However, it has an important limitation: some information about the data gets lost in translation, notably the data types. If you read a CSV file from someone else, you might need some additional information to make sure you are getting it right.
Also, CSV files tend to be big, and slow to read and write.
With those caveats, heres how to write one:
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{candidate\PYGZus{}df}\PYG{o}{.}\PYG{n}{to\PYGZus{}csv}\PYG{p}{(}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{gd1\PYGZus{}candidates.csv}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\end{sphinxVerbatim}
We can check the file size like this:
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{o}{!}ls \PYGZhy{}lh gd1\PYGZus{}candidates.csv
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYGZhy{}rw\PYGZhy{}rw\PYGZhy{}r\PYGZhy{}\PYGZhy{} 1 downey downey 1.6M Oct 19 14:39 gd1\PYGZus{}candidates.csv
\end{sphinxVerbatim}
The CSV file about 2 times bigger than the HDF5 file (so thats not that bad, really).
We can see the first few lines like this:
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{o}{!}head \PYGZhy{}3 gd1\PYGZus{}candidates.csv
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
,source\PYGZus{}id,ra,dec,pmra,pmdec,parallax,parallax\PYGZus{}error,radial\PYGZus{}velocity,phi1,phi2,pm\PYGZus{}phi1,pm\PYGZus{}phi2
0,635559124339440000,137.58671691646745,19.1965441084838,\PYGZhy{}3.770521900009566,\PYGZhy{}12.490481778113859,0.7913934419894347,0.2717538145759051,,\PYGZhy{}59.63048941944396,\PYGZhy{}1.21648525150429,\PYGZhy{}7.361362712556612,\PYGZhy{}0.5926328820420083
1,635860218726658176,138.5187065217173,19.09233926905897,\PYGZhy{}5.941679495793577,\PYGZhy{}11.346409129876392,0.30745551377348623,0.19946557779138105,,\PYGZhy{}59.247329893833296,\PYGZhy{}2.0160784008206476,\PYGZhy{}7.527126084599517,1.7487794924398758
\end{sphinxVerbatim}
The CSV file contains the names of the columns, but not the data types.
We can read the CSV file back like this:
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{read\PYGZus{}back\PYGZus{}csv} \PYG{o}{=} \PYG{n}{pd}\PYG{o}{.}\PYG{n}{read\PYGZus{}csv}\PYG{p}{(}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{gd1\PYGZus{}candidates.csv}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\end{sphinxVerbatim}
Lets compare the first few rows of \sphinxcode{\sphinxupquote{candidate\_df}} and \sphinxcode{\sphinxupquote{read\_back\_csv}}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{candidate\PYGZus{}df}\PYG{o}{.}\PYG{n}{head}\PYG{p}{(}\PYG{l+m+mi}{3}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
source\PYGZus{}id ra dec pmra pmdec parallax \PYGZbs{}
0 635559124339440000 137.586717 19.196544 \PYGZhy{}3.770522 \PYGZhy{}12.490482 0.791393
1 635860218726658176 138.518707 19.092339 \PYGZhy{}5.941679 \PYGZhy{}11.346409 0.307456
2 635674126383965568 138.842874 19.031798 \PYGZhy{}3.897001 \PYGZhy{}12.702780 0.779463
parallax\PYGZus{}error radial\PYGZus{}velocity phi1 phi2 pm\PYGZus{}phi1 pm\PYGZus{}phi2
0 0.271754 NaN \PYGZhy{}59.630489 \PYGZhy{}1.216485 \PYGZhy{}7.361363 \PYGZhy{}0.592633
1 0.199466 NaN \PYGZhy{}59.247330 \PYGZhy{}2.016078 \PYGZhy{}7.527126 1.748779
2 0.223692 NaN \PYGZhy{}59.133391 \PYGZhy{}2.306901 \PYGZhy{}7.560608 \PYGZhy{}0.741800
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{read\PYGZus{}back\PYGZus{}csv}\PYG{o}{.}\PYG{n}{head}\PYG{p}{(}\PYG{l+m+mi}{3}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
Unnamed: 0 source\PYGZus{}id ra dec pmra pmdec \PYGZbs{}
0 0 635559124339440000 137.586717 19.196544 \PYGZhy{}3.770522 \PYGZhy{}12.490482
1 1 635860218726658176 138.518707 19.092339 \PYGZhy{}5.941679 \PYGZhy{}11.346409
2 2 635674126383965568 138.842874 19.031798 \PYGZhy{}3.897001 \PYGZhy{}12.702780
parallax parallax\PYGZus{}error radial\PYGZus{}velocity phi1 phi2 pm\PYGZus{}phi1 \PYGZbs{}
0 0.791393 0.271754 NaN \PYGZhy{}59.630489 \PYGZhy{}1.216485 \PYGZhy{}7.361363
1 0.307456 0.199466 NaN \PYGZhy{}59.247330 \PYGZhy{}2.016078 \PYGZhy{}7.527126
2 0.779463 0.223692 NaN \PYGZhy{}59.133391 \PYGZhy{}2.306901 \PYGZhy{}7.560608
pm\PYGZus{}phi2
0 \PYGZhy{}0.592633
1 1.748779
2 \PYGZhy{}0.741800
\end{sphinxVerbatim}
Notice that the index in \sphinxcode{\sphinxupquote{candidate\_df}} has become an unnamed column in \sphinxcode{\sphinxupquote{read\_back\_csv}}. The Pandas functions for writing and reading CSV files provide options to avoid that problem, but this is an example of the kind of thing that can go wrong with CSV files.
\section{Summary}
\label{\detokenize{04_select:summary}}
In the previous lesson we downloaded data for a large number of stars and then selected a small fraction of them based on proper motion.
In this lesson, we improved this process by writing a more complex query that uses the database to select stars based on proper motion. This process requires more computation on the Gaia server, but then were able to either:
\begin{enumerate}
\sphinxsetlistlabels{\arabic}{enumi}{enumii}{}{.}%
\item {}
Search the same region and download less data, or
\item {}
Search a larger region while still downloading a manageable amount of data.
\end{enumerate}
In the next lesson, well learn about the databased \sphinxcode{\sphinxupquote{JOIN}} operation and use it to download photometry data from Pan\sphinxhyphen{}STARRS.
\section{Best practices}
\label{\detokenize{04_select:best-practices}}\begin{itemize}
\item {}
When possible, “move the computation to the data”; that is, do as much of the work as possible on the database server before downloading the data.
\item {}
For most applications, saving data in FITS or HDF5 is better than CSV. FITS and HDF5 are binary formats, so the files are usually smaller, and they store metadata, so you dont lose anything when you read the file back.
\item {}
On the other hand, CSV is a “least common denominator” format; that is, it can be read by practically any application that works with data.
\end{itemize}
\chapter{Chapter 5}
\label{\detokenize{05_join:chapter-5}}\label{\detokenize{05_join::doc}}
This is the fifth in a series of notebooks related to astronomy data.
As a continuing example, we will replicate part of the analysis in a recent paper, “\sphinxhref{https://arxiv.org/abs/1805.00425}{Off the beaten path: Gaia reveals GD\sphinxhyphen{}1 stars outside of the main stream}” by Adrian M. Price\sphinxhyphen{}Whelan and Ana Bonaca.
Picking up where we left off, the next step in the analysis is to select candidate stars based on photometry. The following figure from the paper is a color\sphinxhyphen{}magnitude diagram for the stars selected based on proper motion:
In red is a theoretical isochrone, showing where we expect the stars in GD\sphinxhyphen{}1 to fall based on the metallicity and age of their original globular cluster.
By selecting stars in the shaded area, we can further distinguish the main sequence of GD\sphinxhyphen{}1 from younger background stars.
\section{Outline}
\label{\detokenize{05_join:outline}}
Here are the steps in this notebook:
\begin{enumerate}
\sphinxsetlistlabels{\arabic}{enumi}{enumii}{}{.}%
\item {}
Well reload the candidate stars we identified in the previous notebook.
\item {}
Then well run a query on the Gaia server that uploads the table of candidates and uses a \sphinxcode{\sphinxupquote{JOIN}} operation to select photometry data for the candidate stars.
\item {}
Well write the results to a file for use in the next notebook.
\end{enumerate}
After completing this lesson, you should be able to
\begin{itemize}
\item {}
Upload a table to the Gaia server.
\item {}
Write ADQL queries involving \sphinxcode{\sphinxupquote{JOIN}} operations.
\end{itemize}
\section{Installing libraries}
\label{\detokenize{05_join:installing-libraries}}
If you are running this notebook on Colab, you can run the following cell to install Astroquery and a the other libraries well use.
If you are running this notebook on your own computer, you might have to install these libraries yourself.
If you are using this notebook as part of a Carpentries workshop, you should have received setup instructions.
TODO: Add a link to the instructions.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{c+c1}{\PYGZsh{} If we\PYGZsq{}re running on Colab, install libraries}
\PYG{k+kn}{import} \PYG{n+nn}{sys}
\PYG{n}{IN\PYGZus{}COLAB} \PYG{o}{=} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{google.colab}\PYG{l+s+s1}{\PYGZsq{}} \PYG{o+ow}{in} \PYG{n}{sys}\PYG{o}{.}\PYG{n}{modules}
\PYG{k}{if} \PYG{n}{IN\PYGZus{}COLAB}\PYG{p}{:}
\PYG{o}{!}pip install astroquery astro\PYGZhy{}gala pyia python\PYGZhy{}wget
\end{sphinxVerbatim}
\section{Reloading the data}
\label{\detokenize{05_join:reloading-the-data}}
The following cell downloads the data from the previous notebook.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k+kn}{import} \PYG{n+nn}{os}
\PYG{k+kn}{from} \PYG{n+nn}{wget} \PYG{k+kn}{import} \PYG{n}{download}
\PYG{n}{filename} \PYG{o}{=} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{gd1\PYGZus{}candidates.hdf5}\PYG{l+s+s1}{\PYGZsq{}}
\PYG{n}{path} \PYG{o}{=} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{https://github.com/AllenDowney/AstronomicalData/raw/main/data/}\PYG{l+s+s1}{\PYGZsq{}}
\PYG{k}{if} \PYG{o+ow}{not} \PYG{n}{os}\PYG{o}{.}\PYG{n}{path}\PYG{o}{.}\PYG{n}{exists}\PYG{p}{(}\PYG{n}{filename}\PYG{p}{)}\PYG{p}{:}
\PYG{n+nb}{print}\PYG{p}{(}\PYG{n}{download}\PYG{p}{(}\PYG{n}{path}\PYG{o}{+}\PYG{n}{filename}\PYG{p}{)}\PYG{p}{)}
\end{sphinxVerbatim}
And we can read it back.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k+kn}{import} \PYG{n+nn}{pandas} \PYG{k}{as} \PYG{n+nn}{pd}
\PYG{n}{candidate\PYGZus{}df} \PYG{o}{=} \PYG{n}{pd}\PYG{o}{.}\PYG{n}{read\PYGZus{}hdf}\PYG{p}{(}\PYG{n}{filename}\PYG{p}{,} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{candidate\PYGZus{}df}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\end{sphinxVerbatim}
\sphinxcode{\sphinxupquote{candidate\_df}} is the Pandas DataFrame that contains results from the query in the previous notebook, which selects stars likely to be in GD\sphinxhyphen{}1 based on proper motion. It also includes position and proper motion transformed to the ICRS frame.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k+kn}{import} \PYG{n+nn}{matplotlib}\PYG{n+nn}{.}\PYG{n+nn}{pyplot} \PYG{k}{as} \PYG{n+nn}{plt}
\PYG{n}{x} \PYG{o}{=} \PYG{n}{candidate\PYGZus{}df}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{phi1}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}
\PYG{n}{y} \PYG{o}{=} \PYG{n}{candidate\PYGZus{}df}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{phi2}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{plot}\PYG{p}{(}\PYG{n}{x}\PYG{p}{,} \PYG{n}{y}\PYG{p}{,} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{ko}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{,} \PYG{n}{markersize}\PYG{o}{=}\PYG{l+m+mf}{0.3}\PYG{p}{,} \PYG{n}{alpha}\PYG{o}{=}\PYG{l+m+mf}{0.3}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{xlabel}\PYG{p}{(}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{ra (degree GD1)}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{ylabel}\PYG{p}{(}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{dec (degree GD1)}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}\PYG{p}{;}
\end{sphinxVerbatim}
\noindent\sphinxincludegraphics{{05_join_9_0}.png}
This is the same figure we saw at the end of the previous notebook. GD\sphinxhyphen{}1 is visible against the background stars, but we will be able to see it more clearly after selecting based on photometry data.
\section{Getting photometry data}
\label{\detokenize{05_join:getting-photometry-data}}
The Gaia dataset contains some photometry data, including the variable \sphinxcode{\sphinxupquote{bp\_rp}}, which we used in the original query to select stars with BP \sphinxhyphen{} RP color between \sphinxhyphen{}0.75 and 2.
Selecting stars with \sphinxcode{\sphinxupquote{bp\sphinxhyphen{}rp}} less than 2 excludes many class M dwarf stars, which are low temperature, low luminosity. A star like that at GD\sphinxhyphen{}1s distance would be hard to detect, so if it is detected, it it more likely to be in the foreground.
Now, to select stars with the age and metal richness we expect in GD\sphinxhyphen{}1, we will use \sphinxcode{\sphinxupquote{g \sphinxhyphen{} i}} color and apparent \sphinxcode{\sphinxupquote{g}}\sphinxhyphen{}band magnitude, which are available from the Pan\sphinxhyphen{}STARRS survey.
Conveniently, the Gaia server provides data from Pan\sphinxhyphen{}STARRS as a table in the same database we have been using, so we can access it by making ADQL queries.
In general, looking up a star from the Gaia catalog and finding the corresponding star in the Pan\sphinxhyphen{}STARRS catalog is not easy. This kind of cross matching is not always possible, because a star might appear in one catalog and not the other. And even when both stars are present, there might not be a clear one\sphinxhyphen{}to\sphinxhyphen{}one relationship between stars in the two catalogs.
Fortunately, smart people have worked on this problem, and the Gaia database includes cross\sphinxhyphen{}matching tables that suggest a best neighbor in the Pan\sphinxhyphen{}STARRS catalog for many stars in the Gaia catalog.
\sphinxhref{https://gea.esac.esa.int/archive/documentation/GDR2/Catalogue\_consolidation/chap\_cu9val\_cu9val/ssec\_cu9xma/sssec\_cu9xma\_extcat.html}{This document describes the cross matching process}. Briefly, it uses a cone search to find possible matches in approximately the right position, then uses attributes like color and magnitude to choose pairs of stars most likely to be identical.
So the hard part of cross\sphinxhyphen{}matching has been done for us. However, using the results is a little tricky.
But, it is also an opportunity to learn about one of the most important tools for working with databases: “joining” tables.
In general, a “join” is an operation where you match up records from one table with records from another table using as a “key” a piece of information that is common to both tables, usually some kind of ID code.
In this example:
\begin{itemize}
\item {}
Stars in the Gaia dataset are identified by \sphinxcode{\sphinxupquote{source\_id}}.
\item {}
Stars in the Pan\sphinxhyphen{}STARRS dataset are identified by \sphinxcode{\sphinxupquote{obj\_id}}.
\end{itemize}
For each candidate star we have selected so far, we have the \sphinxcode{\sphinxupquote{source\_id}}; the goal is to find the \sphinxcode{\sphinxupquote{obj\_id}} for the same star (we hope) in the Pan\sphinxhyphen{}STARRS catalog.
To do that we will:
\begin{enumerate}
\sphinxsetlistlabels{\arabic}{enumi}{enumii}{}{.}%
\item {}
Make a table that contains the \sphinxcode{\sphinxupquote{source\_id}} for each candidate star and upload the table to the Gaia server;
\item {}
Use the \sphinxcode{\sphinxupquote{JOIN}} operator to look up each \sphinxcode{\sphinxupquote{source\_id}} in the \sphinxcode{\sphinxupquote{gaiadr2.panstarrs1\_best\_neighbour}} table, which contains the \sphinxcode{\sphinxupquote{obj\_id}} of the best match for each star in the Gaia catalog; then
\item {}
Use the \sphinxcode{\sphinxupquote{JOIN}} operator again to look up each \sphinxcode{\sphinxupquote{obj\_id}} in the \sphinxcode{\sphinxupquote{panstarrs1\_original\_valid}} table, which contains the Pan\sphinxhyphen{}STARRS photometry data we want.
\end{enumerate}
Lets start with the first step, uploading a table.
\section{Preparing a table for uploading}
\label{\detokenize{05_join:preparing-a-table-for-uploading}}
For each candidate star, we want to find the corresponding row in the \sphinxcode{\sphinxupquote{gaiadr2.panstarrs1\_best\_neighbour}} table.
In order to do that, we have to:
\begin{enumerate}
\sphinxsetlistlabels{\arabic}{enumi}{enumii}{}{.}%
\item {}
Write the table in a local file as an XML VOTable, which is a format suitable for transmitting a table over a network.
\item {}
Write an ADQL query that refers to the uploaded table.
\item {}
Change the way we submit the job so it uploads the table before running the query.
\end{enumerate}
The first step is not too difficult because Astropy provides a function called \sphinxcode{\sphinxupquote{writeto}} that can write a \sphinxcode{\sphinxupquote{Table}} in \sphinxcode{\sphinxupquote{XML}}.
\sphinxhref{https://docs.astropy.org/en/stable/io/votable/}{The documentation of this process is here}.
First we have to convert our Pandas \sphinxcode{\sphinxupquote{DataFrame}} to an Astropy \sphinxcode{\sphinxupquote{Table}}.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k+kn}{from} \PYG{n+nn}{astropy}\PYG{n+nn}{.}\PYG{n+nn}{table} \PYG{k+kn}{import} \PYG{n}{Table}
\PYG{n}{candidate\PYGZus{}table} \PYG{o}{=} \PYG{n}{Table}\PYG{o}{.}\PYG{n}{from\PYGZus{}pandas}\PYG{p}{(}\PYG{n}{candidate\PYGZus{}df}\PYG{p}{)}
\PYG{n+nb}{type}\PYG{p}{(}\PYG{n}{candidate\PYGZus{}table}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
astropy.table.table.Table
\end{sphinxVerbatim}
To write the file, we can use \sphinxcode{\sphinxupquote{Table.write}} with \sphinxcode{\sphinxupquote{format=\textquotesingle{}votable\textquotesingle{}}}, \sphinxhref{https://docs.astropy.org/en/stable/io/unified.html\#vo-tables}{as described here}.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{table} \PYG{o}{=} \PYG{n}{candidate\PYGZus{}table}\PYG{p}{[}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{source\PYGZus{}id}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}\PYG{p}{]}
\PYG{n}{table}\PYG{o}{.}\PYG{n}{write}\PYG{p}{(}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{candidate\PYGZus{}df.xml}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{,} \PYG{n+nb}{format}\PYG{o}{=}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{votable}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{,} \PYG{n}{overwrite}\PYG{o}{=}\PYG{k+kc}{True}\PYG{p}{)}
\end{sphinxVerbatim}
Notice that we select a single column from the table, \sphinxcode{\sphinxupquote{source\_id}}.
We could write the entire table to a file, but that would take longer to transmit over the network, and we really only need one column.
This process, taking a structure like a \sphinxcode{\sphinxupquote{Table}} and translating it into a form that can be transmitted over a network, is called \sphinxhref{https://en.wikipedia.org/wiki/Serialization}{serialization}.
XML is one of the most common serialization formats. One nice feature is that XML data is plain text, as opposed to binary digits, so you can read the file we just wrote:
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{o}{!}head candidate\PYGZus{}df.xml
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYGZlt{}?xml version=\PYGZdq{}1.0\PYGZdq{} encoding=\PYGZdq{}utf\PYGZhy{}8\PYGZdq{}?\PYGZgt{}
\PYGZlt{}!\PYGZhy{}\PYGZhy{} Produced with astropy.io.votable version 4.0.1.post1
http://www.astropy.org/ \PYGZhy{}\PYGZhy{}\PYGZgt{}
\PYGZlt{}VOTABLE version=\PYGZdq{}1.4\PYGZdq{} xmlns=\PYGZdq{}http://www.ivoa.net/xml/VOTable/v1.4\PYGZdq{} xmlns:xsi=\PYGZdq{}http://www.w3.org/2001/XMLSchema\PYGZhy{}instance\PYGZdq{} xsi:noNamespaceSchemaLocation=\PYGZdq{}http://www.ivoa.net/xml/VOTable/v1.4\PYGZdq{}\PYGZgt{}
\PYGZlt{}RESOURCE type=\PYGZdq{}results\PYGZdq{}\PYGZgt{}
\PYGZlt{}TABLE\PYGZgt{}
\PYGZlt{}FIELD ID=\PYGZdq{}source\PYGZus{}id\PYGZdq{} datatype=\PYGZdq{}long\PYGZdq{} name=\PYGZdq{}source\PYGZus{}id\PYGZdq{}/\PYGZgt{}
\PYGZlt{}DATA\PYGZgt{}
\PYGZlt{}TABLEDATA\PYGZgt{}
\PYGZlt{}TR\PYGZgt{}
\end{sphinxVerbatim}
XML is a general format, so different XML files contain different kinds of data. In order to read an XML file, its not enough to know that its XML; you also have to know the data format, which is called a \sphinxhref{https://en.wikipedia.org/wiki/XML\_schema}{schema}.
In this example, the schema is VOTable; notice that one of the first tags in the file specifies the schema, and even includes the URL where you can get its definition.
So this is an example of a self\sphinxhyphen{}documenting format.
A drawback of XML is that it tends to be big, which is why we wrote just the \sphinxcode{\sphinxupquote{source\_id}} column rather than the whole table.
The size of the file is about 750 KB, so thats not too bad.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{o}{!}ls \PYGZhy{}lh candidate\PYGZus{}df.xml
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYGZhy{}rw\PYGZhy{}rw\PYGZhy{}r\PYGZhy{}\PYGZhy{} 1 downey downey 396K Oct 19 14:48 candidate\PYGZus{}df.xml
\end{sphinxVerbatim}
If you are using Windows, \sphinxcode{\sphinxupquote{ls}} might not work; in that case, try:
\begin{sphinxVerbatim}[commandchars=\\\{\}]
!dir candidate\PYGZus{}df.xml
\end{sphinxVerbatim}
\sphinxstylestrong{Exercise:} Theres a gotcha here we want to warn you about. Why do you think we used double brackets to specify the column we wanted? What happens if you use single brackets?
Run these cells to find out.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{table} \PYG{o}{=} \PYG{n}{candidate\PYGZus{}table}\PYG{p}{[}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{source\PYGZus{}id}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}\PYG{p}{]}
\PYG{n+nb}{type}\PYG{p}{(}\PYG{n}{table}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
astropy.table.table.Table
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{column} \PYG{o}{=} \PYG{n}{candidate\PYGZus{}table}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{source\PYGZus{}id}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}
\PYG{n+nb}{type}\PYG{p}{(}\PYG{n}{column}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
astropy.table.column.Column
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{c+c1}{\PYGZsh{} writeto(column, \PYGZsq{}candidate\PYGZus{}df.xml\PYGZsq{})}
\end{sphinxVerbatim}
\section{Uploading a table}
\label{\detokenize{05_join:uploading-a-table}}
The next step is to upload this table to the Gaia server and use it as part of a query.
\sphinxhref{https://astroquery.readthedocs.io/en/latest/gaia/gaia.html\#synchronous-query-on-an-on-the-fly-uploaded-table}{Heres the documentation that explains how to run a query with an uploaded table}.
In the spirit of incremental development and testing, lets start with the simplest possible query.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{query} \PYG{o}{=} \PYG{l+s+s2}{\PYGZdq{}\PYGZdq{}\PYGZdq{}}\PYG{l+s+s2}{SELECT *}
\PYG{l+s+s2}{FROM tap\PYGZus{}upload.candidate\PYGZus{}df}
\PYG{l+s+s2}{\PYGZdq{}\PYGZdq{}\PYGZdq{}}
\end{sphinxVerbatim}
This query downloads all rows and all columns from the uploaded table. The name of the table has two parts: \sphinxcode{\sphinxupquote{tap\_upload}} specifies a table that was uploaded using TAP+ (remember thats the name of the protocol were using to talk to the Gaia server).
And \sphinxcode{\sphinxupquote{candidate\_df}} is the name of the table, which we get to choose (unlike \sphinxcode{\sphinxupquote{tap\_upload}}, which we didnt get to choose).
Heres how we run the query:
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k+kn}{from} \PYG{n+nn}{astroquery}\PYG{n+nn}{.}\PYG{n+nn}{gaia} \PYG{k+kn}{import} \PYG{n}{Gaia}
\PYG{n}{job} \PYG{o}{=} \PYG{n}{Gaia}\PYG{o}{.}\PYG{n}{launch\PYGZus{}job\PYGZus{}async}\PYG{p}{(}\PYG{n}{query}\PYG{o}{=}\PYG{n}{query}\PYG{p}{,}
\PYG{n}{upload\PYGZus{}resource}\PYG{o}{=}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{candidate\PYGZus{}df.xml}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{,}
\PYG{n}{upload\PYGZus{}table\PYGZus{}name}\PYG{o}{=}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{candidate\PYGZus{}df}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
Created TAP+ (v1.2.1) \PYGZhy{} Connection:
Host: gea.esac.esa.int
Use HTTPS: True
Port: 443
SSL Port: 443
Created TAP+ (v1.2.1) \PYGZhy{} Connection:
Host: geadata.esac.esa.int
Use HTTPS: True
Port: 443
SSL Port: 443
INFO: Query finished. [astroquery.utils.tap.core]
\end{sphinxVerbatim}
\sphinxcode{\sphinxupquote{upload\_resource}} specifies the name of the file we want to upload, which is the file we just wrote.
\sphinxcode{\sphinxupquote{upload\_table\_name}} is the name we assign to this table, which is the name we used in the query.
And here are the results:
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{results} \PYG{o}{=} \PYG{n}{job}\PYG{o}{.}\PYG{n}{get\PYGZus{}results}\PYG{p}{(}\PYG{p}{)}
\PYG{n}{results}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYGZlt{}Table length=7346\PYGZgt{}
source\PYGZus{}id
int64
\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}
635559124339440000
635860218726658176
635674126383965568
635535454774983040
635497276810313600
635614168640132864
635821843194387840
635551706931167104
635518889086133376
635580294233854464
...
612282738058264960
612485911486166656
612386332668697600
612296172717818624
612250375480101760
612394926899159168
612288854091187712
612428870024913152
612256418500423168
612429144902815104
\end{sphinxVerbatim}
If things go according to plan, the result should contain the same rows and columns as the uploaded table.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n+nb}{len}\PYG{p}{(}\PYG{n}{candidate\PYGZus{}table}\PYG{p}{)}\PYG{p}{,} \PYG{n+nb}{len}\PYG{p}{(}\PYG{n}{results}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
(7346, 7346)
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n+nb}{set}\PYG{p}{(}\PYG{n}{candidate\PYGZus{}table}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{source\PYGZus{}id}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}\PYG{p}{)} \PYG{o}{==} \PYG{n+nb}{set}\PYG{p}{(}\PYG{n}{results}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{source\PYGZus{}id}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
True
\end{sphinxVerbatim}
In this example, we uploaded a table and then downloaded it again, so thats not too useful.
But now that we can upload a table, we can join it with other tables on the Gaia server.
\section{Joining with an uploaded table}
\label{\detokenize{05_join:joining-with-an-uploaded-table}}
Heres the first example of a query that contains a \sphinxcode{\sphinxupquote{JOIN}} clause.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{query1} \PYG{o}{=} \PYG{l+s+s2}{\PYGZdq{}\PYGZdq{}\PYGZdq{}}\PYG{l+s+s2}{SELECT *}
\PYG{l+s+s2}{FROM gaiadr2.panstarrs1\PYGZus{}best\PYGZus{}neighbour as best}
\PYG{l+s+s2}{JOIN tap\PYGZus{}upload.candidate\PYGZus{}df as candidate\PYGZus{}df}
\PYG{l+s+s2}{ON best.source\PYGZus{}id = candidate\PYGZus{}df.source\PYGZus{}id}
\PYG{l+s+s2}{\PYGZdq{}\PYGZdq{}\PYGZdq{}}
\end{sphinxVerbatim}
Lets break that down one clause at a time:
\begin{itemize}
\item {}
\sphinxcode{\sphinxupquote{SELECT *}} means we will download all columns from both tables.
\item {}
\sphinxcode{\sphinxupquote{FROM gaiadr2.panstarrs1\_best\_neighbour as best}} means that well get the columns from the Pan\sphinxhyphen{}STARRS best neighbor table, which well refer to using the short name \sphinxcode{\sphinxupquote{best}}.
\item {}
\sphinxcode{\sphinxupquote{JOIN tap\_upload.candidate\_df as candidate\_df}} means that well also get columns from the uploaded table, which well refer to using the short name \sphinxcode{\sphinxupquote{candidate\_df}}.
\item {}
\sphinxcode{\sphinxupquote{ON best.source\_id = candidate\_df.source\_id}} specifies that we will use \sphinxcode{\sphinxupquote{source\_id }} to match up the rows from the two tables.
\end{itemize}
Heres the \sphinxhref{https://gea.esac.esa.int/archive/documentation/GDR2/Gaia\_archive/chap\_datamodel/sec\_dm\_crossmatches/ssec\_dm\_panstarrs1\_best\_neighbour.html}{documentation of the best neighbor table}.
Lets run the query:
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{job1} \PYG{o}{=} \PYG{n}{Gaia}\PYG{o}{.}\PYG{n}{launch\PYGZus{}job\PYGZus{}async}\PYG{p}{(}\PYG{n}{query}\PYG{o}{=}\PYG{n}{query1}\PYG{p}{,}
\PYG{n}{upload\PYGZus{}resource}\PYG{o}{=}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{candidate\PYGZus{}df.xml}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{,}
\PYG{n}{upload\PYGZus{}table\PYGZus{}name}\PYG{o}{=}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{candidate\PYGZus{}df}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
INFO: Query finished. [astroquery.utils.tap.core]
\end{sphinxVerbatim}
And get the results.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{results1} \PYG{o}{=} \PYG{n}{job1}\PYG{o}{.}\PYG{n}{get\PYGZus{}results}\PYG{p}{(}\PYG{p}{)}
\PYG{n}{results1}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYGZlt{}Table length=3724\PYGZgt{}
source\PYGZus{}id original\PYGZus{}ext\PYGZus{}source\PYGZus{}id ... source\PYGZus{}id\PYGZus{}2
...
int64 int64 ... int64
\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{} \PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{} ... \PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}
635860218726658176 130911385187671349 ... 635860218726658176
635674126383965568 130831388428488720 ... 635674126383965568
635535454774983040 130631378377657369 ... 635535454774983040
635497276810313600 130811380445631930 ... 635497276810313600
635614168640132864 130571395922140135 ... 635614168640132864
635598607974369792 130341392091279513 ... 635598607974369792
635737661835496576 131001399333502136 ... 635737661835496576
635850945892748672 132011398654934147 ... 635850945892748672
635600532119713664 130421392285893623 ... 635600532119713664
... ... ... ...
612241781249124608 129751343755995561 ... 612241781249124608
612332147361443072 130141341458538777 ... 612332147361443072
612426744016802432 130521346852465656 ... 612426744016802432
612331739340341760 130111341217793839 ... 612331739340341760
612282738058264960 129741340445933519 ... 612282738058264960
612386332668697600 130351354570219774 ... 612386332668697600
612296172717818624 129691338006168780 ... 612296172717818624
612250375480101760 129741346475897464 ... 612250375480101760
612394926899159168 130581355199751795 ... 612394926899159168
612256418500423168 129931349075297310 ... 612256418500423168
\end{sphinxVerbatim}
This table contains all of the columns from the best neighbor table, plus the single column from the uploaded table.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{results1}\PYG{o}{.}\PYG{n}{colnames}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
[\PYGZsq{}source\PYGZus{}id\PYGZsq{},
\PYGZsq{}original\PYGZus{}ext\PYGZus{}source\PYGZus{}id\PYGZsq{},
\PYGZsq{}angular\PYGZus{}distance\PYGZsq{},
\PYGZsq{}number\PYGZus{}of\PYGZus{}neighbours\PYGZsq{},
\PYGZsq{}number\PYGZus{}of\PYGZus{}mates\PYGZsq{},
\PYGZsq{}best\PYGZus{}neighbour\PYGZus{}multiplicity\PYGZsq{},
\PYGZsq{}gaia\PYGZus{}astrometric\PYGZus{}params\PYGZsq{},
\PYGZsq{}source\PYGZus{}id\PYGZus{}2\PYGZsq{}]
\end{sphinxVerbatim}
Because one of the column names appears in both tables, the second instance of \sphinxcode{\sphinxupquote{source\_id}} has been appended with the suffix \sphinxcode{\sphinxupquote{\_2}}.
The length of the results table is about 2000, which means we were not able to find matches for all stars in the list of candidate\_df.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n+nb}{len}\PYG{p}{(}\PYG{n}{results1}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
3724
\end{sphinxVerbatim}
To get more information about the matching process, we can inspect \sphinxcode{\sphinxupquote{best\_neighbour\_multiplicity}}, which indicates for each star in Gaia how many stars in Pan\sphinxhyphen{}STARRS are equally likely matches.
For this kind of data exploration, well convert a column from the table to a Pandas \sphinxcode{\sphinxupquote{Series}} so we can use \sphinxcode{\sphinxupquote{value\_counts}}, which counts the number of times each value appears in a \sphinxcode{\sphinxupquote{Series}}, like a histogram.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k+kn}{import} \PYG{n+nn}{pandas} \PYG{k}{as} \PYG{n+nn}{pd}
\PYG{n}{nn} \PYG{o}{=} \PYG{n}{pd}\PYG{o}{.}\PYG{n}{Series}\PYG{p}{(}\PYG{n}{results1}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{best\PYGZus{}neighbour\PYGZus{}multiplicity}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}\PYG{p}{)}
\PYG{n}{nn}\PYG{o}{.}\PYG{n}{value\PYGZus{}counts}\PYG{p}{(}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
1 3724
dtype: int64
\end{sphinxVerbatim}
The result shows that \sphinxcode{\sphinxupquote{1}} is the only value in the \sphinxcode{\sphinxupquote{Series}}, appearing xxx times.
That means that in every case where a match was found, the matching algorithm identified a single neighbor as the most likely match.
Similarly, \sphinxcode{\sphinxupquote{number\_of\_mates}} indicates the number of other stars in Gaia that match with the same star in Pan\sphinxhyphen{}STARRS.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{nm} \PYG{o}{=} \PYG{n}{pd}\PYG{o}{.}\PYG{n}{Series}\PYG{p}{(}\PYG{n}{results1}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{number\PYGZus{}of\PYGZus{}mates}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}\PYG{p}{)}
\PYG{n}{nm}\PYG{o}{.}\PYG{n}{value\PYGZus{}counts}\PYG{p}{(}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
0 3724
dtype: int64
\end{sphinxVerbatim}
For this set of candidate\_df, almost all of the stars weve selected from Pan\sphinxhyphen{}STARRS are only matched with a single star in the Gaia catalog.
\sphinxstylestrong{Detail} The table also contains \sphinxcode{\sphinxupquote{number\_of\_neighbors}} which is the number of stars in Pan\sphinxhyphen{}STARRS that match in terms of position, before using other critieria to choose the most likely match.
\section{Getting the photometry data}
\label{\detokenize{05_join:getting-the-photometry-data}}
The most important column in \sphinxcode{\sphinxupquote{results1}} is \sphinxcode{\sphinxupquote{original\_ext\_source\_id}} which is the \sphinxcode{\sphinxupquote{obj\_id}} we will use to look up the likely matches in Pan\sphinxhyphen{}STARRS to get photometry data.
The process is similar to what we just did to look up the matches. We will:
\begin{enumerate}
\sphinxsetlistlabels{\arabic}{enumi}{enumii}{}{.}%
\item {}
Make a table that contains \sphinxcode{\sphinxupquote{source\_id}} and \sphinxcode{\sphinxupquote{original\_ext\_source\_id}}.
\item {}
Write the table to an XML VOTable file.
\item {}
Write a query that joins the uploaded table with \sphinxcode{\sphinxupquote{gaiadr2.panstarrs1\_original\_valid}} and selects the photometry data we want.
\item {}
Run the query using the uploaded table.
\end{enumerate}
Since weve done everything here before, well do these steps as an exercise.
\sphinxstylestrong{Exercise:} Select \sphinxcode{\sphinxupquote{source\_id}} and \sphinxcode{\sphinxupquote{original\_ext\_source\_id}} from \sphinxcode{\sphinxupquote{results1}} and write the resulting table as a file named \sphinxcode{\sphinxupquote{external.xml}}.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{c+c1}{\PYGZsh{} Solution}
\PYG{n}{table} \PYG{o}{=} \PYG{n}{results1}\PYG{p}{[}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{source\PYGZus{}id}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{,} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{original\PYGZus{}ext\PYGZus{}source\PYGZus{}id}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}\PYG{p}{]}
\PYG{n}{table}\PYG{o}{.}\PYG{n}{write}\PYG{p}{(}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{external.xml}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{,} \PYG{n+nb}{format}\PYG{o}{=}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{votable}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{,} \PYG{n}{overwrite}\PYG{o}{=}\PYG{k+kc}{True}\PYG{p}{)}
\end{sphinxVerbatim}
Use \sphinxcode{\sphinxupquote{!head}} to confirm that the file exists and contains an XML VOTable.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{o}{!}head external.xml
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYGZlt{}?xml version=\PYGZdq{}1.0\PYGZdq{} encoding=\PYGZdq{}utf\PYGZhy{}8\PYGZdq{}?\PYGZgt{}
\PYGZlt{}!\PYGZhy{}\PYGZhy{} Produced with astropy.io.votable version 4.0.1.post1
http://www.astropy.org/ \PYGZhy{}\PYGZhy{}\PYGZgt{}
\PYGZlt{}VOTABLE version=\PYGZdq{}1.4\PYGZdq{} xmlns=\PYGZdq{}http://www.ivoa.net/xml/VOTable/v1.4\PYGZdq{} xmlns:xsi=\PYGZdq{}http://www.w3.org/2001/XMLSchema\PYGZhy{}instance\PYGZdq{} xsi:noNamespaceSchemaLocation=\PYGZdq{}http://www.ivoa.net/xml/VOTable/v1.4\PYGZdq{}\PYGZgt{}
\PYGZlt{}RESOURCE type=\PYGZdq{}results\PYGZdq{}\PYGZgt{}
\PYGZlt{}TABLE\PYGZgt{}
\PYGZlt{}FIELD ID=\PYGZdq{}source\PYGZus{}id\PYGZdq{} datatype=\PYGZdq{}long\PYGZdq{} name=\PYGZdq{}source\PYGZus{}id\PYGZdq{} ucd=\PYGZdq{}meta.id;meta.main\PYGZdq{}\PYGZgt{}
\PYGZlt{}DESCRIPTION\PYGZgt{}
Unique Gaia source identifier
\PYGZlt{}/DESCRIPTION\PYGZgt{}
\end{sphinxVerbatim}
\sphinxstylestrong{Exercise:} Read \sphinxhref{https://gea.esac.esa.int/archive/documentation/GDR2/Gaia\_archive/chap\_datamodel/sec\_dm\_external\_catalogues/ssec\_dm\_panstarrs1\_original\_valid.html}{the documentation of the Pan\sphinxhyphen{}STARRS table} and make note of \sphinxcode{\sphinxupquote{obj\_id}}, which contains the object IDs well use to find the rows we want.
Write a query that uses each value of \sphinxcode{\sphinxupquote{original\_ext\_source\_id}} from the uploaded table to find a row in \sphinxcode{\sphinxupquote{gaiadr2.panstarrs1\_original\_valid}} with the same value in \sphinxcode{\sphinxupquote{obj\_id}}, and select all columns from both tables.
Suggestion: Develop and test your query incrementally. For example:
\begin{enumerate}
\sphinxsetlistlabels{\arabic}{enumi}{enumii}{}{.}%
\item {}
Write a query that downloads all columns from the uploaded table. Test to make sure we can read the uploaded table.
\item {}
Write a query that downloads the first 10 rows from \sphinxcode{\sphinxupquote{gaiadr2.panstarrs1\_original\_valid}}. Test to make sure we can access Pan\sphinxhyphen{}STARRS data.
\item {}
Write a query that joins the two tables and selects all columns. Test that the join works as expected.
\end{enumerate}
As a bonus exercise, write a query that joins the two tables and selects just the columns we need:
\begin{itemize}
\item {}
\sphinxcode{\sphinxupquote{source\_id}} from the uploaded table
\item {}
\sphinxcode{\sphinxupquote{g\_mean\_psf\_mag}} from \sphinxcode{\sphinxupquote{gaiadr2.panstarrs1\_original\_valid}}
\item {}
\sphinxcode{\sphinxupquote{i\_mean\_psf\_mag}} from \sphinxcode{\sphinxupquote{gaiadr2.panstarrs1\_original\_valid}}
\end{itemize}
Hint: When you select a column from a join, you have to specify which table the column is in.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{c+c1}{\PYGZsh{} Solution}
\PYG{n}{query2} \PYG{o}{=} \PYG{l+s+s2}{\PYGZdq{}\PYGZdq{}\PYGZdq{}}\PYG{l+s+s2}{SELECT *}
\PYG{l+s+s2}{FROM tap\PYGZus{}upload.external as external}
\PYG{l+s+s2}{\PYGZdq{}\PYGZdq{}\PYGZdq{}}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{c+c1}{\PYGZsh{} Solution}
\PYG{n}{query2} \PYG{o}{=} \PYG{l+s+s2}{\PYGZdq{}\PYGZdq{}\PYGZdq{}}\PYG{l+s+s2}{SELECT TOP 10 *}
\PYG{l+s+s2}{FROM gaiadr2.panstarrs1\PYGZus{}original\PYGZus{}valid}
\PYG{l+s+s2}{\PYGZdq{}\PYGZdq{}\PYGZdq{}}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{c+c1}{\PYGZsh{} Solution}
\PYG{n}{query2} \PYG{o}{=} \PYG{l+s+s2}{\PYGZdq{}\PYGZdq{}\PYGZdq{}}\PYG{l+s+s2}{SELECT *}
\PYG{l+s+s2}{FROM gaiadr2.panstarrs1\PYGZus{}original\PYGZus{}valid as ps}
\PYG{l+s+s2}{JOIN tap\PYGZus{}upload.external as external}
\PYG{l+s+s2}{ON ps.obj\PYGZus{}id = external.original\PYGZus{}ext\PYGZus{}source\PYGZus{}id}
\PYG{l+s+s2}{\PYGZdq{}\PYGZdq{}\PYGZdq{}}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{c+c1}{\PYGZsh{} Solution}
\PYG{n}{query2} \PYG{o}{=} \PYG{l+s+s2}{\PYGZdq{}\PYGZdq{}\PYGZdq{}}\PYG{l+s+s2}{SELECT}
\PYG{l+s+s2}{external.source\PYGZus{}id, ps.g\PYGZus{}mean\PYGZus{}psf\PYGZus{}mag, ps.i\PYGZus{}mean\PYGZus{}psf\PYGZus{}mag}
\PYG{l+s+s2}{FROM gaiadr2.panstarrs1\PYGZus{}original\PYGZus{}valid as ps}
\PYG{l+s+s2}{JOIN tap\PYGZus{}upload.external as external}
\PYG{l+s+s2}{ON ps.obj\PYGZus{}id = external.original\PYGZus{}ext\PYGZus{}source\PYGZus{}id}
\PYG{l+s+s2}{\PYGZdq{}\PYGZdq{}\PYGZdq{}}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n+nb}{print}\PYG{p}{(}\PYG{n}{query2}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
SELECT
external.source\PYGZus{}id, ps.g\PYGZus{}mean\PYGZus{}psf\PYGZus{}mag, ps.i\PYGZus{}mean\PYGZus{}psf\PYGZus{}mag
FROM gaiadr2.panstarrs1\PYGZus{}original\PYGZus{}valid as ps
JOIN tap\PYGZus{}upload.external as external
ON ps.obj\PYGZus{}id = external.original\PYGZus{}ext\PYGZus{}source\PYGZus{}id
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{job2} \PYG{o}{=} \PYG{n}{Gaia}\PYG{o}{.}\PYG{n}{launch\PYGZus{}job\PYGZus{}async}\PYG{p}{(}\PYG{n}{query}\PYG{o}{=}\PYG{n}{query2}\PYG{p}{,}
\PYG{n}{upload\PYGZus{}resource}\PYG{o}{=}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{external.xml}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{,}
\PYG{n}{upload\PYGZus{}table\PYGZus{}name}\PYG{o}{=}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{external}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
INFO: Query finished. [astroquery.utils.tap.core]
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{results2} \PYG{o}{=} \PYG{n}{job2}\PYG{o}{.}\PYG{n}{get\PYGZus{}results}\PYG{p}{(}\PYG{p}{)}
\PYG{n}{results2}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYGZlt{}Table length=3724\PYGZgt{}
source\PYGZus{}id g\PYGZus{}mean\PYGZus{}psf\PYGZus{}mag i\PYGZus{}mean\PYGZus{}psf\PYGZus{}mag
mag
int64 float64 float64
\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{} \PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{} \PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}\PYGZhy{}
635860218726658176 17.8978004455566 17.5174007415771
635674126383965568 19.2873001098633 17.6781005859375
635535454774983040 16.9237995147705 16.478099822998
635497276810313600 19.9242000579834 18.3339996337891
635614168640132864 16.1515998840332 14.6662998199463
635598607974369792 16.5223999023438 16.1375007629395
635737661835496576 14.5032997131348 13.9849004745483
635850945892748672 16.5174999237061 16.0450000762939
635600532119713664 20.4505996704102 19.5177001953125
... ... ...
612241781249124608 20.2343997955322 18.6518001556396
612332147361443072 21.3848991394043 20.3076000213623
612426744016802432 17.8281002044678 17.4281005859375
612331739340341760 21.8656997680664 19.5223007202148
612282738058264960 22.5151996612549 19.9743995666504
612386332668697600 19.3792991638184 17.9923000335693
612296172717818624 17.4944000244141 16.926700592041
612250375480101760 15.3330001831055 14.6280002593994
612394926899159168 16.4414005279541 15.8212003707886
612256418500423168 20.8715991973877 19.9612007141113
\end{sphinxVerbatim}
\sphinxstylestrong{Challenge exercise}
Do both joins in one query.
Theres an \sphinxhref{https://github.com/smoh/Getting-started-with-Gaia/blob/master/gaia-adql-snippets.md}{example here} you could start with.
\section{Write the data}
\label{\detokenize{05_join:write-the-data}}
Since we have the data in an Astropy \sphinxcode{\sphinxupquote{Table}}, lets store it in a FITS file.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{filename} \PYG{o}{=} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{gd1\PYGZus{}photo.fits}\PYG{l+s+s1}{\PYGZsq{}}
\PYG{n}{results2}\PYG{o}{.}\PYG{n}{write}\PYG{p}{(}\PYG{n}{filename}\PYG{p}{,} \PYG{n}{overwrite}\PYG{o}{=}\PYG{k+kc}{True}\PYG{p}{)}
\end{sphinxVerbatim}
We can check that the file exists, and see how big it is.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{o}{!}ls \PYGZhy{}lh gd1\PYGZus{}photo.fits
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYGZhy{}rw\PYGZhy{}rw\PYGZhy{}r\PYGZhy{}\PYGZhy{} 1 downey downey 96K Oct 19 14:49 gd1\PYGZus{}photo.fits
\end{sphinxVerbatim}
At around 175 KB, it is smaller than some of the other files weve been working with.
If you are using Windows, \sphinxcode{\sphinxupquote{ls}} might not work; in that case, try:
\begin{sphinxVerbatim}[commandchars=\\\{\}]
!dir gd1\PYGZus{}photo.fits
\end{sphinxVerbatim}
\section{Summary}
\label{\detokenize{05_join:summary}}
In this notebook, we used database \sphinxcode{\sphinxupquote{JOIN}} operations to select photometry data for the stars weve identified as candidates to be in GD\sphinxhyphen{}1.
In the next notebook, well use this data for a second round of selection, identifying stars that have photometry data consistent with GD\sphinxhyphen{}1.
\section{Best practice}
\label{\detokenize{05_join:best-practice}}\begin{itemize}
\item {}
Use \sphinxcode{\sphinxupquote{JOIN}} operations to combine data from multiple tables in a databased, using some kind of identifier to match up records from one table with records from another.
\item {}
This is another example of a practice we saw in the previous notebook, moving the computation to the data.
\end{itemize}
\chapter{Chapter 6}
\label{\detokenize{06_photo:chapter-6}}\label{\detokenize{06_photo::doc}}
This is the sixth in a series of notebooks related to astronomy data.
As a continuing example, we will replicate part of the analysis in a recent paper, “\sphinxhref{https://arxiv.org/abs/1805.00425}{Off the beaten path: Gaia reveals GD\sphinxhyphen{}1 stars outside of the main stream}” by Adrian M. Price\sphinxhyphen{}Whelan and Ana Bonaca.
In the previous lesson we downloaded photometry data from Pan\sphinxhyphen{}STARRS, which is available from the same server weve been using to get Gaia data.
The next step in the analysis is to select candidate stars based on the photometry data. The following figure from the paper is a color\sphinxhyphen{}magnitude diagram for the stars selected based on proper motion:
In red is a theoretical isochrone, showing where we expect the stars in GD\sphinxhyphen{}1 to fall based on the metallicity and age of their original globular cluster.
By selecting stars in the shaded area, we can further distinguish the main sequence of GD\sphinxhyphen{}1 from younger background stars.
\section{Outline}
\label{\detokenize{06_photo:outline}}
Here are the steps in this notebook:
\begin{enumerate}
\sphinxsetlistlabels{\arabic}{enumi}{enumii}{}{.}%
\item {}
Well reload the data from the previous notebook and make a color\sphinxhyphen{}magnitude diagram.
\item {}
Then well specify a polygon in the diagram that contains stars with the photometry we expect.
\item {}
Then well merge the photometry data with the list of candidate stars, storing the result in a Pandas \sphinxcode{\sphinxupquote{DataFrame}}.
\end{enumerate}
After completing this lesson, you should be able to
\begin{itemize}
\item {}
Use Matplotlib to specify a \sphinxcode{\sphinxupquote{Polygon}} and determine which points fall inside it.
\item {}
Use Pandas to merge data from multiple \sphinxcode{\sphinxupquote{DataFrames}}, much like a database \sphinxcode{\sphinxupquote{JOIN}} operation.
\end{itemize}
\section{Installing libraries}
\label{\detokenize{06_photo:installing-libraries}}
If you are running this notebook on Colab, you can run the following cell to install Astroquery and a the other libraries well use.
If you are running this notebook on your own computer, you might have to install these libraries yourself.
If you are using this notebook as part of a Carpentries workshop, you should have received setup instructions.
TODO: Add a link to the instructions.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{c+c1}{\PYGZsh{} If we\PYGZsq{}re running on Colab, install libraries}
\PYG{k+kn}{import} \PYG{n+nn}{sys}
\PYG{n}{IN\PYGZus{}COLAB} \PYG{o}{=} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{google.colab}\PYG{l+s+s1}{\PYGZsq{}} \PYG{o+ow}{in} \PYG{n}{sys}\PYG{o}{.}\PYG{n}{modules}
\PYG{k}{if} \PYG{n}{IN\PYGZus{}COLAB}\PYG{p}{:}
\PYG{o}{!}pip install astroquery astro\PYGZhy{}gala pyia python\PYGZhy{}wget
\end{sphinxVerbatim}
\section{Reload the data}
\label{\detokenize{06_photo:reload-the-data}}
The following cell downloads the photometry data we created in the previous notebook.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k+kn}{import} \PYG{n+nn}{os}
\PYG{k+kn}{from} \PYG{n+nn}{wget} \PYG{k+kn}{import} \PYG{n}{download}
\PYG{n}{filename} \PYG{o}{=} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{gd1\PYGZus{}photo.fits}\PYG{l+s+s1}{\PYGZsq{}}
\PYG{n}{filepath} \PYG{o}{=} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{https://github.com/AllenDowney/AstronomicalData/raw/main/data/}\PYG{l+s+s1}{\PYGZsq{}}
\PYG{k}{if} \PYG{o+ow}{not} \PYG{n}{os}\PYG{o}{.}\PYG{n}{path}\PYG{o}{.}\PYG{n}{exists}\PYG{p}{(}\PYG{n}{filename}\PYG{p}{)}\PYG{p}{:}
\PYG{n+nb}{print}\PYG{p}{(}\PYG{n}{download}\PYG{p}{(}\PYG{n}{filepath}\PYG{o}{+}\PYG{n}{filename}\PYG{p}{)}\PYG{p}{)}
\end{sphinxVerbatim}
Now we can read the data back into an Astropy \sphinxcode{\sphinxupquote{Table}}.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k+kn}{from} \PYG{n+nn}{astropy}\PYG{n+nn}{.}\PYG{n+nn}{table} \PYG{k+kn}{import} \PYG{n}{Table}
\PYG{n}{photo\PYGZus{}table} \PYG{o}{=} \PYG{n}{Table}\PYG{o}{.}\PYG{n}{read}\PYG{p}{(}\PYG{n}{filename}\PYG{p}{)}
\end{sphinxVerbatim}
\section{Plotting photometry data}
\label{\detokenize{06_photo:plotting-photometry-data}}
Now that we have photometry data from Pan\sphinxhyphen{}STARRS, we can replicate the \sphinxhref{https://en.wikipedia.org/wiki/Galaxy\_color\%E2\%80\%93magnitude\_diagram}{color\sphinxhyphen{}magnitude diagram} from the original paper:
The y\sphinxhyphen{}axis shows the apparent magnitude of each source with the \sphinxhref{https://en.wikipedia.org/wiki/Photometric\_system}{g filter}.
The x\sphinxhyphen{}axis shows the difference in apparent magnitude between the g and i filters, which indicates color.
Stars with lower values of (g\sphinxhyphen{}i) are brighter in g\sphinxhyphen{}band than in i\sphinxhyphen{}band, compared to other stars, which means they are bluer.
Stars in the lower\sphinxhyphen{}left quadrant of this diagram are less bright and less metallic than the others, which means they are \sphinxhref{http://spiff.rit.edu/classes/ladder/lectures/ordinary\_stars/ordinary.html}{likely to be older}.
Since we expect the stars in GD\sphinxhyphen{}1 to be older than the background stars, the stars in the lower\sphinxhyphen{}left are more likely to be in GD\sphinxhyphen{}1.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k+kn}{import} \PYG{n+nn}{matplotlib}\PYG{n+nn}{.}\PYG{n+nn}{pyplot} \PYG{k}{as} \PYG{n+nn}{plt}
\PYG{k}{def} \PYG{n+nf}{plot\PYGZus{}cmd}\PYG{p}{(}\PYG{n}{table}\PYG{p}{)}\PYG{p}{:}
\PYG{l+s+sd}{\PYGZdq{}\PYGZdq{}\PYGZdq{}Plot a color magnitude diagram.}
\PYG{l+s+sd}{ }
\PYG{l+s+sd}{ table: Table or DataFrame with photometry data}
\PYG{l+s+sd}{ \PYGZdq{}\PYGZdq{}\PYGZdq{}}
\PYG{n}{y} \PYG{o}{=} \PYG{n}{table}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{g\PYGZus{}mean\PYGZus{}psf\PYGZus{}mag}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}
\PYG{n}{x} \PYG{o}{=} \PYG{n}{table}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{g\PYGZus{}mean\PYGZus{}psf\PYGZus{}mag}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]} \PYG{o}{\PYGZhy{}} \PYG{n}{table}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{i\PYGZus{}mean\PYGZus{}psf\PYGZus{}mag}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{plot}\PYG{p}{(}\PYG{n}{x}\PYG{p}{,} \PYG{n}{y}\PYG{p}{,} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{ko}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{,} \PYG{n}{markersize}\PYG{o}{=}\PYG{l+m+mf}{0.3}\PYG{p}{,} \PYG{n}{alpha}\PYG{o}{=}\PYG{l+m+mf}{0.3}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{xlim}\PYG{p}{(}\PYG{p}{[}\PYG{l+m+mi}{0}\PYG{p}{,} \PYG{l+m+mf}{1.5}\PYG{p}{]}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{ylim}\PYG{p}{(}\PYG{p}{[}\PYG{l+m+mi}{14}\PYG{p}{,} \PYG{l+m+mi}{22}\PYG{p}{]}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{gca}\PYG{p}{(}\PYG{p}{)}\PYG{o}{.}\PYG{n}{invert\PYGZus{}yaxis}\PYG{p}{(}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{ylabel}\PYG{p}{(}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{\PYGZdl{}g\PYGZus{}0\PYGZdl{}}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{xlabel}\PYG{p}{(}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{\PYGZdl{}(g\PYGZhy{}i)\PYGZus{}0\PYGZdl{}}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\end{sphinxVerbatim}
\sphinxcode{\sphinxupquote{plot\_cmd}} uses a new function, \sphinxcode{\sphinxupquote{invert\_yaxis}}, to invert the \sphinxcode{\sphinxupquote{y}} axis, which is conventional when plotting magnitudes, since lower magnitude indicates higher brightness.
\sphinxcode{\sphinxupquote{invert\_yaxis}} is a little different from the other functions weve used. You cant call it like this:
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{invert\PYGZus{}yaxis}\PYG{p}{(}\PYG{p}{)} \PYG{c+c1}{\PYGZsh{} doesn\PYGZsq{}t work}
\end{sphinxVerbatim}
You have to call it like this:
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{gca}\PYG{p}{(}\PYG{p}{)}\PYG{o}{.}\PYG{n}{invert\PYGZus{}yaxis}\PYG{p}{(}\PYG{p}{)} \PYG{c+c1}{\PYGZsh{} works}
\end{sphinxVerbatim}
\sphinxcode{\sphinxupquote{gca}} stands for “get current axis”. It returns an object that represents the axes of the current figure, and that object provides \sphinxcode{\sphinxupquote{invert\_yaxis}}.
\sphinxstylestrong{In case anyone asks:} The most likely reason for this inconsistency in the interface is that \sphinxcode{\sphinxupquote{invert\_yaxis}} is a lesser\sphinxhyphen{}used function, so its not made available at the top level of the interface.
Heres what the results look like.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{plot\PYGZus{}cmd}\PYG{p}{(}\PYG{n}{photo\PYGZus{}table}\PYG{p}{)}
\end{sphinxVerbatim}
\noindent\sphinxincludegraphics{{06_photo_12_0}.png}
Our figure does not look exactly like the one in the paper because we are working with a smaller region of the sky, so we dont have as many stars. But we can see an overdense region in the lower left that contains stars with the photometry we expect for GD\sphinxhyphen{}1.
The authors of the original paper derive a detailed polygon that defines a boundary between stars that are likely to be in GD\sphinxhyphen{}1 or not.
As a simplification, well choose a boundary by eye that seems to contain the overdense region.
\section{Drawing a polygon}
\label{\detokenize{06_photo:drawing-a-polygon}}
Matplotlib provides a function called \sphinxcode{\sphinxupquote{ginput}} that lets us click on the figure and make a list of coordinates.
Its a little tricky to use \sphinxcode{\sphinxupquote{ginput}} in a Jupyter notebook.Before calling \sphinxcode{\sphinxupquote{plt.ginput}} we have to tell Matplotlib to use \sphinxcode{\sphinxupquote{TkAgg}} to draw the figure in a new window.
When you run the following cell, a figure should appear in a new window. Click on it 10 times to draw a polygon around the overdense area. A red cross should appear where you click.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k+kn}{import} \PYG{n+nn}{matplotlib} \PYG{k}{as} \PYG{n+nn}{mpl}
\PYG{k}{if} \PYG{n}{IN\PYGZus{}COLAB}\PYG{p}{:}
\PYG{n}{coords} \PYG{o}{=} \PYG{k+kc}{None}
\PYG{k}{else}\PYG{p}{:}
\PYG{n}{mpl}\PYG{o}{.}\PYG{n}{use}\PYG{p}{(}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{TkAgg}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\PYG{n}{plot\PYGZus{}cmd}\PYG{p}{(}\PYG{n}{photo\PYGZus{}table}\PYG{p}{)}
\PYG{n}{coords} \PYG{o}{=} \PYG{n}{plt}\PYG{o}{.}\PYG{n}{ginput}\PYG{p}{(}\PYG{l+m+mi}{10}\PYG{p}{)}
\PYG{n}{mpl}\PYG{o}{.}\PYG{n}{use}\PYG{p}{(}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{agg}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\end{sphinxVerbatim}
The argument to \sphinxcode{\sphinxupquote{ginput}} is the number of times the user has to click on the figure.
The result from \sphinxcode{\sphinxupquote{ginput}} is a list of coordinate pairs.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{coords}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
[(0.2150537634408602, 17.548197203826344),
(0.3897849462365591, 18.94628403237675),
(0.5376344086021505, 19.902869757174393),
(0.7034050179211468, 20.601913171449596),
(0.8288530465949819, 21.300956585724798),
(0.6630824372759856, 21.52170713760118),
(0.4301075268817204, 20.785871964679913),
(0.27329749103942647, 19.71891096394408),
(0.17473118279569888, 18.688741721854306),
(0.17473118279569888, 17.95290654893304)]
\end{sphinxVerbatim}
If \sphinxcode{\sphinxupquote{ginput}} doesnt work for you, you could use the following coordinates.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k}{if} \PYG{n}{coords} \PYG{o+ow}{is} \PYG{k+kc}{None}\PYG{p}{:}
\PYG{n}{coords} \PYG{o}{=} \PYG{p}{[}\PYG{p}{(}\PYG{l+m+mf}{0.2}\PYG{p}{,} \PYG{l+m+mf}{17.5}\PYG{p}{)}\PYG{p}{,}
\PYG{p}{(}\PYG{l+m+mf}{0.2}\PYG{p}{,} \PYG{l+m+mf}{19.5}\PYG{p}{)}\PYG{p}{,}
\PYG{p}{(}\PYG{l+m+mf}{0.65}\PYG{p}{,} \PYG{l+m+mi}{22}\PYG{p}{)}\PYG{p}{,}
\PYG{p}{(}\PYG{l+m+mf}{0.75}\PYG{p}{,} \PYG{l+m+mi}{21}\PYG{p}{)}\PYG{p}{,}
\PYG{p}{(}\PYG{l+m+mf}{0.4}\PYG{p}{,} \PYG{l+m+mi}{19}\PYG{p}{)}\PYG{p}{,}
\PYG{p}{(}\PYG{l+m+mf}{0.4}\PYG{p}{,} \PYG{l+m+mf}{17.5}\PYG{p}{)}\PYG{p}{]}
\end{sphinxVerbatim}
The next step is to convert the coordinates to a format we can use to plot them, which is a sequence of \sphinxcode{\sphinxupquote{x}} coordinates and a sequence of \sphinxcode{\sphinxupquote{y}} coordinates. The NumPy function \sphinxcode{\sphinxupquote{transpose}} does what we want.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k+kn}{import} \PYG{n+nn}{numpy} \PYG{k}{as} \PYG{n+nn}{np}
\PYG{n}{xs}\PYG{p}{,} \PYG{n}{ys} \PYG{o}{=} \PYG{n}{np}\PYG{o}{.}\PYG{n}{transpose}\PYG{p}{(}\PYG{n}{coords}\PYG{p}{)}
\PYG{n}{xs}\PYG{p}{,} \PYG{n}{ys}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
(array([0.21505376, 0.38978495, 0.53763441, 0.70340502, 0.82885305,
0.66308244, 0.43010753, 0.27329749, 0.17473118, 0.17473118]),
array([17.5481972 , 18.94628403, 19.90286976, 20.60191317, 21.30095659,
21.52170714, 20.78587196, 19.71891096, 18.68874172, 17.95290655]))
\end{sphinxVerbatim}
To display the polygon, well draw the figure again and use \sphinxcode{\sphinxupquote{plt.plot}} to draw the polygon.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{plot\PYGZus{}cmd}\PYG{p}{(}\PYG{n}{photo\PYGZus{}table}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{plot}\PYG{p}{(}\PYG{n}{xs}\PYG{p}{,} \PYG{n}{ys}\PYG{p}{)}\PYG{p}{;}
\end{sphinxVerbatim}
\noindent\sphinxincludegraphics{{06_photo_23_0}.png}
If it looks like your polygon does a good job surrounding the overdense area, go on to the next section. Otherwise you can try again.
If you want a polygon with more points (or fewer), you can change the argument to \sphinxcode{\sphinxupquote{ginput}}.
The polygon does not have to be “closed”. When we use this polygon in the next section, the last and first points will be connected by a straight line.
\section{Which points are in the polygon?}
\label{\detokenize{06_photo:which-points-are-in-the-polygon}}
Matplotlib provides a \sphinxcode{\sphinxupquote{Path}} object that we can use to check which points fall in the polygon we selected.
Heres how we make a \sphinxcode{\sphinxupquote{Path}} using a list of coordinates.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k+kn}{from} \PYG{n+nn}{matplotlib}\PYG{n+nn}{.}\PYG{n+nn}{path} \PYG{k+kn}{import} \PYG{n}{Path}
\PYG{n}{path} \PYG{o}{=} \PYG{n}{Path}\PYG{p}{(}\PYG{n}{coords}\PYG{p}{)}
\PYG{n}{path}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
Path(array([[ 0.21505376, 17.5481972 ],
[ 0.38978495, 18.94628403],
[ 0.53763441, 19.90286976],
[ 0.70340502, 20.60191317],
[ 0.82885305, 21.30095659],
[ 0.66308244, 21.52170714],
[ 0.43010753, 20.78587196],
[ 0.27329749, 19.71891096],
[ 0.17473118, 18.68874172],
[ 0.17473118, 17.95290655]]), None)
\end{sphinxVerbatim}
\sphinxcode{\sphinxupquote{Path}} provides \sphinxcode{\sphinxupquote{contains\_points}}, which figures out which points are inside the polygon.
To test it, well create a list with two points, one inside the polygon and one outside.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{points} \PYG{o}{=} \PYG{p}{[}\PYG{p}{(}\PYG{l+m+mf}{0.4}\PYG{p}{,} \PYG{l+m+mi}{20}\PYG{p}{)}\PYG{p}{,}
\PYG{p}{(}\PYG{l+m+mf}{0.4}\PYG{p}{,} \PYG{l+m+mi}{30}\PYG{p}{)}\PYG{p}{]}
\end{sphinxVerbatim}
Now we can make sure \sphinxcode{\sphinxupquote{contains\_points}} does what we expect.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{inside} \PYG{o}{=} \PYG{n}{path}\PYG{o}{.}\PYG{n}{contains\PYGZus{}points}\PYG{p}{(}\PYG{n}{points}\PYG{p}{)}
\PYG{n}{inside}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
array([ True, False])
\end{sphinxVerbatim}
The result is an array of Boolean values.
We are almost ready to select stars whose photometry data falls in this polygon. But first we need to do some data cleaning.
\section{Reloading the data}
\label{\detokenize{06_photo:reloading-the-data}}
Now we need to combine the photometry data with the list of candidate stars we identified in a previous notebook. The following cell downloads it:
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k+kn}{import} \PYG{n+nn}{os}
\PYG{k+kn}{from} \PYG{n+nn}{wget} \PYG{k+kn}{import} \PYG{n}{download}
\PYG{n}{filename} \PYG{o}{=} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{gd1\PYGZus{}candidates.hdf5}\PYG{l+s+s1}{\PYGZsq{}}
\PYG{n}{filepath} \PYG{o}{=} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{https://github.com/AllenDowney/AstronomicalData/raw/main/data/}\PYG{l+s+s1}{\PYGZsq{}}
\PYG{k}{if} \PYG{o+ow}{not} \PYG{n}{os}\PYG{o}{.}\PYG{n}{path}\PYG{o}{.}\PYG{n}{exists}\PYG{p}{(}\PYG{n}{filename}\PYG{p}{)}\PYG{p}{:}
\PYG{n+nb}{print}\PYG{p}{(}\PYG{n}{download}\PYG{p}{(}\PYG{n}{filepath}\PYG{o}{+}\PYG{n}{filename}\PYG{p}{)}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k+kn}{import} \PYG{n+nn}{pandas} \PYG{k}{as} \PYG{n+nn}{pd}
\PYG{n}{candidate\PYGZus{}df} \PYG{o}{=} \PYG{n}{pd}\PYG{o}{.}\PYG{n}{read\PYGZus{}hdf}\PYG{p}{(}\PYG{n}{filename}\PYG{p}{,} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{candidate\PYGZus{}df}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\end{sphinxVerbatim}
\sphinxcode{\sphinxupquote{candidate\_df}} is the Pandas DataFrame that contains the results from Notebook XX, which selects stars likely to be in GD\sphinxhyphen{}1 based on proper motion. It also includes position and proper motion transformed to the ICRS frame.
\section{Merging photometry data}
\label{\detokenize{06_photo:merging-photometry-data}}
Before we select stars based on photometry data, we have to solve two problems:
\begin{enumerate}
\sphinxsetlistlabels{\arabic}{enumi}{enumii}{}{.}%
\item {}
We only have Pan\sphinxhyphen{}STARRS data for some stars in \sphinxcode{\sphinxupquote{candidate\_df}}.
\item {}
Even for the stars where we have Pan\sphinxhyphen{}STARRS data in \sphinxcode{\sphinxupquote{photo\_table}}, some photometry data is missing.
\end{enumerate}
We will solve these problems in two step:
\begin{enumerate}
\sphinxsetlistlabels{\arabic}{enumi}{enumii}{}{.}%
\item {}
Well merge the data from \sphinxcode{\sphinxupquote{candidate\_df}} and \sphinxcode{\sphinxupquote{photo\_table}} into a single Pandas \sphinxcode{\sphinxupquote{DataFrame}}.
\item {}
Well use Pandas functions to deal with missing data.
\end{enumerate}
\sphinxcode{\sphinxupquote{candidate\_df}} is already a \sphinxcode{\sphinxupquote{DataFrame}}, but \sphinxcode{\sphinxupquote{results}} is an Astropy \sphinxcode{\sphinxupquote{Table}}. Lets convert it to Pandas:
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{photo\PYGZus{}df} \PYG{o}{=} \PYG{n}{photo\PYGZus{}table}\PYG{o}{.}\PYG{n}{to\PYGZus{}pandas}\PYG{p}{(}\PYG{p}{)}
\PYG{k}{for} \PYG{n}{colname} \PYG{o+ow}{in} \PYG{n}{photo\PYGZus{}df}\PYG{o}{.}\PYG{n}{columns}\PYG{p}{:}
\PYG{n+nb}{print}\PYG{p}{(}\PYG{n}{colname}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
source\PYGZus{}id
g\PYGZus{}mean\PYGZus{}psf\PYGZus{}mag
i\PYGZus{}mean\PYGZus{}psf\PYGZus{}mag
\end{sphinxVerbatim}
Now we want to combine \sphinxcode{\sphinxupquote{candidate\_df}} and \sphinxcode{\sphinxupquote{photo\_df}} into a single table, using \sphinxcode{\sphinxupquote{source\_id}} to match up the rows.
You might recognize this task; its the same as the JOIN operation in ADQL/SQL.
Pandas provides a function called \sphinxcode{\sphinxupquote{merge}} that does what we want. Heres how we use it.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{merged} \PYG{o}{=} \PYG{n}{pd}\PYG{o}{.}\PYG{n}{merge}\PYG{p}{(}\PYG{n}{candidate\PYGZus{}df}\PYG{p}{,}
\PYG{n}{photo\PYGZus{}df}\PYG{p}{,}
\PYG{n}{on}\PYG{o}{=}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{source\PYGZus{}id}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{,}
\PYG{n}{how}\PYG{o}{=}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{left}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\PYG{n}{merged}\PYG{o}{.}\PYG{n}{head}\PYG{p}{(}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
source\PYGZus{}id ra dec pmra pmdec parallax \PYGZbs{}
0 635559124339440000 137.586717 19.196544 \PYGZhy{}3.770522 \PYGZhy{}12.490482 0.791393
1 635860218726658176 138.518707 19.092339 \PYGZhy{}5.941679 \PYGZhy{}11.346409 0.307456
2 635674126383965568 138.842874 19.031798 \PYGZhy{}3.897001 \PYGZhy{}12.702780 0.779463
3 635535454774983040 137.837752 18.864007 \PYGZhy{}4.335041 \PYGZhy{}14.492309 0.314514
4 635497276810313600 138.044516 19.009471 \PYGZhy{}7.172931 \PYGZhy{}12.291499 0.425404
parallax\PYGZus{}error radial\PYGZus{}velocity phi1 phi2 pm\PYGZus{}phi1 pm\PYGZus{}phi2 \PYGZbs{}
0 0.271754 NaN \PYGZhy{}59.630489 \PYGZhy{}1.216485 \PYGZhy{}7.361363 \PYGZhy{}0.592633
1 0.199466 NaN \PYGZhy{}59.247330 \PYGZhy{}2.016078 \PYGZhy{}7.527126 1.748779
2 0.223692 NaN \PYGZhy{}59.133391 \PYGZhy{}2.306901 \PYGZhy{}7.560608 \PYGZhy{}0.741800
3 0.102775 NaN \PYGZhy{}59.785300 \PYGZhy{}1.594569 \PYGZhy{}9.357536 \PYGZhy{}1.218492
4 0.337689 NaN \PYGZhy{}59.557744 \PYGZhy{}1.682147 \PYGZhy{}9.000831 2.334407
g\PYGZus{}mean\PYGZus{}psf\PYGZus{}mag i\PYGZus{}mean\PYGZus{}psf\PYGZus{}mag
0 NaN NaN
1 17.8978 17.517401
2 19.2873 17.678101
3 16.9238 16.478100
4 19.9242 18.334000
\end{sphinxVerbatim}
The first argument is the “left” table, the second argument is the “right” table, and the keyword argument \sphinxcode{\sphinxupquote{on=\textquotesingle{}source\_id\textquotesingle{}}} specifies a column to use to match up the rows.
The argument \sphinxcode{\sphinxupquote{how=\textquotesingle{}left\textquotesingle{}}} means that the result should have all rows from the left table, even if some of them dont match up with a row in the right table.
If you are interested in the other options for \sphinxcode{\sphinxupquote{how}}, you can \sphinxhref{https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.merge.html}{read the documentation of \sphinxcode{\sphinxupquote{merge}}}.
You can also do different types of join in ADQL/SQL; \sphinxhref{https://www.w3schools.com/sql/sql\_join.asp}{you can read about that here}.
The result is a \sphinxcode{\sphinxupquote{DataFrame}} that contains the same number of rows as \sphinxcode{\sphinxupquote{candidate\_df}}.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n+nb}{len}\PYG{p}{(}\PYG{n}{candidate\PYGZus{}df}\PYG{p}{)}\PYG{p}{,} \PYG{n+nb}{len}\PYG{p}{(}\PYG{n}{photo\PYGZus{}df}\PYG{p}{)}\PYG{p}{,} \PYG{n+nb}{len}\PYG{p}{(}\PYG{n}{merged}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
(7346, 3724, 7346)
\end{sphinxVerbatim}
And all columns from both tables.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k}{for} \PYG{n}{colname} \PYG{o+ow}{in} \PYG{n}{merged}\PYG{o}{.}\PYG{n}{columns}\PYG{p}{:}
\PYG{n+nb}{print}\PYG{p}{(}\PYG{n}{colname}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
source\PYGZus{}id
ra
dec
pmra
pmdec
parallax
parallax\PYGZus{}error
radial\PYGZus{}velocity
phi1
phi2
pm\PYGZus{}phi1
pm\PYGZus{}phi2
g\PYGZus{}mean\PYGZus{}psf\PYGZus{}mag
i\PYGZus{}mean\PYGZus{}psf\PYGZus{}mag
\end{sphinxVerbatim}
\sphinxstylestrong{Detail} You might notice that Pandas also provides a function called \sphinxcode{\sphinxupquote{join}}; it does almost the same thing, but the interface is slightly different. We think \sphinxcode{\sphinxupquote{merge}} is a little easier to use, so thats what we chose. Its also more consistent with JOIN in SQL, so if you learn how to use \sphinxcode{\sphinxupquote{pd.merge}}, you are also learning how to use SQL JOIN.
Also, someone might ask why we have to use Pandas to do this join; why didnt we do it in ADQL. The answer is that we could have done that, but since we already have the data we need, we should probably do the computation locally rather than make another round trip to the Gaia server.
\section{Missing data}
\label{\detokenize{06_photo:missing-data}}
Lets add columns to the merged table for magnitude and color.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{merged}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{mag}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]} \PYG{o}{=} \PYG{n}{merged}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{g\PYGZus{}mean\PYGZus{}psf\PYGZus{}mag}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}
\PYG{n}{merged}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{color}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]} \PYG{o}{=} \PYG{n}{merged}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{g\PYGZus{}mean\PYGZus{}psf\PYGZus{}mag}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]} \PYG{o}{\PYGZhy{}} \PYG{n}{merged}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{i\PYGZus{}mean\PYGZus{}psf\PYGZus{}mag}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}
\end{sphinxVerbatim}
These columns contain the special value \sphinxcode{\sphinxupquote{NaN}} where we are missing data.
We can use \sphinxcode{\sphinxupquote{notnull}} to see which rows contain value data, that is, not null values.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{merged}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{color}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}\PYG{o}{.}\PYG{n}{notnull}\PYG{p}{(}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
0 False
1 True
2 True
3 True
4 True
...
7341 True
7342 False
7343 False
7344 True
7345 False
Name: color, Length: 7346, dtype: bool
\end{sphinxVerbatim}
And \sphinxcode{\sphinxupquote{sum}} to count the number of valid values.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{merged}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{color}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}\PYG{o}{.}\PYG{n}{notnull}\PYG{p}{(}\PYG{p}{)}\PYG{o}{.}\PYG{n}{sum}\PYG{p}{(}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
3724
\end{sphinxVerbatim}
For scientific purposes, its not obvious what we should do with candidate stars if we dont have photometry data. Should we give them the benefit of the doubt or leave them out?
In part the answer depends on the goal: are we trying to identify more stars that might be in GD\sphinxhyphen{}1, or a smaller set of stars that have higher probability?
In the next section, well leave them out, but you can experiment with the alternative.
\section{Selecting based on photometry}
\label{\detokenize{06_photo:selecting-based-on-photometry}}
Now lets see how many of these points are inside the polygon we chose.
We can use a list of column names to select \sphinxcode{\sphinxupquote{color}} and \sphinxcode{\sphinxupquote{mag}}.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{points} \PYG{o}{=} \PYG{n}{merged}\PYG{p}{[}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{color}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{,} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{mag}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}\PYG{p}{]}
\PYG{n}{points}\PYG{o}{.}\PYG{n}{head}\PYG{p}{(}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
color mag
0 NaN NaN
1 0.3804 17.8978
2 1.6092 19.2873
3 0.4457 16.9238
4 1.5902 19.9242
\end{sphinxVerbatim}
The result is a \sphinxcode{\sphinxupquote{DataFrame}} that can be treated as a sequence of coordinates, so we can pass it to \sphinxcode{\sphinxupquote{contains\_points}}:
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{inside} \PYG{o}{=} \PYG{n}{path}\PYG{o}{.}\PYG{n}{contains\PYGZus{}points}\PYG{p}{(}\PYG{n}{points}\PYG{p}{)}
\PYG{n}{inside}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
array([False, False, False, ..., False, False, False])
\end{sphinxVerbatim}
The result is a Boolean array. We can use \sphinxcode{\sphinxupquote{sum}} to see how many stars fall in the polygon.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{inside}\PYG{o}{.}\PYG{n}{sum}\PYG{p}{(}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
496
\end{sphinxVerbatim}
Now we can use \sphinxcode{\sphinxupquote{inside}} as a mask to select stars that fall inside the polygon.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{selected} \PYG{o}{=} \PYG{n}{merged}\PYG{p}{[}\PYG{n}{inside}\PYG{p}{]}
\end{sphinxVerbatim}
Lets make a color\sphinxhyphen{}magnitude plot one more time, highlighting the selected stars with green \sphinxcode{\sphinxupquote{x}} marks.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{plot\PYGZus{}cmd}\PYG{p}{(}\PYG{n}{photo\PYGZus{}table}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{plot}\PYG{p}{(}\PYG{n}{xs}\PYG{p}{,} \PYG{n}{ys}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{plot}\PYG{p}{(}\PYG{n}{selected}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{color}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}\PYG{p}{,} \PYG{n}{selected}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{mag}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}\PYG{p}{,} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{gx}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}\PYG{p}{;}
\end{sphinxVerbatim}
\noindent\sphinxincludegraphics{{06_photo_61_0}.png}
It looks like the selected stars are, in fact, inside the polygon, which means they have photometry data consistent with GD\sphinxhyphen{}1.
Finally, we can plot the coordinates of the selected stars:
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{figure}\PYG{p}{(}\PYG{n}{figsize}\PYG{o}{=}\PYG{p}{(}\PYG{l+m+mi}{10}\PYG{p}{,}\PYG{l+m+mf}{2.5}\PYG{p}{)}\PYG{p}{)}
\PYG{n}{x} \PYG{o}{=} \PYG{n}{selected}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{phi1}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}
\PYG{n}{y} \PYG{o}{=} \PYG{n}{selected}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{phi2}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{plot}\PYG{p}{(}\PYG{n}{x}\PYG{p}{,} \PYG{n}{y}\PYG{p}{,} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{ko}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{,} \PYG{n}{markersize}\PYG{o}{=}\PYG{l+m+mf}{0.7}\PYG{p}{,} \PYG{n}{alpha}\PYG{o}{=}\PYG{l+m+mf}{0.9}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{xlabel}\PYG{p}{(}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{ra (degree GD1)}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{ylabel}\PYG{p}{(}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{dec (degree GD1)}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{axis}\PYG{p}{(}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{equal}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}\PYG{p}{;}
\end{sphinxVerbatim}
\noindent\sphinxincludegraphics{{06_photo_63_0}.png}
This example includes two new Matplotlib commands:
\begin{itemize}
\item {}
\sphinxcode{\sphinxupquote{figure}} creates the figure. In previous examples, we didnt have to use this function; the figure was created automatically. But when we call it explicitly, we can provide arguments like \sphinxcode{\sphinxupquote{figsize}}, which sets the size of the figure.
\item {}
\sphinxcode{\sphinxupquote{axis}} with the parameter \sphinxcode{\sphinxupquote{equal}} sets up the axes so a unit is the same size along the \sphinxcode{\sphinxupquote{x}} and \sphinxcode{\sphinxupquote{y}} axes.
\end{itemize}
In an example like this, where \sphinxcode{\sphinxupquote{x}} and \sphinxcode{\sphinxupquote{y}} represent coordinates in space, equal axes ensures that the distance between points is represented accurately.
\section{Write the data}
\label{\detokenize{06_photo:write-the-data}}
Lets write the merged DataFrame to a file.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{filename} \PYG{o}{=} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{gd1\PYGZus{}merged.hdf5}\PYG{l+s+s1}{\PYGZsq{}}
\PYG{n}{merged}\PYG{o}{.}\PYG{n}{to\PYGZus{}hdf}\PYG{p}{(}\PYG{n}{filename}\PYG{p}{,} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{merged}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\PYG{n}{selected}\PYG{o}{.}\PYG{n}{to\PYGZus{}hdf}\PYG{p}{(}\PYG{n}{filename}\PYG{p}{,} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{selected}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{o}{!}ls \PYGZhy{}lh gd1\PYGZus{}merged.hdf5
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYGZhy{}rw\PYGZhy{}rw\PYGZhy{}r\PYGZhy{}\PYGZhy{} 1 downey downey 2.0M Oct 19 17:21 gd1\PYGZus{}merged.hdf5
\end{sphinxVerbatim}
If you are using Windows, \sphinxcode{\sphinxupquote{ls}} might not work; in that case, try:
\begin{sphinxVerbatim}[commandchars=\\\{\}]
!dir gd1\PYGZus{}merged.hdf5
\end{sphinxVerbatim}
\section{Save the polygon}
\label{\detokenize{06_photo:save-the-polygon}}
\sphinxhref{https://en.wikipedia.org/wiki/Reproducibility\#Reproducible\_research}{Reproducibile research} is “the idea that … the full computational environment used to produce the results in the paper such as the code, data, etc. can be used to reproduce the results and create new work based on the research.”
This Jupyter notebook is an example of reproducible research because it contains all of the code needed to reproduce the results, including the database queries that download the data and and analysis.
However, when we used \sphinxcode{\sphinxupquote{ginput}} to define a polygon by hand, we introduced a non\sphinxhyphen{}reproducible element to the analysis. If someone running this notebook chooses a different polygon, they will get different results. So it is important to record the polygon we chose as part of the data analysis pipeline.
Since \sphinxcode{\sphinxupquote{coords}} is a NumPy array, we cant use \sphinxcode{\sphinxupquote{to\_hdf}} to save it in a file. But we can convert it to a Pandas \sphinxcode{\sphinxupquote{DataFrame}} and save that.
As an alternative, we could use \sphinxhref{http://www.pytables.org/index.html}{PyTables}, which is the library Pandas uses to read and write files. It is a powerful library, but not easy to use directly. So lets take advantage of Pandas.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{coords\PYGZus{}df} \PYG{o}{=} \PYG{n}{pd}\PYG{o}{.}\PYG{n}{DataFrame}\PYG{p}{(}\PYG{n}{coords}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{filename} \PYG{o}{=} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{gd1\PYGZus{}polygon.hdf5}\PYG{l+s+s1}{\PYGZsq{}}
\PYG{n}{coords\PYGZus{}df}\PYG{o}{.}\PYG{n}{to\PYGZus{}hdf}\PYG{p}{(}\PYG{n}{filename}\PYG{p}{,} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{coords\PYGZus{}df}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\end{sphinxVerbatim}
We can read it back like this.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{coords2\PYGZus{}df} \PYG{o}{=} \PYG{n}{pd}\PYG{o}{.}\PYG{n}{read\PYGZus{}hdf}\PYG{p}{(}\PYG{n}{filename}\PYG{p}{,} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{coords\PYGZus{}df}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\PYG{n}{coords2} \PYG{o}{=} \PYG{n}{coords2\PYGZus{}df}\PYG{o}{.}\PYG{n}{to\PYGZus{}numpy}\PYG{p}{(}\PYG{p}{)}
\end{sphinxVerbatim}
And verify that the data we read back is the same.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{np}\PYG{o}{.}\PYG{n}{all}\PYG{p}{(}\PYG{n}{coords2} \PYG{o}{==} \PYG{n}{coords}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
True
\end{sphinxVerbatim}
\section{Summary}
\label{\detokenize{06_photo:summary}}
In this notebook, we worked with two datasets: the list of candidate stars from Gaia and the photometry data from Pan\sphinxhyphen{}STARRS.
We drew a color\sphinxhyphen{}magnitude diagram and used it to identify stars we think are likely to be in GD\sphinxhyphen{}1.
Then we used a Pandas \sphinxcode{\sphinxupquote{merge}} operation to combine the data into a single \sphinxcode{\sphinxupquote{DataFrame}}.
\section{Best practices}
\label{\detokenize{06_photo:best-practices}}\begin{itemize}
\item {}
If you want to perform something like a database \sphinxcode{\sphinxupquote{JOIN}} operation with data that is in a Pandas \sphinxcode{\sphinxupquote{DataFrame}}, you can use the \sphinxcode{\sphinxupquote{join}} or \sphinxcode{\sphinxupquote{merge}} function. In many cases, \sphinxcode{\sphinxupquote{merge}} is easier to use because the arguments are more like SQL.
\item {}
Use Matplotlib options to control the size and aspect ratio of figures to make them easier to interpret. In this example, we scaled the axes so the size of a degree is equal along both axes.
\item {}
Matplotlib also provides operations for working with points, polygons, and other geometric entities, so its not just for making figures.
\item {}
Be sure to record every element of the data analysis pipeline that would be needed to replicate the results.
\end{itemize}
\chapter{Chapter 7}
\label{\detokenize{07_plot:chapter-7}}\label{\detokenize{07_plot::doc}}
This is the seventh in a series of notebooks related to astronomy data.
As a continuing example, we will replicate part of the analysis in a recent paper, “\sphinxhref{https://arxiv.org/abs/1805.00425}{Off the beaten path: Gaia reveals GD\sphinxhyphen{}1 stars outside of the main stream}” by Adrian M. Price\sphinxhyphen{}Whelan and Ana Bonaca.
In the previous notebook we selected photometry data from Pan\sphinxhyphen{}STARRS and used it to identify stars we think are likely to be in GD\sphinxhyphen{}1
In this notebook, well take the results from previous lessons and use them to make a figure that tells a compelling scientific story.
\section{Outline}
\label{\detokenize{07_plot:outline}}
Here are the steps in this notebook:
\begin{enumerate}
\sphinxsetlistlabels{\arabic}{enumi}{enumii}{}{.}%
\item {}
Starting with the figure from the previous notebook, well add annotations to present the results more clearly.
\item {}
The well see several ways to customize figures to make them more appealing and effective.
\item {}
Finally, well see how to make a figure with multiple panels or subplots.
\end{enumerate}
After completing this lesson, you should be able to
\begin{itemize}
\item {}
Design a figure that tells a compelling story.
\item {}
Use Matplotlib features to customize the appearance of figures.
\item {}
Generate a figure with multiple subplots.
\end{itemize}
\section{Installing libraries}
\label{\detokenize{07_plot:installing-libraries}}
If you are running this notebook on Colab, you can run the following cell to install Astroquery and a the other libraries well use.
If you are running this notebook on your own computer, you might have to install these libraries yourself.
If you are using this notebook as part of a Carpentries workshop, you should have received setup instructions.
TODO: Add a link to the instructions.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{c+c1}{\PYGZsh{} If we\PYGZsq{}re running on Colab, install libraries}
\PYG{k+kn}{import} \PYG{n+nn}{sys}
\PYG{n}{IN\PYGZus{}COLAB} \PYG{o}{=} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{google.colab}\PYG{l+s+s1}{\PYGZsq{}} \PYG{o+ow}{in} \PYG{n}{sys}\PYG{o}{.}\PYG{n}{modules}
\PYG{k}{if} \PYG{n}{IN\PYGZus{}COLAB}\PYG{p}{:}
\PYG{o}{!}pip install astroquery astro\PYGZhy{}gala pyia python\PYGZhy{}wget
\end{sphinxVerbatim}
\section{Making Figures That Tell a Story}
\label{\detokenize{07_plot:making-figures-that-tell-a-story}}
So far the figure weve made have been “quick and dirty”. Mostly we have used Matplotlibs default style, although we have adjusted a few parameters, like \sphinxcode{\sphinxupquote{markersize}} and \sphinxcode{\sphinxupquote{alpha}}, to improve legibility.
Now that the analysis is done, its time to think more about:
\begin{enumerate}
\sphinxsetlistlabels{\arabic}{enumi}{enumii}{}{.}%
\item {}
Making professional\sphinxhyphen{}looking figures that are ready for publication, and
\item {}
Making figures that communicate a scientific result clearly and compellingly.
\end{enumerate}
Not necessarily in that order.
Lets start by reviewing Figure 1 from the original paper. Weve seen the individual panels, but now lets look at the whole thing, along with the caption:
\sphinxstylestrong{Exercise:} Think about the following questions:
\begin{enumerate}
\sphinxsetlistlabels{\arabic}{enumi}{enumii}{}{.}%
\item {}
What is the primary scientific result of this work?
\item {}
What story is this figure telling?
\item {}
In the design of this figure, can you identify 1\sphinxhyphen{}2 choices the authors made that you think are effective? Think about big\sphinxhyphen{}picture elements, like the number of panels and how they are arranged, as well as details like the choice of typeface.
\item {}
Can you identify 1\sphinxhyphen{}2 elements that could be improved, or that you might have done differently?
\end{enumerate}
Some topics that might come up in this discussion:
\begin{enumerate}
\sphinxsetlistlabels{\arabic}{enumi}{enumii}{}{.}%
\item {}
The primary result is that the multiple stages of selection make it possible to separate likely candidates from the background more effectively than in previous work, which makes it possible to see the structure of GD\sphinxhyphen{}1 in “unprecedented detail”.
\item {}
The figure documents the selection process as a sequence of steps. Reading right\sphinxhyphen{}to\sphinxhyphen{}left, top\sphinxhyphen{}to\sphinxhyphen{}bottom, we see selection based on proper motion, the results of the first selection, selection based on color and magnitude, and the results of the second selection. So this figure documents the methodology and presents the primary result.
\item {}
Its mostly black and white, with minimal use of color, so it will work well in print. The annotations in the bottom left panel guide the reader to the most important results. It contains enough technical detail for a professional audience, but most of it is also comprehensible to a more general audience. The two left panels have the same dimensions and their axes are aligned.
\item {}
Since the panels represent a sequence, it might be better to arrange them left\sphinxhyphen{}to\sphinxhyphen{}right. The placement and size of the axis labels could be tweaked. The entire figure could be a little bigger to match the width and proportion of the caption. The top left panel has unnused white space (but that leaves space for the annotations in the bottom left).
\end{enumerate}
\section{Plotting GD\sphinxhyphen{}1}
\label{\detokenize{07_plot:plotting-gd-1}}
Lets start with the panel in the lower left. The following cell reloads the data.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k+kn}{import} \PYG{n+nn}{os}
\PYG{k+kn}{from} \PYG{n+nn}{wget} \PYG{k+kn}{import} \PYG{n}{download}
\PYG{n}{filename} \PYG{o}{=} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{gd1\PYGZus{}merged.hdf5}\PYG{l+s+s1}{\PYGZsq{}}
\PYG{n}{path} \PYG{o}{=} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{https://github.com/AllenDowney/AstronomicalData/raw/main/data/}\PYG{l+s+s1}{\PYGZsq{}}
\PYG{k}{if} \PYG{o+ow}{not} \PYG{n}{os}\PYG{o}{.}\PYG{n}{path}\PYG{o}{.}\PYG{n}{exists}\PYG{p}{(}\PYG{n}{filename}\PYG{p}{)}\PYG{p}{:}
\PYG{n+nb}{print}\PYG{p}{(}\PYG{n}{download}\PYG{p}{(}\PYG{n}{path}\PYG{o}{+}\PYG{n}{filename}\PYG{p}{)}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k+kn}{import} \PYG{n+nn}{pandas} \PYG{k}{as} \PYG{n+nn}{pd}
\PYG{n}{selected} \PYG{o}{=} \PYG{n}{pd}\PYG{o}{.}\PYG{n}{read\PYGZus{}hdf}\PYG{p}{(}\PYG{n}{filename}\PYG{p}{,} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{selected}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k+kn}{import} \PYG{n+nn}{matplotlib}\PYG{n+nn}{.}\PYG{n+nn}{pyplot} \PYG{k}{as} \PYG{n+nn}{plt}
\PYG{k}{def} \PYG{n+nf}{plot\PYGZus{}second\PYGZus{}selection}\PYG{p}{(}\PYG{n}{df}\PYG{p}{)}\PYG{p}{:}
\PYG{n}{x} \PYG{o}{=} \PYG{n}{df}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{phi1}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}
\PYG{n}{y} \PYG{o}{=} \PYG{n}{df}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{phi2}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{plot}\PYG{p}{(}\PYG{n}{x}\PYG{p}{,} \PYG{n}{y}\PYG{p}{,} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{ko}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{,} \PYG{n}{markersize}\PYG{o}{=}\PYG{l+m+mf}{0.7}\PYG{p}{,} \PYG{n}{alpha}\PYG{o}{=}\PYG{l+m+mf}{0.9}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{xlabel}\PYG{p}{(}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{\PYGZdl{}}\PYG{l+s+s1}{\PYGZbs{}}\PYG{l+s+s1}{phi\PYGZus{}1\PYGZdl{} [deg]}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{ylabel}\PYG{p}{(}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{\PYGZdl{}}\PYG{l+s+s1}{\PYGZbs{}}\PYG{l+s+s1}{phi\PYGZus{}2\PYGZdl{} [deg]}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{title}\PYG{p}{(}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{Proper motion + photometry selection}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{,} \PYG{n}{fontsize}\PYG{o}{=}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{medium}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{axis}\PYG{p}{(}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{equal}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\end{sphinxVerbatim}
And heres what it looks like.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{figure}\PYG{p}{(}\PYG{n}{figsize}\PYG{o}{=}\PYG{p}{(}\PYG{l+m+mi}{10}\PYG{p}{,}\PYG{l+m+mf}{2.5}\PYG{p}{)}\PYG{p}{)}
\PYG{n}{plot\PYGZus{}second\PYGZus{}selection}\PYG{p}{(}\PYG{n}{selected}\PYG{p}{)}
\end{sphinxVerbatim}
\noindent\sphinxincludegraphics{{07_plot_13_0}.png}
\section{Annotations}
\label{\detokenize{07_plot:annotations}}
The figure in the paper uses three other features to present the results more clearly and compellingly:
\begin{itemize}
\item {}
A vertical dashed line to distinguish the previously undetected region of GD\sphinxhyphen{}1,
\item {}
A label that identifies the new region, and
\item {}
Several annotations that combine text and arrows to identify features of GD\sphinxhyphen{}1.
\end{itemize}
As an exercise, choose any or all of these features and add them to the figure:
\begin{itemize}
\item {}
To draw vertical lines, see \sphinxhref{https://matplotlib.org/3.3.1/api/\_as\_gen/matplotlib.pyplot.vlines.html}{\sphinxcode{\sphinxupquote{plt.vlines}}} and \sphinxhref{https://matplotlib.org/3.3.1/api/\_as\_gen/matplotlib.pyplot.axvline.html\#matplotlib.pyplot.axvline}{\sphinxcode{\sphinxupquote{plt.axvline}}}.
\item {}
To add text, see \sphinxhref{https://matplotlib.org/3.3.1/api/\_as\_gen/matplotlib.pyplot.text.html}{\sphinxcode{\sphinxupquote{plt.text}}}.
\item {}
To add an annotation with text and an arrow, see \DUrole{xref,myst}{plt.annotate}.
\end{itemize}
And here is some \sphinxhref{https://matplotlib.org/3.3.1/tutorials/text/annotations.html\#plotting-guide-annotation}{additional information about text and arrows}.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{c+c1}{\PYGZsh{} Solution}
\PYG{c+c1}{\PYGZsh{} plt.axvline(\PYGZhy{}55, ls=\PYGZsq{}\PYGZhy{}\PYGZhy{}\PYGZsq{}, color=\PYGZsq{}gray\PYGZsq{}, }
\PYG{c+c1}{\PYGZsh{} alpha=0.4, dashes=(6,4), lw=2)}
\PYG{c+c1}{\PYGZsh{} plt.text(\PYGZhy{}60, 5.5, \PYGZsq{}Previously\PYGZbs{}nundetected\PYGZsq{}, }
\PYG{c+c1}{\PYGZsh{} fontsize=\PYGZsq{}small\PYGZsq{}, ha=\PYGZsq{}right\PYGZsq{}, va=\PYGZsq{}top\PYGZsq{});}
\PYG{c+c1}{\PYGZsh{} arrowprops=dict(color=\PYGZsq{}gray\PYGZsq{}, shrink=0.05, width=1.5, }
\PYG{c+c1}{\PYGZsh{} headwidth=6, headlength=8, alpha=0.4)}
\PYG{c+c1}{\PYGZsh{} plt.annotate(\PYGZsq{}Spur\PYGZsq{}, xy=(\PYGZhy{}33, 2), xytext=(\PYGZhy{}35, 5.5),}
\PYG{c+c1}{\PYGZsh{} arrowprops=arrowprops,}
\PYG{c+c1}{\PYGZsh{} fontsize=\PYGZsq{}small\PYGZsq{})}
\PYG{c+c1}{\PYGZsh{} plt.annotate(\PYGZsq{}Gap\PYGZsq{}, xy=(\PYGZhy{}22, \PYGZhy{}1), xytext=(\PYGZhy{}25, \PYGZhy{}5.5),}
\PYG{c+c1}{\PYGZsh{} arrowprops=arrowprops,}
\PYG{c+c1}{\PYGZsh{} fontsize=\PYGZsq{}small\PYGZsq{})}
\end{sphinxVerbatim}
\section{Customization}
\label{\detokenize{07_plot:customization}}
Matplotlib provides a default style that determines things like the colors of lines, the placement of labels and ticks on the axes, and many other properties.
There are several ways to override these defaults and customize your figures:
\begin{itemize}
\item {}
To customize only the current figure, you can call functions like \sphinxcode{\sphinxupquote{tick\_params}}, which well demonstrate below.
\item {}
To customize all figures in a notebook, you use \sphinxcode{\sphinxupquote{rcParams}}.
\item {}
To override more than a few defaults at the same time, you can use a style sheet.
\end{itemize}
As a simple example, notice that Matplotlib puts ticks on the outside of the figures by default, and only on the left and bottom sides of the axes.
To change this behavior, you can use \sphinxcode{\sphinxupquote{gca()}} to get the current axes and \sphinxcode{\sphinxupquote{tick\_params}} to change the settings.
Heres how you can put the ticks on the inside of the figure:
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{gca}\PYG{p}{(}\PYG{p}{)}\PYG{o}{.}\PYG{n}{tick\PYGZus{}params}\PYG{p}{(}\PYG{n}{direction}\PYG{o}{=}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{in}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\end{sphinxVerbatim}
\sphinxstylestrong{Exercise:} Read the documentation of \sphinxhref{https://matplotlib.org/3.1.1/api/\_as\_gen/matplotlib.axes.Axes.tick\_params.html}{\sphinxcode{\sphinxupquote{tick\_params}}} and use it to put ticks on the top and right sides of the axes.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{c+c1}{\PYGZsh{} Solution}
\PYG{c+c1}{\PYGZsh{} plt.gca().tick\PYGZus{}params(top=True, right=True)}
\end{sphinxVerbatim}
\section{rcParams}
\label{\detokenize{07_plot:rcparams}}
If you want to make a customization that applies to all figures in a notebook, you can use \sphinxcode{\sphinxupquote{rcParams}}.
Heres an example that reads the current font size from \sphinxcode{\sphinxupquote{rcParams}}:
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{rcParams}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{font.size}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
10.0
\end{sphinxVerbatim}
And sets it to a new value:
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{rcParams}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{font.size}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]} \PYG{o}{=} \PYG{l+m+mi}{14}
\end{sphinxVerbatim}
\sphinxstylestrong{Exercise:} Plot the previous figure again, and see what font sizes have changed. Look up any other element of \sphinxcode{\sphinxupquote{rcParams}}, change its value, and check the effect on the figure.
If you find yourself making the same customizations in several notebooks, you can put changes to \sphinxcode{\sphinxupquote{rcParams}} in a \sphinxcode{\sphinxupquote{matplotlibrc}} file, \sphinxhref{https://matplotlib.org/3.3.1/tutorials/introductory/customizing.html\#customizing-with-matplotlibrc-files}{which you can read about here}.
\section{Style sheets}
\label{\detokenize{07_plot:style-sheets}}
The \sphinxcode{\sphinxupquote{matplotlibrc}} file is read when you import Matplotlib, so it is not easy to switch from one set of options to another.
The solution to this problem is style sheets, \sphinxhref{https://matplotlib.org/3.1.1/tutorials/introductory/customizing.html}{which you can read about here}.
Matplotlib provides a set of predefined style sheets, or you can make your own.
The following cell displays a list of style sheets installed on your system.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{style}\PYG{o}{.}\PYG{n}{available}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
[\PYGZsq{}Solarize\PYGZus{}Light2\PYGZsq{},
\PYGZsq{}\PYGZus{}classic\PYGZus{}test\PYGZus{}patch\PYGZsq{},
\PYGZsq{}bmh\PYGZsq{},
\PYGZsq{}classic\PYGZsq{},
\PYGZsq{}dark\PYGZus{}background\PYGZsq{},
\PYGZsq{}fast\PYGZsq{},
\PYGZsq{}fivethirtyeight\PYGZsq{},
\PYGZsq{}ggplot\PYGZsq{},
\PYGZsq{}grayscale\PYGZsq{},
\PYGZsq{}seaborn\PYGZsq{},
\PYGZsq{}seaborn\PYGZhy{}bright\PYGZsq{},
\PYGZsq{}seaborn\PYGZhy{}colorblind\PYGZsq{},
\PYGZsq{}seaborn\PYGZhy{}dark\PYGZsq{},
\PYGZsq{}seaborn\PYGZhy{}dark\PYGZhy{}palette\PYGZsq{},
\PYGZsq{}seaborn\PYGZhy{}darkgrid\PYGZsq{},
\PYGZsq{}seaborn\PYGZhy{}deep\PYGZsq{},
\PYGZsq{}seaborn\PYGZhy{}muted\PYGZsq{},
\PYGZsq{}seaborn\PYGZhy{}notebook\PYGZsq{},
\PYGZsq{}seaborn\PYGZhy{}paper\PYGZsq{},
\PYGZsq{}seaborn\PYGZhy{}pastel\PYGZsq{},
\PYGZsq{}seaborn\PYGZhy{}poster\PYGZsq{},
\PYGZsq{}seaborn\PYGZhy{}talk\PYGZsq{},
\PYGZsq{}seaborn\PYGZhy{}ticks\PYGZsq{},
\PYGZsq{}seaborn\PYGZhy{}white\PYGZsq{},
\PYGZsq{}seaborn\PYGZhy{}whitegrid\PYGZsq{},
\PYGZsq{}tableau\PYGZhy{}colorblind10\PYGZsq{}]
\end{sphinxVerbatim}
Note that \sphinxcode{\sphinxupquote{seaborn\sphinxhyphen{}paper}}, \sphinxcode{\sphinxupquote{seaborn\sphinxhyphen{}talk}} and \sphinxcode{\sphinxupquote{seaborn\sphinxhyphen{}poster}} are particularly intended to prepare versions of a figure with text sizes and other features that work well in papers, talks, and posters.
To use any of these style sheets, run \sphinxcode{\sphinxupquote{plt.style.use}} like this:
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{style}\PYG{o}{.}\PYG{n}{use}\PYG{p}{(}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{fivethirtyeight}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\end{sphinxVerbatim}
The style sheet you choose will affect the appearance of all figures you plot after calling \sphinxcode{\sphinxupquote{use}}, unless you override any of the options or call \sphinxcode{\sphinxupquote{use}} again.
\sphinxstylestrong{Exercise:} Choose one of the styles on the list and select it by calling \sphinxcode{\sphinxupquote{use}}. Then go back and plot one of the figures above and see what effect it has.
If you cant find a style sheet thats exactly what you want, you can make your own. This repository includes a style sheet called \sphinxcode{\sphinxupquote{az\sphinxhyphen{}paper\sphinxhyphen{}twocol.mplstyle}}, with customizations chosen by Azalee Bostroem for publication in astronomy journals.
The following cell downloads the style sheet.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k+kn}{import} \PYG{n+nn}{os}
\PYG{n}{filename} \PYG{o}{=} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{az\PYGZhy{}paper\PYGZhy{}twocol.mplstyle}\PYG{l+s+s1}{\PYGZsq{}}
\PYG{n}{path} \PYG{o}{=} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{https://github.com/AllenDowney/AstronomicalData/raw/main/data/}\PYG{l+s+s1}{\PYGZsq{}}
\PYG{k}{if} \PYG{o+ow}{not} \PYG{n}{os}\PYG{o}{.}\PYG{n}{path}\PYG{o}{.}\PYG{n}{exists}\PYG{p}{(}\PYG{n}{filename}\PYG{p}{)}\PYG{p}{:}
\PYG{n+nb}{print}\PYG{p}{(}\PYG{n}{download}\PYG{p}{(}\PYG{n}{path}\PYG{o}{+}\PYG{n}{filename}\PYG{p}{)}\PYG{p}{)}
\end{sphinxVerbatim}
You can use it like this:
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{style}\PYG{o}{.}\PYG{n}{use}\PYG{p}{(}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{./az\PYGZhy{}paper\PYGZhy{}twocol.mplstyle}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\end{sphinxVerbatim}
The prefix \sphinxcode{\sphinxupquote{./}} tells Matplotlib to look for the file in the current directory.
As an alternative, you can install a style sheet for your own use by putting it in your configuration directory. To find out where that is, you can run the following command:
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k+kn}{import} \PYG{n+nn}{matplotlib} \PYG{k}{as} \PYG{n+nn}{mpl}
\PYG{n}{mpl}\PYG{o}{.}\PYG{n}{get\PYGZus{}configdir}\PYG{p}{(}\PYG{p}{)}
\end{sphinxVerbatim}
\section{LaTeX fonts}
\label{\detokenize{07_plot:latex-fonts}}
When you include mathematical expressions in titles, labels, and annotations, Matplotlib uses \sphinxhref{https://matplotlib.org/3.1.0/tutorials/text/mathtext.html}{\sphinxcode{\sphinxupquote{mathtext}}} to typeset them. \sphinxcode{\sphinxupquote{mathtext}} uses the same syntax as LaTeX, but it provides only a subset of its features.
If you need features that are not provided by \sphinxcode{\sphinxupquote{mathtext}}, or you prefer the way LaTeX typesets mathematical expressions, you can customize Matplotlib to use LaTeX.
In \sphinxcode{\sphinxupquote{matplotlibrc}} or in a style sheet, you can add the following line:
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{text}\PYG{o}{.}\PYG{n}{usetex} \PYG{p}{:} \PYG{n}{true}
\end{sphinxVerbatim}
Or in a notebook you can run the following code.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{rcParams}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{text.usetex}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]} \PYG{o}{=} \PYG{k+kc}{True}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{rcParams}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{text.usetex}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]} \PYG{o}{=} \PYG{k+kc}{True}
\end{sphinxVerbatim}
If you go back and draw the figure again, you should see the difference.
If you get an error message like
\begin{sphinxVerbatim}[commandchars=\\\{\}]
LaTeX Error: File `type1cm.sty\PYGZsq{} not found.
\end{sphinxVerbatim}
You might have to install a package that contains the fonts LaTeX needs. On some systems, the packages \sphinxcode{\sphinxupquote{texlive\sphinxhyphen{}latex\sphinxhyphen{}extra}} or \sphinxcode{\sphinxupquote{cm\sphinxhyphen{}super}} might be what you need. \sphinxhref{https://stackoverflow.com/questions/11354149/python-unable-to-render-tex-in-matplotlib}{See here for more help with this}.
In case you are curious, \sphinxcode{\sphinxupquote{cm}} stands for \sphinxhref{https://en.wikipedia.org/wiki/Computer\_Modern}{Computer Modern}, the font LaTeX uses to typeset math.
\section{Multiple panels}
\label{\detokenize{07_plot:multiple-panels}}
So far weve been working with one figure at a time, but the figure we are replicating contains multiple panels, also known as “subplots”.
Confusingly, Matplotlib provides \sphinxstyleemphasis{three} functions for making figures like this: \sphinxcode{\sphinxupquote{subplot}}, \sphinxcode{\sphinxupquote{subplots}}, and \sphinxcode{\sphinxupquote{subplot2grid}}.
\begin{itemize}
\item {}
\sphinxhref{https://matplotlib.org/3.3.1/api/\_as\_gen/matplotlib.pyplot.subplot.html}{\sphinxcode{\sphinxupquote{subplot}}} is simple and similar to MATLAB, so if you are familiar with that interface, you might like \sphinxcode{\sphinxupquote{subplot}}
\item {}
\sphinxhref{https://matplotlib.org/3.3.1/api/\_as\_gen/matplotlib.pyplot.subplots.html}{\sphinxcode{\sphinxupquote{subplots}}} is more object\sphinxhyphen{}oriented, which some people prefer.
\item {}
\sphinxhref{https://matplotlib.org/3.3.1/api/\_as\_gen/matplotlib.pyplot.subplot2grid.html}{\sphinxcode{\sphinxupquote{subplot2grid}}} is most convenient if you want to control the relative sizes of the subplots.
\end{itemize}
So well use \sphinxcode{\sphinxupquote{subplot2grid}}.
All of these functions are easier to use if we put the code that generates each panel in a function.
\section{Upper right}
\label{\detokenize{07_plot:upper-right}}
To make the panel in the upper right, we have to reload \sphinxcode{\sphinxupquote{centerline}}.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k+kn}{import} \PYG{n+nn}{os}
\PYG{n}{filename} \PYG{o}{=} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{gd1\PYGZus{}dataframe.hdf5}\PYG{l+s+s1}{\PYGZsq{}}
\PYG{n}{path} \PYG{o}{=} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{https://github.com/AllenDowney/AstronomicalData/raw/main/data/}\PYG{l+s+s1}{\PYGZsq{}}
\PYG{k}{if} \PYG{o+ow}{not} \PYG{n}{os}\PYG{o}{.}\PYG{n}{path}\PYG{o}{.}\PYG{n}{exists}\PYG{p}{(}\PYG{n}{filename}\PYG{p}{)}\PYG{p}{:}
\PYG{n+nb}{print}\PYG{p}{(}\PYG{n}{download}\PYG{p}{(}\PYG{n}{path}\PYG{o}{+}\PYG{n}{filename}\PYG{p}{)}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k+kn}{import} \PYG{n+nn}{pandas} \PYG{k}{as} \PYG{n+nn}{pd}
\PYG{n}{centerline} \PYG{o}{=} \PYG{n}{pd}\PYG{o}{.}\PYG{n}{read\PYGZus{}hdf}\PYG{p}{(}\PYG{n}{filename}\PYG{p}{,} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{centerline}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\end{sphinxVerbatim}
And define the coordinates of the rectangle we selected.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{pm1\PYGZus{}min} \PYG{o}{=} \PYG{o}{\PYGZhy{}}\PYG{l+m+mf}{8.9}
\PYG{n}{pm1\PYGZus{}max} \PYG{o}{=} \PYG{o}{\PYGZhy{}}\PYG{l+m+mf}{6.9}
\PYG{n}{pm2\PYGZus{}min} \PYG{o}{=} \PYG{o}{\PYGZhy{}}\PYG{l+m+mf}{2.2}
\PYG{n}{pm2\PYGZus{}max} \PYG{o}{=} \PYG{l+m+mf}{1.0}
\PYG{n}{pm1\PYGZus{}rect} \PYG{o}{=} \PYG{p}{[}\PYG{n}{pm1\PYGZus{}min}\PYG{p}{,} \PYG{n}{pm1\PYGZus{}min}\PYG{p}{,} \PYG{n}{pm1\PYGZus{}max}\PYG{p}{,} \PYG{n}{pm1\PYGZus{}max}\PYG{p}{]}
\PYG{n}{pm2\PYGZus{}rect} \PYG{o}{=} \PYG{p}{[}\PYG{n}{pm2\PYGZus{}min}\PYG{p}{,} \PYG{n}{pm2\PYGZus{}max}\PYG{p}{,} \PYG{n}{pm2\PYGZus{}max}\PYG{p}{,} \PYG{n}{pm2\PYGZus{}min}\PYG{p}{]}
\end{sphinxVerbatim}
To plot this rectangle, well use a feature we have not seen before: \sphinxcode{\sphinxupquote{Polygon}}, which is provided by Matplotlib.
To create a \sphinxcode{\sphinxupquote{Polygon}}, we have to put the coordinates in an array with \sphinxcode{\sphinxupquote{x}} values in the first column and \sphinxcode{\sphinxupquote{y}} values in the second column.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k+kn}{import} \PYG{n+nn}{numpy} \PYG{k}{as} \PYG{n+nn}{np}
\PYG{n}{vertices} \PYG{o}{=} \PYG{n}{np}\PYG{o}{.}\PYG{n}{transpose}\PYG{p}{(}\PYG{p}{[}\PYG{n}{pm1\PYGZus{}rect}\PYG{p}{,} \PYG{n}{pm2\PYGZus{}rect}\PYG{p}{]}\PYG{p}{)}
\PYG{n}{vertices}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
array([[\PYGZhy{}8.9, \PYGZhy{}2.2],
[\PYGZhy{}8.9, 1. ],
[\PYGZhy{}6.9, 1. ],
[\PYGZhy{}6.9, \PYGZhy{}2.2]])
\end{sphinxVerbatim}
The following function takes a \sphinxcode{\sphinxupquote{DataFrame}} as a parameter, plots the proper motion for each star, and adds a shaded \sphinxcode{\sphinxupquote{Polygon}} to show the region we selected.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k+kn}{from} \PYG{n+nn}{matplotlib}\PYG{n+nn}{.}\PYG{n+nn}{patches} \PYG{k+kn}{import} \PYG{n}{Polygon}
\PYG{k}{def} \PYG{n+nf}{plot\PYGZus{}proper\PYGZus{}motion}\PYG{p}{(}\PYG{n}{df}\PYG{p}{)}\PYG{p}{:}
\PYG{n}{pm1} \PYG{o}{=} \PYG{n}{df}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{pm\PYGZus{}phi1}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}
\PYG{n}{pm2} \PYG{o}{=} \PYG{n}{df}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{pm\PYGZus{}phi2}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{plot}\PYG{p}{(}\PYG{n}{pm1}\PYG{p}{,} \PYG{n}{pm2}\PYG{p}{,} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{ko}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{,} \PYG{n}{markersize}\PYG{o}{=}\PYG{l+m+mf}{0.3}\PYG{p}{,} \PYG{n}{alpha}\PYG{o}{=}\PYG{l+m+mf}{0.3}\PYG{p}{)}
\PYG{n}{poly} \PYG{o}{=} \PYG{n}{Polygon}\PYG{p}{(}\PYG{n}{vertices}\PYG{p}{,} \PYG{n}{closed}\PYG{o}{=}\PYG{k+kc}{True}\PYG{p}{,}
\PYG{n}{facecolor}\PYG{o}{=}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{C1}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{,} \PYG{n}{alpha}\PYG{o}{=}\PYG{l+m+mf}{0.4}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{gca}\PYG{p}{(}\PYG{p}{)}\PYG{o}{.}\PYG{n}{add\PYGZus{}patch}\PYG{p}{(}\PYG{n}{poly}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{xlabel}\PYG{p}{(}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{\PYGZdl{}}\PYG{l+s+s1}{\PYGZbs{}}\PYG{l+s+s1}{mu\PYGZus{}}\PYG{l+s+s1}{\PYGZob{}}\PYG{l+s+s1}{\PYGZbs{}}\PYG{l+s+s1}{phi\PYGZus{}1\PYGZcb{} [}\PYG{l+s+s1}{\PYGZbs{}}\PYG{l+s+s1}{mathrm}\PYG{l+s+s1}{\PYGZob{}}\PYG{l+s+s1}{mas\PYGZti{}yr\PYGZcb{}\PYGZca{}}\PYG{l+s+s1}{\PYGZob{}}\PYG{l+s+s1}{\PYGZhy{}1\PYGZcb{}]\PYGZdl{}}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{ylabel}\PYG{p}{(}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{\PYGZdl{}}\PYG{l+s+s1}{\PYGZbs{}}\PYG{l+s+s1}{mu\PYGZus{}}\PYG{l+s+s1}{\PYGZob{}}\PYG{l+s+s1}{\PYGZbs{}}\PYG{l+s+s1}{phi\PYGZus{}2\PYGZcb{} [}\PYG{l+s+s1}{\PYGZbs{}}\PYG{l+s+s1}{mathrm}\PYG{l+s+s1}{\PYGZob{}}\PYG{l+s+s1}{mas\PYGZti{}yr\PYGZcb{}\PYGZca{}}\PYG{l+s+s1}{\PYGZob{}}\PYG{l+s+s1}{\PYGZhy{}1\PYGZcb{}]\PYGZdl{}}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{xlim}\PYG{p}{(}\PYG{o}{\PYGZhy{}}\PYG{l+m+mi}{12}\PYG{p}{,} \PYG{l+m+mi}{8}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{ylim}\PYG{p}{(}\PYG{o}{\PYGZhy{}}\PYG{l+m+mi}{10}\PYG{p}{,} \PYG{l+m+mi}{10}\PYG{p}{)}
\end{sphinxVerbatim}
Notice that \sphinxcode{\sphinxupquote{add\_patch}} is like \sphinxcode{\sphinxupquote{invert\_yaxis}}; in order to call it, we have to use \sphinxcode{\sphinxupquote{gca}} to get the current axes.
Heres what the new version of the figure looks like. Weve changed the labels on the axes to be consistent with the paper.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{rcParams}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{text.usetex}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]} \PYG{o}{=} \PYG{k+kc}{False}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{style}\PYG{o}{.}\PYG{n}{use}\PYG{p}{(}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{default}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\PYG{n}{plot\PYGZus{}proper\PYGZus{}motion}\PYG{p}{(}\PYG{n}{centerline}\PYG{p}{)}
\end{sphinxVerbatim}
\noindent\sphinxincludegraphics{{07_plot_50_0}.png}
\section{Upper left}
\label{\detokenize{07_plot:upper-left}}
Now lets work on the panel in the upper left. We have to reload \sphinxcode{\sphinxupquote{candidates}}.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k+kn}{import} \PYG{n+nn}{os}
\PYG{n}{filename} \PYG{o}{=} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{gd1\PYGZus{}candidates.hdf5}\PYG{l+s+s1}{\PYGZsq{}}
\PYG{n}{path} \PYG{o}{=} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{https://github.com/AllenDowney/AstronomicalData/raw/main/data/}\PYG{l+s+s1}{\PYGZsq{}}
\PYG{k}{if} \PYG{o+ow}{not} \PYG{n}{os}\PYG{o}{.}\PYG{n}{path}\PYG{o}{.}\PYG{n}{exists}\PYG{p}{(}\PYG{n}{filename}\PYG{p}{)}\PYG{p}{:}
\PYG{n+nb}{print}\PYG{p}{(}\PYG{n}{download}\PYG{p}{(}\PYG{n}{path}\PYG{o}{+}\PYG{n}{filename}\PYG{p}{)}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k+kn}{import} \PYG{n+nn}{pandas} \PYG{k}{as} \PYG{n+nn}{pd}
\PYG{n}{filename} \PYG{o}{=} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{gd1\PYGZus{}candidates.hdf5}\PYG{l+s+s1}{\PYGZsq{}}
\PYG{n}{candidate\PYGZus{}df} \PYG{o}{=} \PYG{n}{pd}\PYG{o}{.}\PYG{n}{read\PYGZus{}hdf}\PYG{p}{(}\PYG{n}{filename}\PYG{p}{,} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{candidate\PYGZus{}df}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\end{sphinxVerbatim}
Heres a function that takes a \sphinxcode{\sphinxupquote{DataFrame}} of candidate stars and plots their positions in GD\sphinxhyphen{}1 coordindates.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k}{def} \PYG{n+nf}{plot\PYGZus{}first\PYGZus{}selection}\PYG{p}{(}\PYG{n}{df}\PYG{p}{)}\PYG{p}{:}
\PYG{n}{x} \PYG{o}{=} \PYG{n}{df}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{phi1}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}
\PYG{n}{y} \PYG{o}{=} \PYG{n}{df}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{phi2}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{plot}\PYG{p}{(}\PYG{n}{x}\PYG{p}{,} \PYG{n}{y}\PYG{p}{,} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{ko}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{,} \PYG{n}{markersize}\PYG{o}{=}\PYG{l+m+mf}{0.3}\PYG{p}{,} \PYG{n}{alpha}\PYG{o}{=}\PYG{l+m+mf}{0.3}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{xlabel}\PYG{p}{(}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{\PYGZdl{}}\PYG{l+s+s1}{\PYGZbs{}}\PYG{l+s+s1}{phi\PYGZus{}1\PYGZdl{} [deg]}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{ylabel}\PYG{p}{(}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{\PYGZdl{}}\PYG{l+s+s1}{\PYGZbs{}}\PYG{l+s+s1}{phi\PYGZus{}2\PYGZdl{} [deg]}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{title}\PYG{p}{(}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{Proper motion selection}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{,} \PYG{n}{fontsize}\PYG{o}{=}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{medium}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{axis}\PYG{p}{(}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{equal}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\end{sphinxVerbatim}
And heres what it looks like.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{plot\PYGZus{}first\PYGZus{}selection}\PYG{p}{(}\PYG{n}{candidate\PYGZus{}df}\PYG{p}{)}
\end{sphinxVerbatim}
\noindent\sphinxincludegraphics{{07_plot_57_0}.png}
\section{Lower right}
\label{\detokenize{07_plot:lower-right}}
For the figure in the lower right, we need to reload the merged \sphinxcode{\sphinxupquote{DataFrame}}, which contains data from Gaia and photometry data from Pan\sphinxhyphen{}STARRS.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k+kn}{import} \PYG{n+nn}{pandas} \PYG{k}{as} \PYG{n+nn}{pd}
\PYG{n}{filename} \PYG{o}{=} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{gd1\PYGZus{}merged.hdf5}\PYG{l+s+s1}{\PYGZsq{}}
\PYG{n}{merged} \PYG{o}{=} \PYG{n}{pd}\PYG{o}{.}\PYG{n}{read\PYGZus{}hdf}\PYG{p}{(}\PYG{n}{filename}\PYG{p}{,} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{merged}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\end{sphinxVerbatim}
From the previous notebook, heres the function that plots the color\sphinxhyphen{}magnitude diagram.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k+kn}{import} \PYG{n+nn}{matplotlib}\PYG{n+nn}{.}\PYG{n+nn}{pyplot} \PYG{k}{as} \PYG{n+nn}{plt}
\PYG{k}{def} \PYG{n+nf}{plot\PYGZus{}cmd}\PYG{p}{(}\PYG{n}{table}\PYG{p}{)}\PYG{p}{:}
\PYG{l+s+sd}{\PYGZdq{}\PYGZdq{}\PYGZdq{}Plot a color magnitude diagram.}
\PYG{l+s+sd}{ }
\PYG{l+s+sd}{ table: Table or DataFrame with photometry data}
\PYG{l+s+sd}{ \PYGZdq{}\PYGZdq{}\PYGZdq{}}
\PYG{n}{y} \PYG{o}{=} \PYG{n}{table}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{g\PYGZus{}mean\PYGZus{}psf\PYGZus{}mag}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}
\PYG{n}{x} \PYG{o}{=} \PYG{n}{table}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{g\PYGZus{}mean\PYGZus{}psf\PYGZus{}mag}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]} \PYG{o}{\PYGZhy{}} \PYG{n}{table}\PYG{p}{[}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{i\PYGZus{}mean\PYGZus{}psf\PYGZus{}mag}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{]}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{plot}\PYG{p}{(}\PYG{n}{x}\PYG{p}{,} \PYG{n}{y}\PYG{p}{,} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{ko}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{,} \PYG{n}{markersize}\PYG{o}{=}\PYG{l+m+mf}{0.3}\PYG{p}{,} \PYG{n}{alpha}\PYG{o}{=}\PYG{l+m+mf}{0.3}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{xlim}\PYG{p}{(}\PYG{p}{[}\PYG{l+m+mi}{0}\PYG{p}{,} \PYG{l+m+mf}{1.5}\PYG{p}{]}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{ylim}\PYG{p}{(}\PYG{p}{[}\PYG{l+m+mi}{14}\PYG{p}{,} \PYG{l+m+mi}{22}\PYG{p}{]}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{gca}\PYG{p}{(}\PYG{p}{)}\PYG{o}{.}\PYG{n}{invert\PYGZus{}yaxis}\PYG{p}{(}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{ylabel}\PYG{p}{(}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{\PYGZdl{}g\PYGZus{}0\PYGZdl{}}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{xlabel}\PYG{p}{(}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{\PYGZdl{}(g\PYGZhy{}i)\PYGZus{}0\PYGZdl{}}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\end{sphinxVerbatim}
And heres what it looks like.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{plot\PYGZus{}cmd}\PYG{p}{(}\PYG{n}{merged}\PYG{p}{)}
\end{sphinxVerbatim}
\noindent\sphinxincludegraphics{{07_plot_63_0}.png}
\sphinxstylestrong{Exercise:} Add a few lines to \sphinxcode{\sphinxupquote{plot\_cmd}} to show the Polygon we selected as a shaded area.
Run these cells to get the polygon coordinates we saved in the previous notebook.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{k+kn}{import} \PYG{n+nn}{os}
\PYG{n}{filename} \PYG{o}{=} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{gd1\PYGZus{}polygon.hdf5}\PYG{l+s+s1}{\PYGZsq{}}
\PYG{n}{path} \PYG{o}{=} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{https://github.com/AllenDowney/AstronomicalData/raw/main/data/}\PYG{l+s+s1}{\PYGZsq{}}
\PYG{k}{if} \PYG{o+ow}{not} \PYG{n}{os}\PYG{o}{.}\PYG{n}{path}\PYG{o}{.}\PYG{n}{exists}\PYG{p}{(}\PYG{n}{filename}\PYG{p}{)}\PYG{p}{:}
\PYG{n+nb}{print}\PYG{p}{(}\PYG{n}{download}\PYG{p}{(}\PYG{n}{path}\PYG{o}{+}\PYG{n}{filename}\PYG{p}{)}\PYG{p}{)}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{coords\PYGZus{}df} \PYG{o}{=} \PYG{n}{pd}\PYG{o}{.}\PYG{n}{read\PYGZus{}hdf}\PYG{p}{(}\PYG{n}{filename}\PYG{p}{,} \PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{coords\PYGZus{}df}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{)}
\PYG{n}{coords} \PYG{o}{=} \PYG{n}{coords\PYGZus{}df}\PYG{o}{.}\PYG{n}{to\PYGZus{}numpy}\PYG{p}{(}\PYG{p}{)}
\PYG{n}{coords}
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
array([[ 0.21505376, 17.5481972 ],
[ 0.38978495, 18.94628403],
[ 0.53763441, 19.90286976],
[ 0.70340502, 20.60191317],
[ 0.82885305, 21.30095659],
[ 0.66308244, 21.52170714],
[ 0.43010753, 20.78587196],
[ 0.27329749, 19.71891096],
[ 0.17473118, 18.68874172],
[ 0.17473118, 17.95290655]])
\end{sphinxVerbatim}
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{c+c1}{\PYGZsh{} Solution}
\PYG{c+c1}{\PYGZsh{}poly = Polygon(coords, closed=True, }
\PYG{c+c1}{\PYGZsh{} facecolor=\PYGZsq{}C1\PYGZsq{}, alpha=0.4)}
\PYG{c+c1}{\PYGZsh{}plt.gca().add\PYGZus{}patch(poly)}
\end{sphinxVerbatim}
\section{Subplots}
\label{\detokenize{07_plot:subplots}}
Now were ready to put it all together. To make a figure with four subplots, well use \sphinxcode{\sphinxupquote{subplot2grid}}, \sphinxhref{https://matplotlib.org/3.3.1/api/\_as\_gen/matplotlib.pyplot.subplot2grid.html}{which requires two arguments}:
\begin{itemize}
\item {}
\sphinxcode{\sphinxupquote{shape}}, which is a tuple with the number of rows and columns in the grid, and
\item {}
\sphinxcode{\sphinxupquote{loc}}, which is a tuple identifying the location in the grid were about to fill.
\end{itemize}
In this example, \sphinxcode{\sphinxupquote{shape}} is \sphinxcode{\sphinxupquote{(2, 2)}} to create two rows and two columns.
For the first panel, \sphinxcode{\sphinxupquote{loc}} is \sphinxcode{\sphinxupquote{(0, 0)}}, which indicates row 0 and column 0, which is the upper\sphinxhyphen{}left panel.
Heres how we use it to draw the four panels.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{shape} \PYG{o}{=} \PYG{p}{(}\PYG{l+m+mi}{2}\PYG{p}{,} \PYG{l+m+mi}{2}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{subplot2grid}\PYG{p}{(}\PYG{n}{shape}\PYG{p}{,} \PYG{p}{(}\PYG{l+m+mi}{0}\PYG{p}{,} \PYG{l+m+mi}{0}\PYG{p}{)}\PYG{p}{)}
\PYG{n}{plot\PYGZus{}first\PYGZus{}selection}\PYG{p}{(}\PYG{n}{candidate\PYGZus{}df}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{subplot2grid}\PYG{p}{(}\PYG{n}{shape}\PYG{p}{,} \PYG{p}{(}\PYG{l+m+mi}{0}\PYG{p}{,} \PYG{l+m+mi}{1}\PYG{p}{)}\PYG{p}{)}
\PYG{n}{plot\PYGZus{}proper\PYGZus{}motion}\PYG{p}{(}\PYG{n}{centerline}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{subplot2grid}\PYG{p}{(}\PYG{n}{shape}\PYG{p}{,} \PYG{p}{(}\PYG{l+m+mi}{1}\PYG{p}{,} \PYG{l+m+mi}{0}\PYG{p}{)}\PYG{p}{)}
\PYG{n}{plot\PYGZus{}second\PYGZus{}selection}\PYG{p}{(}\PYG{n}{selected}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{subplot2grid}\PYG{p}{(}\PYG{n}{shape}\PYG{p}{,} \PYG{p}{(}\PYG{l+m+mi}{1}\PYG{p}{,} \PYG{l+m+mi}{1}\PYG{p}{)}\PYG{p}{)}
\PYG{n}{plot\PYGZus{}cmd}\PYG{p}{(}\PYG{n}{merged}\PYG{p}{)}
\PYG{n}{poly} \PYG{o}{=} \PYG{n}{Polygon}\PYG{p}{(}\PYG{n}{coords}\PYG{p}{,} \PYG{n}{closed}\PYG{o}{=}\PYG{k+kc}{True}\PYG{p}{,}
\PYG{n}{facecolor}\PYG{o}{=}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{C1}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{,} \PYG{n}{alpha}\PYG{o}{=}\PYG{l+m+mf}{0.4}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{gca}\PYG{p}{(}\PYG{p}{)}\PYG{o}{.}\PYG{n}{add\PYGZus{}patch}\PYG{p}{(}\PYG{n}{poly}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{tight\PYGZus{}layout}\PYG{p}{(}\PYG{p}{)}
\end{sphinxVerbatim}
\noindent\sphinxincludegraphics{{07_plot_69_0}.png}
We use \sphinxhref{https://matplotlib.org/3.3.1/tutorials/intermediate/tight\_layout\_guide.html}{\sphinxcode{\sphinxupquote{plt.tight\_layout}}} at the end, which adjusts the sizes of the panels to make sure the titles and axis labels dont overlap.
\sphinxstylestrong{Exercise:} See what happens if you leave out \sphinxcode{\sphinxupquote{tight\_layout}}.
\section{Adjusting proportions}
\label{\detokenize{07_plot:adjusting-proportions}}
In the previous figure, the panels are all the same size. To get a better view of GD\sphinxhyphen{}1, wed like to stretch the panels on the left and compress the ones on the right.
To do that, well use the \sphinxcode{\sphinxupquote{colspan}} argument to make a panel that spans multiple columns in the grid.
In the following example, \sphinxcode{\sphinxupquote{shape}} is \sphinxcode{\sphinxupquote{(2, 4)}}, which means 2 rows and 4 columns.
The panels on the left span three columns, so they are three times wider than the panels on the right.
At the same time, we use \sphinxcode{\sphinxupquote{figsize}} to adjust the aspect ratio of the whole figure.
\begin{sphinxVerbatim}[commandchars=\\\{\}]
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{figure}\PYG{p}{(}\PYG{n}{figsize}\PYG{o}{=}\PYG{p}{(}\PYG{l+m+mi}{9}\PYG{p}{,} \PYG{l+m+mf}{4.5}\PYG{p}{)}\PYG{p}{)}
\PYG{n}{shape} \PYG{o}{=} \PYG{p}{(}\PYG{l+m+mi}{2}\PYG{p}{,} \PYG{l+m+mi}{4}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{subplot2grid}\PYG{p}{(}\PYG{n}{shape}\PYG{p}{,} \PYG{p}{(}\PYG{l+m+mi}{0}\PYG{p}{,} \PYG{l+m+mi}{0}\PYG{p}{)}\PYG{p}{,} \PYG{n}{colspan}\PYG{o}{=}\PYG{l+m+mi}{3}\PYG{p}{)}
\PYG{n}{plot\PYGZus{}first\PYGZus{}selection}\PYG{p}{(}\PYG{n}{candidate\PYGZus{}df}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{subplot2grid}\PYG{p}{(}\PYG{n}{shape}\PYG{p}{,} \PYG{p}{(}\PYG{l+m+mi}{0}\PYG{p}{,} \PYG{l+m+mi}{3}\PYG{p}{)}\PYG{p}{)}
\PYG{n}{plot\PYGZus{}proper\PYGZus{}motion}\PYG{p}{(}\PYG{n}{centerline}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{subplot2grid}\PYG{p}{(}\PYG{n}{shape}\PYG{p}{,} \PYG{p}{(}\PYG{l+m+mi}{1}\PYG{p}{,} \PYG{l+m+mi}{0}\PYG{p}{)}\PYG{p}{,} \PYG{n}{colspan}\PYG{o}{=}\PYG{l+m+mi}{3}\PYG{p}{)}
\PYG{n}{plot\PYGZus{}second\PYGZus{}selection}\PYG{p}{(}\PYG{n}{selected}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{subplot2grid}\PYG{p}{(}\PYG{n}{shape}\PYG{p}{,} \PYG{p}{(}\PYG{l+m+mi}{1}\PYG{p}{,} \PYG{l+m+mi}{3}\PYG{p}{)}\PYG{p}{)}
\PYG{n}{plot\PYGZus{}cmd}\PYG{p}{(}\PYG{n}{merged}\PYG{p}{)}
\PYG{n}{poly} \PYG{o}{=} \PYG{n}{Polygon}\PYG{p}{(}\PYG{n}{coords}\PYG{p}{,} \PYG{n}{closed}\PYG{o}{=}\PYG{k+kc}{True}\PYG{p}{,}
\PYG{n}{facecolor}\PYG{o}{=}\PYG{l+s+s1}{\PYGZsq{}}\PYG{l+s+s1}{C1}\PYG{l+s+s1}{\PYGZsq{}}\PYG{p}{,} \PYG{n}{alpha}\PYG{o}{=}\PYG{l+m+mf}{0.4}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{gca}\PYG{p}{(}\PYG{p}{)}\PYG{o}{.}\PYG{n}{add\PYGZus{}patch}\PYG{p}{(}\PYG{n}{poly}\PYG{p}{)}
\PYG{n}{plt}\PYG{o}{.}\PYG{n}{tight\PYGZus{}layout}\PYG{p}{(}\PYG{p}{)}
\end{sphinxVerbatim}
\noindent\sphinxincludegraphics{{07_plot_72_0}.png}
This is looking more and more like the figure in the paper.
\sphinxstylestrong{Exercise:} In this example, the ratio of the widths of the panels is 3:1. How would you adjust it if you wanted the ratio to be 3:2?
\section{Summary}
\label{\detokenize{07_plot:summary}}
In this notebook, we reverse\sphinxhyphen{}engineered the figure weve been replicating, identifying elements that seem effective and others that could be improved.
We explored features Matplotlib provides for adding annotations to figures \textendash{} including text, lines, arrows, and polygons \textendash{} and several ways to customize the appearance of figures. And we learned how to create figures that contain multiple panels.
\section{Best practices}
\label{\detokenize{07_plot:best-practices}}\begin{itemize}
\item {}
The most effective figures focus on telling a single story clearly and compellingly.
\item {}
Consider using annotations to guide the readers attention to the most important elements of a figure.
\item {}
The default Matplotlib style generates good quality figures, but there are several ways you can override the defaults.
\item {}
If you find yourself making the same customizations on several projects, you might want to create your own style sheet.
\end{itemize}
\renewcommand{\indexname}{Index}
\printindex
\end{document}