Files
AstronomicalData/04_select.html
2020-12-29 16:52:17 -05:00

1241 lines
75 KiB
HTML
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>Transformation and Selection &#8212; Astronomical Data in Python</title>
<link rel="stylesheet" href="_static/css/index.d431a4ee1c1efae0e38bdfebc22debff.css">
<link rel="stylesheet"
href="_static/vendor/fontawesome/5.13.0/css/all.min.css">
<link rel="preload" as="font" type="font/woff2" crossorigin
href="_static/vendor/fontawesome/5.13.0/webfonts/fa-solid-900.woff2">
<link rel="preload" as="font" type="font/woff2" crossorigin
href="_static/vendor/fontawesome/5.13.0/webfonts/fa-brands-400.woff2">
<link rel="stylesheet"
href="_static/vendor/open-sans_all/1.44.1/index.css">
<link rel="stylesheet"
href="_static/vendor/lato_latin-ext/1.44.1/index.css">
<link rel="stylesheet" href="_static/sphinx-book-theme.bfb7730f9caf2ec0b46a44615585038c.css" type="text/css" />
<link rel="stylesheet" href="_static/pygments.css" type="text/css" />
<link rel="stylesheet" type="text/css" href="_static/togglebutton.css" />
<link rel="stylesheet" type="text/css" href="_static/copybutton.css" />
<link rel="stylesheet" type="text/css" href="_static/mystnb.css" />
<link rel="stylesheet" type="text/css" href="_static/sphinx-thebe.css" />
<link rel="stylesheet" type="text/css" href="_static/panels-main.c949a650a448cc0ae9fd3441c0e17fb0.css" />
<link rel="stylesheet" type="text/css" href="_static/panels-variables.06eb56fa6e07937060861dad626602ad.css" />
<link rel="preload" as="script" href="_static/js/index.30270b6e4c972e43c488.js">
<script id="documentation_options" data-url_root="./" src="_static/documentation_options.js"></script>
<script src="_static/jquery.js"></script>
<script src="_static/underscore.js"></script>
<script src="_static/doctools.js"></script>
<script src="_static/language_data.js"></script>
<script src="_static/togglebutton.js"></script>
<script src="_static/clipboard.min.js"></script>
<script src="_static/copybutton.js"></script>
<script >var togglebuttonSelector = '.toggle, .admonition.dropdown, .tag_hide_input div.cell_input, .tag_hide-input div.cell_input, .tag_hide_output div.cell_output, .tag_hide-output div.cell_output, .tag_hide_cell.cell, .tag_hide-cell.cell';</script>
<script src="_static/sphinx-book-theme.be0a4a0c39cd630af62a2fcf693f3f06.js"></script>
<script async="async" src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.7/latest.js?config=TeX-AMS-MML_HTMLorMML"></script>
<script type="text/x-mathjax-config">MathJax.Hub.Config({"tex2jax": {"inlineMath": [["\\(", "\\)"]], "displayMath": [["\\[", "\\]"]], "processRefs": false, "processEnvironments": false}})</script>
<script async="async" src="https://unpkg.com/thebelab@latest/lib/index.js"></script>
<script >
const thebe_selector = ".thebe"
const thebe_selector_input = "pre"
const thebe_selector_output = ".output"
</script>
<script async="async" src="_static/sphinx-thebe.js"></script>
<link rel="index" title="Index" href="genindex.html" />
<link rel="search" title="Search" href="search.html" />
<link rel="next" title="Joining Tables" href="05_join.html" />
<link rel="prev" title="Proper Motion" href="03_motion.html" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<meta name="docsearch:language" content="en" />
</head>
<body data-spy="scroll" data-target="#bd-toc-nav" data-offset="80">
<div class="container-xl">
<div class="row">
<div class="col-12 col-md-3 bd-sidebar site-navigation show" id="site-navigation">
<div class="navbar-brand-box">
<a class="navbar-brand text-wrap" href="index.html">
<h1 class="site-logo" id="site-title">Astronomical Data in Python</h1>
</a>
</div>
<form class="bd-search d-flex align-items-center" action="search.html" method="get">
<i class="icon fas fa-search"></i>
<input type="search" class="form-control" name="q" id="search-input" placeholder="Search this book..." aria-label="Search this book..." autocomplete="off" >
</form>
<nav class="bd-links" id="bd-docs-nav" aria-label="Main navigation">
<ul class="nav sidenav_l1">
<li class="toctree-l1">
<a class="reference internal" href="README.html">
Astronomical Data in Python
</a>
</li>
</ul>
<ul class="current nav sidenav_l1">
<li class="toctree-l1">
<a class="reference internal" href="01_query.html">
Queries
</a>
</li>
<li class="toctree-l1">
<a class="reference internal" href="02_coords.html">
Coordinates and units
</a>
</li>
<li class="toctree-l1">
<a class="reference internal" href="03_motion.html">
Proper Motion
</a>
</li>
<li class="toctree-l1 current active">
<a class="current reference internal" href="#">
Transformation and Selection
</a>
</li>
<li class="toctree-l1">
<a class="reference internal" href="05_join.html">
Joining Tables
</a>
</li>
<li class="toctree-l1">
<a class="reference internal" href="06_photo.html">
Photometry
</a>
</li>
<li class="toctree-l1">
<a class="reference internal" href="07_plot.html">
Visualization
</a>
</li>
</ul>
</nav>
<!-- To handle the deprecated key -->
<div class="navbar_extra_footer">
Powered by <a href="https://jupyterbook.org">Jupyter Book</a>
</div>
</div>
<main class="col py-md-3 pl-md-4 bd-content overflow-auto" role="main">
<div class="row topbar fixed-top container-xl">
<div class="col-12 col-md-3 bd-topbar-whitespace site-navigation show">
</div>
<div class="col pl-2 topbar-main">
<button id="navbar-toggler" class="navbar-toggler ml-0" type="button" data-toggle="collapse"
data-toggle="tooltip" data-placement="bottom" data-target=".site-navigation" aria-controls="navbar-menu"
aria-expanded="true" aria-label="Toggle navigation" aria-controls="site-navigation"
title="Toggle navigation" data-toggle="tooltip" data-placement="left">
<i class="fas fa-bars"></i>
<i class="fas fa-arrow-left"></i>
<i class="fas fa-arrow-up"></i>
</button>
<div class="dropdown-buttons-trigger">
<button id="dropdown-buttons-trigger" class="btn btn-secondary topbarbtn" aria-label="Download this page"><i
class="fas fa-download"></i></button>
<div class="dropdown-buttons">
<!-- ipynb file if we had a myst markdown file -->
<!-- Download raw file -->
<a class="dropdown-buttons" href="_sources/04_select.ipynb"><button type="button"
class="btn btn-secondary topbarbtn" title="Download source file" data-toggle="tooltip"
data-placement="left">.ipynb</button></a>
<!-- Download PDF via print -->
<button type="button" id="download-print" class="btn btn-secondary topbarbtn" title="Print to PDF"
onClick="window.print()" data-toggle="tooltip" data-placement="left">.pdf</button>
</div>
</div>
<!-- Source interaction buttons -->
<div class="dropdown-buttons-trigger">
<button id="dropdown-buttons-trigger" class="btn btn-secondary topbarbtn"
aria-label="Connect with source repository"><i class="fab fa-github"></i></button>
<div class="dropdown-buttons sourcebuttons">
<a class="repository-button"
href="https://github.com/AllenDowney/AstronomicalData"><button type="button" class="btn btn-secondary topbarbtn"
data-toggle="tooltip" data-placement="left" title="Source repository"><i
class="fab fa-github"></i>repository</button></a>
</div>
</div>
<!-- Full screen (wrap in <a> to have style consistency -->
<a class="full-screen-button"><button type="button" class="btn btn-secondary topbarbtn" data-toggle="tooltip"
data-placement="bottom" onclick="toggleFullScreen()" title="Fullscreen mode"><i
class="fas fa-expand"></i></button></a>
<!-- Launch buttons -->
<div class="dropdown-buttons-trigger">
<button id="dropdown-buttons-trigger" class="btn btn-secondary topbarbtn"
aria-label="Launch interactive content"><i class="fas fa-rocket"></i></button>
<div class="dropdown-buttons">
<a class="binder-button" href="https://mybinder.org/v2/gh/AllenDowney/AstronomicalData/master?urlpath=tree/04_select.ipynb"><button type="button"
class="btn btn-secondary topbarbtn" title="Launch Binder" data-toggle="tooltip"
data-placement="left"><img class="binder-button-logo"
src="_static/images/logo_binder.svg"
alt="Interact on binder">Binder</button></a>
<a class="colab-button" href="https://colab.research.google.com/github/AllenDowney/AstronomicalData/blob/master/04_select.ipynb"><button type="button" class="btn btn-secondary topbarbtn"
title="Launch Colab" data-toggle="tooltip" data-placement="left"><img class="colab-button-logo"
src="_static/images/logo_colab.png"
alt="Interact on Colab">Colab</button></a>
</div>
</div>
</div>
<!-- Table of contents -->
<div class="d-none d-md-block col-md-2 bd-toc show">
<div class="tocsection onthispage pt-5 pb-3">
<i class="fas fa-list"></i> Contents
</div>
<nav id="bd-toc-nav">
<ul class="nav section-nav flex-column">
<li class="toc-h2 nav-item toc-entry">
<a class="reference internal nav-link" href="#outline">
Outline
</a>
</li>
<li class="toc-h2 nav-item toc-entry">
<a class="reference internal nav-link" href="#reload-the-data">
Reload the data
</a>
</li>
<li class="toc-h2 nav-item toc-entry">
<a class="reference internal nav-link" href="#selection-by-proper-motion">
Selection by proper motion
</a>
</li>
<li class="toc-h2 nav-item toc-entry">
<a class="reference internal nav-link" href="#convex-hull">
Convex Hull
</a>
</li>
<li class="toc-h2 nav-item toc-entry">
<a class="reference internal nav-link" href="#assembling-the-query">
Assembling the query
</a>
<ul class="nav section-nav flex-column">
<li class="toc-h3 nav-item toc-entry">
<a class="reference internal nav-link" href="#exercise">
Exercise
</a>
</li>
<li class="toc-h3 nav-item toc-entry">
<a class="reference internal nav-link" href="#id1">
Exercise
</a>
</li>
</ul>
</li>
<li class="toc-h2 nav-item toc-entry">
<a class="reference internal nav-link" href="#plotting-one-more-time">
Plotting one more time
</a>
</li>
<li class="toc-h2 nav-item toc-entry">
<a class="reference internal nav-link" href="#saving-the-dataframe">
Saving the DataFrame
</a>
</li>
<li class="toc-h2 nav-item toc-entry">
<a class="reference internal nav-link" href="#csv">
CSV
</a>
</li>
<li class="toc-h2 nav-item toc-entry">
<a class="reference internal nav-link" href="#summary">
Summary
</a>
</li>
<li class="toc-h2 nav-item toc-entry">
<a class="reference internal nav-link" href="#best-practices">
Best practices
</a>
</li>
</ul>
</nav>
</div>
</div>
<div id="main-content" class="row">
<div class="col-12 col-md-9 pl-md-3 pr-md-0">
<div>
<div class="section" id="transformation-and-selection">
<h1>Transformation and Selection<a class="headerlink" href="#transformation-and-selection" title="Permalink to this headline"></a></h1>
<p>This is the fourth in a series of notebooks related to astronomy data.</p>
<p>As a running example, we are replicating parts of the analysis in a recent paper, “<a class="reference external" href="https://arxiv.org/abs/1805.00425">Off the beaten path: Gaia reveals GD-1 stars outside of the main stream</a>” by Adrian M. Price-Whelan and Ana Bonaca.</p>
<p>In the first lesson, we wrote ADQL queries and used them to select and download data from the Gaia server.</p>
<p>In the second lesson, we write a query to select stars from the region of the sky where we expect GD-1 to be, and save the results in a FITS file.</p>
<p>In the third lesson, we read that data back and identified stars with the proper motion we expect for GD-1.</p>
<div class="section" id="outline">
<h2>Outline<a class="headerlink" href="#outline" title="Permalink to this headline"></a></h2>
<p>Here are the steps in this lesson:</p>
<ol class="simple">
<li><p>Using data from the previous lesson, well identify the values of proper motion for stars likely to be in GD-1.</p></li>
<li><p>Then well compose an ADQL query that selects stars based on proper motion, so we can download only the data we need.</p></li>
<li><p>Well also see how to write the results to a CSV file.</p></li>
</ol>
<p>That will make it possible to search a bigger region of the sky in a single query.</p>
<p>After completing this lesson, you should be able to</p>
<ul class="simple">
<li><p>Transform proper motions from one frame to another.</p></li>
<li><p>Compute the convex hull of a set of points.</p></li>
<li><p>Write an ADQL query that selects based on proper motion.</p></li>
<li><p>Save data in CSV format.</p></li>
</ul>
</div>
<div class="section" id="reload-the-data">
<h2>Reload the data<a class="headerlink" href="#reload-the-data" title="Permalink to this headline"></a></h2>
<p>The following cells download the data from the previous lesson, if necessary, and load it into a Pandas <code class="docutils literal notranslate"><span class="pre">DataFrame</span></code>.</p>
<div class="cell docutils container">
<div class="cell_input docutils container">
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">os</span>
<span class="kn">from</span> <span class="nn">wget</span> <span class="kn">import</span> <span class="n">download</span>
<span class="n">filename</span> <span class="o">=</span> <span class="s1">&#39;gd1_dataframe.hdf5&#39;</span>
<span class="n">path</span> <span class="o">=</span> <span class="s1">&#39;https://github.com/AllenDowney/AstronomicalData/raw/main/data/&#39;</span>
<span class="k">if</span> <span class="ow">not</span> <span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">exists</span><span class="p">(</span><span class="n">filename</span><span class="p">):</span>
<span class="nb">print</span><span class="p">(</span><span class="n">download</span><span class="p">(</span><span class="n">path</span><span class="o">+</span><span class="n">filename</span><span class="p">))</span>
</pre></div>
</div>
</div>
</div>
<div class="cell docutils container">
<div class="cell_input docutils container">
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">pandas</span> <span class="k">as</span> <span class="nn">pd</span>
<span class="n">centerline_df</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">read_hdf</span><span class="p">(</span><span class="n">filename</span><span class="p">,</span> <span class="s1">&#39;centerline_df&#39;</span><span class="p">)</span>
<span class="n">selected_df</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">read_hdf</span><span class="p">(</span><span class="n">filename</span><span class="p">,</span> <span class="s1">&#39;selected_df&#39;</span><span class="p">)</span>
</pre></div>
</div>
</div>
</div>
</div>
<div class="section" id="selection-by-proper-motion">
<h2>Selection by proper motion<a class="headerlink" href="#selection-by-proper-motion" title="Permalink to this headline"></a></h2>
<p>Lets review how we got to this point.</p>
<ol class="simple">
<li><p>We made an ADQL query to the Gaia server to get data for stars in the vicinity of GD-1.</p></li>
<li><p>We transformed the coordinates to the <code class="docutils literal notranslate"><span class="pre">GD1Koposov10</span></code> frame so we could select stars along the centerline of GD-1.</p></li>
<li><p>We plotted the proper motion of the centerline stars to identify the bounds of the overdense region.</p></li>
<li><p>We made a mask that selects stars whose proper motion is in the overdense region.</p></li>
</ol>
<p>At this point we have downloaded data for a relatively large number of stars (more than 100,000) and selected a relatively small number (around 1000).</p>
<p>It would be more efficient to use ADQL to select only the stars we need. That would also make it possible to download data covering a larger region of the sky.</p>
<p>However, the selection we did was based on proper motion in the <code class="docutils literal notranslate"><span class="pre">GD1Koposov10</span></code> frame. In order to do the same selection in ADQL, we have to work with proper motions in ICRS.</p>
<p>As a reminder, heres the rectangle we selected based on proper motion in the <code class="docutils literal notranslate"><span class="pre">GD1Koposov10</span></code> frame.</p>
<div class="cell docutils container">
<div class="cell_input docutils container">
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="n">pm1_min</span> <span class="o">=</span> <span class="o">-</span><span class="mf">8.9</span>
<span class="n">pm1_max</span> <span class="o">=</span> <span class="o">-</span><span class="mf">6.9</span>
<span class="n">pm2_min</span> <span class="o">=</span> <span class="o">-</span><span class="mf">2.2</span>
<span class="n">pm2_max</span> <span class="o">=</span> <span class="mf">1.0</span>
</pre></div>
</div>
</div>
</div>
<div class="cell docutils container">
<div class="cell_input docutils container">
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">make_rectangle</span><span class="p">(</span><span class="n">x1</span><span class="p">,</span> <span class="n">x2</span><span class="p">,</span> <span class="n">y1</span><span class="p">,</span> <span class="n">y2</span><span class="p">):</span>
<span class="sd">&quot;&quot;&quot;Return the corners of a rectangle.&quot;&quot;&quot;</span>
<span class="n">xs</span> <span class="o">=</span> <span class="p">[</span><span class="n">x1</span><span class="p">,</span> <span class="n">x1</span><span class="p">,</span> <span class="n">x2</span><span class="p">,</span> <span class="n">x2</span><span class="p">,</span> <span class="n">x1</span><span class="p">]</span>
<span class="n">ys</span> <span class="o">=</span> <span class="p">[</span><span class="n">y1</span><span class="p">,</span> <span class="n">y2</span><span class="p">,</span> <span class="n">y2</span><span class="p">,</span> <span class="n">y1</span><span class="p">,</span> <span class="n">y1</span><span class="p">]</span>
<span class="k">return</span> <span class="n">xs</span><span class="p">,</span> <span class="n">ys</span>
</pre></div>
</div>
</div>
</div>
<div class="cell docutils container">
<div class="cell_input docutils container">
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="n">pm1_rect</span><span class="p">,</span> <span class="n">pm2_rect</span> <span class="o">=</span> <span class="n">make_rectangle</span><span class="p">(</span>
<span class="n">pm1_min</span><span class="p">,</span> <span class="n">pm1_max</span><span class="p">,</span> <span class="n">pm2_min</span><span class="p">,</span> <span class="n">pm2_max</span><span class="p">)</span>
</pre></div>
</div>
</div>
</div>
<p>The following figure shows:</p>
<ul class="simple">
<li><p>Proper motion for the stars we selected along the center line of GD-1,</p></li>
<li><p>The rectangle we selected, and</p></li>
<li><p>The stars inside the rectangle highlighted in green.</p></li>
</ul>
<div class="cell docutils container">
<div class="cell_input docutils container">
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="k">as</span> <span class="nn">plt</span>
<span class="n">pm1</span> <span class="o">=</span> <span class="n">centerline_df</span><span class="p">[</span><span class="s1">&#39;pm_phi1&#39;</span><span class="p">]</span>
<span class="n">pm2</span> <span class="o">=</span> <span class="n">centerline_df</span><span class="p">[</span><span class="s1">&#39;pm_phi2&#39;</span><span class="p">]</span>
<span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">pm1</span><span class="p">,</span> <span class="n">pm2</span><span class="p">,</span> <span class="s1">&#39;ko&#39;</span><span class="p">,</span> <span class="n">markersize</span><span class="o">=</span><span class="mf">0.3</span><span class="p">,</span> <span class="n">alpha</span><span class="o">=</span><span class="mf">0.3</span><span class="p">)</span>
<span class="n">pm1</span> <span class="o">=</span> <span class="n">selected_df</span><span class="p">[</span><span class="s1">&#39;pm_phi1&#39;</span><span class="p">]</span>
<span class="n">pm2</span> <span class="o">=</span> <span class="n">selected_df</span><span class="p">[</span><span class="s1">&#39;pm_phi2&#39;</span><span class="p">]</span>
<span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">pm1</span><span class="p">,</span> <span class="n">pm2</span><span class="p">,</span> <span class="s1">&#39;gx&#39;</span><span class="p">,</span> <span class="n">markersize</span><span class="o">=</span><span class="mf">0.3</span><span class="p">,</span> <span class="n">alpha</span><span class="o">=</span><span class="mf">0.3</span><span class="p">)</span>
<span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">pm1_rect</span><span class="p">,</span> <span class="n">pm2_rect</span><span class="p">,</span> <span class="s1">&#39;-&#39;</span><span class="p">)</span>
<span class="n">plt</span><span class="o">.</span><span class="n">xlabel</span><span class="p">(</span><span class="s1">&#39;Proper motion phi1 (GD1 frame)&#39;</span><span class="p">)</span>
<span class="n">plt</span><span class="o">.</span><span class="n">ylabel</span><span class="p">(</span><span class="s1">&#39;Proper motion phi2 (GD1 frame)&#39;</span><span class="p">)</span>
<span class="n">plt</span><span class="o">.</span><span class="n">xlim</span><span class="p">(</span><span class="o">-</span><span class="mi">12</span><span class="p">,</span> <span class="mi">8</span><span class="p">)</span>
<span class="n">plt</span><span class="o">.</span><span class="n">ylim</span><span class="p">(</span><span class="o">-</span><span class="mi">10</span><span class="p">,</span> <span class="mi">10</span><span class="p">);</span>
</pre></div>
</div>
</div>
<div class="cell_output docutils container">
<img alt="_images/04_select_13_0.png" src="_images/04_select_13_0.png" />
</div>
</div>
<p>Now well make the same plot using proper motions in the ICRS frame, which are stored in columns <code class="docutils literal notranslate"><span class="pre">pmra</span></code> and <code class="docutils literal notranslate"><span class="pre">pmdec</span></code>.</p>
<div class="cell docutils container">
<div class="cell_input docutils container">
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="n">pm1</span> <span class="o">=</span> <span class="n">centerline_df</span><span class="p">[</span><span class="s1">&#39;pmra&#39;</span><span class="p">]</span>
<span class="n">pm2</span> <span class="o">=</span> <span class="n">centerline_df</span><span class="p">[</span><span class="s1">&#39;pmdec&#39;</span><span class="p">]</span>
<span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">pm1</span><span class="p">,</span> <span class="n">pm2</span><span class="p">,</span> <span class="s1">&#39;ko&#39;</span><span class="p">,</span> <span class="n">markersize</span><span class="o">=</span><span class="mf">0.3</span><span class="p">,</span> <span class="n">alpha</span><span class="o">=</span><span class="mf">0.3</span><span class="p">)</span>
<span class="n">pm1</span> <span class="o">=</span> <span class="n">selected_df</span><span class="p">[</span><span class="s1">&#39;pmra&#39;</span><span class="p">]</span>
<span class="n">pm2</span> <span class="o">=</span> <span class="n">selected_df</span><span class="p">[</span><span class="s1">&#39;pmdec&#39;</span><span class="p">]</span>
<span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">pm1</span><span class="p">,</span> <span class="n">pm2</span><span class="p">,</span> <span class="s1">&#39;gx&#39;</span><span class="p">,</span> <span class="n">markersize</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="n">alpha</span><span class="o">=</span><span class="mf">0.3</span><span class="p">)</span>
<span class="n">plt</span><span class="o">.</span><span class="n">xlabel</span><span class="p">(</span><span class="s1">&#39;Proper motion ra (ICRS frame)&#39;</span><span class="p">)</span>
<span class="n">plt</span><span class="o">.</span><span class="n">ylabel</span><span class="p">(</span><span class="s1">&#39;Proper motion dec (ICRS frame)&#39;</span><span class="p">)</span>
<span class="n">plt</span><span class="o">.</span><span class="n">xlim</span><span class="p">([</span><span class="o">-</span><span class="mi">10</span><span class="p">,</span> <span class="mi">5</span><span class="p">])</span>
<span class="n">plt</span><span class="o">.</span><span class="n">ylim</span><span class="p">([</span><span class="o">-</span><span class="mi">20</span><span class="p">,</span> <span class="mi">5</span><span class="p">]);</span>
</pre></div>
</div>
</div>
<div class="cell_output docutils container">
<img alt="_images/04_select_15_0.png" src="_images/04_select_15_0.png" />
</div>
</div>
<p>The proper motions of the selected stars are more spread out in this frame, which is why it was preferable to do the selection in the GD-1 frame.</p>
<p>But now we can define a polygon that encloses the proper motions of these stars in ICRS, and use that polygon as a selection criterion in an ADQL query.</p>
</div>
<div class="section" id="convex-hull">
<h2>Convex Hull<a class="headerlink" href="#convex-hull" title="Permalink to this headline"></a></h2>
<p>SciPy provides a function that computes the <a class="reference external" href="https://en.wikipedia.org/wiki/Convex_hull">convex hull</a> of a set of points, which is the smallest convex polygon that contains all of the points.</p>
<p>To use it, well select columns <code class="docutils literal notranslate"><span class="pre">pmra</span></code> and <code class="docutils literal notranslate"><span class="pre">pmdec</span></code> and convert them to a NumPy array.</p>
<div class="cell docutils container">
<div class="cell_input docutils container">
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="nn">np</span>
<span class="n">points</span> <span class="o">=</span> <span class="n">selected_df</span><span class="p">[[</span><span class="s1">&#39;pmra&#39;</span><span class="p">,</span><span class="s1">&#39;pmdec&#39;</span><span class="p">]]</span><span class="o">.</span><span class="n">to_numpy</span><span class="p">()</span>
<span class="n">points</span><span class="o">.</span><span class="n">shape</span>
</pre></div>
</div>
</div>
<div class="cell_output docutils container">
<div class="output text_plain highlight-myst-ansi notranslate"><div class="highlight"><pre><span></span>(1049, 2)
</pre></div>
</div>
</div>
</div>
<p>NOTE: If you are using an older version of Pandas, you might not have <code class="docutils literal notranslate"><span class="pre">to_numpy()</span></code>; you can use <code class="docutils literal notranslate"><span class="pre">values</span></code> instead, like this:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">points</span> <span class="o">=</span> <span class="n">selected_df</span><span class="p">[[</span><span class="s1">&#39;pmra&#39;</span><span class="p">,</span><span class="s1">&#39;pmdec&#39;</span><span class="p">]]</span><span class="o">.</span><span class="n">values</span>
</pre></div>
</div>
<p>Well pass the points to <code class="docutils literal notranslate"><span class="pre">ConvexHull</span></code>, which returns an object that contains the results.</p>
<div class="cell docutils container">
<div class="cell_input docutils container">
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="kn">from</span> <span class="nn">scipy.spatial</span> <span class="kn">import</span> <span class="n">ConvexHull</span>
<span class="n">hull</span> <span class="o">=</span> <span class="n">ConvexHull</span><span class="p">(</span><span class="n">points</span><span class="p">)</span>
<span class="n">hull</span>
</pre></div>
</div>
</div>
<div class="cell_output docutils container">
<div class="output text_plain highlight-myst-ansi notranslate"><div class="highlight"><pre><span></span>&lt;scipy.spatial.qhull.ConvexHull at 0x7fa7c4c03a90&gt;
</pre></div>
</div>
</div>
</div>
<p><code class="docutils literal notranslate"><span class="pre">hull.vertices</span></code> contains the indices of the points that fall on the perimeter of the hull.</p>
<div class="cell docutils container">
<div class="cell_input docutils container">
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="n">hull</span><span class="o">.</span><span class="n">vertices</span>
</pre></div>
</div>
</div>
<div class="cell_output docutils container">
<div class="output text_plain highlight-myst-ansi notranslate"><div class="highlight"><pre><span></span>array([ 692, 873, 141, 303, 42, 622, 45, 83, 127, 182, 1006,
971, 967, 1001, 969, 940], dtype=int32)
</pre></div>
</div>
</div>
</div>
<p>We can use them as an index into the original array to select the corresponding rows.</p>
<div class="cell docutils container">
<div class="cell_input docutils container">
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="n">pm_vertices</span> <span class="o">=</span> <span class="n">points</span><span class="p">[</span><span class="n">hull</span><span class="o">.</span><span class="n">vertices</span><span class="p">]</span>
<span class="n">pm_vertices</span>
</pre></div>
</div>
</div>
<div class="cell_output docutils container">
<div class="output text_plain highlight-myst-ansi notranslate"><div class="highlight"><pre><span></span>array([[ -4.05037121, -14.75623261],
[ -3.41981085, -14.72365546],
[ -3.03521988, -14.44357135],
[ -2.26847919, -13.7140236 ],
[ -2.61172203, -13.24797471],
[ -2.73471401, -13.09054471],
[ -3.19923146, -12.5942653 ],
[ -3.34082546, -12.47611926],
[ -5.67489413, -11.16083338],
[ -5.95159272, -11.10547884],
[ -6.42394023, -11.05981295],
[ -7.09631023, -11.95187806],
[ -7.30641519, -12.24559977],
[ -7.04016696, -12.88580702],
[ -6.00347705, -13.75912098],
[ -4.42442296, -14.74641176]])
</pre></div>
</div>
</div>
</div>
<p>To plot the resulting polygon, we have to pull out the x and y coordinates.</p>
<div class="cell docutils container">
<div class="cell_input docutils container">
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="n">pmra_poly</span><span class="p">,</span> <span class="n">pmdec_poly</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">transpose</span><span class="p">(</span><span class="n">pm_vertices</span><span class="p">)</span>
</pre></div>
</div>
</div>
</div>
<p>This use of <code class="docutils literal notranslate"><span class="pre">transpose</span></code> is a bit of a NumPy trick. Because <code class="docutils literal notranslate"><span class="pre">pm_vertices</span></code> has two columns, its transpose has two rows, which are assigned to the two variables <code class="docutils literal notranslate"><span class="pre">pmra_poly</span></code> and <code class="docutils literal notranslate"><span class="pre">pmdec_poly</span></code>.</p>
<p>The following figure shows proper motion in ICRS again, along with the convex hull we just computed.</p>
<div class="cell docutils container">
<div class="cell_input docutils container">
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="n">pm1</span> <span class="o">=</span> <span class="n">centerline_df</span><span class="p">[</span><span class="s1">&#39;pmra&#39;</span><span class="p">]</span>
<span class="n">pm2</span> <span class="o">=</span> <span class="n">centerline_df</span><span class="p">[</span><span class="s1">&#39;pmdec&#39;</span><span class="p">]</span>
<span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">pm1</span><span class="p">,</span> <span class="n">pm2</span><span class="p">,</span> <span class="s1">&#39;ko&#39;</span><span class="p">,</span> <span class="n">markersize</span><span class="o">=</span><span class="mf">0.3</span><span class="p">,</span> <span class="n">alpha</span><span class="o">=</span><span class="mf">0.3</span><span class="p">)</span>
<span class="n">pm1</span> <span class="o">=</span> <span class="n">selected_df</span><span class="p">[</span><span class="s1">&#39;pmra&#39;</span><span class="p">]</span>
<span class="n">pm2</span> <span class="o">=</span> <span class="n">selected_df</span><span class="p">[</span><span class="s1">&#39;pmdec&#39;</span><span class="p">]</span>
<span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">pm1</span><span class="p">,</span> <span class="n">pm2</span><span class="p">,</span> <span class="s1">&#39;gx&#39;</span><span class="p">,</span> <span class="n">markersize</span><span class="o">=</span><span class="mf">0.3</span><span class="p">,</span> <span class="n">alpha</span><span class="o">=</span><span class="mf">0.3</span><span class="p">)</span>
<span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">pmra_poly</span><span class="p">,</span> <span class="n">pmdec_poly</span><span class="p">)</span>
<span class="n">plt</span><span class="o">.</span><span class="n">xlabel</span><span class="p">(</span><span class="s1">&#39;Proper motion phi1 (ICRS frame)&#39;</span><span class="p">)</span>
<span class="n">plt</span><span class="o">.</span><span class="n">ylabel</span><span class="p">(</span><span class="s1">&#39;Proper motion phi2 (ICRS frame)&#39;</span><span class="p">)</span>
<span class="n">plt</span><span class="o">.</span><span class="n">xlim</span><span class="p">([</span><span class="o">-</span><span class="mi">10</span><span class="p">,</span> <span class="mi">5</span><span class="p">])</span>
<span class="n">plt</span><span class="o">.</span><span class="n">ylim</span><span class="p">([</span><span class="o">-</span><span class="mi">20</span><span class="p">,</span> <span class="mi">5</span><span class="p">]);</span>
</pre></div>
</div>
</div>
<div class="cell_output docutils container">
<img alt="_images/04_select_29_0.png" src="_images/04_select_29_0.png" />
</div>
</div>
<p>So <code class="docutils literal notranslate"><span class="pre">pm_vertices</span></code> represents the polygon we want to select.
The next step is to use it as part of an ADQL query.</p>
</div>
<div class="section" id="assembling-the-query">
<h2>Assembling the query<a class="headerlink" href="#assembling-the-query" title="Permalink to this headline"></a></h2>
<p>Heres the base string we used for the query in the previous lesson.</p>
<div class="cell docutils container">
<div class="cell_input docutils container">
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="n">query_base</span> <span class="o">=</span> <span class="s2">&quot;&quot;&quot;SELECT </span>
<span class="si">{columns}</span><span class="s2"></span>
<span class="s2">FROM gaiadr2.gaia_source</span>
<span class="s2">WHERE parallax &lt; 1</span>
<span class="s2"> AND bp_rp BETWEEN -0.75 AND 2 </span>
<span class="s2"> AND 1 = CONTAINS(POINT(ra, dec), </span>
<span class="s2"> POLYGON(</span><span class="si">{point_list}</span><span class="s2">))</span>
<span class="s2">&quot;&quot;&quot;</span>
</pre></div>
</div>
</div>
</div>
<p>And here are the changes well make in this lesson:</p>
<ol class="simple">
<li><p>We will add another clause to select stars whose proper motion is in the polygon we just computed, <code class="docutils literal notranslate"><span class="pre">pm_vertices</span></code>.</p></li>
<li><p>We will select stars with coordinates in a larger region.</p></li>
</ol>
<p>To use <code class="docutils literal notranslate"><span class="pre">pm_vertices</span></code> as part of an ADQL query, we have to convert it to a string.
Using <code class="docutils literal notranslate"><span class="pre">flatten</span></code> and <code class="docutils literal notranslate"><span class="pre">array2string</span></code>, we can almost get the format we need.</p>
<div class="cell docutils container">
<div class="cell_input docutils container">
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="n">s</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">array2string</span><span class="p">(</span><span class="n">pm_vertices</span><span class="o">.</span><span class="n">flatten</span><span class="p">(),</span>
<span class="n">max_line_width</span><span class="o">=</span><span class="mi">1000</span><span class="p">,</span>
<span class="n">separator</span><span class="o">=</span><span class="s1">&#39;,&#39;</span><span class="p">)</span>
<span class="n">s</span>
</pre></div>
</div>
</div>
<div class="cell_output docutils container">
<div class="output text_plain highlight-myst-ansi notranslate"><div class="highlight"><pre><span></span>&#39;[ -4.05037121,-14.75623261, -3.41981085,-14.72365546, -3.03521988,-14.44357135, -2.26847919,-13.7140236 , -2.61172203,-13.24797471, -2.73471401,-13.09054471, -3.19923146,-12.5942653 , -3.34082546,-12.47611926, -5.67489413,-11.16083338, -5.95159272,-11.10547884, -6.42394023,-11.05981295, -7.09631023,-11.95187806, -7.30641519,-12.24559977, -7.04016696,-12.88580702, -6.00347705,-13.75912098, -4.42442296,-14.74641176]&#39;
</pre></div>
</div>
</div>
</div>
<p>We just have to remove the brackets.</p>
<div class="cell docutils container">
<div class="cell_input docutils container">
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="n">pm_point_list</span> <span class="o">=</span> <span class="n">s</span><span class="o">.</span><span class="n">strip</span><span class="p">(</span><span class="s1">&#39;[]&#39;</span><span class="p">)</span>
<span class="n">pm_point_list</span>
</pre></div>
</div>
</div>
<div class="cell_output docutils container">
<div class="output text_plain highlight-myst-ansi notranslate"><div class="highlight"><pre><span></span>&#39; -4.05037121,-14.75623261, -3.41981085,-14.72365546, -3.03521988,-14.44357135, -2.26847919,-13.7140236 , -2.61172203,-13.24797471, -2.73471401,-13.09054471, -3.19923146,-12.5942653 , -3.34082546,-12.47611926, -5.67489413,-11.16083338, -5.95159272,-11.10547884, -6.42394023,-11.05981295, -7.09631023,-11.95187806, -7.30641519,-12.24559977, -7.04016696,-12.88580702, -6.00347705,-13.75912098, -4.42442296,-14.74641176&#39;
</pre></div>
</div>
</div>
</div>
<p>Well add this string to the query soon, but first lets compute the other polygon, the one that specifies the region of the sky we want.</p>
<p>Here are the coordinates of the rectangle well select, in the GD-1 frame.</p>
<div class="cell docutils container">
<div class="cell_input docutils container">
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">astropy.units</span> <span class="k">as</span> <span class="nn">u</span>
<span class="n">phi1_min</span> <span class="o">=</span> <span class="o">-</span><span class="mi">70</span> <span class="o">*</span> <span class="n">u</span><span class="o">.</span><span class="n">degree</span>
<span class="n">phi1_max</span> <span class="o">=</span> <span class="o">-</span><span class="mi">20</span> <span class="o">*</span> <span class="n">u</span><span class="o">.</span><span class="n">degree</span>
<span class="n">phi2_min</span> <span class="o">=</span> <span class="o">-</span><span class="mi">5</span> <span class="o">*</span> <span class="n">u</span><span class="o">.</span><span class="n">degree</span>
<span class="n">phi2_max</span> <span class="o">=</span> <span class="mi">5</span> <span class="o">*</span> <span class="n">u</span><span class="o">.</span><span class="n">degree</span>
</pre></div>
</div>
</div>
</div>
<div class="cell docutils container">
<div class="cell_input docutils container">
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="n">phi1_rect</span><span class="p">,</span> <span class="n">phi2_rect</span> <span class="o">=</span> <span class="n">make_rectangle</span><span class="p">(</span>
<span class="n">phi1_min</span><span class="p">,</span> <span class="n">phi1_max</span><span class="p">,</span> <span class="n">phi2_min</span><span class="p">,</span> <span class="n">phi2_max</span><span class="p">)</span>
</pre></div>
</div>
</div>
</div>
<p>Heres how we transform it to ICRS, as we saw in the previous lesson.</p>
<div class="cell docutils container">
<div class="cell_input docutils container">
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="kn">from</span> <span class="nn">gala.coordinates</span> <span class="kn">import</span> <span class="n">GD1Koposov10</span>
<span class="kn">from</span> <span class="nn">astropy.coordinates</span> <span class="kn">import</span> <span class="n">SkyCoord</span>
<span class="n">corners</span> <span class="o">=</span> <span class="n">SkyCoord</span><span class="p">(</span><span class="n">phi1</span><span class="o">=</span><span class="n">phi1_rect</span><span class="p">,</span>
<span class="n">phi2</span><span class="o">=</span><span class="n">phi2_rect</span><span class="p">,</span>
<span class="n">frame</span><span class="o">=</span><span class="n">GD1Koposov10</span><span class="p">)</span>
<span class="n">corners_icrs</span> <span class="o">=</span> <span class="n">corners</span><span class="o">.</span><span class="n">transform_to</span><span class="p">(</span><span class="s1">&#39;icrs&#39;</span><span class="p">)</span>
</pre></div>
</div>
</div>
</div>
<p>To use <code class="docutils literal notranslate"><span class="pre">corners_icrs</span></code> as part of an ADQL query, we have to convert it to a string. Heres how we do that, as we saw in the previous lesson.</p>
<div class="cell docutils container">
<div class="cell_input docutils container">
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">skycoord_to_string</span><span class="p">(</span><span class="n">skycoord</span><span class="p">):</span>
<span class="sd">&quot;&quot;&quot;Convert SkyCoord to string.&quot;&quot;&quot;</span>
<span class="n">t</span> <span class="o">=</span> <span class="n">skycoord</span><span class="o">.</span><span class="n">to_string</span><span class="p">()</span>
<span class="n">s</span> <span class="o">=</span> <span class="s1">&#39; &#39;</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="n">t</span><span class="p">)</span>
<span class="k">return</span> <span class="n">s</span><span class="o">.</span><span class="n">replace</span><span class="p">(</span><span class="s1">&#39; &#39;</span><span class="p">,</span> <span class="s1">&#39;, &#39;</span><span class="p">)</span>
</pre></div>
</div>
</div>
</div>
<div class="cell docutils container">
<div class="cell_input docutils container">
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="n">point_list</span> <span class="o">=</span> <span class="n">skycoord_to_string</span><span class="p">(</span><span class="n">corners_icrs</span><span class="p">)</span>
<span class="n">point_list</span>
</pre></div>
</div>
</div>
<div class="cell_output docutils container">
<div class="output text_plain highlight-myst-ansi notranslate"><div class="highlight"><pre><span></span>&#39;135.306, 8.39862, 126.51, 13.4449, 163.017, 54.2424, 172.933, 46.4726, 135.306, 8.39862&#39;
</pre></div>
</div>
</div>
</div>
<p>Now we have everything we need to assemble the query.
Heres the base query from the previous lesson again:</p>
<div class="cell docutils container">
<div class="cell_input docutils container">
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="n">query_base</span> <span class="o">=</span> <span class="s2">&quot;&quot;&quot;SELECT </span>
<span class="si">{columns}</span><span class="s2"></span>
<span class="s2">FROM gaiadr2.gaia_source</span>
<span class="s2">WHERE parallax &lt; 1</span>
<span class="s2"> AND bp_rp BETWEEN -0.75 AND 2 </span>
<span class="s2"> AND 1 = CONTAINS(POINT(ra, dec), </span>
<span class="s2"> POLYGON(</span><span class="si">{point_list}</span><span class="s2">))</span>
<span class="s2">&quot;&quot;&quot;</span>
</pre></div>
</div>
</div>
</div>
<div class="section" id="exercise">
<h3>Exercise<a class="headerlink" href="#exercise" title="Permalink to this headline"></a></h3>
<p>Modify <code class="docutils literal notranslate"><span class="pre">query_base</span></code> by adding a new clause to select stars whose coordinates of proper motion, <code class="docutils literal notranslate"><span class="pre">pmra</span></code> and <code class="docutils literal notranslate"><span class="pre">pmdec</span></code>, fall within the polygon defined by <code class="docutils literal notranslate"><span class="pre">pm_point_list</span></code>.</p>
<div class="cell tag_hide-cell docutils container">
<div class="cell_input docutils container">
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="c1"># Solution</span>
<span class="n">query_base</span> <span class="o">=</span> <span class="s2">&quot;&quot;&quot;SELECT </span>
<span class="si">{columns}</span><span class="s2"></span>
<span class="s2">FROM gaiadr2.gaia_source</span>
<span class="s2">WHERE parallax &lt; 1</span>
<span class="s2"> AND bp_rp BETWEEN -0.75 AND 2 </span>
<span class="s2"> AND 1 = CONTAINS(POINT(ra, dec), </span>
<span class="s2"> POLYGON(</span><span class="si">{point_list}</span><span class="s2">))</span>
<span class="s2"> AND 1 = CONTAINS(POINT(pmra, pmdec),</span>
<span class="s2"> POLYGON(</span><span class="si">{pm_point_list}</span><span class="s2">))</span>
<span class="s2">&quot;&quot;&quot;</span>
</pre></div>
</div>
</div>
</div>
<p>Here again are the columns we want to select.</p>
<div class="cell docutils container">
<div class="cell_input docutils container">
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="n">columns</span> <span class="o">=</span> <span class="s1">&#39;source_id, ra, dec, pmra, pmdec, parallax, radial_velocity&#39;</span>
</pre></div>
</div>
</div>
</div>
</div>
<div class="section" id="id1">
<h3>Exercise<a class="headerlink" href="#id1" title="Permalink to this headline"></a></h3>
<p>Use <code class="docutils literal notranslate"><span class="pre">format</span></code> to format <code class="docutils literal notranslate"><span class="pre">query_base</span></code> and define <code class="docutils literal notranslate"><span class="pre">query</span></code>, filling in the values of <code class="docutils literal notranslate"><span class="pre">columns</span></code>, <code class="docutils literal notranslate"><span class="pre">point_list</span></code>, and <code class="docutils literal notranslate"><span class="pre">pm_point_list</span></code>.</p>
<div class="cell tag_hide-cell docutils container">
<div class="cell_input docutils container">
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="c1"># Solution</span>
<span class="n">query</span> <span class="o">=</span> <span class="n">query_base</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">columns</span><span class="o">=</span><span class="n">columns</span><span class="p">,</span>
<span class="n">point_list</span><span class="o">=</span><span class="n">point_list</span><span class="p">,</span>
<span class="n">pm_point_list</span><span class="o">=</span><span class="n">pm_point_list</span><span class="p">)</span>
<span class="nb">print</span><span class="p">(</span><span class="n">query</span><span class="p">)</span>
</pre></div>
</div>
</div>
<div class="cell_output docutils container">
<div class="output stream highlight-myst-ansi notranslate"><div class="highlight"><pre><span></span>SELECT
source_id, ra, dec, pmra, pmdec, parallax, radial_velocity
FROM gaiadr2.gaia_source
WHERE parallax &lt; 1
AND bp_rp BETWEEN -0.75 AND 2
AND 1 = CONTAINS(POINT(ra, dec),
POLYGON(135.306, 8.39862, 126.51, 13.4449, 163.017, 54.2424, 172.933, 46.4726, 135.306, 8.39862))
AND 1 = CONTAINS(POINT(pmra, pmdec),
POLYGON( -4.05037121,-14.75623261, -3.41981085,-14.72365546, -3.03521988,-14.44357135, -2.26847919,-13.7140236 , -2.61172203,-13.24797471, -2.73471401,-13.09054471, -3.19923146,-12.5942653 , -3.34082546,-12.47611926, -5.67489413,-11.16083338, -5.95159272,-11.10547884, -6.42394023,-11.05981295, -7.09631023,-11.95187806, -7.30641519,-12.24559977, -7.04016696,-12.88580702, -6.00347705,-13.75912098, -4.42442296,-14.74641176))
</pre></div>
</div>
</div>
</div>
<p>Now we can run the query like this:</p>
<div class="cell docutils container">
<div class="cell_input docutils container">
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="kn">from</span> <span class="nn">astroquery.gaia</span> <span class="kn">import</span> <span class="n">Gaia</span>
<span class="n">job</span> <span class="o">=</span> <span class="n">Gaia</span><span class="o">.</span><span class="n">launch_job_async</span><span class="p">(</span><span class="n">query</span><span class="p">)</span>
<span class="nb">print</span><span class="p">(</span><span class="n">job</span><span class="p">)</span>
</pre></div>
</div>
</div>
<div class="cell_output docutils container">
<div class="output stream highlight-myst-ansi notranslate"><div class="highlight"><pre><span></span>INFO: Query finished. [astroquery.utils.tap.core]
&lt;Table length=7345&gt;
name dtype unit description n_bad
--------------- ------- -------- ------------------------------------------------------------------ -----
source_id int64 Unique source identifier (unique within a particular Data Release) 0
ra float64 deg Right ascension 0
dec float64 deg Declination 0
pmra float64 mas / yr Proper motion in right ascension direction 0
pmdec float64 mas / yr Proper motion in declination direction 0
parallax float64 mas Parallax 0
radial_velocity float64 km / s Radial velocity 7294
Jobid: 1609278364817O
Phase: COMPLETED
Owner: None
Output file: async_20201229164604.vot
Results: None
</pre></div>
</div>
</div>
</div>
<p>And get the results.</p>
<div class="cell docutils container">
<div class="cell_input docutils container">
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="n">candidate_table</span> <span class="o">=</span> <span class="n">job</span><span class="o">.</span><span class="n">get_results</span><span class="p">()</span>
<span class="nb">len</span><span class="p">(</span><span class="n">candidate_table</span><span class="p">)</span>
</pre></div>
</div>
</div>
<div class="cell_output docutils container">
<div class="output text_plain highlight-myst-ansi notranslate"><div class="highlight"><pre><span></span>7345
</pre></div>
</div>
</div>
</div>
</div>
</div>
<div class="section" id="plotting-one-more-time">
<h2>Plotting one more time<a class="headerlink" href="#plotting-one-more-time" title="Permalink to this headline"></a></h2>
<p>Lets see what the results look like.</p>
<div class="cell docutils container">
<div class="cell_input docutils container">
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="n">x</span> <span class="o">=</span> <span class="n">candidate_table</span><span class="p">[</span><span class="s1">&#39;ra&#39;</span><span class="p">]</span>
<span class="n">y</span> <span class="o">=</span> <span class="n">candidate_table</span><span class="p">[</span><span class="s1">&#39;dec&#39;</span><span class="p">]</span>
<span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">,</span> <span class="s1">&#39;ko&#39;</span><span class="p">,</span> <span class="n">markersize</span><span class="o">=</span><span class="mf">0.3</span><span class="p">,</span> <span class="n">alpha</span><span class="o">=</span><span class="mf">0.3</span><span class="p">)</span>
<span class="n">plt</span><span class="o">.</span><span class="n">xlabel</span><span class="p">(</span><span class="s1">&#39;ra (degree ICRS)&#39;</span><span class="p">)</span>
<span class="n">plt</span><span class="o">.</span><span class="n">ylabel</span><span class="p">(</span><span class="s1">&#39;dec (degree ICRS)&#39;</span><span class="p">);</span>
</pre></div>
</div>
</div>
<div class="cell_output docutils container">
<img alt="_images/04_select_58_0.png" src="_images/04_select_58_0.png" />
</div>
</div>
<p>Here we can see why it was useful to transform these coordinates. In ICRS, it is more difficult to identity the stars near the centerline of GD-1.</p>
<p>So, before we move on to the next step, lets collect the code we used to transform the coordinates and make a Pandas <code class="docutils literal notranslate"><span class="pre">DataFrame</span></code>:</p>
<div class="cell docutils container">
<div class="cell_input docutils container">
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="kn">from</span> <span class="nn">gala.coordinates</span> <span class="kn">import</span> <span class="n">reflex_correct</span>
<span class="k">def</span> <span class="nf">make_dataframe</span><span class="p">(</span><span class="n">table</span><span class="p">):</span>
<span class="sd">&quot;&quot;&quot;Transform coordinates from ICRS to GD-1 frame.</span>
<span class="sd"> </span>
<span class="sd"> table: Astropy Table</span>
<span class="sd"> </span>
<span class="sd"> returns: Pandas DataFrame</span>
<span class="sd"> &quot;&quot;&quot;</span>
<span class="n">skycoord</span> <span class="o">=</span> <span class="n">SkyCoord</span><span class="p">(</span>
<span class="n">ra</span><span class="o">=</span><span class="n">table</span><span class="p">[</span><span class="s1">&#39;ra&#39;</span><span class="p">],</span>
<span class="n">dec</span><span class="o">=</span><span class="n">table</span><span class="p">[</span><span class="s1">&#39;dec&#39;</span><span class="p">],</span>
<span class="n">pm_ra_cosdec</span><span class="o">=</span><span class="n">table</span><span class="p">[</span><span class="s1">&#39;pmra&#39;</span><span class="p">],</span>
<span class="n">pm_dec</span><span class="o">=</span><span class="n">table</span><span class="p">[</span><span class="s1">&#39;pmdec&#39;</span><span class="p">],</span>
<span class="n">distance</span><span class="o">=</span><span class="mi">8</span><span class="o">*</span><span class="n">u</span><span class="o">.</span><span class="n">kpc</span><span class="p">,</span>
<span class="n">radial_velocity</span><span class="o">=</span><span class="mi">0</span><span class="o">*</span><span class="n">u</span><span class="o">.</span><span class="n">km</span><span class="o">/</span><span class="n">u</span><span class="o">.</span><span class="n">s</span><span class="p">)</span>
<span class="n">transformed</span> <span class="o">=</span> <span class="n">skycoord</span><span class="o">.</span><span class="n">transform_to</span><span class="p">(</span><span class="n">GD1Koposov10</span><span class="p">)</span>
<span class="n">gd1_coord</span> <span class="o">=</span> <span class="n">reflex_correct</span><span class="p">(</span><span class="n">transformed</span><span class="p">)</span>
<span class="n">df</span> <span class="o">=</span> <span class="n">table</span><span class="o">.</span><span class="n">to_pandas</span><span class="p">()</span>
<span class="n">df</span><span class="p">[</span><span class="s1">&#39;phi1&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">gd1_coord</span><span class="o">.</span><span class="n">phi1</span>
<span class="n">df</span><span class="p">[</span><span class="s1">&#39;phi2&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">gd1_coord</span><span class="o">.</span><span class="n">phi2</span>
<span class="n">df</span><span class="p">[</span><span class="s1">&#39;pm_phi1&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">gd1_coord</span><span class="o">.</span><span class="n">pm_phi1_cosphi2</span>
<span class="n">df</span><span class="p">[</span><span class="s1">&#39;pm_phi2&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">gd1_coord</span><span class="o">.</span><span class="n">pm_phi2</span>
<span class="k">return</span> <span class="n">df</span>
</pre></div>
</div>
</div>
</div>
<p>Heres how we can use this function:</p>
<div class="cell docutils container">
<div class="cell_input docutils container">
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="n">candidate_df</span> <span class="o">=</span> <span class="n">make_dataframe</span><span class="p">(</span><span class="n">candidate_table</span><span class="p">)</span>
</pre></div>
</div>
</div>
</div>
<p>And lets see the results.</p>
<div class="cell docutils container">
<div class="cell_input docutils container">
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="n">x</span> <span class="o">=</span> <span class="n">candidate_df</span><span class="p">[</span><span class="s1">&#39;phi1&#39;</span><span class="p">]</span>
<span class="n">y</span> <span class="o">=</span> <span class="n">candidate_df</span><span class="p">[</span><span class="s1">&#39;phi2&#39;</span><span class="p">]</span>
<span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">,</span> <span class="s1">&#39;ko&#39;</span><span class="p">,</span> <span class="n">markersize</span><span class="o">=</span><span class="mf">0.5</span><span class="p">,</span> <span class="n">alpha</span><span class="o">=</span><span class="mf">0.5</span><span class="p">)</span>
<span class="n">plt</span><span class="o">.</span><span class="n">xlabel</span><span class="p">(</span><span class="s1">&#39;ra (degree GD1)&#39;</span><span class="p">)</span>
<span class="n">plt</span><span class="o">.</span><span class="n">ylabel</span><span class="p">(</span><span class="s1">&#39;dec (degree GD1)&#39;</span><span class="p">);</span>
</pre></div>
</div>
</div>
<div class="cell_output docutils container">
<img alt="_images/04_select_64_0.png" src="_images/04_select_64_0.png" />
</div>
</div>
<p>Were starting to see GD-1 more clearly.</p>
<p>We can compare this figure with one of these panels in Figure 1 from the original paper:</p>
<a class="reference internal image-reference" href="https://github.com/datacarpentry/astronomy-python/raw/gh-pages/fig/gd1-2.png"><img alt="https://github.com/datacarpentry/astronomy-python/raw/gh-pages/fig/gd1-2.png" src="https://github.com/datacarpentry/astronomy-python/raw/gh-pages/fig/gd1-2.png" style="height: 150px;" /></a>
<a class="reference internal image-reference" href="https://github.com/datacarpentry/astronomy-python/raw/gh-pages/fig/gd1-4.png"><img alt="https://github.com/datacarpentry/astronomy-python/raw/gh-pages/fig/gd1-4.png" src="https://github.com/datacarpentry/astronomy-python/raw/gh-pages/fig/gd1-4.png" style="height: 150px;" /></a>
<p>The top panel shows stars selected based on proper motion only, so it is comparable to our figure (although notice that it covers a wider region).</p>
<p>In the next lesson, we will use photometry data from Pan-STARRS to do a second round of filtering, and see if we can replicate the bottom panel.</p>
<p>Well also learn how to add annotations like the ones in the figure from the paper, and customize the style of the figure to present the results clearly and compellingly.</p>
</div>
<div class="section" id="saving-the-dataframe">
<h2>Saving the DataFrame<a class="headerlink" href="#saving-the-dataframe" title="Permalink to this headline"></a></h2>
<p>Lets save this <code class="docutils literal notranslate"><span class="pre">DataFrame</span></code> so we can pick up where we left off without running this query again.</p>
<div class="cell docutils container">
<div class="cell_input docutils container">
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="n">filename</span> <span class="o">=</span> <span class="s1">&#39;gd1_candidates.hdf5&#39;</span>
<span class="n">candidate_df</span><span class="o">.</span><span class="n">to_hdf</span><span class="p">(</span><span class="n">filename</span><span class="p">,</span> <span class="s1">&#39;candidate_df&#39;</span><span class="p">,</span> <span class="n">mode</span><span class="o">=</span><span class="s1">&#39;w&#39;</span><span class="p">)</span>
</pre></div>
</div>
</div>
</div>
<p>We can use <code class="docutils literal notranslate"><span class="pre">ls</span></code> to confirm that the file exists and check the size:</p>
<div class="cell docutils container">
<div class="cell_input docutils container">
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="o">!</span>ls -lh gd1_candidates.hdf5
</pre></div>
</div>
</div>
<div class="cell_output docutils container">
<div class="output stream highlight-myst-ansi notranslate"><div class="highlight"><pre><span></span>-rw-rw-r-- 1 downey downey 698K Dec 29 16:46 gd1_candidates.hdf5
</pre></div>
</div>
</div>
</div>
<p>If you are using Windows, <code class="docutils literal notranslate"><span class="pre">ls</span></code> might not work; in that case, try:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span>!dir gd1_candidates.hdf5
</pre></div>
</div>
</div>
<div class="section" id="csv">
<h2>CSV<a class="headerlink" href="#csv" title="Permalink to this headline"></a></h2>
<p>Pandas can write a variety of other formats, <a class="reference external" href="https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html">which you can read about here</a>.</p>
<p>We wont cover all of them, but one other important one is <a class="reference external" href="https://en.wikipedia.org/wiki/Comma-separated_values">CSV</a>, which stands for “comma-separated values”.</p>
<p>CSV is a plain-text format with minimal formatting requirements, so it can be read and written by pretty much any tool that works with data. In that sense, it is the “least common denominator” of data formats.</p>
<p>However, it has an important limitation: some information about the data gets lost in translation, notably the data types. If you read a CSV file from someone else, you might need some additional information to make sure you are getting it right.</p>
<p>Also, CSV files tend to be big, and slow to read and write.</p>
<p>With those caveats, heres how to write one:</p>
<div class="cell docutils container">
<div class="cell_input docutils container">
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="n">candidate_df</span><span class="o">.</span><span class="n">to_csv</span><span class="p">(</span><span class="s1">&#39;gd1_candidates.csv&#39;</span><span class="p">)</span>
</pre></div>
</div>
</div>
</div>
<p>We can check the file size like this:</p>
<div class="cell docutils container">
<div class="cell_input docutils container">
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="o">!</span>ls -lh gd1_candidates.csv
</pre></div>
</div>
</div>
<div class="cell_output docutils container">
<div class="output stream highlight-myst-ansi notranslate"><div class="highlight"><pre><span></span>-rw-rw-r-- 1 downey downey 1.4M Dec 29 16:46 gd1_candidates.csv
</pre></div>
</div>
</div>
</div>
<p>The CSV file about 2 times bigger than the HDF5 file (so thats not that bad, really).</p>
<p>We can see the first few lines like this:</p>
<div class="cell docutils container">
<div class="cell_input docutils container">
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="o">!</span>head -3 gd1_candidates.csv
</pre></div>
</div>
</div>
<div class="cell_output docutils container">
<div class="output stream highlight-myst-ansi notranslate"><div class="highlight"><pre><span></span>,source_id,ra,dec,pmra,pmdec,parallax,radial_velocity,phi1,phi2,pm_phi1,pm_phi2
0,635559124339440000,137.58671691646745,19.1965441084838,-3.770521900009566,-12.490481778113859,0.7913934419894347,,-59.63048941944402,-1.2164852515042963,-7.361362712597496,-0.592632882064492
1,635860218726658176,138.5187065217173,19.09233926905897,-5.941679495793577,-11.346409129876392,0.30745551377348623,,-59.247329893833296,-2.016078400820631,-7.527126084640531,1.7487794924176672
</pre></div>
</div>
</div>
</div>
<p>The CSV file contains the names of the columns, but not the data types.</p>
<p>We can read the CSV file back like this:</p>
<div class="cell docutils container">
<div class="cell_input docutils container">
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="n">read_back_csv</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">read_csv</span><span class="p">(</span><span class="s1">&#39;gd1_candidates.csv&#39;</span><span class="p">)</span>
</pre></div>
</div>
</div>
</div>
<p>Lets compare the first few rows of <code class="docutils literal notranslate"><span class="pre">candidate_df</span></code> and <code class="docutils literal notranslate"><span class="pre">read_back_csv</span></code></p>
<div class="cell docutils container">
<div class="cell_input docutils container">
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="n">candidate_df</span><span class="o">.</span><span class="n">head</span><span class="p">(</span><span class="mi">3</span><span class="p">)</span>
</pre></div>
</div>
</div>
<div class="cell_output docutils container">
<div class="output text_html"><div>
<style scoped>
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</style>
<table border="1" class="dataframe">
<thead>
<tr style="text-align: right;">
<th></th>
<th>source_id</th>
<th>ra</th>
<th>dec</th>
<th>pmra</th>
<th>pmdec</th>
<th>parallax</th>
<th>radial_velocity</th>
<th>phi1</th>
<th>phi2</th>
<th>pm_phi1</th>
<th>pm_phi2</th>
</tr>
</thead>
<tbody>
<tr>
<th>0</th>
<td>635559124339440000</td>
<td>137.586717</td>
<td>19.196544</td>
<td>-3.770522</td>
<td>-12.490482</td>
<td>0.791393</td>
<td>NaN</td>
<td>-59.630489</td>
<td>-1.216485</td>
<td>-7.361363</td>
<td>-0.592633</td>
</tr>
<tr>
<th>1</th>
<td>635860218726658176</td>
<td>138.518707</td>
<td>19.092339</td>
<td>-5.941679</td>
<td>-11.346409</td>
<td>0.307456</td>
<td>NaN</td>
<td>-59.247330</td>
<td>-2.016078</td>
<td>-7.527126</td>
<td>1.748779</td>
</tr>
<tr>
<th>2</th>
<td>635674126383965568</td>
<td>138.842874</td>
<td>19.031798</td>
<td>-3.897001</td>
<td>-12.702780</td>
<td>0.779463</td>
<td>NaN</td>
<td>-59.133391</td>
<td>-2.306901</td>
<td>-7.560608</td>
<td>-0.741800</td>
</tr>
</tbody>
</table>
</div></div></div>
</div>
<div class="cell docutils container">
<div class="cell_input docutils container">
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="n">read_back_csv</span><span class="o">.</span><span class="n">head</span><span class="p">(</span><span class="mi">3</span><span class="p">)</span>
</pre></div>
</div>
</div>
<div class="cell_output docutils container">
<div class="output text_html"><div>
<style scoped>
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</style>
<table border="1" class="dataframe">
<thead>
<tr style="text-align: right;">
<th></th>
<th>Unnamed: 0</th>
<th>source_id</th>
<th>ra</th>
<th>dec</th>
<th>pmra</th>
<th>pmdec</th>
<th>parallax</th>
<th>radial_velocity</th>
<th>phi1</th>
<th>phi2</th>
<th>pm_phi1</th>
<th>pm_phi2</th>
</tr>
</thead>
<tbody>
<tr>
<th>0</th>
<td>0</td>
<td>635559124339440000</td>
<td>137.586717</td>
<td>19.196544</td>
<td>-3.770522</td>
<td>-12.490482</td>
<td>0.791393</td>
<td>NaN</td>
<td>-59.630489</td>
<td>-1.216485</td>
<td>-7.361363</td>
<td>-0.592633</td>
</tr>
<tr>
<th>1</th>
<td>1</td>
<td>635860218726658176</td>
<td>138.518707</td>
<td>19.092339</td>
<td>-5.941679</td>
<td>-11.346409</td>
<td>0.307456</td>
<td>NaN</td>
<td>-59.247330</td>
<td>-2.016078</td>
<td>-7.527126</td>
<td>1.748779</td>
</tr>
<tr>
<th>2</th>
<td>2</td>
<td>635674126383965568</td>
<td>138.842874</td>
<td>19.031798</td>
<td>-3.897001</td>
<td>-12.702780</td>
<td>0.779463</td>
<td>NaN</td>
<td>-59.133391</td>
<td>-2.306901</td>
<td>-7.560608</td>
<td>-0.741800</td>
</tr>
</tbody>
</table>
</div></div></div>
</div>
<p>Notice that the index in <code class="docutils literal notranslate"><span class="pre">candidate_df</span></code> has become an unnamed column in <code class="docutils literal notranslate"><span class="pre">read_back_csv</span></code>. The Pandas functions for writing and reading CSV files provide options to avoid that problem, but this is an example of the kind of thing that can go wrong with CSV files.</p>
</div>
<div class="section" id="summary">
<h2>Summary<a class="headerlink" href="#summary" title="Permalink to this headline"></a></h2>
<p>In the previous lesson we downloaded data for a large number of stars and then selected a small fraction of them based on proper motion.</p>
<p>In this lesson, we improved this process by writing a more complex query that uses the database to select stars based on proper motion. This process requires more computation on the Gaia server, but then were able to either:</p>
<ol class="simple">
<li><p>Search the same region and download less data, or</p></li>
<li><p>Search a larger region while still downloading a manageable amount of data.</p></li>
</ol>
<p>In the next lesson, well learn about the databased <code class="docutils literal notranslate"><span class="pre">JOIN</span></code> operation and use it to download photometry data from Pan-STARRS.</p>
</div>
<div class="section" id="best-practices">
<h2>Best practices<a class="headerlink" href="#best-practices" title="Permalink to this headline"></a></h2>
<ul class="simple">
<li><p>When possible, “move the computation to the data”; that is, do as much of the work as possible on the database server before downloading the data.</p></li>
<li><p>For most applications, saving data in FITS or HDF5 is better than CSV. FITS and HDF5 are binary formats, so the files are usually smaller, and they store metadata, so you dont lose anything when you read the file back.</p></li>
<li><p>On the other hand, CSV is a “least common denominator” format; that is, it can be read by practically any application that works with data.</p></li>
</ul>
</div>
</div>
<script type="text/x-thebe-config">
{
requestKernel: true,
binderOptions: {
repo: "binder-examples/jupyter-stacks-datascience",
ref: "master",
},
codeMirrorConfig: {
theme: "abcdef",
mode: "python"
},
kernelOptions: {
kernelName: "python3",
path: "./."
},
predefinedOutput: true
}
</script>
<script>kernelName = 'python3'</script>
</div>
</div>
</div>
<div class='prev-next-bottom'>
<a class='left-prev' id="prev-link" href="03_motion.html" title="previous page">Proper Motion</a>
<a class='right-next' id="next-link" href="05_join.html" title="next page">Joining Tables</a>
</div>
<footer class="footer mt-5 mt-md-0">
<div class="container">
<p>
By Allen B. Downey<br/>
&copy; Copyright 2020.<br/>
</p>
</div>
</footer>
</main>
</div>
</div>
<script src="_static/js/index.30270b6e4c972e43c488.js"></script>
</body>
</html>