diff --git a/01_query.html b/01_query.html index 951c84c..b7aab37 100644 --- a/01_query.html +++ b/01_query.html @@ -109,27 +109,27 @@
Read the documentation of this table and choose a column that looks interesting to you. Add the column name to the query and run it again. What are the units of the column you selected? What is its data type?
+# Solution
+While you are debugging, use TOP to limit the number of rows in the result. That will make each test run faster, which reduces your development time.
Launching test queries synchronously might make them start faster, too.
# Solution
+When you are debugging queries like this, you can use TOP to limit the size of the results, but then you still don’t know how big the results will be.
An alternative is to use COUNT, which asks for the number of rows that would be selected, but it does not return them.
In the previous query, replace TOP 10 source_id with COUNT(source_id) and run the query again. How many stars has Gaia identified in the cone we searched?
# Solution
+
+query = """
+SELECT
+COUNT(source_id)
+FROM gaiadr2.gaia_source
+WHERE 1=CONTAINS(
+ POINT(ra, dec),
+ CIRCLE(266.41683, -29.00781, 0.08333333))
+"""
+In the call to plt.plot, add the keyword argument markersize=0.1 to make the markers smaller.
Then add the argument alpha=0.1 to make the markers nearly transparent.
Adjust these arguments until you think the figure shows the data most clearly.
Note: Once you have made these changes, you might notice that the figure shows stripes with lower density of stars. These stripes are caused by the way Gaia scans the sky, which you can read about here. The dataset we are using, Gaia Data Release 2, covers 22 months of observations; during this time, some parts of the sky were scanned more than others.
+# Solution
+
+
Remember that we started with a rectangle in GD-1 coordinates. When transformed to ICRS, it’s a non-rectangular polygon. Now that we have transformed back to GD-1 coordinates, it’s a rectangle again.
@@ -1222,7 +1233,7 @@ Name: phi2, dtype: bool
+
Looking at these results, we see a large cluster around (0, 0), and a smaller cluster near (0, -10).
@@ -1243,7 +1254,7 @@ Name: phi2, dtype: bool
+
Now we can see the smaller cluster more clearly.
@@ -1291,7 +1302,7 @@ Name: phi2, dtype: bool
+
To select rows that fall within these bounds, we’ll use the following function, which uses Pandas operators to make a mask that selects rows where series falls between low and high.
+
Now that’s starting to look like a tidal stream!
@@ -1402,9 +1413,8 @@ Name: phi2, dtype: boolBecause an HDF5 file can contain more than one Dataset, we have to provide a name, or “key”, that identifies the Dataset in the file.
We could use any string as the key, but in this example I use the variable name df.
We’re going to need centerline and selected later as well. Write a line or two of code to add it as a second Dataset in the HDF5 file.
Pandas can write a variety of other formats, which you can read about here.
In this lesson, we re-loaded the Gaia data we saved from a previous query.
diff --git a/05_join.html b/05_join.html index 3748847..379d234 100644 --- a/05_join.html +++ b/05_join.html @@ -1021,12 +1021,12 @@ dtype: float64 ON ps.obj_id = best.original_ext_source_id """ -job3 = Gaia.launch_job_async(query=query3, +# job3 = Gaia.launch_job_async(query=query3, upload_resource='candidate_df.xml', upload_table_name='candidate_df') -results3 = job3.get_results() -results3 +# results3 = job3.get_results() +# results3Not necessarily in that order.
Let’s start by reviewing Figure 1 from the original paper. We’ve seen the individual panels, but now let’s look at the whole thing, along with the caption:
-Exercise: Think about the following questions:
+Think about the following questions:
What is the primary scientific result of this work?
What story is this figure telling?
In the design of this figure, can you identify 1-2 choices the authors made that you think are effective? Think about big-picture elements, like the number of panels and how they are arranged, as well as details like the choice of typeface.
Can you identify 1-2 elements that could be improved, or that you might have done differently?
Some topics that might come up in this discussion:
-The primary result is that the multiple stages of selection make it possible to separate likely candidates from the background more effectively than in previous work, which makes it possible to see the structure of GD-1 in “unprecedented detail”.
The figure documents the selection process as a sequence of steps. Reading right-to-left, top-to-bottom, we see selection based on proper motion, the results of the first selection, selection based on color and magnitude, and the results of the second selection. So this figure documents the methodology and presents the primary result.
It’s mostly black and white, with minimal use of color, so it will work well in print. The annotations in the bottom left panel guide the reader to the most important results. It contains enough technical detail for a professional audience, but most of it is also comprehensible to a more general audience. The two left panels have the same dimensions and their axes are aligned.
Since the panels represent a sequence, it might be better to arrange them left-to-right. The placement and size of the axis labels could be tweaked. The entire figure could be a little bigger to match the width and proportion of the caption. The top left panel has unnused white space (but that leaves space for the annotations in the bottom left).
# Solution
+
+# Some topics that might come up in this discussion:
+
+# 1. The primary result is that the multiple stages of selection
+# make it possible to separate likely candidates from the
+# background more effectively than in previous work, which makes
+# it possible to see the structure of GD-1 in "unprecedented detail".
+
+# 2. The figure documents the selection process as a sequence of
+# steps. Reading right-to-left, top-to-bottom, we see selection
+# based on proper motion, the results of the first selection,
+# selection based on color and magnitude, and the results of the
+# second selection. So this figure documents the methodology and
+# presents the primary result.
+
+# 3. It's mostly black and white, with minimal use of color, so
+# it will work well in print. The annotations in the bottom
+# left panel guide the reader to the most important results.
+# It contains enough technical detail for a professional audience,
+# but most of it is also comprehensible to a more general audience.
+# The two left panels have the same dimensions and their axes are
+# aligned.
+
+# 4. Since the panels represent a sequence, it might be better to
+# arrange them left-to-right. The placement and size of the axis
+# labels could be tweaked. The entire figure could be a little
+# bigger to match the width and proportion of the caption.
+# The top left panel has unnused white space (but that leaves
+# space for the annotations in the bottom left).
+gd1_merged.hdf5
+A label that identifies the new region, and
Several annotations that combine text and arrows to identify features of GD-1.
As an exercise, choose any or all of these features and add them to the figure:
+Choose any or all of these features and add them to the figure:
To draw vertical lines, see plt.vlines and plt.axvline.
To add text, see plt.text.
Matplotlib provides a default style that determines things like the colors of lines, the placement of labels and ticks on the axes, and many other properties.
@@ -487,7 +562,9 @@plt.gca().tick_params(direction='in')
Exercise: Read the documentation of tick_params and use it to put ticks on the top and right sides of the axes.
Read the documentation of tick_params and use it to put ticks on the top and right sides of the axes.
# Solution
@@ -498,6 +575,7 @@
If you want to make a customization that applies to all figures in a notebook, you can use rcParams.
Exercise: Plot the previous figure again, and see what font sizes have changed. Look up any other element of rcParams, change its value, and check the effect on the figure.
As an exercise, plot the previous figure again, and see what font sizes have changed. Look up any other element of rcParams, change its value, and check the effect on the figure.
If you find yourself making the same customizations in several notebooks, you can put changes to rcParams in a matplotlibrc file, which you can read about here.
The style sheet you choose will affect the appearance of all figures you plot after calling use, unless you override any of the options or call use again.
Exercise: Choose one of the styles on the list and select it by calling use. Then go back and plot one of the figures above and see what effect it has.
As an exercise, choose one of the styles on the list and select it by calling use. Then go back and plot one of the figures above and see what effect it has.
If you can’t find a style sheet that’s exactly what you want, you can make your own. This repository includes a style sheet called az-paper-twocol.mplstyle, with customizations chosen by Azalee Bostroem for publication in astronomy journals.
The following cell downloads the style sheet.
gd1_dataframe.hdf5
+
+
gd1_candidates.hdf5
+
+
+
Exercise: Add a few lines to plot_cmd to show the Polygon we selected as a shaded area.
Run these cells to get the polygon coordinates we saved in the previous notebook.
+The following cell downloads an HDF file that contains the polygon we used to select starts in the color-magnitude diagram, if it doesn’t already exist.
import os
@@ -862,42 +949,301 @@
gd1_polygon.hdf5
+And here’s how we read it back.
+loop = pd.read_hdf(filename, 'loop')
+loop
+gi
+0.587571 21.411746
+0.567801 21.322466
+0.548134 21.233380
+0.528693 21.144427
+0.509300 21.054549
+ ...
+0.773743 21.054549
+0.798829 21.144427
+0.824000 21.233380
+0.849503 21.322466
+0.875220 21.411746
+Name: g, Length: 234, dtype: float64
+coords_df = pd.read_hdf(filename, 'coords_df')
-coords = coords_df.to_numpy()
+coords = loop.reset_index().to_numpy()
coords
array([[ 0.26433692, 17.84253127],
- [ 0.35394265, 18.799117 ],
- [ 0.47491039, 19.68211921],
- [ 0.63172043, 20.45474614],
- [ 0.76612903, 20.78587196],
- [ 0.80645161, 21.41133186],
- [ 0.58691756, 21.30095659],
- [ 0.39426523, 20.56512141],
- [ 0.22401434, 19.2406181 ],
- [ 0.19713262, 18.02649007]])
+array([[ 0.58757135, 21.41174601],
+ [ 0.56780097, 21.32246601],
+ [ 0.54813409, 21.23338001],
+ [ 0.5286928 , 21.14442701],
+ [ 0.50929987, 21.05454901],
+ [ 0.48991266, 20.96383501],
+ [ 0.47084777, 20.87386601],
+ [ 0.45222635, 20.78511001],
+ [ 0.43438902, 20.69865301],
+ [ 0.42745198, 20.66469601],
+ [ 0.42067029, 20.63135301],
+ [ 0.41402867, 20.59850601],
+ [ 0.40738016, 20.56529901],
+ [ 0.40088387, 20.53264001],
+ [ 0.39449608, 20.50023501],
+ [ 0.38843797, 20.46871801],
+ [ 0.38251577, 20.43765101],
+ [ 0.3766547 , 20.40653701],
+ [ 0.37088531, 20.37564701],
+ [ 0.36522325, 20.34505401],
+ [ 0.35962415, 20.31443001],
+ [ 0.35413292, 20.28413501],
+ [ 0.34871894, 20.25390101],
+ [ 0.34339273, 20.22385701],
+ [ 0.33815825, 20.19395801],
+ [ 0.33305724, 20.16427301],
+ [ 0.32820637, 20.13508501],
+ [ 0.32348139, 20.10604901],
+ [ 0.31883343, 20.07716101],
+ [ 0.31425423, 20.04833101],
+ [ 0.30974976, 20.01961701],
+ [ 0.30531997, 19.99097001],
+ [ 0.30097354, 19.96246401],
+ [ 0.29669999, 19.93401801],
+ [ 0.29250157, 19.90573101],
+ [ 0.28837983, 19.87746501],
+ [ 0.28441584, 19.84955001],
+ [ 0.28065057, 19.82188301],
+ [ 0.27700644, 19.79450101],
+ [ 0.27342328, 19.76713801],
+ [ 0.26989305, 19.73985301],
+ [ 0.26641258, 19.71265801],
+ [ 0.26298257, 19.68540001],
+ [ 0.25960216, 19.65824401],
+ [ 0.2562733 , 19.63113701],
+ [ 0.25299978, 19.60409301],
+ [ 0.24977307, 19.57714401],
+ [ 0.24660506, 19.55024001],
+ [ 0.24348829, 19.52341001],
+ [ 0.24042159, 19.49666601],
+ [ 0.23741737, 19.46998501],
+ [ 0.23447423, 19.44339301],
+ [ 0.23158726, 19.41688701],
+ [ 0.22876474, 19.39045101],
+ [ 0.22600432, 19.36410901],
+ [ 0.22330395, 19.33786601],
+ [ 0.220663 , 19.31170101],
+ [ 0.21808571, 19.28560101],
+ [ 0.21557456, 19.25960101],
+ [ 0.21312279, 19.23368701],
+ [ 0.21073349, 19.20785601],
+ [ 0.20840975, 19.18210401],
+ [ 0.20614799, 19.15640601],
+ [ 0.20395119, 19.13076401],
+ [ 0.20182156, 19.10523201],
+ [ 0.19975572, 19.07977101],
+ [ 0.19775195, 19.05436401],
+ [ 0.19581903, 19.02902801],
+ [ 0.19395701, 19.00376101],
+ [ 0.19216276, 18.97857301],
+ [ 0.19044513, 18.95347601],
+ [ 0.1888007 , 18.92850001],
+ [ 0.18723796, 18.90368201],
+ [ 0.18576648, 18.87905401],
+ [ 0.18438763, 18.85466301],
+ [ 0.18310871, 18.83056001],
+ [ 0.18193706, 18.80672701],
+ [ 0.18087817, 18.78327401],
+ [ 0.17993184, 18.76015001],
+ [ 0.17910244, 18.73740501],
+ [ 0.17838817, 18.71496101],
+ [ 0.17779005, 18.69282101],
+ [ 0.177312 , 18.67099501],
+ [ 0.17694971, 18.64944001],
+ [ 0.1767112 , 18.62815801],
+ [ 0.17659065, 18.60714001],
+ [ 0.17658939, 18.58636601],
+ [ 0.17671618, 18.56585701],
+ [ 0.17696696, 18.54562201],
+ [ 0.17733781, 18.52565801],
+ [ 0.1778346 , 18.50597901],
+ [ 0.17846661, 18.48656801],
+ [ 0.17922891, 18.46742401],
+ [ 0.18012796, 18.44859001],
+ [ 0.18116197, 18.43005501],
+ [ 0.18233604, 18.41181501],
+ [ 0.18363223, 18.39379401],
+ [ 0.18506009, 18.37602901],
+ [ 0.18660932, 18.35862101],
+ [ 0.18829849, 18.34153201],
+ [ 0.19012805, 18.32480701],
+ [ 0.19210919, 18.30851301],
+ [ 0.19422686, 18.29250401],
+ [ 0.1964951 , 18.27685701],
+ [ 0.19890209, 18.26156301],
+ [ 0.20145338, 18.24666001],
+ [ 0.20417715, 18.23260501],
+ [ 0.20705285, 18.21898101],
+ [ 0.21005661, 18.20562501],
+ [ 0.21319339, 18.19254201],
+ [ 0.22126873, 18.16185301],
+ [ 0.2300065 , 18.13259301],
+ [ 0.23950909, 18.10508001],
+ [ 0.24974677, 18.07932501],
+ [ 0.26066153, 18.05527801],
+ [ 0.27224553, 18.03295501],
+ [ 0.28447607, 18.01227601],
+ [ 0.40566013, 18.01227601],
+ [ 0.39412682, 18.03295501],
+ [ 0.38329907, 18.05527801],
+ [ 0.37320316, 18.07932501],
+ [ 0.36384734, 18.10508001],
+ [ 0.35529237, 18.13259301],
+ [ 0.34756872, 18.16185301],
+ [ 0.34056407, 18.19254201],
+ [ 0.33788593, 18.20562501],
+ [ 0.33535176, 18.21898101],
+ [ 0.33295648, 18.23260501],
+ [ 0.33072983, 18.24666001],
+ [ 0.32870734, 18.26156301],
+ [ 0.32684482, 18.27685701],
+ [ 0.3251355 , 18.29250401],
+ [ 0.32359167, 18.30851301],
+ [ 0.32219665, 18.32480701],
+ [ 0.32097089, 18.34153201],
+ [ 0.31990093, 18.35862101],
+ [ 0.31898485, 18.37602901],
+ [ 0.3182056 , 18.39379401],
+ [ 0.31756993, 18.41181501],
+ [ 0.31706705, 18.43005501],
+ [ 0.31671781, 18.44859001],
+ [ 0.3165174 , 18.46742401],
+ [ 0.31646817, 18.48656801],
+ [ 0.3165622 , 18.50597901],
+ [ 0.31680458, 18.52565801],
+ [ 0.31718682, 18.54562201],
+ [ 0.31770268, 18.56585701],
+ [ 0.31835632, 18.58636601],
+ [ 0.31915162, 18.60714001],
+ [ 0.32007915, 18.62815801],
+ [ 0.3211385 , 18.64944001],
+ [ 0.32233599, 18.67099501],
+ [ 0.32366367, 18.69282101],
+ [ 0.32512771, 18.71496101],
+ [ 0.32672398, 18.73740501],
+ [ 0.32845154, 18.76015001],
+ [ 0.33031546, 18.78327401],
+ [ 0.33230964, 18.80672701],
+ [ 0.33443651, 18.83056001],
+ [ 0.3366864 , 18.85466301],
+ [ 0.3390529 , 18.87905401],
+ [ 0.34152681, 18.90368201],
+ [ 0.34410502, 18.92850001],
+ [ 0.34677677, 18.95347601],
+ [ 0.34953217, 18.97857301],
+ [ 0.35237348, 19.00376101],
+ [ 0.35529144, 19.02902801],
+ [ 0.35828883, 19.05436401],
+ [ 0.36136575, 19.07977101],
+ [ 0.36451277, 19.10523201],
+ [ 0.36773241, 19.13076401],
+ [ 0.37102978, 19.15640601],
+ [ 0.37440044, 19.18210401],
+ [ 0.37784139, 19.20785601],
+ [ 0.38135736, 19.23368701],
+ [ 0.38494552, 19.25960101],
+ [ 0.388603 , 19.28560101],
+ [ 0.39233725, 19.31170101],
+ [ 0.39614435, 19.33786601],
+ [ 0.40002069, 19.36410901],
+ [ 0.40396796, 19.39045101],
+ [ 0.40798805, 19.41688701],
+ [ 0.41208235, 19.44339301],
+ [ 0.41624335, 19.46998501],
+ [ 0.42047622, 19.49666601],
+ [ 0.42478124, 19.52341001],
+ [ 0.42914714, 19.55024001],
+ [ 0.43357463, 19.57714401],
+ [ 0.43806989, 19.60409301],
+ [ 0.44262347, 19.63113701],
+ [ 0.44724247, 19.65824401],
+ [ 0.4519225 , 19.68540001],
+ [ 0.45666424, 19.71265801],
+ [ 0.46146067, 19.73985301],
+ [ 0.46631851, 19.76713801],
+ [ 0.47124047, 19.79450101],
+ [ 0.47623175, 19.82188301],
+ [ 0.48136578, 19.84955001],
+ [ 0.48671855, 19.87746501],
+ [ 0.49225451, 19.90573101],
+ [ 0.49787627, 19.93401801],
+ [ 0.50358931, 19.96246401],
+ [ 0.50938655, 19.99097001],
+ [ 0.51528266, 20.01961701],
+ [ 0.52126534, 20.04833101],
+ [ 0.52733726, 20.07716101],
+ [ 0.53348957, 20.10604901],
+ [ 0.53973535, 20.13508501],
+ [ 0.54612384, 20.16427301],
+ [ 0.55279781, 20.19395801],
+ [ 0.55962597, 20.22385701],
+ [ 0.56656311, 20.25390101],
+ [ 0.57360789, 20.28413501],
+ [ 0.58074299, 20.31443001],
+ [ 0.5880138 , 20.34505401],
+ [ 0.59535596, 20.37564701],
+ [ 0.60283203, 20.40653701],
+ [ 0.61042265, 20.43765101],
+ [ 0.61808231, 20.46871801],
+ [ 0.62591386, 20.50023501],
+ [ 0.63413647, 20.53264001],
+ [ 0.64249372, 20.56529901],
+ [ 0.65104657, 20.59850601],
+ [ 0.659584 , 20.63135301],
+ [ 0.66830253, 20.66469601],
+ [ 0.67722496, 20.69865301],
+ [ 0.70017638, 20.78511001],
+ [ 0.72413715, 20.87386601],
+ [ 0.74870785, 20.96383501],
+ [ 0.77374297, 21.05454901],
+ [ 0.7988286 , 21.14442701],
+ [ 0.8240001 , 21.23338001],
+ [ 0.84950281, 21.32246601],
+ [ 0.8752204 , 21.41174601]])
Add a few lines to plot_cmd to show the polygon we selected as a shaded area.
Hint: pass coords as an argument to Polygon and plot it using add_patch.
# Solution
-#poly = Polygon(coords, closed=True,
-# facecolor='C1', alpha=0.4)
-#plt.gca().add_patch(poly)
+# poly = Polygon(coords, closed=True,
+# facecolor='C1', alpha=0.4)
+# plt.gca().add_patch(poly)
Now we’re ready to put it all together. To make a figure with four subplots, we’ll use subplot2grid, which requires two arguments:
+
We use plt.tight_layout at the end, which adjusts the sizes of the panels to make sure the titles and axis labels don’t overlap.
Exercise: See what happens if you leave out tight_layout.
As an exercise, see what happens if you leave out tight_layout.
+
This is looking more and more like the figure in the paper.
-Exercise: In this example, the ratio of the widths of the panels is 3:1. How would you adjust it if you wanted the ratio to be 3:2?
+In this example, the ratio of the widths of the panels is 3:1. How would you adjust it if you wanted the ratio to be 3:2?
+# Solution
+
+# plt.figure(figsize=(9, 4.5))
+
+# shape = (2, 5) # CHANGED
+# plt.subplot2grid(shape, (0, 0), colspan=3)
+# plot_first_selection(candidate_df)
+
+# plt.subplot2grid(shape, (0, 3), colspan=2) # CHANGED
+# plot_proper_motion(centerline)
+
+# plt.subplot2grid(shape, (1, 0), colspan=3)
+# plot_second_selection(selected)
+
+# plt.subplot2grid(shape, (1, 3), colspan=2) # CHANGED
+# plot_cmd(merged)
+# poly = Polygon(coords, closed=True,
+# facecolor='C1', alpha=0.4)
+# plt.gca().add_patch(poly)
+
+# plt.tight_layout()
+import statements to check whether you have everything installed that we need.
A cell where you will paste a line of code you copy from Slack, to check for a potential problem with “smart” quotes.
At the end there’s a link to a survey where you can let us know you’re done, or if you have any problems.
This is a Jupyter notebook, which is a computational document that contains text, code, and results.
@@ -412,6 +418,10 @@ If it runs without producing an error, you are all set.Otherwise, you might have to change your system settings so it does not convert straight quotes to smart quotes. If you have trouble with this, let us know and we will provide more details.
Please fill out this survey to let us know when you are done.
+