Cloud Immersion Experience: Unified Analytics – Unifying Data Pipelines & Machine Learning with Apache Spark
May 14, 2019Financial Cyber Threats: 10 Cases of Insider Bank Attacks
May 14, 2019Many organizations maintain large data warehouses full of analytics, sales numbers, performance metrics, and more. But nature gives us other massive datasets, including a night sky full of stars. While BigQuery GIS was explicitly designed to serve the needs of geospatial users here on Earth, its spherical coordinate systems and built-in transformation functions are equally well suited to another domain for spherical coordinates: astronomy.
What makes BigQuery a great platform for analyzing astronomy datasets?
-
BigQuery is intended for online analysis (OLAP), and optimized to work with massive datasets that are not transactional. That is true for most work with astronomy catalogs that are released every year or so, depending on the project.
-
BigQuery supports queries on spherical geometry, using BigQuery GIS. Locating objects on the celestial sphere requires spherical geometry.
-
BigQuery GIS can query astronomy data nearly as fast as more specialized database platforms, and may be faster when used to perform full table scans.
And there’s no lack of astronomy data to explore. For example, catalog data organizes the observations of a telescope project into giant tables. Some of the larger catalog datasets comprise a billion or so objects with many observed features, and for some features, these datasets include observations that span over the hours or years. WISE and Gaia are satellite-based telescopes that provide us with high resolution image data. LSST, a major new ground-based telescope, will soon come online. It is mandated to release catalogs of observed objects over the 10 year life of the project. Later in this post, we’ll explore how to use BigQuery GIS with this kind of catalog data.
Understanding the celestial coordinate system
But before we show you examples of how to query astronomy catalog data with BigQuery, let’s take a step back and discuss the broad set of functions implemented in BigQuery GIS to support your GIS needs.
Look down for a second
Consider that the Earth is a sphere, and that you find yourself on the two-dimensional surface of our planet with latitude and longitude, easily obtained from a global positioning system (GPS) that locates you and guides you to where you want to go using “lat and long” coordinates.
If you want to find out how long a trip is, remembering your high school geometry, you might think you can find the total distance using the Pythagorean theorem. In some cases, that might seem to work at first, but the farther you travel, your situation quickly becomes more complex. First, you need to convert your source and destination, lat and long, to Cartesian coordinates on a Euclidean plane, and convert angles to meters or miles. And worse, Euclidean distance is all about planar geometry, but surface or the earth is not flat (rather, it’s spherical), so Pythagoras’ theorem doesn’t work. The ancient Greek and Islamic mathematicians had most of the math worked out 1000 years ago, but that doesn’t make it any easier. The good news is that BigQuery GIS takes advantage of Google’s S2 Geometry library that can help you perform these calculations, so you can access all that above-mentioned messy geometry in much simpler Standard SQL. You can calculate the distance between points on earth, and get fancier still doing work with regions, polygons and so on. It’s very powerful, and pretty easy to use.
Ad astra
Now that you have an understanding of terrestrial geometry, let’s look back up to the stars! BigQuery GIS uses the same basic concepts to track celestial bodies as it does to track things on Earth. In other words, to locate a star in the sky, you assign a coordinate, like lat and long, that points you to exactly where you will find the star in space. But hold on, space is not a sphere! Space is literally a fully three-dimensional sort-of-infinite expanse of stars, galaxies, black holes, planets, quasars, pulsars, and nebulae. They’re all spread out, light years away, not anything like the surface of the earth where I am trying to get from my house to the nearest Google office using GPS coordinates.
Here’s where it gets interesting: all the celestial objects I describe above are so distant that we can’t easily tell the difference between a closer object and a farther object. They might as well be points of light on a giant black sphere with the Earth at its center, which is kind of what it looks like at night when you look up at the sky. (Although we’re not here to discuss the history of astronomy, avid historians of science will recall that this is exactly the model the ancient Greeks–and up until quite recently all their intellectual descendents–used to describe the heavens. If you are interested, I recommend The Structure of Scientific Revolutions, by Thomas S Kuhn.)
So, back to the celestial sphere. If the night sky and all the celestial bodies are indistinguishable from a giant sphere with the Earth at its center, my earlier proposal to assign a latitude and longitude to locate objects seems reasonable. In fact, astronomers do exactly that. They assign what they call the coordinates right ascension (ra) and declination (dec). These coordinates work exactly like latitude and longitude. Sometimes, right ascension is written in more historical notation using hours, minutes, and seconds.
Let’s look at an example. Vega (a star famous from the movie Contact) can be found at RA 18h 36m 56s, Dec +38? 47? 1?. Fortunately, modern astronomical data typically uses degrees and decimal points to store coordinates, just like modern geographers do. In modern notation, Vega has the same declination (+39?) as the longitude (39? N) of Kansas City. This means once a day people in Kansas city can look straight up to see Vega (if it’s night time). This daily rotation clearly hints at the historical use of the 24 hour system for right ascension.