16 April 2011

Dreaming of Easy-to-Use Data

Posted by Ryan Anderson

Let me tell you about the way things are and the way things should be.

In planetary science, and especially lunar and Mars science, we have tons of data. Multiple missions, all carrying multiple instruments are sending back terabytes of information about other worlds. Unfortunately, this information is not always available, and when it is available it is not easy to use. Yes, missions are required to release their data to the Planetary Data System after a certain period of time, but the data products in the PDS often require a great deal of expertise to use. Non-imaging data in particular is usually impenetrable to someone not on the instrument team, but even images are often released in a bewildering number of versions and formats and map-projections.

As a graduate student, I have used data from many different missions and instruments. Imaging spectrometer data from OMEGA and CRISM, wide angle multispectral pushbroom imaging from MARCI, spectacular high resolution images from MOC, HiRISE and CTX, infrared images from THEMIS, topography from MOLA and HRSC, radar profiles from SHARAD, images from Pancam, elemental compositions from APXS. Each of these datasets has its own learning curve that must be overcome before it can be used. HiRISE and Pancam are probably the easiest because the images from these instruments are publicly released, and since they are images they are pretty intuitive. But even with these, a special program is needed to view the JP2 HiRISE images, and Pancam images are only easy for me to use because I have access to a whole suite of special programs written for the Pancam team to use.

One of the most powerful things that a planetary scientist can do is combine multiple datasets to look at a problem in several ways, but this is almost never trivial. Last year I published a study that I did of the geomorphology at Gale Crater using a whole bunch of different orbital imaging data. To combine this data, I used the program ArcGIS. This program is commonly used by people who do mapping but it has a very steep learning curve, it is plagued by bugs and confusing options, and in my experience it is slow and crashes far too often for comfort. Plus it costs quite a lot of money and has pretty strict licensing control.

And ArcGIS is for mapping, not for quantitative analysis. So if I want to look at the spectrum in a CRISM image that I’ve loaded into ArcGIS, I have to open that same image in a programming environment like IDL (or IDL-based ENVI, which has its own learning curve) even to do something as simple as finding what the albedo is at a certain location.

So that’s the way it is: datasets each have their own steep learning curve and are far too difficult and time-consuming to combine, and it is especially hard to combine them and still retain the ability to do quantitative science with them.

To me, this is not acceptable. Scientists should not have to waste their time getting data into a useable format. Scientists should be spending their time using the data, not wrestling with disparate computer programs (or writing their own inevitably poorly-coded programs).

Here is the way I think it should be: A single program acts as the interface to all publicly-released data for a given planet. All datasets are projected onto a globe of the planet and coregistered, and you can click and drag and zoom easily with a mouse.  Simply clicking on a point on the globe should be able to bring up things like the spectrum of that location in available spectrometer datasets, the albedo in various calibrated image datasets, the elevation in topographic data, etc. Zooming in on landing sites should provide access to images, spectra, etc. of targets encountered on the ground.

A 3D view of Valles Marineris in JMars.

So far, nothing with these capabilities exists yet, but two programs are worth mentioning because they come close. The first is JMars, a free java-based program distributed by Arizona State University. JMars is a great way to find released datasets and project them onto a map of Mars. It even lets you do things like take topographic profiles or load your own images. Where JMars fails for me is especially in its panning and zooming. You can’t just click and drag to move around the map, and to zoom you have to right click and select a zoom level rather than just scrolling a mouse wheel. The lack of a scalebar is also a nuisance, as is the inability to render HiRISE images at full resolution. It is also unfortunately stuck in cylindrical projection. To me, it seems like computers are good enough these days to render a globe pretty easily, so it makes more sense to just display the data on a globe. Avoiding wrestling with the various types of map projections unless you’re working with global-scale data would save a lot of hassle. Still, JMars is my go-to tool for finding up-to-date publicly released data.

Even closer to my dream program is Google Earth in Mars Mode. I love Google Earth. It gives a glimpse of how easy it could be to use planetary data: it’s very intuitive to move around and zoom in, and there are multiple datasets that can be projected onto the globe, including some HiRISE at nearly full resolution! Google Earth even has rover traverses and some Pancam images if you zoom in enough! It’s extremely powerful, and I use it when I want to quickly and easily look at a location and most of the available images. Unfortunately, the images available in Google Earth are not up-to-date, so it’s easy to overlook more recent data, and most images can’t be rendered in Google Earth, you have to follow a link to an external webpage where the file can be downloaded and view with the appropriate program.

And of course, neither of these programs lets you do things like analyze spectra or get the actual data values in the data. These are primarily for overlaying visual data. Still, the fact that JMars and Google Earth exist make me optimistic, and they are a vast improvement over only a few years ago when the only way to access data was to FTP it from the PDS.

I hope that someday soon, we will see a tool that is as easy to use as Google Earth but as powerful as any quantitative analysis software. Then maybe, just maybe, planetary scientists can spend most of their time using the data rather than making the data usable.