16 April 2011
Dreaming of Easy-to-Use Data
Posted by Ryan Anderson
Let me tell you about the way things are and the way things should be.
In planetary science, and especially lunar and Mars science, we have tons of data. Multiple missions, all carrying multiple instruments are sending back terabytes of information about other worlds. Unfortunately, this information is not always available, and when it is available it is not easy to use. Yes, missions are required to release their data to the Planetary Data System after a certain period of time, but the data products in the PDS often require a great deal of expertise to use. Non-imaging data in particular is usually impenetrable to someone not on the instrument team, but even images are often released in a bewildering number of versions and formats and map-projections.
As a graduate student, I have used data from many different missions and instruments. Imaging spectrometer data from OMEGA and CRISM, wide angle multispectral pushbroom imaging from MARCI, spectacular high resolution images from MOC, HiRISE and CTX, infrared images from THEMIS, topography from MOLA and HRSC, radar profiles from SHARAD, images from Pancam, elemental compositions from APXS. Each of these datasets has its own learning curve that must be overcome before it can be used. HiRISE and Pancam are probably the easiest because the images from these instruments are publicly released, and since they are images they are pretty intuitive. But even with these, a special program is needed to view the JP2 HiRISE images, and Pancam images are only easy for me to use because I have access to a whole suite of special programs written for the Pancam team to use.
One of the most powerful things that a planetary scientist can do is combine multiple datasets to look at a problem in several ways, but this is almost never trivial. Last year I published a study that I did of the geomorphology at Gale Crater using a whole bunch of different orbital imaging data. To combine this data, I used the program ArcGIS. This program is commonly used by people who do mapping but it has a very steep learning curve, it is plagued by bugs and confusing options, and in my experience it is slow and crashes far too often for comfort. Plus it costs quite a lot of money and has pretty strict licensing control.
And ArcGIS is for mapping, not for quantitative analysis. So if I want to look at the spectrum in a CRISM image that I’ve loaded into ArcGIS, I have to open that same image in a programming environment like IDL (or IDL-based ENVI, which has its own learning curve) even to do something as simple as finding what the albedo is at a certain location.
So that’s the way it is: datasets each have their own steep learning curve and are far too difficult and time-consuming to combine, and it is especially hard to combine them and still retain the ability to do quantitative science with them.
To me, this is not acceptable. Scientists should not have to waste their time getting data into a useable format. Scientists should be spending their time using the data, not wrestling with disparate computer programs (or writing their own inevitably poorly-coded programs).
Here is the way I think it should be: A single program acts as the interface to all publicly-released data for a given planet. All datasets are projected onto a globe of the planet and coregistered, and you can click and drag and zoom easily with a mouse. Simply clicking on a point on the globe should be able to bring up things like the spectrum of that location in available spectrometer datasets, the albedo in various calibrated image datasets, the elevation in topographic data, etc. Zooming in on landing sites should provide access to images, spectra, etc. of targets encountered on the ground.
So far, nothing with these capabilities exists yet, but two programs are worth mentioning because they come close. The first is JMars, a free java-based program distributed by Arizona State University. JMars is a great way to find released datasets and project them onto a map of Mars. It even lets you do things like take topographic profiles or load your own images. Where JMars fails for me is especially in its panning and zooming. You can’t just click and drag to move around the map, and to zoom you have to right click and select a zoom level rather than just scrolling a mouse wheel. The lack of a scalebar is also a nuisance, as is the inability to render HiRISE images at full resolution. It is also unfortunately stuck in cylindrical projection. To me, it seems like computers are good enough these days to render a globe pretty easily, so it makes more sense to just display the data on a globe. Avoiding wrestling with the various types of map projections unless you’re working with global-scale data would save a lot of hassle. Still, JMars is my go-to tool for finding up-to-date publicly released data.
Even closer to my dream program is Google Earth in Mars Mode. I love Google Earth. It gives a glimpse of how easy it could be to use planetary data: it’s very intuitive to move around and zoom in, and there are multiple datasets that can be projected onto the globe, including some HiRISE at nearly full resolution! Google Earth even has rover traverses and some Pancam images if you zoom in enough! It’s extremely powerful, and I use it when I want to quickly and easily look at a location and most of the available images. Unfortunately, the images available in Google Earth are not up-to-date, so it’s easy to overlook more recent data, and most images can’t be rendered in Google Earth, you have to follow a link to an external webpage where the file can be downloaded and view with the appropriate program.
And of course, neither of these programs lets you do things like analyze spectra or get the actual data values in the data. These are primarily for overlaying visual data. Still, the fact that JMars and Google Earth exist make me optimistic, and they are a vast improvement over only a few years ago when the only way to access data was to FTP it from the PDS.
I hope that someday soon, we will see a tool that is as easy to use as Google Earth but as powerful as any quantitative analysis software. Then maybe, just maybe, planetary scientists can spend most of their time using the data rather than making the data usable.
Is Dot Astronomy (http://dotastronomy.com/) relevant to this? I think they have more of an astronomy than planetary science focus, but at some level I think the problems are the same (except, perhaps, that these are the people who are complementary to you: they seem to enjoy the programming as much as the science, if not more).
Not that I really know very much about it.
I love both JMars and Google Mars. I use JMars quite a bit, and the HRSC images in Google are great for sightseeing and geological context at medium resolutions.
Note: JMars does have a moveable scalebar; it’s on a separate layer. My biggest problem with JMars is that you have to type in long. and lat. coordinates to get stamps of various images. You should just be able to left-click and drag an area, with the coordinates of interest automatically entered for you.
Tom
In case you don’t already know this, if you right click over the lat or lon boxes when loading stamps in JMars, you can set the lon/lat bounds to the main view. I always do that instead of entering coordinates manually.
Have you considered upgrading to google earth pro? It has more traditional GIS functionality.
I concur and that sounds really frustrating and like something worth investing in (having all the data reasonably accessible on a single map). Organizing things in computers in general is tough. We spent a lot of time my sophomore year just sort of wrangling electrophysiology data around, from raw data to excel spreadsheets to SPSS.
I totally agree with everything you mention here. I also work with multiple datasets of Mars and attempt to integrate them into ArcGIS. It is incredible how long it takes to achieve entering data that can be crossed. While we are fortunate that data is available for free, as you say, some of them are practically inaccessible. Now I’m starting to work with spectrometric data, and still I am not able to use data from OMEGA. I hope that soon develop software like Goolge Earth with the ability to download data projected! I did not ask for more.
I absolutely 10000% agree with this article. As a non-scientist but avid consumer of NASA consumables, as it were, I was completely shocked at how palpably hostile the Mars Orbiter’s imagery system was set up for the public. What I needed at the time was a (quel supris!) topographic map of the Mariner Valley for a 3DCG project, a seemingly pedestrian goal I thought. But the NASA team had set up their image file structure (for the public BTW) as an enormous set of folders-within-folders-within-folders, all numeric names based on their filing codec, which basically turned away in utter frustration anyone who had a sincere wish to peruse the library in any way. You went down eight or nine levels of folders, at an agonizingly-slow network speed & blindly guessing at what the folder names might refer to…..only to find an empty folder, because the Orbiter just hadn’t taken a picture of that area yet. Of course, THEY knew where to go because they’d set it up, but to anyone else it was a hopeless labyrinth.