27 July 2015
Europe Doesn’t Just Have Better Roads, Faster Trains, and Nicer Airports. They Have Better Weather Satellites, and More Accurate Weather Models.
Posted by Dan Satterfield
..and don’t get me started about roundabouts either, because that’s a whole other blog post!
Seriously though, our once number one position in atmospheric science is long gone, and there are few signs of that changing. Yes, we will launch a weather satellite next year that will be as good or perhaps better than Europe’s Meteosat, but they have an even better one on the drawing board,while Japan has us beat now and has plans for more as well. I want to talk about numerical weather models though, and when it comes to the weather prediction gap, the person to listen to is Univ. of Washington professor Dr. Cliff Mass.
First some background on where we are now. I spent this weekend watching some of the talks at the AMS Conference on Weather and Forecasting held in Chicago in late June. I did not make that meeting, having been at the AMS meeting on Broadcast meteorology a couple of weeks earlier, but most of the talks are online and I will link to a few here.
WHOSE IN FIRST
Ask any forecaster what the best medium range weather model is, and they will almost certainly tell you that it’s the model run in Reading, England by the European Center for Medium Range Forecasting. We just call it the ECMWF model for short. If, by chance, someone tells you otherwise, they’re wrong, and the data proves it.
Before someone says something along the lines of “It must be nice to get paid for being wrong all the time”, let me point out two things. My batting average is better than any MLB player, and I’ll gladly swap paychecks with them. Secondly, numerical models have improved amazingly over the last 20 years. A three-day model forecast today is as good as a two-day forecast from just 13 years ago. The Hurricane WRF model has improved dramatically as well (see image below from the same talk).
That said there is no doubt that Europe is ahead of us, and in spite of new more powerful supercomputers at NOAA, they will likely stay there. The reason is that early next year the ECMWF will be upgraded to a much higher resolution with even better physics. They will also upgrade their long-range ensemble forecasting as well. FYI: Long range predictions are more accurate when you run the model many times using a very slightly different starting point and then look at the solutions that are most common. A high spread can also tell you that predictability is low, and this too is valuable information. See this talk by David Richardson at ECMWF, and the images below are from that talk.
For those not familiar with the terminology, the grid size is like the number of pixels in your digital camera. Smaller grid equals better resolution. You also have vertical resolution, and the ECM model now has 137 layers starting at the surface and extending to a height that will freeze you to death about the time you die from lack of oxygen!
Models are only as good as their starting point though and this is on of the reasons the ECM is so good. The starting point it is given is very good and more realistic than the one NOAA uses. We do not know what the state of the atmosphere is over the entire globe, so we have to estimate it using the available data from surface stations, ships, weather balloons and ever more importantly satellites. They do a very good job of this at ECMWF and are planning on improving more.
The Charles Emerson Winchester Technique of Weather Prediction
The M*A*S*H character Charles Emerson Winchester had a line in one episode where he says “I do one thing at a time, I do it very well, and then I move on.” This is how the ECMWF folks approach forecasting. They have one main model, and work to make it better. Look at the distribution of computer time on NOAA’s mainframe for the different numerical models. (See below-from an image that Bill Lapenta (head of NOAA’s National Center for Env. Prediction (They run the NOAA models) showed in his talk (watch it here).
NOAA folks realize that we need to step back and take a hard look at the future, and they have a committee put together to do just that.(I’m hoping they have at least one broadcast meteorologist on it, and I’d bet they do!)
NOW THE BAD NEWS
Remember Dr. Cliff Mass I mentioned at the top of this post? He made some remarks in Chicago as well,and I hope everyone was listening, because it sure made me sit up and pay attention watching online! Look at the graph he showed. The models have not really improved much in the past 5 years. None of them. You can watch the talk by Dr. Mass here, but some of the images he showed are below:
I think he is spot on. It was Cliff Mass who first made public what we forecasters have known for a while, about the state of our atmospheric modeling, and that actually helped to get NOAA new computers. I asked him about his talk and he said “I have blogged about these issues quite a bit and I wrote a paper in BAMS (the uncoordinated giant) that deals with many of the issues. The bottom line is really straightforward: the US has fallen behind because we have divided our resources on too many models and systems. No system of coordination and combination of resources. A complacency among NWS leadership and a willingness to be third rate. But I think there are encouraging signs, like the acquisition of the new computer and the current NGGPS effort.“
The threat of Russia’s space program took us to the Moon. Here’s hoping China’s forecast system gets even better than the ECM model soon!
Robert had trouble posting this comment so I am posting for him. DS
A thought (speaking just for myself) that I’ve had recently, and which I’m still working out what, if anything, can be done about.
Namely, while computational demand for running a model increases as N^4 (talking about resolution increasing by 2x from 26 to 13 km, for instance), the computational demand of the experiment increases as N^6. Fundamental to this is atmospheric chaos.
The scores from day to day will _always_ have a fair amount of variability, and we are stuck with that because of chaos. On the other hand, as the models get better, there is less room for improvement. Suppose (which more or less accords with my experience) that the 5 day AC improves in any generation by 10% of the difference between where it is, and perfect. In the days when AC was about 0.7, that’s 3 points. Today, with ACs around 0.9, it’s only 1 point. So, something like 10 times the test runs are needed to find your improvement. But that’s 10 times as many runs of a vastly more computationally demanding model.
Back in 1994, I published a note in Weather and Forecasting about improving the sea ice albedo algorithm in the MRF (precursor to GFS). 40 runs were sufficient to show the improvement to the satisfaction of all. No great computational demand even at the time. That was the era of T126L28. Today, with T1534LXX, the computational demand for the equivalent experiment goes from negligible to large.
Of course computers are much faster now. But this is where the N^6 gets us in trouble. The computer power has only been increasing at a rate for running the operational models (and strong arguments can be made for not even that fast), declare it to be the N^4. But that still leaves us with another N^2 to deal with. If back then my experiment constituted, say 1% of the operational computer, the equivalent experiment would now take about 100% of the operational computer.
We (NWP community) can somewhat moderate the issue by being more clever about how we coordinate research and development. But only somewhat. In another 20 years, when we want to be running globally at 1 km, the development/research computer would have to be 100x the power of the operational computer of that day. I don’t think the community is currently stupid enough that we can be 100x smarter than we currently are. And I can’t envision any congress giving funds for an NWP research computer on the order of $1 Billion per year.
Two things I think might get us around this problem. One is, de-emphasize 5 day 500 mb AC in favor of something else, more challenging (i.e., more room for improvement). Second is related — pay much more attention to the oceans, land, sea ice, and coupled systems. But not lip service attention and still mostly look at 500mb scores (though my ice experiment did show improvement up there, most was near surface, where the ice — and people — are). Maybe a better score is the 5 day SST anomaly prediction? Certainly there’s a lot of room for improvement on that.
Disclosure: I work in NWS, EMC. Currently acting head of the Marine Modeling and Analysis Branch. Team lead for the ocean/waves/ice group in the NGGPS. Still, these thoughts are entirely mine, not official policy of anybody. But you see why I mention things like ice albedo and SST scores smile emoticon