WeatherZine #23 Guest Editorial

Guest Editorial

How much "skill" is there in forecasting El Niño?

Chris Landsea
NOAA/AOML/Hurricane Research Division
landsea@aoml.noaa.gov

John Knaff
NOAA/Cooperative Institute for Research in the Atmosphere
Colorado State University
Knaff@CIRA.colostate.edu

The 1997-98 El Niño had dramatic impacts, leading to drought in Indonesia, extreme rains in Peru and Ecuador, and a quiet Atlantic hurricane season. Conventional wisdom holds that predictions of the event's onset, magnitude, decay, and impacts generally were accurate. But a close look at the forecasts reveals that while the impacts of the event, once it had begun, were accurately anticipated based on the climatology of past El Niños, none of the available forecast techniques accurately predicted the event's onset, magnitude, and decay.

What is an accurate forecast? One definition of forecast accuracy is based on a concept called "skill." Atmospheric scientists define "skill" as a prediction's improvement upon some naïve baseline. For example, absent other information a best guess for the high temperature in Washington, DC on September 1 might be the historical average high temperature for that date. A forecast that is closer than the historical average to the actual temperature on September 1 thus has skill. A forecast methodology that consistently improves upon a naïve baseline is a skillful methodology. Consequently, judgments of the appropriate baseline against which to measure skill are crucial to claims of forecast accuracy.

Traditionally, scientists have judged seasonal El Niño forecasts of sea surface temperature (SST) in the equatorial eastern Pacific Ocean "skillful" if they improve upon a baseline based upon "persistence." A forecast of "persistence" simply uses current conditions as a predictor of future conditions. For example, if the SSTs were 0.7 C above average, persistence would simply forecast 0.7 C above average for the following months and seasons. Because SSTs associated with El Niño are part of a cycle (the El Niño-Southern Oscillation or ENSO), it turns out that persistence is a very easy baseline to outforecast.

To provide a more stringent, but still naïve baseline of skill in El Niño forecasting, we developed the El Niño-Southern Oscillation CLImatology and PERsistence model as a simple statistical tool that takes advantage of the climatology of past El Niño events, persistence, and contemporary trends. Thus, we recommend that the output of ENSO-CLIPER replace the use of persistence as a skill threshold. There are, of course, other simple statistical models that could be used to set this threshold. In our proposal, "skill" is defined as the ability of a forecast or forecast methodology to improve upon ENSO-CLIPER – which is a more difficult task.

In next month's Bulletin of the American Meteorological Society we evaluate twelve statistical and dynamical models which were available in real-time for the 1997/98 event. We conclude that some of the models were able to outperform ENSO-CLIPER in predicting either the onset or the decay of the 1997-98 El Niño, but none was successful at predicting both onset and decay for a medium-range (6-11 months) lead time. Also, no predictive approaches, including the ENSO-CLIPER baseline, were able to anticipate even one-half of the actual magnitude of the El Niño at medium-range (6-11 months) lead. In addition, none of the models showed skill at short- to medium-range lead times (0-8 months). No dynamical model and only two of the statistical models outperformed ENSO-CLIPER by more than 5% (of the root mean square error) at 9 to 14 months lead-time.

A lesson to be learned from this evaluation is that since the best performing models for the very strong 1997-98 El Niño event were statistical ones, it appears that the use of more complex, physically realistic dynamical models does not automatically provide more accurate forecasts. Increased complexity can increase by orders of magnitude the sources for error, which can cause degradation in skill. Despite the lack of skill in forecasting ENSO up to 8 months in advance, once the 1997-98 El Niño had begun national meteorological centers were able to anticipate correctly many of the impacts because of the tendency for El Niño events to persist into and peak during the winter. Indeed, the U.S. Climate Prediction Center's most skillful tools for predicting U.S. seasonal precipitation were statistical rather than dynamical models. For seasonal temperature anomalies in the United States, the statistical and dynamical approaches were about equal in skill. This implies that in this case the use of dynamical models was not needed to anticipate a wet and stormy winter for the southern tier of the United States and a warm winter for the northern tier of states.

We have two recommendations based on this work: 1) a distinct need exists for the forecasting community to debate and agree on the naïve baseline against which "skill" is to be measured in forecasts of ENSO phenomena. Use of the simple persistence is much too easy a benchmark. If not ENSO-CLIPER, then some other more rigorous but simple test is essential for evaluating ENSO forecasting in a useful manner; and 2) for the 1997-98 event none of the more sophisticated models- both other statistical schemes as well as numerical techniques – outperformed the naïve ENSO-CLIPER baseline for short to medium lead times (0 to 8 months). Thus these more complex models may not be doing much more than carrying out pattern recognition and extrapolation. National meteorological centers and research agencies may wish to consider carefully their resource priorities (personnel, computers, and budgets) when the most accurate tools presently appear to be the relatively cheap statistical systems, compared to the expensive (developmentally and computationally) dynamical models.

These results may be surprising given the general perception that seasonal El Niño forecasts from dynamical models have been quite successful and may even be considered a solved problem. A particular report in Science in 1998 – "Models win big in forecasting El Niño" – generated widespread publicity for the success in forecasting the 1997-98 El Niño's onset by the comprehensive dynamical models. This report was based upon a conference paper by NOAA's Tony Barnston, which only considered El Niño's onset at the time of the report in October 1997. No follow-up in Science was forthcoming when Barnston and colleagues published a paper in the Bulletin of the American Meteorological Society last year showing that the comprehensive dynamical models did not "win big" after all. (It is worth mentioning that the results from Barnston and colleagues do indeed agree quite well in general with our paper, though the interpretation is very different.)

Also disturbing is the use of the supposed success in dynamical El Niño forecasting to support other agendas. For example, a 1999 overview paper by Ledley and colleagues in support of the American Geophysical Union's "Position Statement on Climate Change and Greenhouse Gases" said the following:

"Confidence in [comprehensive coupled] models [for anthropogenic global warming scenarios] is also gained from their emerging predictive capability. An example of this capability is the development of a hierarchy of models to study the El Niño-Southern Oscillation (ENSO) phenomena.....These models can predict the lower frequency responses of the climate system, such as anomalies in monthly and season averages of the sea surface temperatures in the tropical Pacific."

To the contrary, under this logic and with the results of our study, one could even have less confidence in anthropogenic global warming predictions because of the lack of skill in predicting El Niño. The inability of dynamical models to outperform a relatively simple statistical scheme for ENSO calls into question the consensus opinion that coupled dynamical models are the best way to accurately predict short-term climate variability. The bottom line is that the successes in ENSO forecasting have been overstated (sometimes drastically) and misapplied in other arenas.

We are now engaged in an assessment of the forecast skill of the strong 1998-2000 La Niña event, which immediately followed the 1997-1998 El Niño. Given the most recent complete ENSO warm and cold cycle, it may be that truly skillful predictions from models are available. But the current answer to the question posed in this article's title is that there was essentially no skill in forecasting the very strong 1997-98 El Niño at lead times ranging from 0 to 8 months using the performance of ENSO-CLIPER as the naive baseline. Moreover, the lack of skill at the short- to medium-range lead times continues to confirm what was also observed in independent tests of real-time ENSO prediction models for the period 1993-96 in our earlier work.

For further reading:

Barnston, A. G., M. H. Glantz, and Y. He, 1999: Predictive skill of statistical and dynamical climate models in SST forecasts during the 1997-98 El Nino episode and the 1998 La Nina onset. Bull. Amer. Meteor. Soc., 80, 217-243.
Kerr, R. A., 1998: Models win big in forecasting El Nino. Science, 280, 522-523.
Knaff, J.A, and C.W. Landsea, 1997: An El Nino-Southern Oscillation CLImatology and PERsistence (CLIPER) forecasting scheme. Wea. Forecasting, 12, 633-652.
Landsea, C.W., and J.A. Knaff, 2000: How much skill was there in forecasting the very strong 1997-98 El Nino? Bull. Amer. Meteor. Soc., (in press, September issue).
Ledley, T. S., E. T. Sundquist, S. E. Schwartz, D. K. Hall, J. D. Fellows, and T. L. Killeen, 1999: Climate change and greenhouse gases. Eos, 80, 453-458.

— Chris Landsea
NOAA/AOML/Hurricane Research Division
Miami, Florida
landsea@aoml.noaa.gov

— John Knaff
NOAA/Cooperative Institute for Research in the Atmosphere
Colorado State University
Fort Collins, Colorado
Knaff@CIRA.colostate.edu

Comments? thunder@ucar.edu

[ Top of Page ]

WeatherZine #23 Home Page | Comments and Feedback | Site Map
ESIG Home Page | Roger Pielke, Jr.'s Home Page | Societal Aspects of Weather
[ Societal Aspects of Weather – Text Version ]