For those puzzling over the various hurricane computer forecast models to figure out which one to believe, the best answer is: Don’t believe any of them. Put your trust in the National Hurricane Center, or NHC, forecast.
Although an individual model may outperform the official NHC forecast in some situations, the 2021 National Hurricane Center Forecast Verification Report documents that overall, it is very difficult for any one model to consistently beat the NHC forecasts for track and for intensity.
During the 2021 Atlantic hurricane season, NHC track forecasts had accuracies near or better than the five-year average, with two-day and three-day track forecasts setting new records for accuracy. Over the past 30 years, one- to three-day track forecast errors have been reduced by about 75%; over the past 20 years, four-day and five-day track forecast errors by 50 – 60%. Those numbers amount to an extraordinary accomplishment, one undoubtedly leading to huge savings in lives, damage, and emotional angst. The improvement in track forecast accuracy has slowed down in recent years, however, suggesting that forecasts may be nearing their limit in accuracy because of the chaotic nature of the atmosphere.
Best track model in 2021: the GFS
As usual, the official NHC track forecasts for Atlantic storms in 2021 were tough to beat, and none of the individual models outperformed the official forecast at any time period, when compared to a “no-skill” model called CLIPER5 (Figure 2). The CLIPER5 model (which combines the word “climatology” and “persistence” to show the nature of the forecasts it makes) is tough to outperform at short-term forecasts, since a hurricane will tend to keep moving in the same direction and at the same speed as at its initial point (this is called persistence). For that reason, the skill curve in Figure 2 shows relatively low skill for NHC forecasts for short-term forecasts out to one day; skill increases for forecasts between one and three days, when persistence tends not to be a good forecast (hurricanes generally don’t move in a straight line at a constant speed for days on end). Beyond three-day forecasts, NHC forecast skill starts to drop off, as the CLIPER5 model starts weighting its forecasts using climatology, which becomes tougher to beat at long ranges.
The GFS model was the best model in 2021, followed by the European model. The HWRF, HMON, COAMPS-TC, and CMC models did respectably for forecasts up to 72 hours; at longer time periods, the CMC and COAMPS-TC models performed poorly. The official 2021 NHC Atlantic track forecasts tended to have a northeast bias of 7-21 miles for one- to three-day forecasts (i.e., the official forecast tended to fall to the northeast of the verifying position).
Here is a list of some of the top hurricane forecast models used by NHC:
Euro: The European Center for Medium-range Weather Forecasting (ECMWF) global forecast model
GFS: The National Oceanic and Atmospheric Administration (NOAA) Global Forecast System model
UKMET: The United Kingdom Met Office’s global forecast model
HMON: Hurricanes in a Multi-scale Ocean-coupled Non-hydrostatic regional model, initialized using GFS data
HWRF: Hurricane Weather and Research Forecasting regional model, initialized using GFS data
COAMPS: COAMPS-TC regional model, initialized using GFS data
If one averages together the track forecasts from three or more of these six models, the NHC official forecast will rarely depart much from it. These six models are used as input to various “consensus” models, such as “TVCN,” often referenced in NHC discussions for a storm. Improved consensus modeling techniques are one major reason NHC track forecasts have improved so much in the past 30 years.
NHC intensity forecasts improved considerably in 2021
Though intensity forecasts have not improved as dramatically as track forecasts over the past 30 years, there has been a notable decrease since around 2010 in intensity errors. Official NHC intensity forecast errors in the Atlantic in 2021 were 13-24% smaller than the five-year average for all forecast times, and records for intensity accuracy were set for forecasts from 12-60 hours out into the future. Mean intensity forecast errors in 2021 were about 7 mph at 24 hours, and increased to 14 mph for five-day forecasts. The official forecasts had little bias through four-day forecasts, but were biased too low for five-day forecasts.
Best intensity model in 2021: HMON
In 2021, the official NHC intensity forecast outperformed the five top intensity models at all forecast times, save for four-day forecasts, for which the HMON model did slightly better. The five top intensity models are the regional/dynamical models HWRF, HMON, and COAMPS-TC (which subdivide the atmosphere into a 3-D grid around the storm and solve the atmospheric equations of fluid flow at each point on the grid), and the statistics-based LGEM and DSHP models (DSHP is the SHIPS model with inland decay of a storm factored in). The HMON model was the clear top performer at all time periods except for one-day forecasts, when it was slightly outperformed by the COAMPS-TC model. The COAMPS-TC and LGEM models did poorly at four- and five-day forecasts. Most of the models had little bias, either high or low.
Two of the top-performing global dynamical models for hurricane track, the European (ECMWF) and GFS models, are typically not considered by NHC forecasters when making intensity forecasts. These models made poor intensity forecasts in 2021, as evident in Figure 4.
Sources of free model data
– Tropical Tidbits, which has become the best source of free model data on the web;
– Weathernerds, another excellent model data site;
– cyclonicwx.com, similar to Tropical Tidbits and Weathernerds;
– pivotalweather.com, similar to Tropical Tidbits and Weathernerds;
– ECMWF forecasts from the ECMWF web site;
– ECMWF ensemble forecasts from ECMWF;
– FSU’s model page (CMC, ECMWF, GFS, HWRF, HMON, and NAVGEM models);
– NOAA’s HWRF model page;
– NOAA’s HWRF and HMON model data page;
– The Navy’s COAMPS-TC model data page;
– Experimental HFIP models (note that the HAFS model is scheduled to replace the HWRF model in 2023); and
– UKMET text forecast.
Additional information about the guidance models used at the NHC is available at NHC (updated in 2019).
For a small monthly fee, users can access a great variety of model data at weathermodels.com.
About ensemble models
Ensemble model runs are available for most of the top global models. An ensemble model is created by taking the forecast from the high-resolution version of a model like the GFS or European, then running multiple versions of the model with slightly different initial conditions to generate an ensemble of potential forecasts that suggest uncertainties that may exist. These ensemble members are run at a lower resolution to save computer time. The European model has 51 ensemble members, and the GFS has 31. The 0Z GFS run (called GEFS) goes out to Day 35 (note: there is approximately a 24-hour delay for Days 17-35 to be recorded). Note that Days 17-35 ensemble forecasts should be taken with a large grain of salt for now but may still be useful for tracking long-term or seasonal shifts.
Ensembles are especially useful for setups such as weak steering flow, where the varied starting conditions across a model ensemble may shed light on important features that the observing grid hasn’t yet captured directly. When the spread in a model ensemble decreases as a storm evolves, it’s a good sign that the forecast from that operational model is becoming more reliable. Keep in mind that one model’s ensemble tracks can sometimes be in tight agreement while another model’s ensemble is in tight agreement on a completely different solution. In such a case, it’s often the different physics within each model that are driving the difference, which makes it especially important to watch how the consensus model output evolves (the average forecast from three or more separate models averaged together, like the GFS, European, and UKMET models).
Tropical cyclone genesis forecasts
NHC issues a Tropical Weather Outlook four times per day, offering two-day and five-day forecasts of tropical cyclone genesis. For the Atlantic in 2021, these forecasts were pretty reliable for five-day genesis forecasts of 10 – 70%. For example, when NHC gave a 40% chance a tropical cyclone would form within five days, one actually did form 38% of the time.
However, NHC’s genesis forecasts were too conservative at the upper end of the distribution. Ninety percent of the storms to which NHC gave a 70% chance of development in fact did develop, and 95% of the systems that NHC gave an 80% chance of development to developed.
A 2016 study by a group of scientists led by Florida State’s Daniel Halperin found that four models can make decent forecasts out to five days in advance of the genesis of new tropical cyclones in the Atlantic. The model with the highest success ratio (rewarding correct genesis forecasts combined with fewest false alarms) was the European (ECMWF), followed by the UKMET, the GFS, and Canadian models.
The scientists authoring that study found that skill declined markedly for forecasts beyond two days into the future, and skill was lowest for small tropical cyclones. The European model had the lowest probability of correctly making a genesis forecast – near 20% – but had the fewest false alarms. The GFS correctly made genesis forecasts 20 – 25% of the time, but had more false alarms. The Canadian model had the best chance of making a correct genesis forecast, but also had the highest number of false alarms. The take-home message: The Canadian model’s predicting genesis suggests something may be afoot, but don’t bet on it until the European model comes on board. In general, when two or more models make the same genesis forecast, the odds of the event actually occurring increase considerably, the study authors found.
Sources of tropical cyclone genesis forecasts
– NHC 5-day Tropical Weather Outlook;
– NOAA/CIRA (two-day forecasts);
– Florida State University Experimental Tropical Cyclone Genesis Guidance page; and
– SUNY Albany 10-day Experimental Genesis Probabilities (Alan Brammer).
Bob Henson contributed to this post.