Much like the invigorating passage of a strong cold front, major changes are afoot in the weather forecasting community. And the end game is nothing short of revolutionary: an entirely new way to forecast weather based on artificial intelligence that can run on a desktop computer.
Today's artificial intelligence systems require one resource more than any other to operate—data. For example, large language models such as ChatGPT voraciously consume data to improve answers to queries. The more and higher quality data, the better their training, and the sharper the results.
However, there is a finite limit to quality data, even on the Internet. These large language models have hoovered up so much data that they're being sued widely for copyright infringement. And as they're running out of data, the operators of these AI models are turning to ideas such as synthetic data to keep feeding the beast and produce ever more capable results for users.
If data is king, what about other applications for AI technology similar to large language models? Are there untapped pools of data? One of the most promising that has emerged in the last 18 months is weather forecasting, and recent advances have sent shockwaves through the field of meteorology.
That's because there's a secret weapon: an extremely rich data set. The European Centre for Medium-Range Weather Forecasts, the premiere organization in the world for numerical weather prediction, maintains a set of data about atmospheric, land, and oceanic weather data for every day, at points around the world, every few hours, going back to 1940. The last 50 years of data, after the advent of global satellite coverage, is especially rich. This dataset is known as ERA5, and it is publicly available.
It was not created to fuel AI applications, but ERA5 has turned out to be incredibly useful for this purpose. Computer scientists only really got serious about using this data to train AI models to forecast the weather in 2022. Since then, the technology has made rapid strides. In some cases, the output of these models is already superior to global weather models that scientists have labored decades to design and build, and they require some of the most powerful supercomputers in the world to run.
"It is clear that machine learning is a significant part of the future of weather forecasting," said Matthew Chantry, who leads AI forecasting efforts at the European weather center known as ECMWF, in an interview with Ars.
It’s moving fast
John Dean and Kai Marshland met as undergraduates at Stanford University in the late 2010s. Dean, an electrical engineer, interned at SpaceX during the summer of 2017. Marshland, a computer scientist, interned at the launch company the next summer. Both graduated in 2019 and were trying to figure out what to do with their lives.
"We decided we wanted to solve the problem of weather uncertainty," Marshland said, so they co-founded a company called WindBorne Systems.
The premise of the company was simple: For about 85 percent of the Earth and its atmosphere, we have no good data about weather conditions there. A lack of quality data, which establishes initial conditions, represents a major handicap for global weather forecast models. The company's proposed solution was in its name—wind borne.
Dean and Marshland set about designing small weather balloons they could release into the atmosphere and which would fly around the world for up to 40 days, relaying useful atmospheric data that could be packaged and sold to large, government-funded weather models.
Weather balloons provide invaluable data about atmospheric conditions—readings such as temperature, dewpoints, and pressures—that cannot be captured by surface observations or satellites. Such atmospheric "profiles" are helpful in setting the initial conditions models start with. The problem is that traditional weather balloons are cumbersome and only operate for a few hours. Because of this, the National Weather Service only launches them twice daily from about 100 locations in the United States.