- xwiki:Main.WebHome
- Phase2
- Machine Learning
Machine Learning
Machine Learning / Artificial Intelligence for S2S prediction
There is currently a lot excitement in the weather and climate communities to explore the potential of data driven approaches based on Artificial Intelligence/Machine Learning/Deep learning for S2S prediction through, for instance, improved parameterization, improved calibration and multi-model calibration, extreme event attribution, verification... The publicly available S2S database which contains a considerable amount of data (re-forecasts and real-time forecasts from 11 operational centres) represents an ideal testbed for these data-driven methods. The SubX database provides another such opportunity (http://cola.gmu.edu/subx/index.html).
Potential applications :
1. Improved data assimilation (e.g. better quality control of observations)
2. Improved parameterization (e.g. radiative schemes)
3. Improved post-processing (model calibration, bias-correction, multi-ensemble combination...)
4. Predictability diagnostics (e.g. teleconnections)
5. S2S event attribution (e.g. origins of extreme events)
6. Empirical forecasts
Ongoing Community Research on Machine Learning / Artificial Intelligence for S2S prediction
1. Scripps Institute of Oceanography (From Dr. Peter Gibson. Updated on April 1, 2020) The project explores the potential for modern machine learning tools to improve seasonal prediction skill of precipitation over the Western US. Modern machine learning approaches are 'data hungry' while observations are data limited (relatively short in length for the purposes of seasonal forecasting). To circumvent this issue, we train a variety of machine learning tools on perturbed initial condition climate model ensembles that span several thousands of years, then use these 'learnt' teleconnections to make seasonal predictions. We are testing a hierarchy of machine learning approaches from simple to complex: simple logistic regression, LASSO, Random Forests, Gradient Boosted decision trees, and convolutional neural networks. This project is a collaboration between researchers at Scripps CW3E and JPL, and funded by the California Department of Water Resources. |
---|
2. Australian Bureau of Meteorology (From Catherine de Burgh-Day with Oscar Alves and Debbie Hudson. Updated on April 2, 2020)) We are at the early stages of work developing a ML-based vegetation model which uses outputs of the Bureau's seasonal prediction ACCESS-S.The purpose of this work is twofold:
We intend to start by trying an LSTM Neural Network for the vegetation model, potentially also including some convolutional layers. We will however be investigating what is most effective as we go. Initially we will be training using the ACCESS-S1 hindcast, however if a larger training set is needed we may investigate using a larger set to train, followed by transfer learning techniques to update the model to ACCESS-S. |
---|
3. APEC Climate Center (From Dr. Hyung Jin Kim with Dr. Uran Chung and Dr. Kyungwon Park. Updated on April 3, 2020) Our project is to develop a deep learning ensemble technique to improve subseasonal forecast over the Korean Peninsula. Deep learning is now recognized as a technique to improve climate forecasting, especially subseasonal climate prediction; however there is a limit to the application of deep learning due to insufficiency in size of subseasonal forecast data to train and test for deep learning models. Therefore, we are testing ensemble techniques for constructing sufficient subseasonal prediction data of the Korean Peninsula from climate models, and developing the application of machine learning and various deep learning algorithms (e.g. SVM, RF, RNN, LSTM, and Convolution LSTM) to the multi-model-ensemble based-subseasonal prediction data, to improve the daily maximum and minimum temperatures, and precipitation of the Korean Peninsula. |
---|
4. NOAA (ESRL/PSD) (From Dr. Michael Scheuerer. Updated on April 3, 2020) 'Using artificial neural networks for generating probabilistic subseasonal precipitation forecasts over California' Data: We have NOT obtained our data from the publicly available S2S database mentioned in the email below. For this study, we have used |
---|
- Subseasonal retrospective forecasts by the IFS ensemble, Cycle 43r3, that we retrieved from the ECMWF MARS archive system
- The daily accumulated PRISM precipitation data set, obtained from http://prism.oregonstate.edu/
- ERA5 reanalysis data, obtained from the Copernicus Climate Change Service
ML/AI methodology used:
In our work (paper has been submitted recently) we propose two new approaches for statistical post-processing of subseasonal ensemble forecasts:
- The first approach uses an artificial neural network to translate subseasonal IFS precipitation forecasts into reliable probabilistic forecasts of week-2, week-3, and week-4 precipitation accumulations over California
- The second approach uses a convolutional neural network to link large-scale predictors (geopotential height and total column water over the north-eastern Pacific) calculated from ERA5 analyses to precipitation amounts over California; these relationships are then used to derive week-2, week-3, and week-4 precipitation forecasts from subseasonal IFS forecasts of these large-scale weather variables
5. Colorado State University (From Prof. Elizabeth A. Barnes. Updated on April 3, 2020) I have multiple members of my group using ML for S2S prediction. Specifically, we are focused on interpretable neural networks - so the goal is to not only make better empirical predictions, but to also understand where the predictability is coming from. We are also working on using ML to leverage climate model information to improve observational predictions. |
---|
6. Climate Prediction Center, NOAA/NWS/NCEP (From Dr. Yun Fan with Dr. Jon Gottschalck. Updated on April 4, 2020) Benefiting from great advances in the machine learning techniques in recent years, such as more flexible and capable machine learning algorithms and availability of big dataset, we designed a more beneficial neural network setups which enable us not only to explore nonlinear impacts from big data, but also extract more sophisticated pattern and co-variabilities relationships hidden behind the multiple dimensional predictors and predictands. Then these learned more complicate relationships and high level statistical information are used to correct the original bias corrected NOAA NCEP Climate Forecast System(CFSv2) Week 34 precipitation and 2 meter temperature forecasts. The results show that to some extent neural network techniques can clearly improve the Week 34 forecast accuracy and greatly increase the efficiency over the traditional pointwise multiple linear regression methods. The dataset currently used is the NOAA NCEP CFSv2. In the near future, we will work on the NCEP GEFS, ECMWF, CMC etc real-time data sets available here in the NOAA CPC. The following link has our NN short paper (on page 59-63: https://www.nws.noaa.gov/ost/climate/STIP/43CDPW/43CDPW_Digest.pdf). A paper submitted to the AMS Journal: WAF (under revision): Yun Fan, Vladimir Krasnopolsky, Huug van den Dool, Chung-Yu Wu and Jon Gottschalck; 2020: Using Artificial Neural Networks to Improve CFS Week 3-4 Precipitation and 2 Meter Air Temperature Forecasts. |
---|
7. Royal Dutch Meteorological Institute (KNMI) and the Institute for Environmental Studies at the Vrije Universiteit Amsterdam (IVM) (From Chiem van Straaten Updated on April 6, 2020) At the Royal Dutch Meteorological Institute (KNMI) and the Institute for Environmental Studies at the Vrije Universiteit Amsterdam (IVM) we run a research project called ‘Improvement of sub-seasonal probabilistic forecasts of European high-impact weather events using machine learning techniques’. The project uses ML for post-processing and diagnostics (mainly dimension reduction and learning connections). |
---|
We evaluate whether probabilistic forecasts at the sub-seasonal timescale contain skill for surface variables in Europe (e.g. 2-meter temperature), and how this depends on scale, location and extremity. Then, for events in which some predictability is found (for hot extremes predictability is expected), we try to find their physical precursors in other variables. Ridge regression and unsupervised clustering are used for dimension reduction in SST’s, geopotential height and more.
Lastly, we combine the information on observed driving factors with information on shortcomings of the ensemble prediction systems (e.g. propagation of waves from the tropics to the mid-latitudes) to post-process the forecasts. We have experience with RF’s and CNN’s for post-processing at shorter timescales. Regarding data: forecast evaluation was done on ECMWF cycle 45r1, precursors are currently searched in ERA5, and we might apply our post-processing to the EPS’s in the S2S database.