Increasing the Accuracy and Resilience of Streamflow Forecasts through Data Augmentation and High Resolution Weather Inputs

dc.contributor.authorDavid Lambl
dc.contributor.authorSimon Topp
dc.contributor.authorPhil Butcher
dc.contributor.authorMostafa Elkurdy
dc.contributor.authorLaura Reed
dc.contributor.authorAlden Keefe Sampson
dc.coverage.spatialBolivia
dc.date.accessioned2026-03-22T20:50:48Z
dc.date.available2026-03-22T20:50:48Z
dc.date.issued2025
dc.description.abstractAccurately forecasting streamflow is essential for effectively managing water resources. High-quality operational forecasts allow us to prepare for extreme weather events, optimize hydropower generation, and minimize the impact of human development on the natural environment. However, streamflow forecasts are inherently limited by the quality and availability of upstream weather sources. The weather forecasts that drive hydrological modeling vary in their temporal resolutions and are prone to outages, such as the ECMWF data outage in November of 2023. Here, we present HydroForecast Short Term 3 (ST-3), a state-of-the-art probabilistic deep learning model for medium-term (10-day) streamflow forecasts. ST-3 combines long short-term memory architecture with Boolean tensors representing data availability and dense embeddings for processing of the information in these tensors. This architecture allows for a training routine that implements data augmentation to synthesize varying amounts of availability of weather inputs. The result is a model that 1) makes accurate forecasts even in the case of an upstream data outage, 2) achieves higher accuracy by leveraging data of varying temporal resolutions including regional weather inputs with shorter lead times than the most common medium term weather inputs, and 3) generates individual forecast traces for each individual weather source, facilitating inference across regions where weather data availability is limited. Initial results across CAMELS sites in North America indicate that the incorporation of near-term high resolution weather data increases early horizon forecast KGE by nearly 0.25 with meaningful improvements in metrics seen across our customers’ operational sites. Validation metrics across individual weather sources, as well as model interrogation through integrated gradients highlights a high level of fidelity in the model’s learned physical relationships across forecast scenarios.
dc.identifier.doi10.5194/egusphere-egu25-12110
dc.identifier.urihttps://doi.org/10.5194/egusphere-egu25-12110
dc.identifier.urihttps://andeanlibrary.org/handle/123456789/84416
dc.language.isoen
dc.sourceUniversidad Pública de El Alto
dc.subjectStreamflow
dc.subjectResilience (materials science)
dc.subjectEnvironmental science
dc.subjectClimatology
dc.subjectResolution (logic)
dc.subjectMeteorology
dc.subjectHigh resolution
dc.subjectComputer science
dc.titleIncreasing the Accuracy and Resilience of Streamflow Forecasts through Data Augmentation and High Resolution Weather Inputs
dc.typepreprint

Files