Martin Wright

VAE-GAN for Hydrological Stochastic Generation

October 2025 Deep Learning • Hydrology • Climate Modeling
Contemporary methods for stochastically generating climate-change impacted rainfall patterns involve site-based stochastic methods. This paper builds on recent IBM Research methods using VAE models to better model co-variant rainfall patterns by using a prior-posterior sampling method based on latent space. The approach addresses critical limitations in existing weather generators, particularly the challenge of maintaining spatial co-variance while capturing temporal dynamics and extreme weather events.

Latent Space Explorer

The VAE encodes each daily rainfall field into a 64-dimensional latent vector. The scatter below shows a PCA projection of 500 encoded days. Hover a point to see the spatial rainfall pattern it decodes to — demonstrating how distinct weather regimes cluster in latent space.

PCA 2
PCA 1
Hover a point
← select a point
Dry
Wet

Background & Motivation

Traditional stochastic weather generators (SWGs) cannot simultaneously produce spatially co-variant rainfall across multiple sites while preserving multi-variable temporal dependencies. For water supply systems like Auckland's — spanning the Hunua and Waitakere Ranges plus the Waikato River — accurately modelling how rainfall interacts across catchments during drought is essential for yield assessment and infrastructure planning.

Existing tools like the Stochastic Climate Library can generate either multi-site rainfall or co-variant rainfall–PET at a single site, but not both. This project explores VAEs as a unified alternative.

Methodology

A convolutional VAE encodes daily spatial rainfall fields (from CHIRPS and CCAM gridded datasets) into a regularized 64-dimensional latent space. A GRU-based transition model learns temporal dynamics p(zt | zt−1, ct) conditioned on day-of-year and climate scenario. Decoding reconstructs the full spatial field.

Training uses β-VAE weighting with KL warm-up over 30 epochs to balance reconstruction fidelity against latent structure. Extreme-scenario synthesis is achieved by sampling from the tails of the learned Gaussian prior — points far from the mean correspond to unusual weather states.

Results

The model successfully captures orographic effects and spatial dependencies that site-independent methods miss. QQ plots show good distributional agreement with observed rainfall across percentiles. By varying sampling standard deviation from 0.3 to 1.3 in latent space, the decoder produces a continuous range from drought to extreme monsoon conditions while maintaining physical spatial coherence.