AbstractThe evaluation of decadal climate predictions against observations is crucial for their benefit to stakeholders. While the skill of such forecasts has been verified for several atmospheric variables, land hydrological states such as terrestrial water storage (TWS) have not been extensively investigated yet due to a lack of long observational records. Anomalies of TWS are globally observed with the satellite missions GRACE (2002–2017) and GRACE-FO (since 2018). By means of a GRACE-like reconstruction of TWS available over 41 years, we demonstrate that this data type can be used to evaluate the skill of decadal prediction experiments made available from different Earth system models as part of both CMIP5 and CMIP6. Analysis of correlation and root-mean-square deviation (RMSD) reveals that for the global land average the initialized simulations outperform the historical experiments in the first three forecast years. This predominance originates mainly from equatorial regions where we assume a longer influence of initialization due to longer soil memory times. Evaluated for individual grid cells, the initialization has a largely positive effect on the forecast year 1 TWS states; however, a general grid-scale prediction skill for TWS of more than 2 years could not be identified in this study for CMIP5. First results from decadal hindcasts of three CMIP6 models indicate a predictive skill comparable to CMIP5 for the multimodel mean in general, and a distinct positive influence of the improved soil–hydrology scheme implemented in the MPI-ESM for CMIP6 in particular.