Abstract
Modelling studies are evaluated by comparing the simulation outputs to an observational reference. In climate science, the number and complexity of the models and the mass of data have led the community to develop standardised methods and automated tools, such as the Climate Variability Diagnostic Package or the ESMValTool. However, these tools are mostly designed to evaluate simulations of the instrumental period. Different methods are required to compare paleoclimate simulations to palaeodata. For example, new variables are being modelled, such as vegetation, ice sheet extent, or isotopic ratio, and are used for the evaluation. Changing boundary conditions in transient simulations further complicate the evaluation process: traditional indices that characterise circulation (e.g. monsoon) or modes of variability (e.g. NAO, ENSO) need to be adapted, while new ones are needed to investigate modes of longer timescale and abrupt events. Finally, the palaeodata also present challenges: various type of uncertainty, complex relation to climate variables, and different spatio-temporal representativeness compared to model outputs. Here, we summarise the challenges of model-data comparison in paleo-climate studies. We then review some of the different methods and tools already developed by the community, such as biome comparison and Bayesian approaches to quantify model-data deviation. We finally discuss the implementation of an evaluation framework which aims to provide both adaptable tools to the community and automated standardised analyses.