AbstractTwenty-seven models participated in the Earth System Model - Snow Model Intercomparison Project (ESM-SnowMIP), the most data-rich MIP dedicated to snow modelling. Our findings do not support the hypothesis advanced by previous snow MIPs: evaluating models against more variables, and providing evaluation datasets extended temporally and spatially does not facilitate identification of key new processes requiring improvement to model snow mass and energy budgets, even at point scales. In fact, the same modelling issues identified by previous snow MIPs arose: albedo is a major source of uncertainty, surface exchange parametrizations are problematic and individual model performance is inconsistent. This lack of progress is attributed partly to the large number of human errors that led to anomalous model behaviour and to numerous resubmissions. It is unclear how widespread such errors are in our field and others; dedicated time and resources will be needed to tackle this issue to prevent highly sophisticated models and their research outputs from being vulnerable because of avoidable human mistakes. The design of and the data available to successive snow MIPs were also questioned. Evaluation of models against bulk snow properties was found to be sufficient for some but inappropriate for more complex snow models whose skills at simulating internal snow properties remained untested. Discussions between the authors of this paper on the purpose of MIPs revealed varied, and sometimes contradictory, motivations behind their participation. These findings started a collaborative effort to adapt future snow MIPs to respond to the diverse needs of the community.