This paper is devoted to the validation of water level forecasts in the Gulf of Finland. Daily forecasts produced by four setups of operational, three-dimensional Baltic Sea oceanographic models are analyzed using statistical means and are compared with water level observations at three Finnish stations located on the northern coast of the Gulf of Finland. The overall conclusion is that the operational systems were skillful in forecasting water level variations during the study period from November 1, 2003, to January 31, 2005. The factors causing differences between the water level forecasts of different models are discussed as well. An important task of operational sea level forecasting services is to provide accurate and early information about extreme water levels, both positive and negative surges. During the study period, two major winter storms occurred which caused coastal flooding in the region. According to our analysis, the operational models forecast the rise of water levels during these events rather successfully. Nowadays, operational forecasts can provide early warnings of extreme water levels at least 1 day in advance, which may be regarded as a minimum requirement for an operational forecasting system. The paper concludes that the models generally performed very well, with over 93% of the hourly water level forecasts found to be within the range of +/- 15 cm of the observed water levels, and with the timing of the water level peaks accurately predicted. Further discussion and studies dealing with the assessment of the skills of both operational meteorological and oceanographic forecasts, especially in connection with rare surge events, will be necessary. Skill assessment of operational oceanographic models would be relatively easy if acceptable error limits or a quality system was developed for the Baltic Sea operational models.