Results of the meteorological model WRF-ARW over Catalonia, using different parameterizations of convection and cloud microphysics

The meteorological model WRF-ARW (Weather Research and Forecasting Advanced Research WRF) is a new generation model that has a worldwide growing community of users. In the framework of a project that studies the feasibility of implementing it operationally at the Meteorological Service of Catalonia, a verification of the forecasts produced by the model in several cases of precipitation observed over Catalonia has been carried out. Indeed, given the importance of precipitation forecasts in this area, one of the main objectives was to study the sensitivity of the model in different configurations of its parameterizations of convection and cloud microphysics. In this paper, we present the results of this verification for two domains, a 36-km grid size and one of 12 km grid size, unidirectionally nested to the previous one. In the external domain, the evaluation was based on the analysis of the main statistical parameters (ME and RMSE) for temperature, relative humidity, geopotential and wind, and it has been determined that the combination using the Kain-Fritsch convective scheme with the WSM5 microphysical scheme has provided the best results. Then, with this configuration set for the external domain, some forecasts at the nested domain have been done, by combining different convection and cloud microphysics schemes, leading to the conclusion that the most accurate configuration is the one combining the convective parameterization of Kain-Fritsch and the Thompson microphysics scheme.


Introduction
In recent years, several limited area meteorological models have been used for operational weather forecasting in Catalonia.The Meteorological Service of Catalonia (SMC), for example, has been working with MASS, MM5 and Lokal Modell .
Recently, under the leadership of the National Center for Atmospheric Research (NCAR), United States, a new limited area model has been developed, the Weather Research and Forecasting Model (WRF). This is a new generation model, one of whose features is its dual role as a research and forecasting model. According to Klemp (2006), one of the WRF objectives is to accelerate the transfer of advances in research to operational meteorology. All these features, along with the fact of being a free distribution model, have attracted a large number of users.
For these reasons, the study of the behavior of the WRF in Catalonia was considered to be useful, especially considering a future operational use. In this context, the practical significance that the field of precipitation has for a variety of users of forecasts, from meteorologists to the general public, should be taken into consideration (Davis et al., 2006). Indeed, in the case of Catalonia, a considerable proportion of the recorded precipitation comes from convective clouds, which represent up to the 70-80% of the total precipitation in summer (Llasat and Puigcerver, 1997), and in the geographical area of the Western Mediterranean, where most episodes of extreme rainfall and floods are caused by convective episodes (Llasat, 2001). Due to the importance of the convective phenomena in Catalonia, and thus the interest in obtaining the highest accuracy in rainfall forecasts, the study focused on assessing the sensitivity of the model to different combinations of convection and cloud microphysics parameterizations, in order to find the most accurate configuration in the prognosis of several variables. Specifically, the model simulations for a set of eleven study cases have been studied, combining the three convective schemes available in the version 2.2 of the WRF with two microphysical schemes: the so called WSM5 (Hong et al., 2004) and that of Thompson et al. (2004).
Therefore, it was decided that the number of possible combinations of physical schemes available in the model had to be limited in order to evaluate a number of case studies and, thus, increase the representativeness of the results with regard to that obtained if only a few were considered. Consequently, no studies of sensitivity to other physical parameterizations that could also have been of great interest for the design of the operational configuration of the WRF in Catalonia have been carried out, especially boundary layer (PBL) and other surface processes schemes. Indeed, Wisse and Vilà-Guerau de Arellano (2004) showed that the PBL schemes had a great impact in the rain fields simulated by the MM5 model in the study of a case of severe convective precipitation also in Catalonia.
Recently, especially in the United States, similar studies have been conducted, which evaluated the sensitivity of the precipitation field predicted by the WRF for different configurations of the physical parameterizations, among which are those of convection and cloud microphysics (Koo and Hong, 2008;Jankov et al., 2007;Gallus and Bresch, 2006). However, works with high-resolution domains are also found, such as those of Kain (2006) or Weisman et al. (2008), which opted for an explicit treatment of convection, so that the convective scheme is left outside sensitivity studies. For Catalonia, no work is known to evaluate the sensitivity of the predictions to the configuration of the WRF model for the variables analyzed in this work. Nevertheless, two recent publications verify the WRF forecasts for several surface variables (such as temperature and humidity at 2 meters or wind at 10 meters), one of which compares the results of the forecasts produced by the two dynamic cores of the model (ARW and NMM) on Europe (Jorba et al., 2008) while the other examines the sensitivity to a set of 23 configurations of the WRF model over the Iberian Peninsula (Borge et al., 2008).
In the following section, the selected study cases are detailed and afterwards there is a description of the configuration of the model used for the execution of simulations.In section 4 the verification methodology is explained and section 5 presents and evaluates the results. The last section is devoted to conclusions and future lines of work.

Selected case studies
To evaluate the model, 11 case studies (Table 1) between 15 June 2006 and 16 March 2007 were selected, during which rainfall was observed over Catalonia, some of them with a well defined convective character.
Two high-impact episodes stand out among them: that of 25 August 2006, when an isolated thunderstorm caused a flood that dragged many cars to the stream of Calella, Maresme, and the episode between 12 and 16 September  2006, when many observatories recorded daily accumulations exceeding 100 mm (Mercader et al., 2007). The same episode of September has also been thoroughly studied by Mateo et al. (2009).

Configuration of the domains
The simulations were run on two domains ( Figure 1) identical to those used at the SMC for operative forecasts of the MM5 model  in order to ease, in the future, the comparison between the performance of WRF and MM5. The characteristics of both domains, the external one, of 36-km grid size, and the internal one, nested to the previous domain, of 12 km, are specified in Table 2. The set of 31 vertical levels defined by default by the WRF-ARW model .

Initial and boundary data
As noted in Table 2, the initial and boundary conditions of the external domain are obtained from the forecasts of the GFS global model (1 • × 1 • horizontal resolution) initialized 12 hours before the start time of the simulation.The initial field, which corresponds to a forecast at 12 hours of the GFS, is improved through the assimilation of observational data (METAR and radiosoundings) with the 3DVAR method (Barker et al., 2004). The boundary conditions of the nested domain are interpolated from the external domain with a pe-riod equal to the timepassing of integration of the external domain.

Configuration of the physical parameterizations
Since the goal of this work is to evaluate the performance of the WRF forecasts based on the parameterization of convection and cloud microphysics, several simulations were performed for each initialization instant keeping fixed parameterizations of radiation, surface layer, subsoil and boundary layer, and varying convective and cloud microphysics schemes. Table 3 summarizes all the options of physics used in the simulations, both variables and fixed. The latter were chosen, on the one hand, taking those with a similar or prior version in the MM5 model that is used in the operational configuration of the SMC, and on the other hand keeping in mind the results of other investigators who have worked with the WRF.
In the case of the boundary layer (PBL), there are 3 schemes available in the 2.2 version of the model: first, the parameterization of the Medium Range Forecast model (MRF) described in Hong and Pan (1996), and also implemented in the MM5 model, but that will be most likely deactivated in the WRF ; secondly, the PBL scheme of Yonsei University (YSU), developed by Hong et al. (2006) as a revised version of the MRF parametrization, and finally the Mellor-Yamada-Janjic (MYJ) scheme. The YSU scheme has been chosen for this study because it is the new generation of the MRF scheme, used by the MM5 model in the SMC operative version. For the surface processes, each parameterization option is linked to a PBL scheme (Skamarock et al., 2005); so for YSU, the parameterization is the one based on the Monin-Obukhov similarity theory originally developed for the MM5 model.
The combination of all variable schemes outlines six possible configurations that can be defined as in Table 4.
Finally, it is important to note that the verification work consisted of two well-defined stages: first, tests to determine the optimum configuration for the external domain were performed, and once it was found, tests for the internal domain were conducted, so that the boundary conditions provided from the external domain came from the selected configuration.

External domain (∆x = 36 km)
In the 36 km grid size domain the temperature, relative humidity, geopotential height and wind speed forecast have been verified by calculating the mean error (ME), that helps to determine whether the model as a whole shows any bias in the prognosis of these variables, and the root mean square deviation (RMSE), which can be interpreted as the typical magnitude of the error (Wilks, 1995), as it has the units of the evaluated variable. The mean vector wind error (MVWE) module has also been determined, which is defined as: where: where N i , N j are the X and Y dimensions of the grid, respectively, u(i, j) and v(i, j) are the components of the wind in the grid point (i, j) and the subscripts F and A refer, respectively, to the fields predicted and analyzed. To calculate all these indexes two methods have been followed: the verification grid to grid and the verification point to point. Specifically, the first method consisted of comparing, for each variable, the predicted values with the values analyzed. The latter were obtained from the interpolation in the external domain of fields from the analysis of the GFS model (FNL-GFS), enhanced with the ingestion of observational data with the WRF-3DVAR. As a result of this comparison, the values of statistical indices for each of the grid points have been calculated, as well as the average values of these indices over the entire external domain.
In the point to point verification, however, the expected values of each variable in the standard vertical levels were checked with regard to the observations provided by radiosoundings from all stations within the domain.
While the grid to grid verification makes it possible to get both global values of the different statistical indices over the entire domain as a spatial distribution of the error, the point to point verification has the advantage of allowing for direct comparison of the output model with observations.

Internal domain (∆x = 12 km)
In the case of the internal domain, the accuracy of forecasts for temperature, relative humidity and wind have been evaluated only through the point to point methodology, and the calculation of the ME and RMSE (also MVWE for the wind) was done with regard to the observed vertical profiles of these variables from three radiosounding stations: in Barcelona, Palma and Zaragoza.
Finally, the quantitative precipitation forecasting, which is the most interesting variable for this work, was also verified by using various statistical indices obtained from a comparison of the predicted field with the field observed. To carry out this verification it was necessary, firstly, to construct an analysis of the rainfall observed over a grid of 32 × 24 points over Catalonia, which has also been interpolated with the predicted field. Then a mask was applied on both fields in order to compare only the area corresponding to Catalonia, as the field analyzed was obtained from rain gauge observations from the Xarxa d'Estacions Meteorològiques Automàtiques (Automatic Weather Stations network, XEMA according to Catalan acronym) of the SMC (Figure 2).
-4.4 (11.7) -4.5 (11.7) -3.8 (11.6) -3.8 (11.2) -4.4 (11.8) -4.6 (11.9) Z 700 -5.3 (12.1) -5.5 (12.3) -5.9 (12.6) -5.5 (12.1) -5.6 (12.5) -5.7 (12.5) (m) 500 -5.6 (14.3) -5.7 (14.4) -6.3 (14.6) -5.9 (13.9) -5.9 (14.5) -6.0 (14. With the fields resulting from applying the mask, the quantitative precipitation forecasts were evaluated from two complementary perspectives: firstly, the classical approach, based on a point to point comparison of the predicted and observed fields and the calculation of various statistical indices from contingency tables, such as the probability of detection (POD), the false alarm ratio (FAR), the bias (BIAS) or the Critical Success Index (CSI), all of them defined for example in Ebert (2008); secondly, the approach based on partial verification (better known as fuzzy) that rewards the closeness between predictions and observations while relaxing the requirement of exact match (Ebert, 2008). In other words, partial or fuzzy verification assumes that a forecast can be equally useful if it is slightly shifted with respect to the observation and this shift is determined by the size of neighboring or window points of the grid around the point of interest. Ebert (2008) collected a range of verification techniques that fall into this latter approach, and classified them according to the matching strategy (as if neighboring is considered only around forecast or also around observation) and of the decision model, which is the criteria for determining whether there is agreement between prediction and observation. Specifically, this work has chosen to implement two partial verification techniques that fall in the second group of agreement strategies, so an area around the observation point is considered to account for the uncertainty associated to the observed field. These are the following: • Minimum coverage: Assume that a forecast is useful if the event is predicted in a minimum number of points within the window. This work takes the most lenient requirement, since if at least at one point inside the window, the event is predicted and observed, it is considered to be a success. Following this criteria, contingency tables can be constructed based on the correspondence between windows and the verification indices used in the classical approach (POD, FAR, BIAS, CSI) can be calculated. This technique therefore allows the use of these indices under the concept of fuzzy verification, and in the case of windows made of a single cell, the classical approximation values are recovered. • Fractions Skill Score (FSS): developed by Roberts and Lean (2008), states that a forecast is useful if the frequency of the events predicted and those observed are similar. The statistical index is calculated according to: where F i and O i are the fraction of events predicted and observed, respectively, in each of the N windows i of a certain size. Therefore, it is considered that an event occurs if rainfall exceeds a certain threshold of intensity. The FSS has a range between 0 and 1 (null and maximum skill, respectively), and for samples that are large enough, tends to grow as we increase the level of verification, determined by the size of the neighboring window. Roberts and Lean (2008) also proposed a threshold value (F SS unif orm ) for this index from which it can be considered that the model has an acceptable accuracy, a fact that allows to determine, if FSS is calculated for different scales, the forecast from which it is sufficiently skilled. F SS unif orm is determined by the FSS index that would be obtained with single cell windows for a forecast in which the fraction in each cell was equal to the fraction of cells of the domain where rain is observed (f o ) so that: where: and N 1 is the total number of cells in the domain.
The verification of precipitation has been done for various forecasting horizons (up to 48 hours) and threshold intensities (between 0.5 mm 6h −1 and 50 mm 1h −1 ).

External domain (∆x = 36 km)
In this section, the results of the verification of the temperature, relative humidity, geopotential altitude and wind forecasts are summarized. Those results were obtained from the two comparison methods described above: grid to grid and point to point. For example, Table 5 shows the main statistical indices to evaluate the 24-hour prognosis (initialized at 00 Z) of these variables at different vertical levels, obtained from the grid to grid verification.
Some of the features found in the analysis of the results of the 36 km verification are common to all variables such as: • More sensitivity to the parameterization of convection at low levels; at high levels, to distinguish the configuration that provides better results becomes more difficult. • At 300 hPa, the forecasts seem to be more sensitive to the cloud microphysics scheme than to the parameterization of convection.
More specifically, the temperature forecast is characterized by a RMSE that increases with the forecast horizon and, as illustrated in Table 5, decreases with altitude, although this latter behavior shows an exception among the highest levels (500 and 300 hPa). A bias with negative sign was also observed at those levels closest to the surface and with positive sign at 300 hPa. Given these indices, the best results on the  Moreover, as seen in Table 5, in any configuration, the geopotential altitude is underestimated throughout the vertical profile, as is the atmospheric pressure reduced to sea level. From the RMSE and ME indexes it is difficult to figure out what is the best setup, but if the forecast of this variable is evaluated with the index S1 (Wilks, 1995), the best results (values not shown) are provided by the setting KF·WSM5.
As for the wind, this is seen as one of the variables less sensitive to the parameterization of convection, and indeed, the MVWE index gives very similar results in all settings (see Table 5).
The relative humidity forecasts are shown as the most sensitive to convective schemes, a feature that has also been detected by, for example, Koo and Hong (2008) in their evaluation of WRF. This variable has a smaller value bias at low levels where all configurations tend to be more humid (see Table 5 and Figure 3). It is also observed that setups with the convective scheme of BMJ are the driest at low levels and the wettest at high levels. The RMSE, in turn, increases with altitude (see Table 5 and Figure 3) but the differences between settings are too small to decide which of them has the best results.
Finally, from the results obtained, it can be concluded that the two settings with the convective schema of Kain-Fritsch provide the best forecasts of temperature, geopotential altitude and wind; between the two, the one that uses the WSM5 microphysics parameterization shows a slight advantage over the other one.

Internal domain (∆x = 12 km)
The results of the verification of the conventional variables on the profiles observed through radiosoundings are shown first, and then the statistical indices of verification of quantitative precipitation forecasts are analyzed.

Conventional variables
Generally, the values of ME and RMSE for temperatures are low, the variable tends to be overestimated at high levels, while its behavior at low levels depends on the radiosounding station with which it is compared. However, it is observed that configurations that include GD and KF convective schemes have a more regular behavior than the rest (not shown).
As for the wind, the MVWE is more elevated at low and high levels than at middle levels, and the configurations that use the parameterizations of KF and BMJ have greater success. Meanwhile, the verification of the relative humidity shows that the mean error (Figures 4a -6a) is low (-10, 15%) in the three locations and at all levels. However, in Barcelona and Palma the model is more humid at low levels and very humid at the highest levels, while in Zaragoza the entire vertical profile is very humid.As for the RMSE, (Figures 4b -6b), its value is increased with altitude at the lower levels, while it tends to decrease between 500 and 300 hPa in Barcelona and Palma.
Finally, it can be concluded that almost all configurations give similar results in its verification.

Verification of quantitative forecast of precipitation
Due to the large number of data to analyze, this verification was done in two steps. In the first phase, traditional statistical indices were analyzed for the more significant hourly accumulation intervals (3 and 6 hours) in order to select a subset, among the 6 combinations available, with the configurations that provide the most satisfactory results.
In the second phase, the selected configurations have been evaluated in greater detail: first, the two partial verification or fuzzy techniques described in Section 4.2 have been applied, in addition to the classical techniques, for several accumulation thresholds over 1, 3 and 6 hours and, secondly, a visual comparison of the predicted and observed fields has been done. This way, it has been decided which settings provide the best forecasts.
In the first selection, configurations using the BMJ convective scheme were discarded, as they show little success when predicting high intensities, as seen in Table 6. However, it should be noted that these settings work well when there is low intensity rainfall. In fact, this behavior has been also detected in other verification studies of rainfall forecast of the WRF. For example, Gallus and Bresch (2006) detected that the parameterization of BMJ tends to give a high BIAS for low intensity thresholds and a small BIAS for higher thresholds, and according to Jankov et al. (2007), the BMJ scheme has a tendency, first, to generate large areas of weak precipitation, giving rise to such high values of BIAS that can be associated with high scores in some indexes of success, and secondly, tends to underestimate higher accumulations of precipitation.
Among the other four combinations (with the KF and GD convective parameterizations), it is difficult to discern which one works better, but only those that are combined with the Thompson microphysics are selected, because their results in the verification of precipitation, although similar, are slightly better than those provided by the combinations with WSM5. At this point, the choice of two configurations that are analyzed more in depth already involves a discrepancy between the microphysics parameterizations chosen for the external domain and the internal domain, which cannot be interpreted as a proof that different configurations give better results depending on the domain where applicable. To make such a statement it would be necessary to apply parallel verification strategies in both domains and also consider the field of precipitation at 36 km. In this sense, the main goal for the internal domain was to have some optimal precipitation forecasts, while in the external domain, which pro-vides the boundary conditions to the internal domain, the main goal was that conventional variables were correctly predicted.
After the first selection, KF·Thom and GD·Thom configurations can be further analyzed. If we observe the temporal evolution of POD and CSI indices, calculated under the classical approach, we detect two maxima at 18 and 36 hour forecast horizons (see Figure 7) in the simulations that are initialized at 00 Z, so that they correspond to intervals ranging between 12 Z and 18 Z of the first day and between 06 Z and 12 Z of the second day, both during daylight hours, during which convective activity is more important. These two ranges of maximum accuracy, each separated in time, suggest an influence of the larger scale atmospheric systems, through the forecast used to feed the boundary conditions on the convective activity simulated by the model between 30 and 36 hour forecasting horizons.
Moreover, the comparison of these two configurations shows that KF·Thom has a better accuracy than GD·Thom for almost all forecast horizons (Figure 7) as its higher POD index corresponds to a FAR index identical or inferior, a circumstance that gives rise to a higher CSI. However, both configurations show that in periods when POD and CSI indices  increase, FAR index decreases, which shows good performance, in general, of the precipitation forecasts.
If these indices are calculated from the partial verification approach with the agreement model of minimum coverage, we can clearly see how the relative performance between both configurations is maintained for most thresholds although the spatial coincidence between predictions and observations is relaxed, expanding the verification window to squares of 3 and 5 side cells. Figure 8, to name one example, shows the CSI index corresponding to predictions of rainfall accumulated in 6 hours for the period between 36 and 42 hour forecast horizons, and finds, as to be expected, that its value increases as the criteria of correspondence is relaxed. However, the values of the KF·Thom configuration remain higher on most scales and thresholds.
The contrasted evaluation of precipitation forecasts given by the two configurations can be completed by analyzing the given values for the FSS index, which is also part of the partial verification techniques or fuzzy defined above. This statistic makes it possible to determine the scale from which a forecast has an acceptable accuracy, which is met when FSS exceeds the value of F SS unif orm (Equation 4). From the values of this index, how the configuration of KF·Thom tends to overcome the F SS unif orm for scales smaller than GD·Thom is detected and, in most cases, it gives higher FSS values for the same scales. At larger scales, the FSS values of KF·Thom are closer to the unity, indicating a bias of smaller magnitude. Figure 9 shows an example of the FSS index values of both configurations for the 6 hour interval corresponding to the morning of the second forecast day, and the features that have just been mentioned about the relative behavior of the two models can be observed, with the exception of the higher intensity threshold (50 mm 6h −1 ), for which GD·Thom makes a clearly better forecast.
As for the visual verification of the precipitation field (not shown), some homogeneity between the two configura- Figure 10. For the configuration of KF + Thompson, CSI index, calculated from the classical approach, over all simulations initialized at 00 Z, for each 6-hour forecast interval, depending on the accumulation threshold.
tions in the precipitation forecast in the NW quadrant of Catalonia (which includes one of the rainfall maxima of the region) has been detected, and also at the whole of the domain in those cases in which the explicit precipitation represents an important part of the total rainfall predicted; these latter situations are mostly associated with the passage of frontal systems over the study area. However, when the simulated precipitation is mostly of convective origin, configurations differ substantially in areas of the NE quadrant and the precoastal and coastal central area, with differences in rainfall patterns that are stressed as the forecast horizon is extended.
In short, it can be concluded that the two selected configurations show a satisfactory accuracy, but the three verification methods used to compare them confirm that the combination of the Kain-Fritsch convective parameterization with the Thompson microphysical scheme shows the most consistent behavior and the most accurate forecasts.
At this point, it is interesting to characterize the particular behavior of this configuration representing the CSI index, calculated with the classical approach, depending on the threshold of intensity for each 6-hour interval, as shown in Figure 10. For weak rainfall, it is observed that the best forecast is obtained between 12 Z and 18 Z on the first day, for moderate rainfall (3 to 15 mm in 6 hours) between 06 Z and 12 Z on the second day and the best forecast for more intense rainfall (30 mm 6h −1 ) is obtained between 12 Z and 18 Z on the second day.

Conclusions
In this paper the sensitivity of the WRF-ARW 2.2 model to the combinations of different convective parameterizations available in the model with the microphysics schemes WSM5 and Thompson has been evaluated in 11 case studies distributed along different seasons. The main purpose was to find a stable configuration of the model for operational forecasts at the SMC.
Both in the external (36 km grid size) and the internal domain (12 km), the conventional variables were verified through the calculation of the indices ME and RMSE, and MVWE for wind. The best results for the external domain were obtained with the configuration that combines the parameterization of Kain-Fritsch convection with the WSM5 microphysical scheme. For the internal domain no highlighted setting has been found in the forecast of the same variables verified in the external domain, but after evaluating the quantitative precipitation forecast with classical statistical indices (POD, CSI, FAR and BIAS), the two configurations with better results have been selected (those that combine Kain-Fritsch and Grell-Devenyi with Thompson) and a more thorough verification has been done, in which two partial verification of fuzzy techniques (minimum coverage and Fractions Skill Score) have been used to complement the classical methodology. The analysis of these results showed that the configuration with the best success combines Kain-Fritsch with Thompson microphysics.
However, it must be said that the significance of these results has certain limitations. First, the case studies correspond to a particular time period (2006 and 2007) and interannual meteorological variability, particularly with regard to precipitation, is extremely high in the region studied. On the other hand, the analysis of the forecast sensitivity to the parameterizations of the boundary layer available at the model have been left out of the scope of this paper, although they play a key role in the simulation of convective processes.
The most immediate practical application of this work is the implementation of the two selected settings (KF·WSM5 for the external domain and KF·Thom for the nested domain) in the operational forecasts of the WRF at the SMC. This will later allow for the verification of the WRF forecasts in contrast with the other models that currently run operationally at the SMC.
Moreover, a starting point is established in order to continue working with the WRF-ARW model, with the aim of improving operational forecasts made by the SMC. On the one hand, as this is a model in constant evolution, a verification of the forecasts produced by the updated versions of the model will have to be pursuit, and more sensitivity studies should be carried out if new parameterizations are incorporated, which should include PBL processes that have been left out of this work. On the other hand, since the convective parameterization at the WRF is the one that gave the best results in the area of interest, a new way to investigate in depth the Kain-Fritsch scheme is open, while performing studies of the forecasts sensitivity to various internal parameters of the scheme in order to introduce changes that result in an improvement of the forecasts of convective rainfall over Catalonia.
of the work and comments provided by two anonymous reviewers who helped to improve the quality of the article.