Prediction of Childhood Asthma Using Conditional Probability and Discrete Event Simulation

Asthma prevalence in children and adolescents in Spain is 10-17%. It is the most common chronic illness during childhood. Prevalence has been increasing over the last 40 years and there is considerable evidence that, among other factors, continued exposure to cigarette smoke results in asthma in children. No statistical or simulation model exist to forecast the evolution of childhood asthma in Europe. Such a model needs to incorporate the main risk factors that can be managed by medical authorities, such as tobacco (OR = 1.44), to establish how they affect the present generation of children. A simulation model using conditional probability and discrete event simulation for childhood asthma was developed and validated by simulating realistic scenario. The parameters used for the model (input data) were those found in the bibliography, especially those related to the incidence of smoking in Spain. We also used data from a panel of experts from the Hospital del Mar (Barcelona) related to actual evolution and asthma phenotypes. The results obtained from the simulation established a threshold of a 15-20% smoking population for a reduction in the prevalence of asthma. This is still far from the current level in Spain, where 24% of people smoke. We conclude that more effort must be made to combat smoking and other childhood asthma risk factors, in order to significantly reduce the number of cases. Once completed, this simulation methodology can realistically be used to forecast the evolution of childhood asthma as a function of variation in different risk factors.


INTRODUCTION
Asthma, which affects between 10% and 17% of children and adolescents in Spain [1], especially in cities, is a chronic inflammatory alteration of the airway with consequent narrowing of the bronchial conducts, leading to a feeling of breathlessness, coughing and difficulty breathing.It has become the leading cause of consultation in emergency hospitals and primary health centres in Spain at state level.The prevalence of childhood asthma makes this disease the most common chronic condition in childhood and adolescence, therefore it is essential to predict its future evolution.Childhood asthma has still not been well defined and demarcated, which makes it difficult to study its epidemiology, diagnosis and treatment.Among doctors, doubt has always existed in establishing the diagnosis of bronchial asthma in a child who has bronchial obstructive episodes.
In Spain, there are no data on the evolution of the prevalence of asthma in children.However, it is considered that, as in surrounding countries, the prevalence must be rising [2].Ten per cent mean prevalence of childhood asthma has been observed in western countries, but there is great variability among countries.This prevalence appears to have been increasing over the last 30-40 years, although there is debate on whether this is a real increase in prevalence or whether an increasing number of children are being diagnosed with asthma.Asthma is the leading cause of hospitalization in children and the leading cause of school absenteeism due to chronic illness.
Currently, there is no doubt that asthma is genetically determined.Its mode of transmission (inheritance) is polygenic (several genes on several chromosomes), which explains why children of parents with asthma may or may not have asthma and that among those who have the condition the severity and presentation varies [3][4][5].
Nevertheless, the temporary nature of this phenomenon (the rapid rise in incidence and the differences in prevalence in countries with similar genes) indicate that environmental causes are important risk factors.Several such factors have been proposed to explain asthma: the frequency and intensity of viral and bacterial infections, vaccinations or exposure to pollutants and allergens inside and outside homes, including NO 2 and passive smoking.To date, the role of each of these factors has not been clearly identified, as most research has involved cross-sectional studies whose basic subjects were selected on the basis of risk factors, such as allergies or asthma in adults.Figure 1 presents a summary of the different risk factors implicated in the onset and progression of asthma in children from birth to adulthood [5].
There is considerable evidence that continuing exposure to cigarette smoke results in the induction of asthma in children.Two large cross-sectional studies in the USA involving a total of about 8,000 children and adolescents resulted in odds ratios (OR) of approximately two for the presence of asthma with parental smoking [6] or maternal smoking of more than 10 cigarettes a day [7].In a longitudinal investigation of asthma incidence among 774 children up to 5 years of age at entry, Martínez et al. [8] reported a relative risk of 2.5 (95% CI = 1.4-4.6)when maternal smoking exceeded 10 cigarettes/day and the mother had at most a high-school education.The US EPA, see [9], and other studies reviewed this research and stated that the evidence leads to the conclusion that environmental tobacco smoke (ETS) is a risk factor for inducing new cases of asthma.The evidence is suggestive of a causal association, but is not conclusive [10].The association of exposure to ETS with childhood asthma was examined in different studies, concluding that exposure to ETS is associated with wheezing symptoms, medical therapy for wheezing and wheezing-related emergency department visits in US and Canadian children [11].
Tobacco is the leading cause of preventable death in developed countries and also the most significant cause of years of life lost prematurely and years lived with disability in Spain [12].Smoking prevention and the fight against tobacco consumption are primary objectives of health policies in the international community and consequently in Europe and Spain.There is continuous implementation of smoke-free legislation and information campaigns against smoking, particularly in relation to Spanish law 28/2005.A decreasing exposure to ETS has been observed in the last 10 years and the benefits to the population are well-known and have been studied in other neighbouring countries such as Italy [13].This study is interested in how smoke-free policy can be an effective strategy for reducing childhood asthma in the next generation.In particular, tobacco control through legislation in Spain in 2008 for promoting a smoke free environment is a recommended strategy for reducing the prevalence of smoking and consequently ETS in the community.[20].
Furthermore, there are no statistical and/or simulation models that can be used to forecast the evolution of the prevalence of childhood asthma in the European context.Such models should incorporate the main risk factors, such as ETS.The bibliography only contains the work of Balemans et al. [21] on prediciting asthma in young adults using childhood characteristics and the development of a prediction rule using multivariate statistics.
The aim of this study was to create a discrete event simulation (DES) model which would reproduce the current situation of child asthma in Spain, in order to reproduce this pattern from birth to adolescence.This model included tobacco, one of the main risks factors for asthma, so as to be able to forecast the likelihood of the illness in the future as a function of ETS and to examine how asthma prevalence is affected by different levels of parental tobacco prevalence that have been decreasing in Spain in recent years.The consequences of parents' smoke on childhood asthma were also discussed.This model includes a probabilistic interrelation between the disease and the risk factor.It also reproduces the distribution of asthma-related episodes of wheezing, and considers the phenotype distribution in the time horizon.

Simulation Models in Epidemiology
Data found in the bibliography were used as parameters of the model (inputs).In addition, data from a panel of experts in a hospital in Barcelona were used.The model input was as follows: 1) childhood asthma prevalence reported in the bibliography (10-17% in Spain); 2) level of ETS or parental tobacco prevalence as a risk of asthma, using + OR = 1.44 (95% CI 1.27-1.64)(see [10]) or OR = 2 [6,7]; parental smoking prevalence (%) among the Spanish population, reviewed several years before the application of the 2005 Spanish anti-tobacco law up to now (the last date was in 2007) [20].
The studies offer different values of asthma risk (OR), on the basis of tobacco.Primarily, the ETS and prevalence of habitual consumption of tobacco by parents was considered.As it is easier to use the prevalence of tobacco consumed (daily consumption), this was taken as the main risk factor in the model.The time horizon of the model was 11 years.

Probabilistic Basis of the Model: OR and Risk Ratio as Conditional Probability
To examine the implications of smoking as a risk factor in the development of childhood asthma in a given population, we need to plan a simulation study.After selecting a sample of subjects, the likelihood of developing the disease in the study population is P(D).The probability of D, P(D) is increased if the risk factor (F) is presented with probability P(F), until reaching the conditional probability P(D / F ) .If the risk factor is not present, the probability is P(D/ F ). P(D / F ) , P(D/ F ) and P(D F ) were not reported in the bibliography for the Spanish population.
In epidemiology, the association between a risk factor or protective factor (exposure) and a disease may be evaluated by the "risk ratio" (RR) or the "odds ratio" (OR).Both are measures of "relative risk", the general concept of comparing disease risks in exposed vs. unexposed individuals [22][23][24].The odds ratio (OR) is a measure of effect size which allows the prevalence of the disease to be linked with the conditional probability of the disease and the factor.OR is defined as a combination of conditional probabilities, The total population studied (n=a+b+c+d) and the relation between disease events and factor presence can be represented in a DxF 2 by 2 table, as follows: Where: , is the marginal probability of disease ( 2) , is the marginal probability of exposed subjects ( 4) OR can be defined as: Where a / c are the exposed subjects' odds of acquiring the disease and b / d are the unexposed populations' odds of acquiring the disease.An odds ratio of 1 implies that the event is equally likely in both groups.An OR greater than one implies that the event is more likely in the first group P(D/F)>P(D).An OR of less than one implies that the event is less likely in the first group P(F/D)<(PD).
On the other hand, using conditional probability theorem: Where P(D F ) corresponds to the probability associated with D and presents the risk factor F. This allows us to calculate the conditional probability of interest P(D / F ), P(D / F ) .P(D F ) corresponds to a n in the previous risk DxF 2 by 2 table.
However, the calculation of P(D / F ), P(D / F ) using ( 7) is not possible if only one of the following are  P(D / F ), P(D / F ) can be numerically calculated using a simulation of 3 uniform random variables y 1 U (1,10000), y 2 U (1,10000), y 3 U (1,10000) to generate 2 10 6 values in order to obtain a, b, d values under the restriction outlined in the previous 2 by 2 table (Disease x risk) and ( 2), ( 3), ( 4), ( 5), (7), P(D) = 0.1 0.17 Finally, we consider that the event asthma (D) in the simulated population depends on a Bernoulli distribution of probabilities [ y* Be(Z 1 ) if F = 1, y* Be(Z 2 ) if F=0], with a conditional probability of the risk factor, tobacco (F), as P(D/F).

Phenotype Sub-Model and Wheezing Time Event Distribution: Discrete Event Simulation (DES)
The discrete event simulation (DES) paradigm [25,26] was used to construct the simulation model.DES is one way of building models to observe the timebased (or dynamic) behaviour of a system, based on a probabilistic model using random variables of a disease associated with a risk factor.In DES, the operation of a system is represented as a chronological sequence of probabilistic events (asthma [D] or wheezing event [W] in the present study, associated with risk event [F]).Each event occurs at an instant in time and marks a change of state in the system.If a paediatric asthma episode is simulated, an event could be D, according to a Bernoulli distribution of probabilities, or W according to a Weibull time distribution of probabilities, t We( , ) .
Moreover, the simulation model has taken into account that a child with asthma (D) may have an average of 5 episodes of wheezing (W) per year.Asthma is considered as 3 or more episodes of breathlessness and wheezing per year.Thus, in the simulation model we considered, Where (N (5,3) is the Normal Gaussian distribution with an average of 5 W episodes, a standard deviation of 3 and positive truncation of values.
Where U(0,2) is the uniform distribution with 0 and 2 W episodes.
In addition, there was a constant risk (OR) in the incidence of smoking on the various phenotypes of asthma, supposing a prevalence of 80, 15 and 5% for transient wheezing (Phenotype 1), wheezing / non atopy (Phenotype 2) and wheezing / atopy phenotypes (Phenotype 3) respectively.To increase the sensitivity of results and compare the disparity of OR in the bibliography, OR = 1.4 and 2 were used.Thus, in the simulation model we considered, Where t 1 , t 2 and t 3 are random variables allowing different W episodes associated with D to be generated as a function of phenotype.t 1 , t 2 and t 3 were adjusted empirically using the median of values for 3, 5 and 10 years old, truncated to 11 years old, for simulated wheezing episodes (W) in the three phenotypes contemplated.A synthesis of the DES results is presented in Figure 3, which includes D, F, P(D / F), W, t, r and phenotypes.
We used simulation to generate 10,000 individuals and 500 replicae of each simulation scenario.This series of simulation scenario was created to study the consequences of tobacco exposure on future levels of childhood asthma, as well as to make the model sensitive and robust.Replicas of simulation were generated for each experiment and 95% of the value of the prevalence of asthma were calculated among the simulated individuals.
The simulations to obtain Z 1 = P(D / F ) f (P(D), P(F ),OR) , Z 2 = P(D / F ) f (P(D), P(F ),OR) were carried out in R and the simulation of the DES model was performed with SAS v.9.1 (SAS Institute Inc.) and R v. 2.1.0(R Development Core Team).

RESULTS
A DES simulation model for childhood asthma was validated with respect to data in the bibliography, with a prevalence of asthma of around 10% to 17%, based on published studies [1, 3, 5, 14].The results of 14 different simulation scenario with 10,000 simulated children and 100 replications of each scenario are presented in Table 1.
The scenario reflects the behaviour of the prevalence of child asthma and allow it to be observed throughout childhood and for different phenotypes.The results also indicate the following: a rise in daily tobacco consumption, from 32.9% in 1997 to 35.1% in 2001; a reduction of 5-10% of daily smokers in Spain (from 35.1% in 2001 to 26.4% in 2006, during the first application of Spanish law 28/2005); and a reduction of 23.7% in 2007 after complete application of Spanish law 28/2005.We also carried out simulations for scenario of 20, 15 and 10% asthma prevalence.Each value of tobacco prevalence was taken into account for OR = 1.4 and OR = 2.
The rows of Table 1 correspond to the input model data: 1) experiment number, 2) childhood asthma prevalence reported in the bibliography P(D), 3) OR smoking risk of asthma by smoking parents, 4) parental smoking prevalence (%) P(F).They also include the output model data: 6) total childhood prevalence of asthma 0-11 years (%) (CI95%), 7) prevalence of childhood asthma for Phenotype 1 incidence, 8) prevalence of childhood asthma for Phenotype 2, 9) prevalence of childhood for Phenotype 3, and 10) empirical P(D) , P(F) , P(D/F) , P(D/NF) , which were estimated using the multivariate function Z 1 (Figure 2).
The results of the simulated prevalence of asthma, presented in Table 1, are summarised in Figure 4 where the different values of real smoking prevalence (X) are represented in comparison to the asthma prevalence obtained by simulation (Y) for OR=1.4 and OR=2.The fitted curves using a third order inverse polynomial model and R Where f (x) =% asthma prevalence, x = % smoking prevalence.
As shown in Figure 5 and the results obtained in Table 1, the prevalence of asthma obtained by simulation varied from 14.5-16% for OR=2 and 12-13% for OR=1.4.Thus, the prevalence of asthma increased or there was a random, consistent trend between 1997 and 2007 for the curve OR = 2 and OR = 1.4,as        The curves adjusted to the simulation values show that the slope was negative from a lower prevalence of 20% for OR = 2 and a 15% prevalence for the curve OR = 1.4.
The distribution of phenotypes of simulated asthma conforms perfectly with the model 80%:15%:5% in the literature, as shown in Table 1, which also indicates the confidence interval of 95%.
The sub-distribution of asthma phenotypes and the distribution of wheezing events were not studied.However, they may be useful in a future improved model, in which the prevalence of tobacco for each different phenotype could be considered, or the effect of tobacco on wheezing events.

DISCUSSION
Childhood asthma is genetically determined, but factors that make some individuals develop asthma early are currently the subject of research.Environmental causes are important risk factors for asthma.It is important to discover these causes, in order to reduce the prevalence and incidence of asthma, which are currently increasing.
No statistical or simulation models exist to forecast the evolution of childhood asthma in Europe.There are formal methods for constructing simulation models in an epidemiologic framework and methods to assure that these models are credible.These models basically follow validation, verification and accreditation phases [27].During the experimental phase, the models are executed (run over time) in order to generate results.The results can then be used to provide an insight into the system and as a basis for decision-making.Such models need to incorporate the main risk factors that can be managed by medical authorities, such as tobacco (OR asthma diagnosis = 1.44, 2), to establish how they affect the present generation of children.Thus, the number of cases can be forecast from an epidemiological perspective, as can their possible treatment, and whether to assign more or less public resources to asthma treatment.
To investigate the relationship between ETS exposure and childhood asthma more thoroughly, a meta-analysis of studies purporting to examine this issue was undertaken by the US Department of Health and Human Services [10].A Medline search was conducted to identify all epidemiologic studies published between 1975 and 1995 examining ETS exposure as a risk factor for the induction of childhood asthma.Of the 37 studies included in this analysis, all but three reported a risk ratio (RR) greater than 1.0, although many were not statistically significant at = 0.05.The pooled RR for those studies with clinically diagnosed asthma as the outcome was 1.44 (95% CI = 1.27-1.64).This study concludes that ETS increased the risk of asthma (OR =1.44 (95% CI 1.27-1.64)[11].
Other studies [28] hypothesized that the joint effect of genetic propensity to asthma and exposure to ETS on the risk of childhood asthma was greater than expected on the basis of their independent effects and constituted an important risk factor for this disease.A population-based 4-year cohort study of 2,531 children born in Oslo, Norway was analyzed.Information on the child's health and environmental exposure at birth and when the child was 6, 12, 18, and 24 months as well as 4 years of age was collected.The outcomes of interest were bronchial obstruction during the first 2 years and asthma at the age of 4 years.The study found in a logistic regression analysis adjusted for confounding that parental atopy alone increased the risk of bronchial obstruction (OR 1.62; 95% confidence interval [CI] 1.10-2.40)and asthma (1.66; 95% CI, 1.08-2.54).The results are consistent with the hypothesized joint effect of parental atopy and exposure to ETS.This phenomenon, denoted as effect modification of environmental exposure by genetic constitution or gene by environment interaction, suggests that some genetic markers could indicate susceptibility to environmental factors.
Finally, other epidemiological studies concluded that involuntary smoke exposure was associated with increased asthma severity and worsened lung function in a nationally representative group of US children with asthma.Asthmatic children with high levels of smoke exposure, compared with those with low levels of exposure, were more likely to have moderate or severe asthma (OR, 2.7 95% confidence interval [CI], 1.1 to 6.8) and decreased lung function, with a mean FEV1 decrement of 213 mL or 8.1% (95% CI, -14.7 to -3.5) [29].
The model presented here includes the main risk factor affecting asthma (tobacco), so as to be able to forecast the likelihood of the illness in the future as a function of the risk factors and how these factors change.Parameters of the model (input) data were taken from the bibliography along with data from a panel of experts from a hospital in Barcelona.
The DES was used to construct the simulation model.A simulation model for childhood asthma has been validated with respect to those validated in the bibliography.The preliminary result (10,000 simulated cases) for the scenario enabled the behaviour of the prevalence and phenotypes of childhood asthma to be observed.
The reduction of 5-10% of smokers in Spain (from 35.2% in 2001 to 23.7% in 2007) did not have a significant effect on reducing the prevalence of childhood asthma in the following years.The simulations established a threshold for a notable reduction in the prevalence of asthma between 15 and 20% of smokers.
The prevalence of asthma obtained using a probabilistic simulation varied from 14.5-16% for OR=2 and 12-13% for OR=1.4. Asthma prevalence values increased or there was a random, consistent trend from 1997 to 2007, both for the curve OR = 2 and OR = 1.4,as reflected in the bibliography, despite the decrease in the prevalence of tobacco from 32.9% in 1997 to 35.1% in 2001 and 23.7% in 2007.No clear drops in asthma prevalence according to tobacco prevelance were observed for 20% OR = 2, or 15% OR = 1.4.However, they were observed at values under 10% prevalence of tobacco.The curves adjusted to the simulated values showed a negative slope for a lower prevalence of 20% for OR = 2 and 15% for the curve OR = 1.4.
This study adds a simulation methodology including modifiable asthma risk factors, in order to be applied to significantly reduce the number of cases.Once completed, this simulation methodology can realistically be used to forecast the evolution of childhood asthma as a function of variation in different risk factors.This simulation model, once completed, validated and adjusted, can be used to forecast the evolution of childhood asthma as a function of the variation in different risk factors.The model also deals with problems associated with risk factors in the field of epidemiology.Fluctuations in probabilities over time can be assessed in large populations.

Figure 1 :
Figure 1: Risk factors implicated in the onset and progression of asthma in children from birth to adulthood (see [5]) from Díaz-Vazquez (2005).
known: P(D) = a + b n as the prevalence of D (10-17% in this study), P(F ) = a + c n as the prevalence of the risk factor (25-35%, in this study) in the population and OR (1.4 or 2 in this study).

Figure 2 :
Figure2: Probabilistic empirical functions Z 1 = P( D / F ) f ( P( D), P( F ),OR) for OR = 1.4 (left) and OR=2 (right) under the restriction of disease prevalence in the Spanish population ( P( D) = 0.1 0.17 ); prevalence of smoking ( P( F ) = 0.1 0.4 ) and risk relationship between asthma and tobacco, as OR=1.4 or 2, using a simulation algorithm in R.
, P(F ) = 0.1 0.4 and OR=1.4 and 2. This algorithm was developed in R and the different approximations to P(D / F ), P(D / F ) were obtained as probabilistic empirical functions Z 1 = P(D / F ) , f (P(D), P(F ),OR) Z 2 = P(D / F ) f (P(D), P(F ),OR) .The results of Z 1 are shown in Figure 2, where they are represented for OR = 1.4 and OR=2.The error ( ) between the values of P(D) and P(F) is shown where 0.005.The maximum of all the values of P(D/F) have been selected.

Figure 3 :
Figure3: Childhood asthma DES model which takes into account the prevalence and phenotypes (D is the asthma event, F is the tobacco risk factor, OR is the odds ratio of asthma related to smoking, t 1 ,t 2 ,t 3 are random variables with a Weibull distribution, used to generate wheezing episodes as a function of parents' smoking prevalence).

Figure 4 :
Figure4: Results of the simulated prevalence of asthma (%) for n=10000 cases, obtained for different levels of tobacco prevalence in Spain(1997, 2001, 2006, 2007  and 20%, 15% and 10%).The simulations were performed using two odds ratio (OR) risks of tobacco in asthma.The Gaussian 95% confidence intervals for the mean prevalence of the 100 simulation replications are also represented, however they are the minimum observed due to the low dispersion.The fitted curves as a third order inverse polynomial model and the R 2 are represented.

Figure 5 :
Figure 5: Histogram frequency of different episodes of wheezing in the simulation model of childhood asthma built in this work.The time horizon is 11 years.Histogram of phenotype asthma episodes (wheezing).
The Spanish anti-tobacco law 28/2005, of December 26, came into force on January 1, 2006, although some aspects of this act did not come into effect until September 2006 and January 2007.The most important measure in this law is the ban on smoking in places such as workplaces (both public and private) or cultural centres.