Nature of the epidemic threshold for the susceptible-infected-susceptible dynamics in networks

We develop an analytical approach to the susceptible-infected-susceptible (SIS) epidemic model that allows us to unravel the true origin of the absence of an epidemic threshold in heterogeneous networks. We find that a delicate balance between the number of high degree nodes in the network and the topological distance between them dictates the existence or absence of such a threshold. In particular, small-world random networks with a degree distribution decaying slower than an exponential have a vanishing epidemic threshold in the thermodynamic limit.

We develop a analytical approach to the susceptible-infected-susceptible epidemic model that allows us to unravel the true origin of the absence of an epidemic threshold in heterogeneous networks. We find that a delicate balance between the number of high degree nodes in the network and the topological distance between them dictates the existence or absence of such a threshold. In particular, small-world random networks with a degree distribution decaying slower than an exponential have a vanishing epidemic threshold in the thermodynamic limit. The accurate theoretical understanding of epidemic thresholds on complex networks is a pressing challenge in the field of network science [1][2][3]. Indeed, such a knowledge has potential practical applications in the design of optimal immunization programs [4,5] and may shed light on the behavior of the viral spreading of rumors, fads and beliefs [6,7]. In this respect, a large research effort has been recently devoted to the study of the susceptibleinfected-susceptible (SIS) epidemic model [8], the simplest model of epidemic spreading showing an absorbing state phase transition between a healthy and an endemic phase at a critical value of the effective infective rate λ [9] (see the Supplementary Information (SI) for a precise definition [10]). The behavior of the SIS model is particularly relevant in the case of highly heterogeneous networks, for which a vanishing epidemic threshold in the thermodynamic limit has been pointed out [11]. Recently, a scientific controversy has arisen concerning the location and the real nature of the epidemic threshold in this kind of networks [12][13][14][15]. Here we settle this issue by showing that the threshold asymptotically vanishes in any network with a degree distribution decaying slower than exponentially, thus clarifying the physical origin of this behavior.
The original approach to the dynamics of the SIS model [11] was based on the so-called heterogeneous mean-field (HMF) theory [16,17], which neglects both dynamical and topological correlations. To do so, the actual quenched structure of the network-given by its adjacency matrix A ij [3]-is replaced by an annealed version, in which edges are constantly rewired at a rate much faster than that of the epidemics, while preserving the degree distribution P (k). According to HMF theory, the epidemic threshold of the SIS model takes the form λ HMF c = k / k 2 [11], where k and k 2 are the first and second moments of P (k) [3]. Many real networks have a heterogeneous degree distribution, often scaling as a power-law (PL), P (k) ∼ k −γ [1][2][3]. This implies that the second moment diverges with the maximum de-gree k max for a degree exponent γ < 3, leading to a threshold scaling λ HMF c ∼ k γ−3 max , which vanishes in the thermodynamic limit. On the other hand, for γ > 3, the second moment is finite and consequently so the epidemic threshold.
While HMF theory represents an exact result in the case of annealed networks [18,19], its validity for real (quenched) networks is a matter of debate. Indeed, an improvement over HMF theory is given by the quenched mean-field theory (QMF) [20,21] which, while still neglecting dynamical correlations, takes into account the full form of A ij . Within this framework, the epidemic threshold is predicted to be λ QMF c = 1/Λ N , where Λ N is the largest eigenvalue of the adjacency matrix. Given the scaling of Λ N with the maximum degree, Λ N ∼ max{ √ k max , k 2 / k } [22], QMF theory predicts the same result as HMF theory for γ < 5/2, while for γ > 5/2 it leads to λ QMF c ∼ 1/ √ k max , that is, to a vanishing threshold for any value of γ (even for γ > 3) in the thermodynamic limit [12]. The prediction of QMF theory has been validated for γ < 3 by means of large scale numerical simulations based on the quasi-stationary state method [19]. Numerical evidence for γ > 3 is, however, less convincing and has led to two different criticisms.
Goltsev et al. [14] have considered, within the QMF framework, the effects of eigenvector localization on the steady state of the SIS model. According to their observations, in PL networks with γ < 5/2, the principal eigenvector is delocalized, which implies that the density of infected nodes is finite above λ QMF c the true threshold to the endemic state would have a finite value for γ > 5/2, at odds with the naive interpretation of QMF theory.
This view is further pursued by Lee et al. [15] by partly taking into account dynamical correlations. Their argument is as follows: Slightly above the QMF threshold, hubs in a PL network become active but their activity is restricted to their immediate neighborhood. This activity has a characteristic lifetime τ (k, λ) depending on the degree and the value of the spreading rate λ. When hubs are directly connected to each other (the case of a clustered network, in the nomenclature of Ref. [15]), activity can be transferred between hubs if the lifetime τ (k, λ) is sufficiently large. In this case, above λ QMF c the network is able to support an endemic state, characterized by the mutual reinfection of connected hubs. In the case of unclustered networks, however, hubs are not directly connected and the reinfection mechanism does not work. Thus, the authors of Ref. [15] claim that the state above λ QMF c is just a Griffiths phase [23], where the density of infected nodes decays with time more slowly than exponentially (logarithmically indeed), while the actual epidemic threshold is located at a higher, finite value of λ. These arguments are checked by means of numerical simulations in a deterministic hierarchical network model, the (u, v)-flower model [24]. In order to make contact with the case of general random PL networks, Lee et al. take an indirect approach, performing a degree-ordered percolation (DOP) analysis [25] in which vertices are ordered in decreasing degree value. A zero DOP percolation threshold indicates that a finite fraction of nodes are neighbors of hubs directly connected to each other (a clustered network), while a finite threshold is indicative of hubs being separated. An application of DOP analysis to random PL networks indicates a finite DOP threshold for γ > 3. This observation leads Lee at al. to claim that the SIS model shows a finite epidemic threshold for γ > 3, with a Griffiths phase below it and above λ QMF c , related to the presence of localized eigenstates of the adjacency matrix, and their associated activation effects. In this picture a true zero epidemic threshold in the thermodynamic limit is only observed for γ < 3.
While the arguments presented in Refs. [14,15] are appealing and, apparently, leading to the conclusion that the threshold is finite in random PL networks with γ > 3, here we reconsider the problem, and provide analytical and numerical evidence pointing in the opposite direction, namely, a vanishing epidemic threshold for any small-world network with a degree distribution decaying slower than exponentially, in particular power-law networks with any γ. To confirm this prediction, we propose a numerical approach, based on the scaling analysis of the survival time of the infection process, which is able to provide very accurate estimates of the epidemic threshold even in the regime where the quasi-stationary state method is unreliable.
Our analytical approach is based on the consideration of dynamical fluctuations, as in [15], but not restricted to direct neighbors. The argument of [15] assumes that a zero epidemic threshold can only take place in clustered networks, when hubs are directly connected to each other and can reinfect each other within a time smaller that the characteristic healing time τ . However, as already pointed out in Ref. [26], a direct connection is not a necessary condition for the possibility of hub reinfection. Instead, we should properly consider the possibility of reinfection between two vertices i and j, separated by a topological distance d ij , possibly larger than 1. Indeed, the epidemic threshold predicted by the HMF theory is actually based on the local properties of the network alone, assuming that the local structure will replicate in a tree like fashion forever, preserving only the statistical properties of the network. Then, above λ HMF c , we expect that a perturbation originated in a node will be able to propagate as a supercritical branching process forever. Below this threshold, this process is not possible. Still, as we show below, the epidemics can sustain itself, due to perturbations which propagate up to distances of order ln(N ), where N is the network size.
Within this view, we replace the original SIS dynamics by a modified one running on a fully connected weighted graph where the weight of the connection between nodes i and j is the topological distance d ij between them in the original graph. The infective rate λ is now replaced by the rate at which the infected node i infects node j when the process is mediated by a chain of d ij − 1 intermediate nodes. Recovery rate δ is now replaced by the simultaneous recovery rate of infected node i and its neighborhood, that is, we consider node i as susceptible only when node i and all of its nearest neighbors in the original graph are susceptible. We further assume that paths connecting nodes are independent and made of nodes of degree 2. This is clearly not true in a real network because there are many paths connecting the same pair of nodes and intermediate nodes have, in general, degrees above 2. This implies that the infective rate between two nodes that we use in the modified SIS dynamics is smaller than the real one. Therefore, our modified SIS dynamics will provide an upper bound for the true epidemic threshold of the original SIS dynamics.
The original network plays two different roles in the modified SIS dynamics. The local properties of the network, controlled by the degree distribution, determine the lifetime of an infected node i depending on its degree and, thus, its recovery rateδ(k i , λ) = τ −1 (k i , λ). The global structure of the network, on the other hand, determines the matrix of distances between pairs of nodes, and thus, defines the infective rates between pairs of nodes i and j as a function of the distance d ij and the infective rate λ,λ(d ij , λ). Notice also that, since infectious events are mediated by intermediate nodes, dynamical correlations will be, in general, very weak. By considering also that our dynamics runs on a fully connected graph, we conclude that a mean field description is an excellent approximation to the dynamics. After these preliminaries, it is easy to see that the probability ρ i (t) of node i being infected at time t satisfies the following rate equation (1) equation which is obviously valid at a coarse-grained time scale, larger than in the original dynamics. The effective recovery rate can be seen to behave asδ(k i , λ) = τ −1 (k i , λ) ≈ e −a(λ)ki (see SI for numerical results and an analytical argument [10]), where a(λ) is a smooth growing function of λ. On the other hand, the effective infective rate can be related to the topological distance between vertices i and j asλ( , with b(λ) = ln(1 + 1/λ) (see SI for an analytical derivation and a numerical validation [10]). Eq. (1) can be applied to any network. We can, however, get deeper insights in the case of small-world random graphs, in which the the average internode topological distance takes the form [27] where κ = k 2 / k − 1 is the average branching factor of the network. Using the approximation Eq. (12) allows us to coarse grain Eq. (1) for degree classes. After defin- Notice that the use of Eq. (12) implies that the local propagation of the infection among directly connected nodes is neglected; only reinfections between distant (∼ ln N ) nodes are taken into account. By performing a linear stability analysis of this equation, we can see that the critical epidemic threshold of the modified SIS dynamics, λ c , is the solution of the transcendental equation (see SI for a detailed derivation [10]) In general, the maximum degree of the network, k max , is a growing function of N . If we further assume that the degree distribution decays slower than an exponential, Eq. (4) can be approximated as the integral near the upper bound k max , i.e.
When P (k) decays slower than an exponential, k max grows faster than ln N . Therefore, as the system size grows while keeping λ fixed, there is a point where the first term in the exponential becomes larger than the other two (negative) terms and, eventually, the right hand side of Eq. (4) becomes larger than 1. As a consequence, the epidemic threshold of the modified SIS dynamics starts decreasing as N increases, thus going to zero in the thermodynamic limit. Making the additional assumption that a(λ) ≈ aλ 2 for λ 1 (which is compatible with numerical simulations, see SI [10]), we conclude that the upper bound of the epidemic threshold decreases as 1/ √ k max , with additional logarithmic corrections to scaling. Interestingly, this scaling is similar to the one predicted by the QMF theory. However, in our case, the threshold marks the onset of a true endemic state where a finite fraction of all nodes of the system are active.
The case of non small-world networks can be considered along the same lines. Unfortunately, a general formula for the average topological distance as a function of nodes' degrees is not known. Nevertheless, the absence of long range connections in non small-world networks suggests that node degree is not as determinant as in the case of small-world ones. Thus, to get some understanding, we assume an internode distance independent of the degree and scaling as a power-law with system size, i.e. d = 1 + αN β . An analysis similar to the case of small-world networks (see SI [10]) concludes that non small-world networks have a vanishing epidemic threshold only if k max grows faster than N β . This result explains the finite epidemic threshold in the (3, 3)-flower model found in Ref. [15], even if the model generates a PL network with γ = 1+ln 6/ ln 2 ≈ 3.58. Indeed, this model generates a non small-world network with β = ln 3/ ln 6 whereas k max ∼ N ln 2/ ln 6 [24].
To check the accuracy of our theory, we propose a method to estimate the critical point of absorbing state phase transitions. The method is based on the analysis of individual realizations of the process starting with a single infected node. Each realization is characterized by its lifetime T and coverage C, where the latter is defined as the fraction of distinct nodes ever infected during the realization. In the thermodynamic limit, realizations can be of two types: finite or endemic. Finite realizations have a finite lifetime and, therefore, a vanishing coverage in the thermodynamic limit. Endemic realizations, on the other hand, have an infinite lifetime and their coverage is equal to 1. Below the epidemic threshold, all realizations are trivially finite. Above the threshold, there is a non null probability, P end (λ), that a realization that starts at a single node becomes endemic, making P end a good order parameter of the phase transition. Akin to the role of the average size of finite clusters in standard percolation [28], in our approach the role of susceptibility is played by the average lifetime of finite realizations T (λ), which diverges at λ c both from below and above. In finite systems, the major problem is to determine when a realization is endemic or not. One possibility is to declare a realization as endemic whenever its coverage reaches 1. However, from a computational point of view, this option is too costly. We therefore take advantage of the following fact: In an infinite size system, whenever the coverage of a realization reaches a finite fraction (even small), the probability of the realization being endemic is 1. Then, in finite systems, we declare a realization as endemic whenever its coverage reaches a predefined value (in our case C = 0.5, see SI for tests with other values [10]) and stop the realization at this point. Then, we can measure the average lifetime of finite realizations T (λ, N ) and the position of its peak, which we take as the estimate of the epidemic threshold for finite systems. Finally, we note that the method can be applied starting from any node of the network with identical results as far as the position of the threshold is concerned. Here, to minimize the fluctuations ofT (λ, N ) close to the critical point, we start our simulations always from the node with highest degree. Figure 1 shows the result of this program in random PL networks generated with the uncorrelated configuration model [29]. The average lifetimeT (λ, N ) behaves as an effective susceptibility and, thus, we estimate the epidemic threshold for a finite network as the position of its peak, λ max (N ). These estimates are shown in the bottom plots and compared with the upper bound given by a numerical solution of Eq. (4) and where τ (k, λ) is obtained from numerical simulations. As it can be clearly seen, the upper bound predicted by our theory is in very good agreement with numerical simulations, even for γ = 4, a network clearly "unclustered" according to [15].
Notice that, due to the approximation made in Eq. (12), our theory neglects the propagation of the epidemic mediated only by connected nodes, which is the approach taken in the HMF theory. Therefore, one should expect that the true upper bound for the real epidemic threshold is the minimum between the estimation given by Eq. (4) and λ HMF c . From this perspective, it is surprising that the epidemic threshold measured from simulations is higher than λ HMF c for small system sizes. Notice, however, that the HMF theory of the SIS dynamics completely neglects dynamical correlations. These correlations account for the fact that, whenever a node is infected, there is a high probability for the node that infected it to be still infected. Therefore, the number of neighbors available to an infected node to further propagate the epidemics is, in most cases, its degree minus 1. Consequently, a better upper bound for the local propagation of the dynamics is given by the HMF theory of the SIR model, that is λ SIR c = k / k(k − 1) . Bottom plots of Fig. 1 show the estimation of λ SIR c , which is always above the real threshold.
To sum up, the behavior of the epidemic threshold in networks depends on a delicate balance between their local and global properties. Both HMF and QMF theories are constructed by considering only the local dynamics of infections among nearest neighbors, and thus fail to provide a correct description. Here we have presented a theoretical approach to epidemics in networks, built upon previously sketched concepts, that takes into account the full network structure, and that consider reinfection events among nodes not directly connected, i.e. mediated by chains of other nodes. Our theoretical analysis, while based in some (reasonable) approximations, is well backed up by means of reliable numerical evidence. The main conclusion of both approaches is that the epidemic threshold in SIS model is effectively null in the thermodynamic limit in all random small-world networks with a degree distribution decaying slower than exponentially, putting in this sense an end to the recent controversy arisen around this important issue.

Definition of the SIS model
The Susceptible-Infected-Susceptible (SIS) model is defined as follows: Individuals can be in one of two states, either susceptible or infected. Susceptibles become infected by contact with infected individuals, at a rate equal to the number of infected contacts times a given spreading rate λ. Infected individuals on the other hand become spontaneously healthy again at a rate µ. Without loss of generality, the rate µ can be taken arbitrarily equal to unity, thus setting the characteristic time scale. We set µ = 1 throughout the paper and the Supplementary Information. The SIS model can be seen to exhibit, in the thermodynamic limit, an absorbing-state phase transition [9] at an epidemic threshold λ c , such that for λ < λ c the epidemics dies exponentially fast (healthy phase), while for λ > λ c it reaches a steady (endemic) state, lasting for arbitrarily large times, where a finite fraction of the total number of nodes is infected.

Numerical simulations
The SIS dynamics is simulated with a continuous time dynamics as follows: During the course of the simulation, we keep track of the number of infected nodes N I (t) and the number of active links E A (t), where an active link is defined as a link emanating from an infected node (notice that links connecting two infected nodes will appear twice in this list). At each step, with probability p r = N I (t)/[N I (t) + λE A (t)], a randomly chosen infected node is turned susceptible whereas, with probability 1 − p r , an active link is chosen at random and if one of the two nodes attached to the link is susceptible, then this node is turned infected. After this procedure, time is updated as t → t + 1/[N I (t) + λE A (t)] and the list of infected nodes and active links recomputed. An equivalent algorithm keeps a list of active links as those connecting one infected node and one susceptible. The advantage of this latter method is that infectious attempts always end up with a susceptible node being infected, which is not the case with the first method. In this work, all simulations are backed up independently with the two methods.

Estimation of the infective rateλ(dij, λ)
This rate is the inverse of the average time infected node i takes to infect node j when they are separated by a distance d ij in the original graph. We consider the process of transmission of the infection between the two nodes mediated by a one dimensional chain of length d ij . Consider a one dimensional chain of d + 2 nodes with the leftmost node always infected, as indicated in Fig. 2 T (d + 1) the average time the node at the rightmost position takes to get infected for the first time. LetT (d) the average time between two consecutive infectious events (after the first one) of the node at distance d. Because we are in a chain and the source of the infection is the leftmost node, the node at distance d + 1 can only get infected for the first time by its left neighbor. Once this node is infected, the probability that the target node gets the infection before its left neighbor recovers is simply given by the node at distance d + 1 can get infected right after its left node gets infected for the first time or after the second time, and so on. The probability that the node gets infected right after its left neighbor gets infected for the n−th time is On the other hand, the average time elapsed in this case T n (d + 1) is Combining these two results, we get the equation for the average infection time, T (d) = n Prob(n)T n (d), with the initial conditions T (1) =T (1) = λ −1 . In the limit of low infectious rate, that is, λ 1, we can ap-proximateT (d) ≈ T (d) and we get a closed recursive equation for T (d), whose solution is We then conclude that the infective rate is In the case of small-world random graphs, the average internode topological distance depends only on the degree of the nodes as [27] d k,k = 1 + ln N k kk ln κ . Inserting this expression into the effective infective rate we get We test numerically this relationship by keeping a single node of degree k always infected and computing the time it takes to infect for the first time any other node in the network. The rate λ k,k is obtained by inverting the value of this time for the first infectious event, averaged over all nodes of degree k . Equation (13) predicts that plotting [λ k,k /λ] ln κ/b(λ) vs k a linear behavior must be found, and this turns out to agree with the outcome of simulations in a network with γ = 3.5 (see Fig. 3).

Estimation of the recovery rate δ(k, λ)
This rate can be estimated as the inverse of the survival time of an infection starting at the center of a star of degree k. Unfortunately, the exact solution to this problem is rather involved (see [30] for an exact treatment). Here, we present an approximation based on the discretization of the process in time units of µ −1 = 1. Consider the following cycle: initially, the center of the star -the hub-is infected whereas leaf nodes are susceptible. The probability that a leaf node is infected when the hub recovers is Then, by the time the hub recovers, there are n infected leaf nodes with probability the probability that at least one of these n infected nodes infects the hub again before they recover is The average time to complete the cycle is 2. The probability that the outbreak goes through a sequence of m complete cycles and then dies is and the time elapsed (2m + 1). However, an outbreak can also die in the middle of the cycle, that is, when infected leaves recover before infecting the hub again. The probability that the outbreak goes through a sequence of m complete cycles and dies in the middle of the m + 1 cycle is 18) The average elapsed time is in this case (2m+2). Putting these pieces together, the effective recovery rate can be approximated as For low infectious rates p in 1, this can be approximated as As we have mentioned at the beginning of this section, the previous calculations provide only an approximation to the true recovery rate. This is so because we have considered the process as discretized in time whereas the real process takes place at continuous time.  Fig. 4 shows the average lifetime for fixed values of λ as a function of the degree k, where the exponential trend predicted by our calculations is clearly visible.

Derivation of Eq. (4)
Let us consider Eq. (3) in the main paper, namely It is obvious that the absorbing state ρ k (t) = 0 is a fixed point of the dynamics. Therefore, we conclude that an endemic state exists whenever the solution ρ k = 0 is dynamically unstable. Following this idea, we linearize Eq. (21) around ρ k = 0, i.e., This equation can be written in matrix form as where ln κ P (k ). (24) The stability of the absorbing state is then controlled by the maximum eigenvalue of matrix M, that is, the maximum Λ m solution of the eigenvalue problem M u = Λ u. In this way, Λ m = 0 defines the threshold between the absorbing and endemic phases. The eigenvalue problem can be rewritten as which gives us the explicit dependence of u k on k. Using this result, the eigenvalues satisfy the equation By setting Λ = 0 and recalling that τ (k, λ) =δ(k, λ) −1 , we recover Eq. (4) in the main paper.

Non small-world networks
Let us assume that the average distance is given by Plugging this expression in Eq. (1) in the main paper, we obtain a coarse description of the dynamics as By repeating the same analysis performed in the previous section, we conclude that the critical infection rate satisfies the equation  Assuming again that P (k) decays slower than an exponential, this equation can be approximated as From this equation it is easy to see that for any fixed value of λ, if k max grows faster than N β , there exists a size N such that the exponent in this equation starts growing with the system size and, therefore, the right hand side in this equation will eventually grow above 1.
The logical consequence is that, in this case, the epidemic threshold goes to zero as N goes to infinity.
Robustness with respect to the coverage threshold C To determine whether a given realization of the SIS process is endemic, we have used the condition that the coverage is larger than a fixed threshold value C = 0.5. To check that different assumptions do not qualitatively change the results, we performed some numerical tests. In Fig. 5, we consider the effect of changing C for an Erdös-Rényi graph of average degree k = 5 and different sizes N . The numerical estimate of the threshold rapidly converges to the expected value λ c = 1/5 for both values of C considered, while the height of the peak grows with an exponent independent of C. In Fig. 6, we perform the same analysis for a UCM graph with γ = 3.5 and k min = 3, obtaining similar results. As the system size N is increased, the estimated thresholds decrease and the peak heights increase in a perfectly analogous way. Both figures confirm that the behavior of the model is robust with respect to the arbitrary choice of the coverage  threshold C.

Further characterization of the epidemic transition
In this section we provide some additional insight into the transition marked by the peak of the average lifetime of finite realizationsT (λ, N ). In Fig. 7, we compare the curves already plotted in the top left of Fig. 1 of the main paper with the analogous curves computed for a star graph with k max leaves. It is clear that the occurrence of the peak in the full network is not due only to the star graph centered around its hub. The latter sustains alone the activity only for small λ. The transition occurs at higher values of λ, for which the lifetime is exceedingly larger.
In Fig. 7 we plot, as a function of N , the value of the average coverage of the whole network at the critical value λ max (N ) and at the critical value λ * max (N ) for the star graph centered around the hub of degree k max . It turns clearly out that at the critical point λ max the coverage assumes a finite value in the thermodynamical limit, while it vanishes for λ = λ * max . For λ = λ * max the hub and its neighbors are fully covered, yet the epidemics does not escape from the hub and its neighbors and it is thus localized. At the transition point λ max instead the epidemics leaves the hub and affects the whole population, leading to a truly endemic state.