Percolation and Epidemic Thresholds in Clustered Networks

We develop a theoretical approach to percolation in random clustered networks. We find that, although clustering in scale-free networks can strongly affect some percolation properties, such as the size and the resilience of the giant connected component, it cannot restore a finite percolation threshold. In turn, this implies the absence of an epidemic threshold in this class of networks extending, thus, this result to a wide variety of real scale-free networks which shows a high level of transitivity. Our findings are in good agreement with numerical simulations.

We develop a theoretical approach to percolation in random clustered networks. We find that, although clustering in scale-free networks can strongly affect some percolation properties, such as the size and the resilience of the giant connected component, it cannot restore a finite percolation threshold. In turn, this implies the absence of an epidemic threshold in this class of networks extending, thus, this result to a wide variety of real scale-free networks which shows a high level of transitivity. Our findings are in good agreement with numerical simulations. Perhaps one of the main reasons for the growing interest in complex networks is that, indeed, many systems in the real world, either naturally evolved or artificially designed, are organized in a networked fashion [1,2]. This makes any theoretical approach potentially applicable to many different fields in the short term. As a germane example, percolation on networks has been one of these theoretical advances which has helped to understand, for instance, the high resilience of scale-free (SF) networks in front of the removal of a fraction of their constituents, with important implications for communication systems like the Internet and other Peer-To-Peer networks [3].
In addition to its high theoretical interest, percolation theory serves as a conceptual framework to treat more factual problems on networks, such as the dynamics of epidemic spreading [4]. Indeed, the susceptibleinfected-removed (SIR) model of epidemic spreading can be mapped into a bond percolation problem [5,6,7,8]. This is one of the simplest models in the literature [9,10], with three different states for the elements of the population: susceptible, infected, and removed. In its bare formulation, it is characterized by the time that an individual remains infected and the time that an infected individual takes to infect a susceptible neighbor, both random variables following a Poisson process but with different constant rates. Since the infection uses the network as a template to spread, the process of propagation can be understood as a percolation problem over the original network where each edge is removed with probability q inf = 1 − p inf , being p inf the likelihood that an infected individual infects a susceptible neighbor before becoming removed. This mapping stands as an example of the importance of percolation theory beyond theoretical concerns.
Percolation properties of random directed and undirected networks with given degree distributions and twopoint correlations have been extensively studied [11,12,13,14,15]. One of the most striking results, due to its important implications, is the absence of a percolation threshold in uncorrelated random SF networks [11,16]. In other words, in this type of networks, one has to remove virtually the totality of their constituents before the network fragments into disconnected components. Translated into the epidemic context, this means that an epidemic threshold below which the epidemics cannot propagate does not exist. This result is particularly important due to the fact that a large number of real networks have a SF degree distribution. This result has also been generalized to the case of random SF networks with twopoint correlations, both for the SIR model and for the susceptible-infected-susceptible (SIS) model of epidemic spreading [14,17].
Nevertheless, almost all the analytical results obtained up to date implicitly refer to networks without clustering and little is known about its effects on the percolation properties of such networks, with the exception of Ref. [18], where an analytical solution for the percolation properties of the one-mode projection of random bipartite graphs was developed. See also [19]. This is due to the fact that those analysis are based on the idea of branching process. This approach works well when the network is locally tree-like and, thus, the clustering coefficient is very small. Real networks, however, are shown to have a significant level of clustering that may change the percolation properties significantly. In this paper, we present analytical and simulation results for percolation in clustered networks. The analytical approximation becomes exact in the limit of weak clustering and simulations are also provided in the case of strong clustering. We find that clustering makes networks more fragmented as compared to the unclustered counterparts but with giant components which have tighter interconnected cores of high-degree vertices. We also find that clustering cannot restore the percolation and epidemic thresholds in SF networks.
To begin with, we follow Ref. [20] and define the multiplicity of an edge, m ij , as the number of triangles in which the edge connecting vertices i and j participates. This quantity is the analog to the number of triangles attached to a node i, T i , which is used to define the local clustering coefficient. In the coarse-grained level of degree classes, one can define the multiplicity matrix m kk ′ as the average multiplicity of the edges connecting the classes k and k ′ . Then, the degree-dependent clustering coefficientc(k) -a property of vertices-and the multiplicity matrix m kk ′ -a property of edges-are related through the following identity valid for any network where P (k) is the degree distribution and P (k, k ′ ) is the probability that one edge connects two vertices of degrees k and k ′ . The multiplicity matrix m kk ′ , which varies in gives a more detailed description on how triangles are shared among vertices of different degrees and, as we shall see, it contains the relevant information to analyze the percolation properties of clustered networks.
An alternative way to quantify clustering is by using the edge clustering coefficient as defined in [21] As in the case of the local clustering coefficient,c(k, k ′ ) also has a probabilistic interpretation. It quantifies the likelihood that a pair of connected vertices have a common neighbor. If the network is random, we can assume that the probability that an edge connecting two vertices of degrees k and k ′ has multiplicity m is This probability, along with the multiplicity matrix, are crucial to compute correctly the percolation properties of clustered random networks due to the fact that, although we start from a given vertex and we follow all its edges as in the non-clustered case, once we are placed in one of the neighbors, we only follow those edges not pointing to the neighborhood of the source vertex so that we avoid edges responsible for clustering. It is worth noticing that, even in this scheme, we are neglecting the fact that higher order loops may be present. Let us start the analytical computations by defining the probability that a given vertex has s reachable vertices (including itself), G(s). For very heterogeneous networks it is more convenient to define this probability conditioned to the degree of the source vertex, G(s|k), and then G(s) = k P (k)G(s|k). Finally, we need to introduce an extra function, g(s|k), which measures the probability that a vertex can reach s other vertices given that it is connected to a vertex v, of degree k, and that it cannot visit neither v nor its neighborhood (this idea was used in [22] to compute the number of second neighbors of a given vertex). This last condition guaranties that we do not overcount contributions due to triangles. The functions G(s|k) and g(s|k) are related through We can find a recursion relation for g(s|k) taking into account that now the branching process has the constraint that at each generation point we can only use the free edges to continue the exploration. In this case where k ′ br = k ′ −m−1. To simplify this equation we make use of the so-called generating function formalism and transform g(s|k) to the discrete Laplace space,ĝ(z|k) ≡ s z s g(s|k), where Eq. (5) becomes a closed equation for the functionĝ(z|k), The percolation transition takes place when Eq. (6), evaluated at z = 1, admits as a stable solutionĝ(z = 1|k) = ξ(k) ≤ 1, that is, there is a finite probability (1 − ξ(k)) that the branching process extends up to infinity. To analyze the stability of Eq. (6) near the fixed pointĝ(z = 1|k) = 1 we study a perturbative solution g(z = 1|k) ≈ 1 + χ(k)ǫ in the limit ǫ → 0. From Eq. (6), using that m kk ′ = m mφ(m|k, k ′ ). The transition between the percolated and the fragmented phases is given by the properties of the matrix (k ′ − 1 − m kk ′ )P (k ′ |k), and, in particular, by its maximum eigenvalue Λ m . When Λ m > 1 the network is in the percolated phase in which a macroscopic fraction of the system becomes globally connected. In the opposite situation, the network is a set of small disconnected clusters. The simplest case of clustered network corresponds to m kk ′ = m 0 , with m 0 ∈ [0, 1]. In this situation, from Eq.(1) one obtainsc(k) = c 0 (k − 1) −1 , where c 0 is a function of m 0 to be determined. Hence, small degree nodes are highly clustered whereas high degree ones are less clustered. This specific form ofc(k) is particularly important since it represents the maximum level of clustering one can impose in a network without introducing at the same time degree-degree correlations. This will allow us to analyze the effect of triangles without any interference from two-point correlations. Hereafter, we will refer to levels of clustering below this threshold as The probability P (1, 1) ≡ x is the smallest solution of the following quadratic equation (the derivation will be given in a forthcoming publication) where φ ′ is the average of φ(0|kk ′ ) over the set of vertices of degrees larger than 1. Then, the maximum eigenvalue of the matrix (k ′ − 1 − m kk ′ )P (k ′ |k) can be analytically computed and so the percolation condition For very low clustering, we recover the well-known result for percolation in random networks. The immediate conclusion seems to be that clustering changes the position of the critical point. However, in the case of SF networks, the left hand side of Eq.(10) diverges in the thermodynamic limit and, therefore, in SF networks weak transitivity is not able to restore a finite percolation threshold, and hence, a finite epidemic threshold.
To check the accuracy of the present formalism, we generated clustered random networks using the algorithm introduced in Ref. [20]. We simulated networks of 10 5 nodes with an exponential degree distribution and a clustering coefficientc(k) = c 0 (k − 1) −1 . In Fig. 1, we compare the relative size of the giant connected component, gcc, as a function of c 0 with the numerical solution of the Eq. (6). As it can be seen, the effect of clustering is to reduce the size of the giant connected component (in agreement with [18,19]). The effect is so strong that, in networks with a moderate average degree, it can fragment completely the network when c 0 exceeds a critical value. In other cases, the reduction of the size can be more than 50%. For values of c 0 ∈ [0, 0.5], the agreement between our formalism and the numerical simulations is excellent. Beyond this point, our approximation slightly overestimates the gcc's size. This is mainly due to the fact that in this regime, links of multiplicity larger than 1 appear which, in turn, induces the presence of some loops of order four.
We now turn our attention to the case of strong transitivity, which corresponds to functionsc(k) decaying slower than k −1 . In this case, clustering and twopoint degree correlations are intimately coupled [20]. An heuristic argument is as follows: if a vertex with a high degree has also a high clustering coefficient, many of its neighbors will be connected among them, which induces an assortative behavior. In other words, to generate random networks with strong transitivity we need to introduce some mechanism generating assortativity. However, it is not possible to obtain a perfect assortative pattern in SF networks for arbitrary large degrees (see Ref. [23] for a detailed discussion) and, as a consequence, the maximum level of clustering is limited. The algorithm of Ref. [20] has a free parameter which allows to control the assortativity of the resulting network so that SF networks with high clustering can be generated. We quantify the level of clustering as C = (1 − P (1)) −1 k P (k)c(k), so that C is defined in the interval [0, 1]. In Fig. 2, we show the relative size of the giant component as a function of C. As in the case of weak transitivity, clustering reduces the size of the giant component. However, after a certain value, the size of the giant component stabilizes to a constant value which is independent of C. Therefore, SF networks with high levels of clustering have giant components which are smaller than their counterparts in networks without clustering. But, which are the resilience properties of those giant components in front of random removal of edges? To answer this question, we have generated two SF networks with γ = 2.5, one with the maximum level of clus- tering (C=0.71) and the other without clustering, and applied a random removal of edges on the corresponding giant components. The results are shown in Fig. 3 (top graph). The giant component of the clustered network turns out to be more resilient than the giant component of the unclustered one. Since SF networks without clustering does not have a percolation threshold, we conclude that clustering, even high, cannot restore the percolation and epidemic thresholds in random SF networks.
However, the degree distributions of the giant connected components can be different, a fact that could explain the observed differences in the resilience properties. To check this point, we have randomized the gcc of the clustered network while keeping fixed its degree distribution (see the curve labeled Randomized in the top of Fig. 3). This network is more resilient than the clustered one for all levels of damage except for very high values, for which the gcc of the randomized network goes to zero faster due to finite size effects. This is illustrated in the inset of Fig. 3. The first arrow indicates the threshold computed with the formula q c = 1 − k / k(k − 1) = 0.986, whereas the second arrow indicates the threshold due to finite size effects for the clustered net, which is placed closer to 1. Therefore, clustered networks are less sensitive to finite size effects than random equivalent ones. This can be understood analyzing the k-core decomposition of the networks (see [24] and references therein). The k-core is the maximal subgraph such that all its nodes have k or more connections within the subgraph. In the bottom plot of Fig. 3, we show the relative size of the giant k-core for both net-works. For small k, the randomized network has k-cores which are bigger than the ones of the clustered net, which explains why it is more resilient. However, for very large degrees, the clustered network has bigger k-cores, that is, it exists a small but finite core of vertices with very large degrees highly interconnected among them, which makes the network less prone to finite size effects. We also show the cumulative degree distribution P c (k), since it bounds the sizes of the k-cores, which, for the clustered net, decays as a function of k with the same exponent.
Summarizing, we have introduced a theoretical framework to analyze percolation properties of clustered networks. We have shown that, although clustering strongly affects the percolation properties and the sizes of the giant components it cannot restore the percolation and epidemic thresholds in random SF networks, extending, thus, this important result to a wider class of networks, closer to the real ones. It is also worth to mention that these results can also be applied to other epidemiological models like the SIS model.
We thank A. Vespignani and R. Pastor-Satorras for valuable comments. This work has been partially supported by DGES, Grant No. FIS2004-05923-CO2-02 and Generalitat de Catalunya Grant No. SGR00889. M. B. thanks the School of Informatics at Indiana University, where part of this work was developed.