Exploring complex networks by means of adaptive walkers

Finding efficient algorithms to explore large networks with the aim of recovering information about their structure is an open problem. Here, we investigate this challenge by proposing a model in which random walkers with previously assigned home nodes navigate through the network during a fixed amount of time. We consider that the exploration is successful if the walker gets the information gathered back home, otherwise, no data is retrieved. Consequently, at each time step, the walkers, with some probability, have the choice to either go backward approaching their home or go farther away. We show that there is an optimal solution to this problem in terms of the average information retrieved and the degree of the home nodes and design an adaptive strategy based on the behavior of the random walker. Finally, we compare different strategies that emerge from the model in the context of network reconstruction. Our results could be useful for the discovery of unknown connections in large scale networks.


I. INTRODUCTION
During the last decades, much scientific interest has been devoted to the characterization and modeling of many natural and artificial systems that exhibit so-called emergent behavior. These systems, referred to as complex systems, are suitably described through their networks of contacts, that is, in terms of nodes (representing the system's components) and edges (standing for their interactions), which allows to catch their essential features in a simple and general representation. Complex networks [1][2][3][4] have therefore become an important, largely used framework for the understanding of both the dynamical and topological aspects of systems such as the brain [5], protein-protein interaction networks [6], Internet and the World Wide Web (WWW) [7].
In the meantime, it has also become clear that many of the mentioned networks, particularly those which are described by a power law degree distribution P (k) ∼ k −γ (scale-free networks [1][2][3][4]), are only partially known. Think, for instance, of online social networks like Facebook or Twitter, which are made up of millions of heterogeneous and nonidentical nodes. In such large networks, a complete map is hardly available and difficult to get [8]. Thereby, providing efficient tools for their exploration has become a crucial challenge. In general, network features are discovered by means of algorithms based on search and traffic routing [9][10][11]. In many cases, the second of these can be performed by means of moving "agents," which explore the topological space and recover information. Nonetheless, it is still a key issue of the investigation and characterization of the efficiency of different strategies [12][13][14] as far as the quality and quantity of information gathered are concerned.
On the other hand, it has also been shown that local topological metrics, like the degree of a node, greatly affect the dynamical properties of complex networks. This is the case of immunization algorithms, which are more effective the larger is the degree of the vaccinated node [15]. As a matter of fact, one of the best strategies is to immunize a neighbor of a randomly chosen node instead of the node itself. This is because a randomly chosen node has degree k, while a neighbor would have degree k with probability kP (k). Another striking example closely related to the problem here addressed in which the degree of the nodes determines the dynamical properties is the scaling law characterizing flow fluctuations in complex networks [16][17][18][19]. Admittedly, the mean traffic f and its standard deviation σ can be related through the simple scaling form σ ∼ f α [16][17][18]. However, this relation, which was previously thought to be universal with α being between 1/2 and 1, is not satisfied for all values of k (i.e., the exponent is not universal and depends, among other factors, on the degree of the nodes [19]).
In this paper, we address the problem of network exploration from the point of view of a single node from which an agent is sent through the network to collect information, henceforth understood as the fraction of nodes visited when the walker gets back home. Our aim is to find out an optimal strategy to maximize both the number of the visited nodes and the chance to meet again the starting point, independently of where the starting node is locate. To this end, we consider an arbitrary (heterogeneous) network of N nodes and a single agent (explorer or walker) initially located on a given node (home node), and let it move during a time frame T , the walker's lifetime. Every time the agent comes back to the starting point, all the nodes it has visited until that moment are marked as visited and the total information gathered is updated. Obviously, it could also be possible to send several agents at once, but it has been demonstrated for several similar situations [20] that increasing the number of walkers (and reducing their lifetime proportionally) does not produce better results. Consequently, we focus on the performance of single agents.
The most important novelty of our proposal is that the agents are not Markovian random walkers, nor a modified version of random walks' dynamics in which additional rules (for instance, preferential or self-avoiding random walks [13,21]) are introduced. Indeed, we introduce a parameter q which governs how likely it is for a walker, at each time step, to go forward or backward (with respect to the walker's home). Thus, by changing the value of this parameter, the two probabilities can be tuned and hence different strategies are defined. In one limiting case, the walkers will tend to move back home, whereas in the other limiting setting, they will tend to move away from home. In between these two asymptotic behaviors, we recover a classical random walk, for which all directions are equally probable. We explore different strategies and their dependencies with both the degree of the home nodes and the walkers' lifetimes. Moreover, we show that it is possible to build up an adaptive algorithm whose efficiency in terms of the information gathered and the quality of the reconstructed network is, in general, the best.
The rest of the paper is organized as follows. Section II introduces the model which is characterized in Secs. III to IV. Our proposal for an adaptive strategy is presented in Sec. V. In Sec. VI we present the application of the algorithms previously discussed to the reconstruction of the degree distribution. Finally, the last section (Sec. VII) is devoted to rounding off the paper.

II. BASELINE MODEL OF WALKERS
Let us first discuss a baseline model in which a given set of walkers explore the network starting from a home node. As previously discussed, to collect the results of walkers' explorations, they should go back home. Therefore, we introduce two probabilities when the walker is at a given node, provided it has tracked the information about the path followed from the home node to the current position. These two probabilities correspond to the forward (F ) and backwards (B) motion along the already tracked path and read, respectively, as where the label i indicates the node that the explorer is going to leave and k i is its degree. These equations stand for every step whenever the agent is not in the starting node (the home h). While at home it can only go forward, thus at that position we have P h F = 1 and P h B = 0. Figure 1 shows an example of the motion of an agent.
From Eqs. (1) and (2), we recover the pure random walk (without any bias, i.e., all possible directions are equally probable) for q = 1. For very large values of the parameter q, no backward step is allowed. Consequently the explorers can get back to their starting node only by chance, through a different path, not being aware that they are coming back, but being able to recognize where they are (at home). Conversely, when q goes to 0, after the first move, no more steps forward are allowed. Therefore, only the first neighbors of the starting node can be explored. We also consider that the walker's lifetime is T steps, which represents the time allowed for the network exploration before the dynamics stops. We define the information gathered as the fraction of nodes marked as visited after T time steps: I = V /N, where V is the number of visited nodes and N is the size of the network. Moreover, if the agent is not at home at time T , the new nodes visited after its last return to the home node are not computed in V (i.e., we consider that only the information brought counts).
We first discuss the expected behavior of I at the two limiting values of q (very high or very small). On one hand, for very low q values only the nearest neighbors are visited and hence I will be small independently of T . On the other hand, for very large values of q the walkers only return to home by chance, being the search also inefficient provided the exploration time is not very large (see next section). Then, if we fix the total number of steps we can expect that the information collected will have a maximum as a function of q. Therefore, there should exist, for any given network, a precise value q * (T ) such that, if we average over all the possible choices of the home node and over many realizations of the dynamical exploration, the mean information I (q * ) is maximal. In other words, there is no other value q for which I (q ) > I (q * ) , where " · " stands for the mean performed over all the nodes in the network and " · " for the average over many realizations.
The previous analysis indicates that the best efficiency in terms of the maximal recovery of information can only be obtained for two values of q * . In the next section, we explore the dependency of I on the network properties (as given by the degree of the home node) and walkers' lifetimes. Admittedly, when this time is very long (T N ) we should expect to recover most information by setting q * → ∞. However, even if this is the best choice on average, it might not be the case when the home of the walker is at a low degree node. On the  066116-2 other hand, for shorter searching times, a value of q = q * < 1 gives almost the same performance for I , but this time the results are independent of the degree of the home node and I (q * ) is a global maximum (the caveat is that q * cannot be known a priori).

III. CHARACTERIZING THE PERFORMANCE OF THE WALKERS
In this section we study the dependency between the information gathered by an agent and q, for different choices of the home node and for different values of the walkers' lifetimes T . Hereafter we will use as a benchmark a scale free network of N = 10 4 nodes and mean degree k = 10 generated by the uncorrelated configuration model [22]. We, however, note that all the results reported are valid for any network with a powerlaw degree distribution provided that it does not have a tree-like topology. Actually, the only relevant difference in the case of a tree-like network is that we will observe a different behavior for large values of q. This is because leaves would make it very difficult for a walker to come back through a different path making their performance very poor, even for very large values of T and for very large degrees of the home nodes.
In Fig. 2 the information I is plotted as a function of q for several home nodes and a searching duration of T = N = 10 000 steps. As it is clearly shown, starting from small values of the parameter q, I initially increases, but soon afterwards there is an abrupt decay to give way to a new increase as q grows further. For very large values of q, the information gathered saturates to an asymptotic value. Interestingly enough, as seen in the figure, the amount of information gathered for both very small values of q and when q 1, as well as the size of the abrupt decay, depends  on the degree of the node from which the walker started the exploration. However, there exists a universal value of q = q p at which almost all curves corresponding to different degrees of the home node collapse (i.e., there is a local maximum which is roughly independent of the connectivity of the home node). Nevertheless, whether this point is also a global maximum for I (q) or just a local one depends on the degree of the initial node. Indeed, when the home node is highly connected, for this searching duration, an agent performs better for q → ∞, but if this is not the case, q p gives the optimum efficiency.
In Fig. 3 we plot the same quantity as in Fig. 2 but averaged over all the possible home nodes (then the dependency with the degree washes out) and considering different lifetimes T . The figure makes it more clear that at q = q p the value of I is a global maximum unless T is many times larger than the network size N . This definitively means that if we are interested in the information an agent may gather for a very long searching time, what we have to do is to set q 1. Otherwise, if we are interested in more realistic situations where there can be limitations on the duration of the exploration (for instance, due to energy constraints), the best choice would be to set q = q p . The latter option has a caveat, however: The precise value of q p depends in an unknown way on the topological features of the underlying network. Nevertheless, one can obtain useful insights into the problem by inspecting how the behavior of a walker changes when q varies.
Looking more carefully at the results plotted in Fig. 3, one can distinguish three regions that qualitatively correspond to the three distinct behaviors of the walker. In the first one, for q < q p , I monotonously increases as a function of q; in the second one, I experiences an abrupt decay; whereas the third region shows that I starts to increase again, until it 066116-3 saturates to a value that depends on T . It is easy to realize that the first increase corresponds to small enough values of q. In this region, the walker moves just a few hops away from home and consequently it takes only a few steps to get back home. The larger the value of q is, the longer the mean path covered by the walker will be. Since for very small values of q the exploration is local, the relevance of the home-node degree is very high (see Fig. 2). Then, increasing q, we are allowing the walker to explore farther nodes, that is, to collect new information, and the initial differences due to the degree of the home node become progressively smaller. At q = q p they have almost vanished.
In the second region, for q slightly larger than q p , the walker often gets lost and its performance is, on average, less efficient. In other words, the explorer wastes an important fraction of the lifetime T gathering information that it will not be able to bring back home before the time is over. The precise value at which this starts to occur is slightly affected by the duration of the exploration, as shown in the inset of Fig. 3. This can be explained as a combination of two factors. On the one hand, to increase q means to increase the number of nodes visited, but also the risk to get "lost." Indeed, if an agent is performing a long trip and it is going to bring a lot of information back home, when the searching time is suddenly over, the loss is big. On the other hand, the very first trips are those that provide the largest fraction of new information since the majority of nodes are being visited for the first time. Thus, getting lost after a couples of returns causes a much worse loss than if the same happens after a few round trips. Again it is a matter of balance and the optimum value q p is smaller when the lifetime is shorter. The second region ends at a value of q for which the previous balance is the worst possible one, thus giving rise to another increase, which marks the start of the third region. Here, for even larger values of q, it begins to be quite frequent that, wandering across the network almost randomly, the explorer returns to its home node through a different path just by chance. This new behavior entails a new increasing of I due to the fact that this kind of random returns start to balance the inefficiency of the walkers that get lost. The likelihood of these events increases with q and it is maximum when q → ∞, that is, when P B = 0 at each time step.
The previous dependency of I on the walker's lifetime T defines two optimal values for q, either I (q) takes its maximum value at q * = q p or at q * = ∞. However, we stress again that for q 1, the walker gets back home by chance (recall that for these values of q the backward probability P B = 0). Consequently the asymptotic values of I in the q = ∞ limit strongly depends on the degree of the home nodes (see Fig. 2). Therefore, setting q * = q p could be a better choice even when T is large enough. To be able to take advantage of the agents' behavior at q p , we need to characterize deeper the transition that occurs for that value of the parameter. To this end, in the next section we focus on the behavior of some dynamical quantities which display a relevant change around q p .

IV. EXPLORATION MECHANISMS AND ESTIMATION OF q p
In Fig. 4 we plot the average maximum number of sequential steps backward ( S B ) and forward ( S F ) that a walker takes in a time lag T = 10 000 as a function of q. These two quantities, estimated by averaging over many realizations and over all the possible home nodes, give a useful picture of the transition between the first and the second regimes previously described. They initially increase together, then S B starts increasing slower than S F , it reaches a maximum and starts decreasing, asymptotically going to zero. Notice that for small q the value of S F is small. Consequently, S B is bounded [even if P B (k) ∼ 1 ∀k] since, when an agent is back to its home, no more steps backward can be taken. The value of q for which S B and S F take the maximum value before getting apart roughly corresponds to q p . It is when the walker goes as far as possible from its starting point, being still able to come back on its own steps. Increasing q a little bit further provokes that the number of steps forward exceeds that of steps backward and the home node is not recovered any more, so that the searching efficiency rapidly decreases. This phenomenology helps us to find out a heuristic definition for the peak. It is indeed possible to state that q p is the precise value of q for which a walker is allowed to take enough steps forward to be able to visit a large region of the network, but at the same time it is also allowed to take enough steps backward so as to return to its home not by chance.
Admittedly, it is possible to translate this heuristic statement into a quantitative condition starting from one simple observation. There exists, for any k, a value of q such that P F (k) = P B (k) and from Eqs. (1) and (2) we know that this value is q(k) = 1/ √ k − 1. If q = q(k max ) it is guaranteed that P F P B ∀k, so the mean path is short and the explorer will come back home very often. If q = q(k min ), the situation is the opposite, P F P B ∀k, so for the agent it is very difficult to recover its home. Therefore, the conclusion is that the peak lies between these two extremal values. A reasonable estimation could be obtained by imposing that P F /P B = 1 on average while an explorer walks around. At each time step, the probability that a walker is on a node of degree k is P w = kp(k)/ k , where p(k) is the degree distribution of the considered network. Hence, this condition can be rewritten as thus we obtain the estimator In Fig. 5 we have plotted q * against q p for several networks of different sizes and different topologies (degree distributions) finding a very good agreement in all of the considered cases. It can be confirmed that the precise value of q p only depends on the first and second moment of the degree distribution, while no explicit dependence on the network size can be observed, at least for finite N .
To complete this phenomenological picture it can be useful to look at another quantity strictly related with what we have said in the previous paragraphs. In Fig. 6 we plot the mean time that a walker needs to come back to its home node ( T R ) as a function of q. In this case we did not set a lifetime. We then let N w walkers wander through the network starting from a given node. Each time an agent recovers its home it is not allowed to leave it anymore and the duration of the trip is recorded. We wait until every walker has come back and calculate the average return time T (i) R , where i ∈ [1,N ] stands for the considered home node. Finally we average over all the possible starting nodes. What we obtain is a curve that closely resembles that of the order parameter in a second-order phase transition, with the critical point located slightly above q p . Furthermore, if we look at the dispersion of the values of T (i) R we recover a behavior quite similar to that of the susceptibility (i.e., a divergence at the critical point [23]). Actually, the divergence takes place very close to q p , but even closer to the value of q for which the average number of consecutive backward steps is maximum (see the inset in Fig. 4). With a slightly larger value of q, the return time starts to rapidly increase, with a corresponding abrupt increment in the dispersion. This indicates that, even if the trip duration uses to be small, sometimes, with a probability that increases by increasing q, the agent needs a very long time to reach his or her home node. Thus, a lot of information is lost if the process is stopped before the explorer is able to complete its last journey. Again, this scenario corroborates the intuition that q p is the maximum value of q able to guarantee that the walker will not get lost.

V. ADAPTIVE STRATEGY
For any given network we are now able to predict where the peak is located, given the first and the second moments of the degree distribution. However, we are interested in developing a searching strategy that can be useful when we have no information at all about the underlying topology. In this section, we are going to set up an adaptive algorithm  aimed at optimizing the performance of an agent exploring a heterogeneous network in a number of steps T that is equal to (or less than) the number of nodes N . The basic idea is simple. We have a walker and a value of q associated to it. We let it wander and when it is at home again we evaluate the contribution of this last round trip to the information gathered until that moment and, if necessary, the value of q is modified. To build up such an algorithm, three main elements are needed. The first one is an appropriate quantitative way to evaluate the performance of the agents. The second one is a criterion to decide whether or not q would be modified. Finally, the adaptive rule applies whenever the choice is to change the value of q. This third element is an algorithm able to connect what the agent has learned about the network until its last return, the efficiency of its performance and the current value of q in order to provide a new, more suitable, value for the parameter.
Let us start with the first element. Since the aim of the exploration is to collect the maximum amount of information in a fixed time frame, to be efficient means to visit as many new nodes as possible per unit of time (step). The final efficiency of a searching process can thus be defined as E = I/T . This definition can be expressed as a function of the number of round trips. If we indicate with t r the time of the rth return of the explorer (0 < t 1 < t 2 < . . . < T ), we have E(t r ) = V (t r )/t r , where V (t r ) stands for the number of visited nodes after t r steps [V (t r )N = I (t r )]. It is also possible to measure the efficiency of a single trip as e r = [V (t r ) − V (t r−1 )]/(t r − t r−1 )], but this is not a very useful procedure as e r is very noisy. Therefore, to compare the performance at time t r with that at time t r−1 it is better to consider the efficiency variation E(t r ) = E(t r ) − E(t r−1 ). Hence, a good criterion to decide whether a change of q is needed is E(t r ) < 0.
Notice that if we start with a small value of q, the number of steps forward and backward will be the same (see Fig. 4) and the explorer will pass on each visited node at least two times. Therefore the first return time t 1 will be twice the number of steps ahead that the walker was allowed to take. The maximum number of different nodes that the agent may have visited during its trip is therefore equal to the number of steps it took forward. This happens whenever the walker does not cross each link more than twice (forward and backward). Thus, the efficiency has an upper bound, E 1/2, that can be easily reached for any small value of q when the explorer performs its first trip. In particular, for q = 0 we surely have E(t 1 ) = 1/2 since only one step forward is allowed and the agent will visit one node in two time steps. Consequently, we expect E(t r ) to start from a value very close to 1/2 and then necessarily decreases. Hence, changing q has the effect of decelerating the decay of E(t), or at most, to make E(t) reach a stationary value.
When q is varied, we should also take special care in not letting the agent to get lost. For this reason, since the real value of q p is unknown, we need to start from a very small value of q to ensure that q q p . Then, we want to let q increase in a controlled way. Hence, we need to fix an upper bound for q based on the information the agent is recovering about the degree of the visited nodes. A first, very simple election could be to use the estimator of q p provided by Eq. (4). One can replace the probabilities P w (i) = k i p(k i )/ k with the visit frequencies F w (i) = n i /t, where i is the node index, n i the number of times the walker has visited that node, and t the elapsed time (number of steps). Thus we obtain the empirical estimator where V t is the number of visited nodes at time t. Obviously, when t → ∞ this empirical estimator is equal to the estimator (4). The problem, however, is that this is not an upper bound. Actually, the degree distribution the walker recovers during the first trips is very noisy and q * e fluctuates a lot. In some cases, it takes values quite smaller than the real q p . This would prevent the explorer from increasing q trapping him or her in the neighborhood of the starting node. Therefore, we need a quantity that satisfies the following requirements.
(1) It has to be less noisy than q * e . (2) It has to take values smaller than q p very unlikely.
(3) When evaluated over the whole network, its value has to be close to that of q p .
(4) It has to be the same as q p and q * e when we consider a homogeneous network.
To satisfy the first requirement, we need to avoid to use the frequencies F i taking into account all the visited nodes with the same weight, regardless of how many times they have been visited. So we are looking for an appropriate function f ({k i }) of the degrees of the visited nodes, such that We propose the following expression that satisfies all the requirements: Notice that in general k 2 k 2 / k , where the equality holds in the case homogeneous networks. Therefore, we have With all the previous remarks, the adaptive algorithm can be formulated as follows (see Fig. 7).
(1) Set q ∼ 0.1 and let the agent perform its first round trip.
(3) Calculate the new value of the efficiency and check if If it is not the case, let the agent explore again, until the condition E(t r ) < 0 is satisfied.
(5) Check if q + dq < q UB , where dq is a small positive quantity (in general dq = 0.01 is a good choice).
(6) If the condition (5) is satisfied, update the value of q adding dq: q → q + dq.
(7) If the condition (5) is not satisfied, but q < q UB < q + dq, update the value of q so that q → q UB .
(8) If q > q UB then (8a) if q − dq < q UB then q → q UB , (8b) if q − dq > q UB then q → q − dq. Figure 8 shows results for the final efficiency E(T ) and the information gathered I (T ) for the three best strategies: the adaptive one, q → ∞ and q = q p (although this is not really Here the efficiency E takes the initial value E 0 = 1. The notation is simplified to make the diagram easier to read: E(t r ) is indicated as E r and V (t r ) just as V .
a strategy since we need to know the precise value of q p ). Both quantities confirm that, unless T is more than twice the network size N = 10 000, the best performance is obtained for q = q p . Nevertheless, our adaptive strategy gives results that are very close to those obtained for q = q p and are always better than those obtained for q 1 (at least for T 2N ) both in terms of efficiency and in terms of the total amount of information recovered.
All these results are coherent with the description of the walkers' behavior commented on in the previous section. In particular, it is reasonable that when q 1 the efficiency initially increases with T since in this case the shorter the searching duration, the larger the probability that an agent gets lost. On the contrary, for q p and the adaptive strategy, which precisely aims at capturing the behavior of the agents at q p , the information is mainly collected by means of quite short round trips. Consequently, increasing the searching time reduces the efficiency because it increases the chance to visit many times the same nodes. In any case, when T N and I ∼ 1, the problem of visiting already visited nodes becomes relevant also for the strategy q 1. Finally, it is worth stressing that while for q 1, the dispersion among the values E (i) and I (i) for different home nodes is very high, in the case of the other two strategies, the same does not happen. This is a clear indication of the fact that the adaptive strategy recovers one of the most interesting features of the agents' behavior at q p , namely, the homogeneity of the performance starting from different home nodes. We next discuss one potential application of the searching strategies previously discussed. This would also allow for a better distinction of what strategy is the best.

VI. RECOVERING THE DEGREE DISTRIBUTION
An important global descriptor of every network is its degree distribution P (k). However, this information is not always at hand. For instance, suppose you belong to a network of which you only know your local neighborhood (like an online social network or a city map). The problem is then to know what is your position in the network as far as the degree is concerned or to make an exploration that allow you to gather information about the entire map. In other words, we want to study if the sample of nodes visited by an agent is more or less representative of the global system, at least with regard to its P (k).
In Fig. 9 we plot the number of nodes of degree k, N (k), found in a typical realization of the different strategies, for two different values of T and for different choices of home nodes. As we expected, the usual random walker and the 066116-7 agent with q 1 are very bad when the home node has a small degree (red curves). On the contrary, the performance of the adaptive protocol and that of the walker when q = q p are almost not affected by the walkers' lifetimes (at least for the considered values) and by the degree of the home node. Note that Figs. 9(a) and 9(b) represent the common situation in which a random walker starting at a lowly connected home node gets lost. Indeed, for such cases, the only information brought back is the degree of the node from which the walker started the exploration of the network. However, as it is also appreciated in the figure, when the home node has a relative high degree, setting q 1 constitutes the best strategy for an accurate estimation of N(k). Nevertheless, as the walker "does not know" what is the connectivity of its home node in relation to the rest of the network, the last mentioned strategy seems to be, as a rule of thumb, a bad choice. Additionally, the figure also shows that in general what is difficult for an agent to recover are the most peripheral nodes of the network. Consequently, the nodes with a small degree are usually under represented while the heavy tail of the degree distribution is reconstructed with high accuracy.
To quantify the accuracy of the reconstructed networks, we calculate the Kullback-Leibler (KL) divergence or relative entropy [24], a nonsymmetric measure of the difference between two probability distributions. This is a standard method to evaluate how different an experimentally estimated distribution is from the real one. For the probability distributions P and Q of a discrete random variable their KL divergence is defined to be where P (k) is the real distribution and Q(k) the estimated one. Using this measure, we explore how the accuracy of the reconstruction depends on the searching strategy, the walkers' lifetimes and the degree of the home node. In what follows, we report results for the mean values, averaged over many realizations, and for the deviations around the means. In Figs. 10(a) and 10(b), we plot D KL as a function of T for four different strategies (including the standard random walk) and two different starting nodes (corresponding to, respectively, maximum and minimum degree,). As expected, D KL → 0 when T → ∞, in all the considered cases. The q = q p strategy and the adaptive protocol perform much better than the other two settings, with less dispersion and a very much weaker dependence on the degree of the home node. Hence, these last two strategies are more suited if we aim at recovering P (k), especially when T is not too long. Moreover, even if they both are good, the adaptive strategy is better than q = q p , with a very small dispersion. Thus, although it is not possible to perform better that q = q p in terms of nodes visited, the adaptive strategy does better in terms of the accuracy of Q(k), that is, when it comes to reconstruct P (k).
We have also analyzed the dependency of D KL on the degree of the home nodes for fixed values of T . Figures 10(c) and 10(d) display D KL as a function of k for all the strategies, in the case of a short lifetime (T = 2000). The differences among strategies are really noteworthy, while for the larger lifetime (T = 10 000) we verified that they persist just in the case of quite small degrees of the home nodes. Finally, the adaptive strategy is in general the best option, being q = q p slightly better only in the case of home nodes with degree k < k .

VII. CONCLUSION
In this paper, we have presented a model for network search and exploration in which walkers evaluate at each time step whether to go farther from a home node or get back with the information retrieved up to that moment. These probabilities depend on a single parameter q, which has been shown to exhibit an optimal value, q = q p < 1 (q = 1 corresponds to the Markovian random walk limit) for exploration times comparable to the system size. When the walkers are allowed to explore the network indefinitely or during long times, the optimal value turns out to be q = ∞. However, although the amount of information recovered when setting q = ∞ could be maximal, the results are highly dependent on the degree of the home node: The smaller the degree of the node assigned to the walker, the less information the walker can get back home. As a matter of fact, for most of the nodes (recall that in a scale-free network most of the nodes are poorly connected), q = ∞ is not the best strategy.
Capitalizing on the behavior of the walkers as a function of q, we have also proposed an alternative algorithm in which the agents are allowed to tune the value of the parameter q to optimize the information retrieved. Through numerical simulations, we have shown that this mechanism allows an exploration as efficient as that performed setting q = q p . Nevertheless, the adaptive scheme has the advantage that the value of q is changed dynamically, and therefore it overcomes the problem of fixing an a priori unknown optimal value q p . We believe that this adaptive search protocol could be a valuable addition to the current literature as it performs optimally with a minimum (local) information about the network structure.
As a demonstration of the potentialities of the algorithms explored in this work, we have made use of the different searching strategies to address the problem of network discovery. As expected, the adaptive mechanism is the one whose performance, in terms of the quality and quantity of the information retrieved, is the best. Whether or not these kinds of strategies can be further developed and applied to the exploration of real networks is out of the scope of the present paper, but we identify at least two scenarios in which they can be useful: the discovery of new connections in communication networks and the exploration of planar networks (i.e., city networks) using minimal local information. We therefore hope that our work guide future research along these lines.

ACKNOWLEDGMENTS
The work has received financial support from Spanish MICINN (Grants No. FIS2009-13730 and No. FIS2011-25167), from Generalitat de Catalunya (2009SGR00838), from the FET-Open project DYNANETS (Grant No. 233847) funded by the European Commission, and from Comunidad de Aragón (FMI22/10). L.P. was supported by the Generalitat de Catalunya through the FI Program and also acknowledges the hospitality of ISI Torino, where part of this work was performed. We are also pleased to acknowledge referee's comments, which have improved the paper. 066116-9