A model for interevent times with long tails and multifractality in human communications: An application to financial trading

Social, technological and economic time series are divided by events which are usually assumed to be random albeit with some hierarchical structure. It is well known that the interevent statistics observed in these contexts differs from the Poissonian profile by being long-tailed distributed with resting and active periods interwoven. Understanding mechanisms generating consistent statistics have therefore become a central issue. The approach we present is taken from the Continuous Time Random Walk formalism and represents an analytical alternative to models of non-trivial priority that have been recently proposed. Our analysis also goes one step further by looking at the multifractal structure of the interevent times of human decisions. We here analyze the inter-transaction time intervals of several financial markets. We observe that empirical data describes a subtle multifractal behavior. Our model explains this structure by taking the pausing-time density in the form of a superstatistics where the integral kernel quantifies the heterogeneous nature of the executed tasks. An stretched exponential kernel provides a multifractal profile valid for a certain limited range. A suggested heuristic analytical profile is capable of covering a broader region.


I. INTRODUCTION
The dynamics of many complex systems, not only in natural sciences but in economical and social contexts as well, is usually presented in the form of time series. These series are frequently separated by random events which, in spite of their randomness, show some structure and apparent universal features [1,2,3,4,5,6]. During the last few years there have been endeavors to explain the sort of actions involved in interhuman communication [1,7]. According to this framework, decisions are taken based on a queuing process and are aimed to be valid for a wide range of phenomena such as correspondence among people, the consecutive visits of a web portal or even transactions and trading in financial markets [5].
The main conclusion of these studies is that, in order to reproduce the empirical observations as well as to give reason of the heterogeneous nature of outgoing tasks, the timing decision has to adopt a rule of non-trivial priority. Otherwise, the implementation of, for instance, the simple rule: "first-in-first-out" leads to Poissonian timing between consecutive outgoing events and this seems to deviate from many empirical observations.
One convenient frame to approach these phenomena is provided by the Continuous Time Random Walk (CTRW). Within this frame one is basically concerned with the appropriate description of ψ(t), the so-called pausing-time density (PTD), which gives the probability of having a certain time interval t between two consecutive events. Many empirical PTD's present long-tailed profiles suggesting a self-similar hierarchy in the entire probability distribution. Following this indication some authors [5] claim that the slow decay of the PTD obeys a power-law ψ(t) ∼ t −δ whose exponent is almost universal in the sense that it seems to adopt only two different values δ = 1 and δ = 3/2 [5]. In the next section we will present a simple approach which gives a power-law reproducing these exponents.
Besides the PTD which doubtlessly provides maximal information on interevent statistics, the deep structure of the fractal hierarchy is perhaps more easily unveiled by looking at the q-moments of the interevent times instead of solely observing the PTD tails. One is thus able to answer questions such as whether the process is monofractal or multifractal and if there eventually exist different regimes depending on the value of q (the order of the q-moment). This information obtained from data can afterwards guide us to find out the main ingredients of a more refined theoretical model for human decision dynamics. This is certainly the chief motivation of this work.
Herein we propose an alternative framework to the existing ones -which are basically based on queuing processes-but that it still considers the heterogeneous nature of the executed tasks. Within our approach it is possible to deal with analytical expressions, not only simulations, and we believe we provide good tools to describe the more subtle structure arisen from q-moments.
The approach we propose has its roots in physics and is reminiscent of Mixture of Distributions Hypothesis in Finance that can be traced back to the 1970s [8], the variational principle of energy dissipation distributions at different timescales in turbulence in the 1990s [9], the superstatistics and nonextensive entropy [10]. In fact, the PTD ψ(t) was first introduced within the CTRW model which was originally established by Montroll and Weiss [11,12,13].
Under this very general setting, the present development has been inspired by the work of Scher and Montroll [14] who in 1975 proposed the so-called "valley model" to describe the power-law relaxation of photocurrents created in amorphous (glossy) materials. We shall use the same idea but in a completely different background.
The paper is organized as follows. In Sect. II we present the fundamentals of Scher and Montroll's model and apply it to explain the emergence of long-tailed distributions in the pausing-time statistics. In Sect. III we address the question of the moments of the interevent times and obtain the conditions for the multifractal behavior of such moments. In Sect. IV we test multifractality on large financial data sets. Conclusions are drawn in Sect. V and some technical details are in the Appendix.

II. THE VALLEY MODEL AND THE PAUSING-TIME DENSITY
Scher and Montroll's "valley model" proposes a conditional PTD ψ(t|ε) as the starting distribution. This conditional density accounts for the probability that a given carrier is trapped during a time interval t within a potential well of depth ε. After this time interval has elapsed the carrier jumps to another potential valley. It is next assumed that the energy ε is a random variable described by a density ρ(ε) [14,15]. We thus have a "superstatistics" with the unconditional pausing-time density ψ(t) given by The conditional PTD is assumed to be the simple exponential (Poisson) form [14] ψ This choice is quite reasonable since for a given ε the emerging statistics is homogeneous because all occurrences have the same origin and in consequence they enjoy an identical characteristic time scale τ (ε). Scher and Montroll also assume that the relationship between the random energy ε and the characteristic time of the distribution is given by the simple exponential form: where τ 0 = τ (0) and β −1 , a fundamental constant of the model, is measured in units of energy. We should note that in Scher-Montroll's approach β −1 = K B T is the thermal energy of the environment at temperature T (K B is the Boltzmann constant).
We remark at this point that the valley model is consistent with the most basic properties of a queuing process recently addressed by Vazquez et al [5]. Indeed, in that process a set of incoming messages, or tasks, arrives at random. To these messages a certain priority labeled by ε is attached. The execution time of a given task with priority ε is described by the conditional density ψ(t|ε). In the most general setting ε is also a random variable characterized by a density ρ(ε). We are thus faced again with the "superstatistics" mentioned above since the timing of the outgoing tasks is governed by the unconditional PTD ψ(t) given by Eq. (1).
In the simplest case of a "first-in-first-out" queue the priority ε = µ has the same value for all tasks, hence ρ(ε) = δ(ε − µ) and the unconditional PTD reads which is a Poissonian density with a single characteristic time scale (the mean time between consecutive outgoing events) τ (µ) = τ 0 e βµ . In terms of decision theory, the situation is comparable to that of having no priority protocol at all [5].
Another particular situation would be to assign priorities in a uniform random manner In this case and from Eqs. (2)-(3) we can easily see that where τ ± = τ 0 exp(±∆β) (see Fig. 1). The characteristic time scale, τ c , of this model (i.e., the mean time between outgoing events τ c = t ) is straightly obtained using Eq. (5) and reads τ c = cosh ∆β ∆β τ 0 .
Observe that, when τ + ≫ τ − , This truncated power-law with exponent 1 is precisely one of the two universal classes suggested in Ref. [5] for queuing processes which emerged when the queue had a fixed length. In our model, this simple power-law arises when the random variable ε is uniformly distributed. The second universal class proposed by Vazquez et al [5] is given by a power-law with exponent 3/2 which appears in their simulated series when the queue is supposed to have an arbitrary length. Let us show that within our approach we can obtain arbitrary power-laws of the form ψ(t) ∼ 1/t δ (δ > 1). To this end we shall assume that the random variable ε follows the Laplace distribution where σ = |ε| > 0. Plugging into Eq. (1), assuming that the conditional PTD is Poissonian (cf Eq. (2)) and using the exponential form of τ (ε) given by Eq.
(3), we get The integral appearing in the right hand side of this expression and that runs over the entire real line can be splitted in two integrals over the interval (0, ∞) where One can easily see by simple transformations of variables that and where Γ(α, z) and γ(α, z) are incomplete gamma functions [16]. We therefore obtain the exact PTD (see Fig. 1 below) The characteristic time scale τ c corresponding to this density is shown to be (recall that τ c = t and see Eq. (16)) This characteristic time will exist as long as βσ < 1, i.e, for the "high-temperature phase".
Thus the process has a completely different dynamics according to whether βσ < 1 or βσ > 1. For in the former case the system has a finite average time between consecutive outgoing events, while in the latter case, i.e., for the "low-temperature phase", such a time ceases to exist, which means that the long term localisation for the whole observational time has sufficient probability. We can say that in this phase event statistics is dominated by rare and extreme events [17,18].
We may therefore assert that when σ = β −1 the system undergoes a "phase transition".
Note that for the carrier currents of Scher-Montroll's model such a phase transition will occur when the energy fluctuations, σ = |ε| , equal the environment's thermal energy, Let us return to Eq. (8). For long times t/τ 0 ≫ 1 we can write [16] Hence, neglecting exponentially small terms, we get which is a general power-law of the form 1/t δ with exponent greater than 1 (recall that β and σ are both positive). Therefore, the appearance in the PTD of long tails in the form of a nontruncated power-law is exclusively due to the assumption of a nonuniform distribution (in the present case the Laplace distribution Eq. (7)) for the random variable ε. Note that in deriving the power law (9) no other power-laws had to be imposed in intermediate stages such as the in the conditional PTD, which still maintains the Poissonian form, nor in the characteristic time scale τ (ǫ) given by Eq. (3). We also observe that, contrary to the uniform distribution where ε is limited by the fixed length ∆, there is now no upper or lower bound for the random variable ε. We finally remark that one of the cases analyzed in [5], a power-law with exponent 3/2, comes up from Eq. (9) when σ = 2/β.

III. PAUSING-TIME MOMENTS
The PTD ψ(t) provides maximal information about the timing between successive events.
There are, nonetheless, other quantities that can more easily unveil hidden but relevant features of the process such as multifractality. This is precisely the case of the pausing-time moments: Within our framework based on a superstatistics with some weight function ρ(ε), q-moments are written as where t q |ε are the conditional pausing-time moments defined by From Eqs. (2) and (3) we get and the moments read We remark that exponent q may be negative. However, for the Poissonian density (2), the conditional moment t q |ε and, hence t q , will exist as long as q > −1. Moreover, for the moments to exist, ρ(ε) must decay faster than e qβε as ε → ±∞.
Before proceeding ahead let us briefly recall that the timing process will show multifractality if q-moments behave as where f (q) is a nonlinear function of q and L is some suitable scale. When f (q) is linear the process is termed as monofractal [19].
We will now explore the possible emergence of multifractality depending on the choice of the weight function ρ(ε). The simplest assumption is ρ(ε) = δ(ε − µ), that is, there is no heterogeneity and all the "valleys" have identical depth µ. In such a case we readily obtain the monofractal behavior where l ≡ e βµ .
We have shown in the previous section that the uniform ρ(ε) and the Laplace ρ(ε) both result in long-tailed expressions for the PTD ψ(t). It is thus natural to ask ourselves whether those two long-tailed distributions (cf. Eqs. (5) and (8)) are also responsible for any multifractal behavior of the q-moments.
Therefore, none of the two weight functions that result in exact long-tailed PTD's produce a multifractal structure. To get multifractality we should go one step further in the formalism and consider, for instance, a stretched exponential: (α > 0 and σ > 0). Substituting Eq. (17) into Eq. (12) we get after a simple change of variables the unconditional moment in the intermediate form where l is defined in Eq. (14) and Looking at this integral we see at once that the convergence is assured as long as α > 1. In other words the unconditional moment t q exists if α > 1 and also if q > −1 (see above).
We can integrate Eq. (19) by means of a power series. In effect which is an exact expression for the q-moment. However, due to the term Γ((2n + 1)/α) appearing in the numerator, the convergence of the series in Eq. (20) is quite slow and difficult to evaluate from a numerical point of view. Nevertheless, when α = 2 (i.e., for a Gaussian ε) the integral in Eq. (19) can be done exactly in closed form with the result: clearly shows for the Gaussian ε the multifractal behavior of the q-moment: When α = 2 the integral (19) cannot be done exactly and to elucidate any possible multifractal behavior of the q-moments we have to resort to approximations. To this end we first define in Eq. (19) a new integration variable x by the change of scale y = (βσ) 1/(α−1) x.
Let us suppose that the fluctuations of the random variable ε, represented by σ, are larger than "the thermal energy of the environment" β −1 , that is, βσ > 1. In such a case the dimensionless parameter λ is large and we can safely use the saddle-point approximation [20] for the evaluation of I(q). This is done in the Appendix with the result where A(q) is given in Eq. (A9) of the Appendix, and b = (α − 1) (βσ/α) α/(α−1) > 0.
Substituting Eq. (24) into Eq. (18) we get an approximate analytical expression for the qmoment of the interevent time intervals. We write such an expression in a form that clearly enhances its multifractal (MF) character: where, as before, we have introduced two different scales l and L. The former l = e µβ and given by Eq. (14) is related to dissipation because it depends on the dissipative term µ of the weight density (17). On the other hand, the scale providing the multifractal behavior is the responsible for fluctuations since b defined in Eq. (25) depends on σ which, in turn, is related to the variance of ρ(ε); the latter given Note that all q-moments considered above obey the normalization condition, i.e., they are equal to 1 for q = 0.
The dissipative and fluctuating scales l and L merge into a single scale when b = µβ.
This equality means that dissipation µ and fluctuation σ are linked by where k = (1 − 1/α)(β/α) 1/(α−1) . For α = 2 (the Gaussian case) this relation reads which is the analog of the usual fluctuation-dissipation relation. This leads us to look at Eq.
(28) as the fractional version of the fluctuation-dissipation theorem suitable to the present approach.
Let us finally observe that when the fractional fluctuation-dissipation relation holds the monofractal and multifractal parts of the q-moment are both governed by the same scale, that is

IV. FINANCIAL DATABASE: AN EMPIRICAL ANALYSIS
We shall now confront our analytical model with empirical data. We focus on moments and multifractality and leave for a future presentation extensive testing of the PTD's obtained in Sect. II and their comparison with previous studies [21,22,23,24,25,26,27,28,29]. We have decided to apply our approach to financial markets because finance is one of the fields where large amounts of data are easily available. In particular we collect tick-by-tick data of futures contracts on several indices and also on a single stock (see Table I). The assets chosen have a very diverse nature thus providing wide generality to our analysis.
where t q i is the empirical q-moment of the database labeled by i. We then perform the linear regression ln t q i Γ(1 + q) = q ln τ i , and obtain that for 10 ≤ q ≤ 20 the estimate τ i is independent of q which proves the monofractal character of the q-moments when q is large (see Table II for the specific values of τ i ). Note that the error in estimation is very small being around 0.03% in all markets.
Let us now support the monofractal findings for q large and also check the robustness of the estimateτ i for all data sets. To this end we evaluate the ratio  where θ is an arbitrary parameter identical for all databases while τ i is the parameter obtained from the monofractal fit when 10 ≤ q ≤ 20 (see Tab. II). We have chosen the parameter θ to be the estimated τ for the Dow Jones Index, that is, we take θ = τ DJI = exp(6.144) (in seconds). If the monofractal hypothesis hold, φ i (q) curves would collapse into a single straight line (in a semi-logarithmic scale). We show this analysis in Fig. 3 and observe that all data sets merge into a single curve for q ≥ 10. Looking more carefully at Fig. 3 we also observe a deviation from the monofractal behavior for q < 10. Moreover the differences among the data sets become neatly visible for q < 5. This indeed can be checked in another way than that of Fig. 3 by plotting the quantity If data were monofractal, the points should be close to 1 independently on the value of q but as can be observed in Fig. 4 this is only true for q > 10. Also observe that the points converge to 1 monotonically as the order of the moments increases. Thanks to this, the plot give us some hints on possible multifractal candidates able to fit data for the smallest orders of q. The description given by Eq. (26) where we assumed a stretched exponential for ρ(ε) appears to be a good candidate. Figure 5 exemplifies these abilities with the stock Telefonica and the future contracts on DAX. Table III shows the values of the estimated parameters not only for the DAX and Telefonica but also for the rest of datasets.  (26) with parameters given by Table III.   TABLE III  Summarizing we may say that empirical q-moments of financial interevent times clearly show multifractal behavior for (approximately) q ≤ 3.5. In this case the analytical model expressed by Eq. (26), agrees with empirical data (see Table III). On the other hand, for higher values of q the data undoubtedly show a clear tendency to monofractality as has been tested in Figs. 3 and 4.
The abilities of the model (32) for small q can be checked as well through the function We can plug into Eq. (33) the parameters estimated from each data base and afterwards represent f MF (q) as a function of bq 1/(α−1) . In case we observe the merging of all databases into a straight line with slope 1 and without independent term, we could thus assert that MF model with the stretched exponential for density ρ(ε) is a good candidate (cf. Eq. (32) and (33)). This exercise is done in Fig. 6. The merging for small q demonstrates the abilities of the MF model not only for the Telefonica stock and the futures on DAX but also for the rest of financial databases.
If we want, however, to have all empirical facts in the nutshell of a single formula we should generalize Eq. (32) so as to include the monofractal behavior when q becomes large.   The requirements that such a Heuristic Multifractal (HMF) formula has to satisfy are: (i) it must obey the normalization condition (i.e., for q = 0 it it should be equal to 1); (ii) for small values of q it must reproduce Eq. (32); while (iii) for larger values of q the HMF formula must tend to a monofractal form.
The predictions of formula (34) have been tested on the available data sets with satisfactory results by plotting the function versus b 1 q 1/(α−1) . If the HMF hold, we then would be able to see a merging of all data sets along the curve 1 − exp(−x) where x = b 1 q 1/(α−1) (cf. Eqs. (34) and (36)). The model also allows for studying explicitly the dependence on ln L = b of transformed q-moment For doing this we escale the q values in each market i aŝ where α i and b i 1 are these estimated parameters from market i as shown in Table IV and where f HMF (q) would be its slope. Figures 8 and 9 show the satisfactory results that supports the validity of the Heuristic Multifractal formula.
We finally mention that the problem of the explicit forms of conditional PTD ψ(t|ε) and distribution ρ(ε) which gives the heuristic formula (34) according to relation (10) is still a challenge. We can otherwise check the soundness and self-consistency of our multifractal approach by looking at its sojourn probability (i.e, the decumulative probability) and compare it with empirical data. We take the sojourn probability instead of the pausing time distribution because verification with empirical data is firmer. Recall that the MF model takes the Poisson density ψ(t|ε) provided by Eq. (2) while ρ(ε) obeys a stretched exponential as given in Eq. (17). Substituting these densities into Eq. (39) yields The expression needs to be numerically evaluated and for doing this we have taken the parameters of Telefonica given in Table III and slightly modify them to improve the fit with empirical data. Solid line in Fig. 10 shows the resulting curve and it is there compared with the empirical sojourn probability of Telefonica.
The empirical analysis on the pausing time density and the sojourn probability in financial data has been extensively studied during the last few years [23,26,27,28,29]. Some recent papers argue that Ψ(t) can be described properly by the Tsallis q-exponential [28,29] Ψ q (t) = 1 with q> 1 or the Weibull distribution [28,29] Ψ W (t) = exp (−at c ) .
These candidates are also represented in Fig. 10 and the quality of their fits are comparable to that of our MF model. A more accurate study among the differences and similarities of the MF model (and eventually the HMF) between both the theoretical and empirical distributions is certainly necessary. However, we leave a more complete study of the sojourn probability for a future work.

V. SUMMARY AND CONCLUSIONS
In this work we have extended the original CTRW formalism, within the frame of Scher- Montroll's valley model, to furnish an analytical treatment for the statistics of interevent times. The model developed has been tested to financial time series, although the analysis is applicable to the broader area of interhuman communications.
The approach presented consists in obtaining the PTD ψ(t) and the q-moments t q of the interevent time intervals through a random variable ε described by a probability density ρ(ε). The nature of this hidden variable depends on the problem at hand. In the original work of Scher and Montroll ε represented the depth of the potential well where carriers were trapped. In other contexts, such as queuing processes, ε may represent the priority assigned to an incoming task and for financial markets we are exploring the possibility that ε would be related with transaction volumes, market depth or bid-ask spread [30,31,32].
Whatever the case, the overall approach assumes an expression for the conditional PTD ψ(t|ε) governing the timing of incoming events (charged carriers, messages, news, etc.). If these incoming events are supposed to arrive at random the natural choice for the conditional PTD is the Poisson distribution as given in Eq. (2). A second assumption is that for a given ε the mean time between consecutive events, τ (ε), depends on the hidden variable ε through the simple exponential form expressed by Eq. (3). Finally, in terms of the probability distribution of ε the unconditional PTD and the q-moments (which both refer to executed tasks or outcoming events) are respectively given by Eqs. (1) and (10).
With these simple ingredients we have been able to obtain long-tailed PTD's and multifractal q-moments. Thus, for instance, for a Laplace density ρ(ε) we have which agrees with many empirical observations of diverse phenomena from queuing theory [5] to finance [27].
Regarding moments the choice of a stretched exponential as the probability density for ε leads to a multifractal behavior of the form where L is a conveniently chosen scale.
We have tested the multifractal behavior of intertransaction times on large financial sets of tick-by-tick data (see [33,34,35] for multifractal analyses in other financial settings).
The overall conclusion is that q-moments are multifractal for small values of q (q ≤ 5), while for larger orders t q becomes monofractal. A more refined but heuristic analytical formula has also been proposed which fits the whole range of empirical q-moments. Nevertheless, the problem of the explicit forms for both the conditional PTD ψ(t|ε) and the density ρ(ε) resulting in the heuristic expression is still a challenge.
Let us finish by noting that in some places around the paper we have highlighted some thermodynamic similarities in our method. In fact the multifractal approach we have herein developed is feasible of a thermodynamic interpretation [19,36]. We will develop this analogy in a future work. We want to evaluate the integral (22) which we write in the form where h(x) = |x| α − qx, (A2) (α > 1). For λ large we can employ the saddle-point approximation or Laplace's method [20]. when q > 0, and when −1 < q < 0.
On the other hand, recalling that x 0 has the same sign as q we can write h(x 0 ) in the .