A time consistent dynamic bargaining procedure in differential games with hterogeneous discounting

We study cooperative solutions for differential games where players consume a common property resource. Players are asymmetric, in the sense that they have different preferences and, in particular, different time preferences. We propose a new time-consistent dynamic bargaining procedure for this class of games. This solution concept, which is defined as the time-consistent dynamic bargaining (TCB) solution, extends the recursive Nash bargaining solution introduced in Sorger (J Econ Dyn Control 30:2637–2659, 2006) to a continuous time setting. The underlying idea is that, in case of disagreement, the threat is that players will play a noncooperative Markov Perfect Nash equilibrium just during a very small period of time, since new negotiations can take place at every future moment and, in particular, immediately later. Conditions for interior TCB solutions are derived. To illustrate the results, two common property resource games are analyzed in detail.


Introduction
Consider a differential game where players share the property of a resource. If cooperation is permitted, players could decide to coordinate their strategies in order to optimize their collective payoff. Pareto optimal solutions can be found by maximizing a weighted sum of the intertemporal utility functions. However, although it is typically assumed that all agents have the same rate of time preference, there is no reason to believe that players (consumers, firms or countries) have identical time preferences for utility streams. In that case, in the computation of optimal decision rules, as in hyperbolic discounting, a problem of time-inconsistency arises: what is optimal for the coalition or the society at time t will be no longer optimal at time s, for s > t.
If utilities are cardinal, the most natural way to address the problem of finding timeconsistent policies if agents with heterogeneous discount functions can cooperate-or if there is a social planner aggregating their preferences-is to add the individuals intertemporal utility functions and-as in hyperbolic discounting-look for timeconsistent policies. 1 In this approach there are two implicit fundamental assumptions: all players can cooperate at every instant of time t, and the different t-coalitions (coalitions at time t) lack precommitment power for future decision rules. As a result, the solution becomes partially cooperative (agents at the same time cooperate to achieve higher joint payoffs) but partially noncooperative (coalitions at different moments have different time preferences and do not cooperate among them). As in nonconstant discounting, time-consistent equilibria can be computed by finding subgame perfect equilibria in a noncooperative sequential game where agents are the different t-coalitions. However, the time-consistent "cooperative" solution obtained in this way (the t-cooperative equilibrium 2 ) can have some important drawbacks.
First, such time-consistent solutions are not Pareto optimal. As a result, timeconsistent cooperative solutions can be inefficient for the group: joint payments can be higher if players act in a fully noncooperative way (see Marín-Solano 2015). A second problem is that, if payments are not transferable among players, it seems natural to assume that players will take into account what they obtain if they decide not to cooperate. Last but not least, if players have different instantaneous utility functions, time-consistent cooperative equilibria seem to be extremely cumbersome to compute (see e.g. de-Paz et al. 2013).
In this paper we propose to work within the framework of a dynamic Nash bargaining theory in a continuous time setting, where negotiations can be done at every instant of time. Our proposal starts from the recursive Nash bargaining solution introduced in Sorger (2006). This solution concept was proposed for dynamic games with heterogeneous players in a discrete time setting. According to this solution, knowing the decision rule at future periods s = t + 1, t + 2, . . . , agents look for a weighted Nash bargaining solution in which the status quo or threat point is given by the payoffs of the players if they do not cooperate just at period t. As a result, weights of players become time-varying and the corresponding solution is time-consistent. 3 In our paper, we maximize, at every time, a Nash welfare function. In case of disagreement, agents can bargain again at any possible moment in the future and, in particular, immediately later. As a result, the corresponding solution becomes fully time-consistent (subgame perfect). Finally, we illustrate our theoretical findings by computing the time-consistent dynamic bargaining solution for two common property resource games. In particular, the second example shows that it seems that there is a clear advantage in analytical tractability of the dynamic bargaining solution proposed in the paper when compared with the t-cooperative equilibrium. For this model, unlike the t-cooperative solution, linear decision rules can exist.
Related literature Although there is a huge amount of papers in economics addressing the issue of bargaining theory in static and repeated games, the topic has received less attention in state-dependent common resource games with asymmetric players discounting the future at different rates. Sorger (2006) introduced the concept of recursive Nash bargaining. More recently, Flamini (2016) proposed an alternative non-cooperative approach to dynamic bargaining, not making use of a Nash welfare function, unlike Sorger (2006). Both papers address the problem in a discrete time setting. In contrast, we consider problems in continuous time, i.e. differential games. The search of time-consistent policies in cooperative differential games with asymmetric players, exhibiting different utilities and discount rates, was addressed in de-Paz et al. (2013), Ekeland et al. (2013) and Marín-Solano and Shevkoplyas (2011). Techniques involved are those of time inconsistent preferences. For a recent survey containing the main references on the topic, we refer to Yan and Yan (2019). Concerning Nash bargaining theory in differential games, the classical approach, consisting in maximizing a Nash welfare function at initial time, presents several unsatisfactory aspects (Haurie 1976), and looks non appropriate in the search if time-consistent (subgame perfect) solutions. First, there is the dynamic-consistency problem related to the fact that weights of players, calculated at initial time, do not guarantee that the solution is individually rational along the whole time-horizon. This problem has been addressed is several papers (see e.g. Petrosyan and Yeung 2014 and references therein). In addition, there is another source of time-inconsistency: if players can renegotiate their weights in a future moment, the new ones will be typically different, as a consequence of the evolution of the state of the system and the asymmetric discount rates. As a result, they will change their decision rules (lack of subgame perfectness). In Castañer et al. (2020), an attempt to manage all these issues was made. As in the present paper, agents can bargain again at every moment in the future, but the threat (the trigger strategy) is that, in case of disagreement, players will not cooperate forever. The corresponding solution concept belongs to what some authors name the class of collusive equilibria (see e.g. Haurie et al. 2012). But the choice of such threat point looks a bit extreme and unrealistic in many real life situations. In addition, it is not fully time-consistent. Instead, in the present paper, the threat is that, in case of disagreement, players will play a noncooperative Markov Perfect Nash equilibrium just during a time period of lenght , for arbitrarily small, with the knowledge that new negotiations can take place immediately later in order to achieve an agreement. In this sense, as we have mentioned above, our paper can be seen as the continuous time counterpart of Sorger (2006).
The paper is structured as follows. Sect. 2 describes the general model. Section 3 presents the main contribution of the paper. We introduce and characterize, in a continuous time setting, the time-consistent dynamic bargaining solution (TCB solution). A common property renewable resource model with log-utilities is analyzed in Sect. 4. Section 5 studies the joint management in a resource model with general isoelastic utilities and constant gross rate of return. Section 6 concludes the paper. Appendix A contains a brief review of the time-consistent cooperative equilibrium rule (t-CE). Proofs of the main results can be found in Appendix B.

Preliminaries
In this section we introduce the differential game model and fix notation. Let N be the number of players, and x ∈ X ⊂ R the state variable. For each player i ∈ {1, . . . , N }, let c i ∈ U i ⊂ R be her control (decision) variable, c = (c 1 , . . . , c N ) the corresponding vector of decision rules, u i (x, c 1 , . . . , c N ) the instantaneous utility function, and ρ i the discount rate. The intertemporal utility function of player i at time t is (2) In this paper, function u i depends just on c i , for i = 1, . . . , N . We will assume that u i (c i ) is continuously differentiable, increasing and strictly concave. 4 In addition, a continuously differentiable and concave (possibly linear) production function, so (2) becomeṡ These conditions, that are standard in economic models, facilitate the fulfilment of the conditions in Benveniste and Scheinkman (1979) for the concavity and differentiability of the value functions appearing in our problem, properties that are assumed in our derivations. Next, let us consider an intertemporal decision problem with several agents in which players can coordinate their strategies in order to optimize their collective payoff. In a cooperative setting, we can aggregate preferences as with λ i ≥ 0. If players have equal weights, we can take λ 1 = · · · = λ N = 1. If there is a unique and constant discount rate of time preference for all agents, Pareto optimal solutions can be obtained by solving a standard optimal control problem. However, in the case of different discount rates, joint preferences become time inconsistent. In order to find time-consistent solutions, the problem can be solved as a noncooperative sequential game with a continuum of "players" (each "player" is each coalition at time t, that we call the t-coalition). Hence, we can follow the ideas in Karp (2007), Ekeland and Lazrak (2010) or Yong (2011), who suggested three different procedures to find subgame perfect equilibria for this sequential game. A brief summary of this solution concept, that we call a t-cooperative equilibrium (t-CE), can be found in Appendix A.
The t-CE is the natural extension of the standard cooperative solution within our setting with asymmetric discounting, since it is constrained (to the future behavior of the agents) Pareto optimal. However, in a differential game with heterogeneous discounting, if utility functions are essentially different, the t-CE can become extremely difficult to compute (as illustrated, e.g., in de-Paz et al. (2013), for the case when marginal elasticities are different). In addition, it can be group inefficient, in the sense that joint payoffs can be lower in the t-cooperative equilibrium than in a noncooperative Markov Perfect Nash Equilibrium (MPNE) (Marín-Solano 2015). Finally, the t-CE has a property that looks rather unsatisfactory: if players have equal weights and equal preferences for consumption, represented by the same utility function, but they are asymmetric in their time-preferences, the t-CE assigns the same consumption to all players at every time, independently of who is more or less impatient.
In the paper, we propose a new time-consistent (and subgame perfect) solution in Markovian strategies. Here, we understand time-consistency and subgame perfectness according to the definition in noncooperative differential games (see e.g. pages 99-103 in Dockner et al. (2000), or pages 173-174 and 260 in Haurie et al. (2012)), linked to the issue of credibility of the announced equilibrium strategies. This is indeed the convention that has been adopted as the standard in the literature of time inconsistent preferences.

Definition 1
Let (x 0 , 0) denote a game played along [0, ∞), with initial state x 0 ∈ X , and let (x, t) be the corresponding subgame defined on the time interval [t, ∞) with initial state x(t) = x ∈ X .
It is important to stress here that, in the context of NTU cooperative differential games, a different idea of time-consistency (and agreeability) has been used (see e.g. Petrosyan and Yeung 2014;Yeung and Petrosyan 2015), by incorporating Pareto optimality and individual rationality. These concepts are related to the stability of the coalition, in the sense that both players are better off by playing in a cooperative way during the whole planning horizon, so there is not a future moment in which it can be profitable for some of them to play in a fully noncooperative way and break the coalition. The t-CE is a time-consistent and subgame perfect equilibrium according to Definition 1. But it is not time-consistent, in general, in this other sense.

A dynamic bargaining procedure
As an alternative to the t-cooperative equilibrium rule, in this paper we propose to use a dynamic version of Nash bargaining theory. In Nash bargaining theory, payments are obtained as the maximizers of a Nash bargaining function (Nash 1953). These payments implicitly characterize the weights of players in the whole coalition. Munro (1979) proposed to use Nash bargaining theory with constant weights in a transboundary resource model with asymmetric players. In his model, the threat (status quo, disagreement or reference) point is the payoff associated to a noncooperative Nash equilibrium, and weights are bargained at the beginning of the game. In order to avoid the problems of dynamic inconsistency 5 (implicit in any differential game, see Haurie 1976) and time inconsistency (due to the different discount rates) of the solution, he assumed full commitment of agents, which is unrealistic in most of cases. As a way to overcome these problems, we propose to work within a Markovian formulation, where strategies are derived as the result of repeated negotiations that take place at every moment t. We assume that, at time t, players know the state of the system and take as given their future decision rule, c = φ(x(s), s), s > t, as a reaction to their current decisions. 6 Then, as in the classical Nash bargaining solution, they compare what they get cooperating at time t (or during the time interval [t, t + ), with arbitrarily small) with what they receive otherwise (the status quo, threat point or reference point). Let us denote the threat point by (W 1 (x, t), . . . , W N (x, t)). When players at time t try to reach an agreement and to derive their corresponding actions, they choose their policy in the time interval [t, t + ) as the maximizer of some "distance" between what they obtain in case of agreement and in case of disagreement. We assume that this distance is measured according to the generalized Nash welfare function with strictly positive bargaining powers η 1 , .
This solution concept has been characterized (for static games) with a set of three axioms: strong individual rationality, independence of utility calibration and independence of irrelevant alternatives. For η 1 = · · · = η N = 1 we recover the classical Nash bargaining solution satisfying symmetry.
For a decision rule (c 1 , . . . , If players can precommit their behavior during a time period of length , take Then, at least in principle (as we will see later, this will be the case if bargaining powers are normalized so that N i=1 η i = 1), we can expect that The underlying idea consists, in general, in maximizing the first order term 1 (x, φ,c, t) (or, in general, the lower order term in ), in such a way thatc * = φ(x(t), t).
It remains to define the threat value function. If we assume full rationality of players, under no commitment, in case of non cooperation at time t, decision-makers can bargain again at time t + , for arbitrarily small. Hence, the threat point (noncooperative behavior) lasts after a time period of length . This is the natural extension to a continuous time setting of the recursive Nash bargaining solution introduced in Sorger (2006). By construction, the corresponding solution is dynamically consistent but typically non Pareto optimal for the t-coalition.
Next, we make precise in a rigorous mathematical way the ideas introduced above. First, we will define the time-consistent bargaining solution. Later on, we will propose a candidate for threat point in case of disagreement. Finally, we will describe interior time-consistent bargaining solutions.

The time-consistent dynamic bargaining (TCB) solution
Let us assume that, for s ∈ [t, t + ), the decision rule in case of non cooperation is given by φ ,nc (x(s), s) and, for s ≥ t + , players follow φ b (x(s), s). We denote as (W 1 , . . . , W N ) the threat point corresponding to the threat decision rule φ ,nc (x, t). The actual threat point will be obtained in the limit → 0 + . In the following subsection we will make a proposal for an appropriate choice of φ ,nc (x, t). For the moment, we take it as given.
If φ b (x, s) is the equilibrium (bargaining) decision rule for s ≥ t + , we can write the threat point for player i ∈ {1, . . . , N } as The second integral illustrates that, since players can bargain again at time t + , the threat of non cooperation lasts at that moment, so we have φ , On the other hand, from (5), The objective is to maximize c , as given in (6), at order (or at the minimum order in ). Note that denotes the value function. Therefore, The Nash welfare function becomes If we normalize bargaining powers so that η 1 + · · · + η N = 1 (just relative bargaining powers are relevant), then Then, we define the time-consistent dynamic bargaining (TCB) solution as the maximizer of the term at order η 1 +···+η N (i.e. the first order term if bargaining powers are normalized) in the Nash welfare function. More precisely, be the decision rule followed by player i, . We define the time-consistent dynamic bargaining solution (TCB) as where A few comments on the existence of a solution to equation (7) follow. First, as in the classical Nash bargaining theory, we need a well-defined threat (reference) point. We will discuss this issue in the Sect. 3.2. If the threat point φ 0,nc is well-defined, we have to find the fixed point φ b satisfying (7). Note that, in the right-hand side of that equation, φ b appears implicitly in V b i and φ 0,nc i , as we will illustrate with the examples. Given the difficulty of finding general existence conditions within the class of differential games and problems with nonconstant discounting, it seems reasonable to analyze the problem for particular classes of games. In Sects. 4 and 5 we study with some detail two classes of games which have been widely used in the literature describing the common management of renewable and nonrenewable resources. Linear state and linear-quadratic games, with applications to environmental problems of climate policy, are also good candidates for our bargaining model.

The threat or reference point
As a proposal for a threat of reference point in case of disagreement, we assume that, under non cooperation at time t, the threat is that agents will play the noncooperative MPNE (see e.g. Dockner et al. 2000or Haurie et al. 2012 in the noncooperative game played in the time interval s ∈ [t, t + ). As we have seen, agents can achieve an agreement later on, at time t + . Next we formalize this idea.
Definition 3 Assume that, at time t, players take as given the future decision rule φ b (x(s), s), for s ≥ t + . The threat point in case of disagreement during the time interval time [t, t + ), with x(t) = x, is given as follows: . . , N , is any possible admissible feedback law for player i for the problem with planning horizon [t, t + ); and As in Definition 2, in the definition above, agents take as given the future decision rule, derived through a dynamic bargaining procedure, at time s ≥ t + . The threat is not to cooperate during the time period [t, t + ) and possibly to cooperate for s ≥ t + , by computing the MPNE for the noncooperative game with finite planning horizon [t, t + ) and final functions Since we consider the situation in which there is no commitment at all and players can bargain again at any possible future moment τ > t, can be arbitrarily small. Hence, the threat point is given by computed as in Definition 3. As we have commented above, we need a well-defined threat (reference) point. In our problem, φ 0,nc is well defined if there exists a unique Nash equilibrium in the game where players do not cooperate in the time interval [t, t + ), and they follow the TCB solution for s ≥ t + . Provided that φ b is known, this is a differential game played in a finite horizon setting. Although working in a finite planning horizon can reduce the number of MPNE (as is the case of linear-quadratic differential games), existence and uniqueness is not guaranteed, in general. If there are several MPNE but one Pareto dominates the others, the most natural choice is to take the Pareto superior MPNE. Otherwise, unless in the limit when → 0 + they converge to a unique solution, the choice of the threat point is not straightforward. In that case, as in the classical (static) Nash bargaining theory, we can take the status quo point as given, by selecting e.g. one of the MPNE.
The TCB solution obtained from Definitions 2 and 3 is, by construction, timeconsistent and subgame perfect in the sense of Definition 1.

Interior TCB solutions
If interior solutions to equation (7) exist for all t, we can compute the first order conditions for j = 1, . . . , N , whose solution is given by In our problem with g(x, c) = f (x) − N k=1 c k , in the limit → 0 + , strategies become stationary. Hence, the (interior) time-consistent dynamic bargaining solutions Remark 1 If, in the intertemporal utility function (1), instantaneous utilities take the general form u i (x, c), TCB solutions follow from (8) for these utility functions.

A common property renewable natural resource model with log-utilities
In this section we apply the results in the previous section to a common property resource game with heterogeneous agents coming from the literature of resource economics. The model considers the problem of joint exploitation of a renewable natural resource if players have logarithmic utilities. The intertemporal utility function of player i is given by with μ i > 0, for i = 1, . . . , N . The stock of the resource evolves according tȯ with a, b > 0. This model with a Gompertz recruitment function (and more general versions of it, including extensions to multiple species) was studied, for equal discount rates, in Clemhout and Wan (1985).

Noncooperative MPNE and the t-cooperative equilibrium
For μ i = 1, noncooperative MPNE and t-cooperative equilibria were already derived in Marín-Solano (2014). The extension is straightforward and we summarize the results.

Noncooperative MPNE
If players do not cooperate, stationary linear strategies exist and are given by for i = 1, . . . , N . The corresponding value functions are of logarithmic type, W n

The t-cooperative equilibrium rule
In order to compare the results, we recall the solution provided by the t-cooperative equilibrium rule. Again, we restrict our attention to stationary linear strategies.
Proposition 1 In Problem (10)-(11), if linear t-cooperative equilibria exist, equilibrium rules are given by c tc and the corresponding value functions are V tc Proof See Marín-Solano (2014).
Next, we briefly compare these t-cooperative equilibria with the MPNE.
• In both cases, a unique steady state exists, so there is no multiplicity of equilibria. We will denote it as x n ∞ in the MPNE, and x tc ∞ in the t-cooperative equilibria.
• As in the standard case with equal and constant discount rates, harvest coefficients are higher in the pure noncooperative solution provided by the MPNE than in the problem with t-cooperation. Note that, from (12) and (14), c n i (x) − c tc i (x) > 0 if, and only if, N j=1 λ j μ j (ρ i + b)/(ρ j + b) − λ i μ i > 0, and this condition is obviously satisfied.
• As a result, x tc ∞ > x n ∞ , i.e., the resource is overexploited in the MPNE in comparison with the t-cooperative equilibria.
• However, the t-cooperative equilibria can be less efficient. The reason is that, if a high weight is placed on impatient agents, since c tc i (x)/c tc j (x) = (λ i μ i )/(λ j μ j ), then the t-CE will prescribe, at every point in time, low relative consumption for very patient agents. However, overall utility is heavily dependent on the future consumption of agents with very low discounting, so constraining them to consume much less than impatient agents at every point in time can be inefficient. We refer to Marín-Solano (2015) for an example of this situation.

Time-consistent dynamic bargaining solution
The first step consists in computing the threat point in case of disagreement. We compute first the threat point according to Definition 3, and we take the limit → 0 + later on.

Proposition 2 In Problem (10)-(11), if players follow, for s ≥ t + , strategies φ b i (x(s), s) = A b i (s)x(s), then the threat points according to Definition 3 are given by
The corresponding decision rules are φ ,nc Proof See Appendix B. Proof It follows by taking the limit → 0 + in Proposition 2. Note that in this limit we can restrict our attention to stationary strategies.

Proposition 3 In Problem (10)-(11), if players follow, for s > t, strategies φ b i (x(s)) = A b i x(s), the threat point in the time-consistent dynamic bargaining solution is given by
Finally, for the derivation of the time-consistent dynamic bargaining solution, we substitute the expression of the threat point W 0 and threat strategy φ 0,nc (see Proposition 3) into (9).

Proposition 4 Within the class of linear strategies, time consistent dynamic bargaining solutions
Proof See Appendix B.
Remark 2 If players are symmetric, μ 1 = · · · = μ N and ρ 1 = · · · = ρ N , within the class of linear decision rules, the standard symmetric cooperative decision rule is also a time-consistent dynamic bargaining solution.
Remark 3 Note that, for all the solution concepts (noncooperative MPNE, tcooperative equilibrium and time-consistent bargaining solution), although value functions and steady states depend on the value of the parameter a, decision rules are independent on a.
The equation system (16) is highly nonlinear. In the numerical illustrations we will restrict our attention to the case of just two players and values of η i (bargaining powers) simplifying the equations. For N = 2, if η 1 = 1 and η 2 = (ρ 1 + b)/(ρ 2 + b) (this specification of bargaining powers coincides, for b = 0, with that in Sorger (2006)), then we have to solve the equation system (see the Appendix)

Comparison of the results: numerical illustrations
We illustrate numerically the results obtained in the noncooperative case with those by the t-cooperative equilibrium rule with equal weights (λ i = 1) and the timeconsistent dynamic bargaining solution. Since, in the expression of the value function, for all these solution concepts, in the analysis of efficiency levels (i.e. comparison of payments) it suffices to compute the values of the corresponding β i . Value functions and decision rules are given by (12)-(13) (in the case of noncooperation), Proposition 1 (t-CE) and Proposition 4 (TCB solution). From (11), the steady state is given by For the numerical illustrations, we take N = 2 (two players), a = 10, b = 2.18, μ 1 = μ 2 = 1, x 0 = 100, ρ i ∈ [0.01, 0.19], η 1 = 1 and η 2 = (ρ 1 + b)/(ρ 2 + b). Note that, in this case, η 2 is very near to η 1 . Table 1 presents the results for harvest coefficients A i and steady states. Harvesting is clearly lower under cooperation, as expected. Joint harvesting, and hence the corresponding steady states, are quite similar in the two cooperation settings considered, with some qualitative differences among players depending on the discount rates. In the bargaining scenario, harvesting is higher if players are more impatient, whereas the t-CE does not make distinction among them. Table 2 illustrates efficiency by showing gains under cooperation over the noncooperative MPNE. Although gains are relatively low for players with higher discount rates, all the cooperative solutions Pareto dominate payments given by the MPNE.

A resource model with general isoelastic utilities
As a second example, we solve a model describing the joint management of a productive asset with constant gross rate of return if players have general isoelastic utilities with different marginal elasticities. The intertemporal utility function of player i is given by The stock of the resource evolves according tȯ with a ≥ 0. For a = 0 we obtain an exhaustible resource model. This problem was studied in Castañer et al. (2020) for an alternative dynamic bargaining procedure, giving rise to the introduction of what they call the agreeable dynamic bargaining solution. In that solution, the trigger strategy is characterized by the threat point in which, in case of disagreement, players will not cooperate forever, i.e., new negotiations are not permitted in the future. This is a kind of precommitment and, in that sense, the corresponding solution is not fully time-consistent (and unrealistic in many real life applications). After deriving conditions for TCB solutions in Sect. 5.1, we compare, in Sect. 5.2, for N = 2 and a = 0, the extraction rates obtained   Castañer et al. (2020).

Time consistent dynamic bargaining solution
In general, t-cooperative equilibria seem to be extremely cumbersome to compute for this simple problem. In fact, no linear decision rules exist unless all marginal elasticities coincide. In addition, as observed in de-Paz et al. (2013), the search of such nonlinear decision rules seems to be extremely difficult, not just analytically, but also numerically, given the high nonlinearity of the differential equation systems involved. On the contrary, linear TCB solutions can exist. In order to calculate them, as in the example studied in Sect. 5, we compute first the threat point and we derive later on the TCB solution.

Proposition 5 In Problem (18)-(19) if, according to Definition 3, players follow, for
, then the candidates to threat points are given by It is interesting to realize that, in the above proposition, players with log-utilities (i ∈ {J + 1, . . . , N }) apply as threat strategies those given by the noncooperative MPNE. On the contrary, this is not the case for players with marginal elasticities different to one. For i ∈ {1, . . . , J }, threat points are computed as the solutions to a highly nonlinear system of integral equations. Next we analyze what happens in the limit when goes to zero of the strategies derived in Proposition 5.

Corollary 1 In Problem (18)-(19), if players follow, for s > t, strategies
Proof It follows by taking the limit → 0 + in Proposition 1. Note that in this limit we can restrict our attention to stationary strategies.
Once we have computed the threat point we use (9) for the derivation of the timeconsistent dynamic bargaining solution.
Proof It is similar to the proof of Proposition 4.

Remark 4
As in the previous example, in , if σ i = σ and ρ i = ρ, for i = 1, . . . , N , the (standard) symmetric cooperative decision rule is also a timeconsistent dynamic bargaining solution.

Numerical illustrations
As in the previous example, we illustrate numerically the results. We take a = 0, which corresponds to the case of a nonrenewable resource, and N = 2 (two players).
We compare the extraction rates obtained under the noncooperative MPNE, (A n 1 , A n 2 ); the agreeable dynamic bargaining procedure, denoted by (A db 1 , A db 2 ); and under timeconsistent dynamic bargaining, Table 3 presents the results for ρ 1 = 0.07, ρ 2 = 0.13, σ 1 = σ 2 = σ ∈ [0.6, 1.5], (η 1 , η 2 ) = (1, 1) and (η 1 , η 2 ) = (1, ρ 1 /ρ 2 ). We observe that extraction rates are clearly lower under the two dynamic bargaining procedures. On the contrary, although the two dynamic bargaining procedures make use of clearly different threat points (and the mathematical derivation of the corresponding solutions is quite different), results are very similar. Concerning this comparison, for equal bargaining powers (η 1 = η 2 ), under time consistent bargaining, extraction rates are slightly higher for the player with a lower discount rate, A b 1 > A db 1 , and slightly lower for the player with a higher discount rate, A b 2 < A db 2 . Joint extractions are lowest under time-consistent bargaining. When σ increases, extraction rates decrease for the three solution concepts. Table 3 Equal marginal elasticities  Table 4 Different marginal elasticities  Table 4 uses the same values of the parameters but now players exhibit different marginal elasticities: σ 1 ∈ [0.9, 1.5] and u 2 (c 2 ) = ln c 2 (which corresponds to the limit σ 2 = 1). Players are asymmetric both in their discount rates and instantaneous utility functions. Similar qualitative results to those on Table 3 are obtained, although differences are lower. In this case, by maintaining σ 2 = 1 constant, an increase of σ 1 induces a decrease of the extraction rates of the more patient player and an increase in the extraction rate of the more impatient player (with the exception of the noncooperative behavior, in which it remains constant).

Concluding remarks
In this paper we have introduced a time-consistent dynamic bargaining procedure in continuous time. This solution concept can be seen as the continuous time version of the recursive Nash bargaining solution introduced in Sorger (2006) for dynamic games in discrete time. In comparison with the time-consistent cooperative solution, it has clear advantages in terms of applicability to a wider class of problems, computability and, in addition, decision rules seem to be more realistic. These ideas have been applied to the study of two common property resource games.
We think that the dynamic bargaining procedure proposed in the paper can be valuable in the search of cooperative solutions in games with asymmetric players, not just for problems in which they exhibit unequal discount rates, but also if they discount the future at the same rate but have different preferences (i.e. utility functions). Linear in state games and linear quadratic games describing environmental problems related to pollution control are other fields where our results could be applied in a natural way.
Another topic that could be of interest is to study if the so known impatience problem arising in dynamic models of equilibrium with heterogeneous households (the most patient household ends up by owning all the capital in the long run, see Becker (1980), andBecker et al., 2013) is also present if agents take their decisions according to the TCB solution. It is clear that this is not the case for the t-CE since, since if players differ just in their time preferences, they will consume the same. This property derives from the fact that discount rates do not make differences among players in equation (25): a social planner, when choosing stationary equilibrium strategies, will aim to maximize the sum of present utilities plus the effects on the future joint welfare of all players. On the contrary, in the case of the TCB solution, from Definition 2 and the structure of the left-hand side term of equation (9), consumption of players will depend on their discount rates. However, it also depends on the choice of the bargaining powers, so we cannot conclude which agent will be more impatience, in general. For equal bargaining powers, the intuition tells us that the impatience problem could be present, and it is an interesting topic that could be explored in the future.

Appendix A
As in Ekeland and Lazrak (2010), we restrict our attention to stationary (note that the problem is autonomous) convergent Markovian strategies, i.e. strategies c i = φ i (x) for which there exists x ∞ < ∞ and a neighbourhood U of x ∞ such that, for every x 0 ∈ U , the solution to (3) along c = φ(x) = (φ 1 (x), . . . , φ N (x)) converges to x ∞ . For stationary convergent strategies, the integral in (1) converges.
If c * (s) = φ(x(s)) is a continuously differentiable equilibrium rule for Problem (4) subject to (3), by denoting x t = x, the corresponding value function for the whole coalition is (22) For stationary equilibrium strategies, the value functions V i (and V ) are timeindependent.
In de-Paz et al. (2013), a dynamic programming equation describing time-consistent equilibria for Problem (3)-(4) was derived by following a formal limiting procedure. First, these authors discretized equations (3)-(4), by following the classical Euler method, with periods of constant length , in the case of finite planning horizon T . For this problem, a dynamic programming algorithm was derived. Next, time-consistent cooperative equilibria and the corresponding value function V (x, t) were defined as the (formal) continuous time limit (assuming that such a limit exists and that V (x, t) is sufficiently smooth) when → 0 of the discrete time dynamic programming equation. Finally, the solution to Problem (3)-(4) was defined by taking the limit when T → ∞.
Alternatively, one can follow the approach in Ekeland and Lazrak (2010) and Ekeland et al. (2013). For > 0 andc = (c 1 , . . . ,c N ),c i ∈ U i , let If the t-coalition has the ability to precommit its behavior during the period [t, t + ), the valuation along the perturbed control path c is given by i.e., Since the problem is autonomous, we can write P(x, φ,c).

Appendix B
Proof of Proposition 2: The threat point in case of disagreement is given as in Definition 3. Let us compute it. At time t players take as given the decision rule for s ≥ t + .
be the corresponding future decision rule. Hence Since we are interested in the existence of linear strategies, we will assume that φ b i (x(s), s) = A b i (s)x(s), for s ≥ t + . By integrating (28) we obtain By substituting (29) in (26) we can write By assuming linear strategies, Alternatively, by substituting in (30) we obtain so, by continuity, By substituting in (32)   (1 − e −b(s−t) ) .
By substituting the equation above into (33) and differentiating, we easily obtain x . The result follows by substituting the expressions for u (c b i ), φ 0,nc i (x) and (V b i (x)) into equation (9). Proof of equations (17): Since , for all i, l = 1, . . . , N , if η i (ρ i + b) = η l (ρ l + b), then and equation (16) becomes . Therefore, As a result, N j=1 A b j /(ρ j + b) = 1. For N = 2, equations (17) follow. Proof of Proposition 5: For the computation of the threat point, from Definition 3, at time t players take as given the decision rule for s ≥ t + . Let Since we are interested in the existence of linear strategies, we will assume that φ b i (x(s), s) = A b i (s)x(s), for s ≥ t + . By integrating (36) we obtain By substituting (37) in (34) we can easily derive Since, by continuity, x t+ = x t e t+ t (a− N j=1 A ,nc j (z)) dz , by identifying x t = x, ,nc (t, t + ) Finally, from (40), and, from (40), the result follows.