Optimal estimation of two-qubit pure-state entanglement

We present optimal measuring strategies for the estimation of the entanglement of unknown two-qubit pure states and of the degree of mixing of unknown single-qubit mixed states, of which N identical copies are available. The most general measuring strategies are considered in both situations, to conclude in the first case that a local, although collective, measurement suffices to estimate entanglement, a non-local property, optimally.


I. INTRODUCTION
Plenty of work has been performed in recent years on optimal quantum measurements, i.e. on measurements which provide the maximum possible information about an unknown quantum mechanical pure [1][2][3][4][5] or mixed [6] state, of which N identical copies are available. These works are focussed mainly on the determination of the unknown state as a whole, and consequently any of its properties is also estimated, although maybe not in an optimal way.
On the other hand recent developments on the field of quantum information theory have stressed the importance of the quantum correlations -or entanglement-displayed by some states of composite systems. In the simplest of such composite systems, the two-qubit case, all non-local properties of pure states depend upon only one single parameter. Such non-local parameter is the only relevant quantity invariant under local unitary transformations on each qubit and plays a central role in the quantification and optimal manipulation of entanglement [7][8][9][10][11].
In this work we analyze and solve the problem of optimally estimating the entanglement of an unknown pure state of two qubits. This problem has been independently addressed also by Sancho and Huelga in a recent work [12], where only a restricted class of measuring strategies is considered. Here, on the contrary, we will consider most general quantum measurements on N identical copies of the state. Their quality will be assessed through the gain of information they provide about the non-local parameter of the state. After presenting and proving the solution we will conclude that the optimal measuring strategies so defined are not equivalent to the ones used to fully reconstruct the unknown state. As a matter of fact, all information about some relative phase of the unknown state turns out to be irreversibly erased as the entanglement is estimated.
Estimation of the degree of mixing of an unknown mixed state is a different but very much related topic that we shall also consider here. For the single-qubit case the amount of mixing is specified again by just one parameter, the modulus of the corresponding Bloch vector, whereas in order to completely specifying the state two more parameters, namely the direction of the Bloch vector, are also required. We shall show that in this case the optimal measuring strategy on any number N of qubits prepared in the same mixed state can be made compatible with the optimal estimation of the direction of its Bloch vector.
Finally, we will show that a possible way of optimally determining the entanglement of an unknown, two-qubit pure state consists precisely in estimating, also optimally, the degree of mixture of any of its two reduced density matrices. Therefore, it turns out in this simple bipartite case that the optimal estimation of a non-local parameter can be done through a local measurement.
The paper is structured as follows. Section II is devoted to background material. We introduce a convenient parameterization of two-qubit pure states and consider their isotropic distribution. We also review some basic aspects on parameter estimation and on quantum measurements. In Section III we pose the problem of entanglement estimation on firmer grounds and announce the main result of this paper: its optimal performance. Section IV, rather technical and that could well be skipped in a first reading, is devoted to the computation of some effective density matrix ρ (N ) (b), an object which plays a central role in deriving the optimal strategy for estimating entanglement. In Section V the N = 1, 2, 3 cases are presented in more detail in order to illustrate the general case. Optimal estimation of the degree of mixing is discussed and solved in Section VI, and finally Section VII contains a discussion relating estimation of both entanglement and mixing, and some concluding remarks.

II. PRELIMINARIES
We will consider here a two-party scenario. Alice and Bob will share the N copies of a completely unknown two-qubit pure state |ψ , and their aim will be to obtain as much information as possible about its entanglement.
The sense in which the state is unknown, the mechanisms for extracting information from the system and the scheme for evaluating the extracted information will be briefly reviewed in what follows.
A. Homogeneous distribution.
All that is initially known about the state of each pair of qubits is that it is pure. This corresponds to the unbiased distribution on the Hilbert space H 4 = H 2 ⊗ H 2 of two qubits, that is, to the only probability distribution invariant under arbitrary unitary transformations on H 4 . It is convenient to express the unknown state |ψ ∈ H 2 ⊗ H 2 , which depends on six parameters, in its Schmidt-like decomposition where the phase e iα , which is usually absorbed by one of the kets it goes with, has been left explicit. The non-local parameter b ∈ [0, 1] characterizes the entanglement of |ψ . Only for b = 1 is |ψ a product state |â ⊗ |b , and thus unentangled. For b < 1 the state contains quantum correlations, b = 0 corresponding to a maximally entangled state. Recall that this parameter is the modulus of the Bloch vector of the reduced density matrix ρ A on Alice's side, and equivalently for ρ B . The other four parameters correspond to the two directionsâ andb of the Bloch vectors of ρ A and ρ B . Then, the unbiased distribution of pure states corresponds [13] to the isotropic distribution ofâ in S 2 ,b in S 2 , α in S 1 and the quadratic distribution of b in [0,1], B. General measurements and information gain.
The parties are thus provided with N copies of a pure state |ψ as in Eq. (1), i.e. with the state |ψ ⊗N , and our aim is to construct the most informative measurement on the collective, 2N -qubit system for the estimation of the parameter b. The optimality criterion to be used is based on the Kullback or mutual information K[f ′ , f ] [14], a functional of two probability distributions f ′ and f that is interpreted as the gain of information in replacing the latter distribution with the former one [15]. In our case, for instance, the prior, unbiased density function for the parameter b is given by (3), so we have f (b) = 3b 2 . A generic measurement, allowing for the most general manipulation of the system, is represented by a resolution of the identity by means of a set of positive operators, After the above positive operator valued measurement (POVM) has been performed, giving the outcome k with probability tr(M (k) ρ ⊗N ), where ρ = |ψ ψ|, we compute the posterior density function for b, f (b|k), through the Bayes formula where p(k) is given by and the conditional probability of getting outcome k when the state's non-local parameter has value b, p(k|b), will be shown later. The gain of information resulting from obtaining the outcome k after the measurement is quantified by the Kullback information corresponding to the prior and posterior probability density functions This expression has to be averaged over all the possible outcomes of the measurement, so that the expected gain of information reads and using (5) this expression can be written as Let us notice here that the value of K[f k , f ] in Eq. (7) would remain unchanged if we decided to characterize the entanglement of |ψ by another parameter b = h(b) (where h(b) is any bijective function of the original parameter b). Consequently, the gain of information we compute for b also applies to any of the measures of entanglement so far proposed, such as the entanglement of formation [7] for the asymptotic regime, or the monotone [10] for the single-copy case.

III. OPTIMAL MEASUREMENTS FOR ENTANGLEMENT ESTIMATION
We are looking for a measurement of the form (4) such that the expected gain of information (9) is maximized. We will present and explain here and in Section V such optimal measurements, whereas their explicit construction is mainly contained in Section IV.

A. Local and global strategies
Before we proceed we comment on four classes of measurements Alice and Bob may consider in order to learn about b [12]: • local measurements on only, say, Alice's side, i.e. on the N qubits supporting the local state ρ ⊗N A , would be the most restrictive class of the hierarchy; • uncorrelated bilocal -i.e. each party measuring on their local N -qubit part independently-and • classically correlated bilocal -that is, with classical communication between Alice and Bob-measurements are two intermediate types of strategies; finally, • global measurements on the 2N qubits constitute the most general case.
Global measurements are in principle the most informative ones. But as the parameter b which quantifies the entanglement of |ψ , completely quantifies also the mixing of ρ A (and ρ B ), it could well happen that local measurements, or bilocal on the two parties, optimal for the determination of the mixing, are as informative as the global ones with respect to entanglement. In fact, in reducing |ψ ψ| to ρ A ⊗ ρ B only the relative phase α is lost, the dependence on directionsâ andb and on the entanglement b is preserved. We have found the optimal global and local measurement of b. The results obtained following the two strategies are the same, as we will discuss in Section VII, so all the extractable information about the entanglement is preserved under the partial trace operation, and the four classes considered above turn out to be equivalent for entanglement estimation.

B. Effective mixed state
Notice that all the dependence on the measuring strategy (4) in Eq. (9) is contained in the probability p(k|b) of outcome k conditioned to the entanglement of the state being some given b, where the sum over the rest of parameters reflects the fact that we are only interested in the entanglement. This expression can also be written as where the mixed state ρ (N ) (b) is Eq. (13) allows for an alternative interpretation to our problem: a 2N -qubit mixed state ρ (N ) (b) is drawn randomly with prior probability distribution f (b) = 3b 2 and we want to determine it by estimating b.
We will compute p(k|b) in the basis that diagonalizes ρ (N ) (b), which will crucially turn out to be independent of b. Let us denote by λ 1 (b), ..., λ m (b) the positive eigenvalues of ρ (N ) (b), and with n 1 , ..., n m their multiplicity. From the normalization of (14) the relation m j=1 n j λ j = 1 follows. The sum n ≡ j n j of multiplicities of (non-vanishing) eigenvalues equals the dimension of the space which supports |ψ ψ| ⊗N . This is the symmetric subspace of H ⊗N 4 , and thus [5] With this notation Eq. (13) reads By substituting this expression in (9) and using the inequality [16] ( where x i , y i ≥ 0, along with the fact that the POVM is a resolution of the identity in the symmetric subspace of = n j , it follows that the average gain of information is bounded bȳ 3 C. Minimal most informative measuring strategy. The bound (18) can be minimally saturated through a measurement with m outcomes where each M (k) is the n k -dimensional projector over the subspace corresponding to the eigenvalue λ k of ρ (N ) (b), having then p(k|b) = n k λ k (b). Therefore the construction of the optimal measurement can be readily performed after the computation of the spectral decomposition of the state (14), and this is done for an arbitrary N in the next Section. For a more detailed account of the N = 1, 2, 3 cases see Section V, where also the gain of information up to N = 80 has been computed explicitly.
Notice also that there are other ways measuring strategies can be evaluated and, consequently, there is not a unique notion of optimality. For instance, in [1][2][3][4][5][6] a guess for the unknown state is made depending on the outcome of the measurement, and then both guessed and unknown state are compared using the fidelity. It can be proved, following Ref. [16], that the optimal measurements presented here, the most informative ones, are also optimal if we decide, alternatively, for a fidelity-like figure of merit satisfying some very general conditions [19].

IV. COMPUTATION OF ρ (N)
It has been shown that the spectrum of ρ (N ) (b) determines the maximal gain of information about b, whereas its eigenprojectors lead to the corresponding measuring strategy. Our next step will be the computation of the spectral decomposition of this effective mixed state.
Let us rewrite the generic two-qubit pure state (1) as where , the single-qubit pure states |+ A and |− A (|+ B and |− B ) constitute an orthonormal basis in Alice's (Bob's) part -corresponding to some fixed direction in the Bloch sphere-, U A and U B are unitary transformations in each single-qubit space and |ψ(b) is a reference state.
The state ρ (N ) (b) corresponds then to a Haar integral over the group SU (2) × SU (2), since it can be expressed as where the index g denotes the elements of the group G = SU (2)×SU (2), D(g) = U A ⊗U B is a 1 2 × 1 2 irreducible representation (irrep) of this group and M (b) = |ψ(b) ψ(b)|.
A well-known result in group representation theory following from Schur's lemma, the so-called orthogonality lemma, will be useful in the calculation of this integral. Consider a matrix A αβ (B) given by where D α and D β are two unitary irreps of the group G. Then, Lemma 1 (orthogonality lemma): ⊗N , and our next task is to recognize them.
The state |ψ(b) ⊗N can be expanded as where |. B means that we have exactly the same vector in the second subsystem. Notice that in the expression above all the elements of the product basis {|u i } of the local spaces H ⊗N We move now from the local spin basis {|u i A } to the coupled one {|v i A } in Alice's N qubits, and we also do the same in Bob's. The following lemma, that can be easily checked, will be useful here.
Lemma 2: Let {|e i } and {|f i } be two orthonormal basis in C l , related by an orthogonal transformation O, Now, notice that the unitary transformation relating the local basis and the coupled one is real (since all the Clebsch-Gordan coefficients are real) and that there is a conservation rule for the total third spin component (i.e. the Clebsch-Gordan coefficients that couple two states with third component m 1 and m 2 to a coupled state with third component m are proportional to δ m,m1+m2 ). Then Eq. (24) can be reexpressed, using the previous two facts and lemma 2, in the coupled basis as (see the examples in next Section for more details). We note that the symmetry between the terms in A and in B allows us to derive (26) from (24). Let us now have a closer look into Eq. (26). The term with coefficient c N + corresponds simply to the state with a total spin j maximal in both Alice's and Bob's subsystem (i.e., j A = j B = N 2 ) and also maximal third spin component m, namely m A = m B = N 2 . We can thus write, with the notation which again belongs to the previous N 2 ⊗ N 2 -irrep, the remaining N−1 kets, |v 3 · · · |v N +1 have j A = j B = N 2 − 1, and thus belong to N−1 different (but equivalent) ( N 2 −1)⊗( N 2 −1)irreps of the group. But since only the linear combination |v 3 + · · · + |v N+1 appears, the relevant irrep is just the symmetric combination of the latter N − 1 ones, which we will denote by {( N 2 −1) ⊗ ( N 2 −1)} sym , and which no longer decomposes as the product of two irreps of SU (2). The same applies for ( N 2 −2) ⊗ ( N 2 −2)-irreps and so on.
Thus, the space which supports the initial state can be decomposed in terms of irreps of SU (2) × SU (2) as where N mod 2 is equal to one for odd N and equal to zero for even N . It can be checked that this result agrees dimensionally with formula (15). The decomposition shown above in terms of the relevant irreps of the group SU (2) × SU (2) together with the orthogonality lemma can be used to solve the integral in (20). As we have argued, when plugging (26) into (20) the cross terms corresponding to inequivalent representations -such as |v 1 ( v 3 | + .... + v N+1 |)-vanish as we integrate, while the terms within the same representation -such as |v 1 v 1 |-lead to a contribution proportional to the identity in the subspace associated with the representation. So the state ρ (N ) (b) is equal to This is the spectral decomposition we are looking for, where {λ j } are the entanglement dependent eigenvalues of ρ (N ) (b), the trace of the identities giving the corresponding multiplicities {n j }. It is important to notice that, as it was mentioned before, the eigenspaces are independent of b.
The calculation of n j λ j can now be readily performed from Eq. (26) by computing the trace of the projection of |ψ(b) N into each relevant irrep. The determination of the spectrum of ρ (N ) (b) completes, as we have shown, the construction of the optimal measurement for the estimation of the entanglement. In the next section some examples are studied in order to clarify the implementation of the procedure.
In this section we will apply the procedure described above to obtain the optimal estimation of b when one, two and three identical copies of the initial state are at our disposal.
The simplest case, N = 1, is now straightforward. The state written as in (19) belongs to the 1 2 ⊗ 1 2 irrep of SU (2) × SU (2). From (20) we have, using the orthogonality lemma as in (28), The eigenvalue λ 1 (b) = 1 4 is obtained by taking the trace in the expression above. The probability p(k|b) (see (13)) is independent of b, so that p(k) = p(k|b) and the average Kullback information (9) vanishes.
Consequently, no information whatsoever can be obtained about the entanglement of a completely unknown pure state if only one copy is at our disposal.

B. N = 2
For the N = 2 case the initial state has the form, from (23) or (24), Now, using lemma 2 and the conservation law mentioned before for the Clebsch-Gordan coefficients (cf. Eq. (26)), we can rewrite the state as where for each party the coupled basis is related to the local one by means of an orthogonal transformation, as usual, The state |ψ(b) ⊗2 in (31) is supported then in the 1 ⊗ 1and the 0 ⊗ 0-irreps of SU (2) × SU (2), and now the application of lemma 1 gives for ρ (2) We just need to pick up the contributions of (31) to each irrep, that is the trace of the corresponding projections, to find that The optimal measurement (see Eq. (18)) then consists of two projectors onto the 1 ⊗ 1-and 0 ⊗ 0-irreps of SU (2) ⊗ SU (2), with probabilities p(1|b) = n 1 λ 1 (b) = 3+b 2 4 and p(2|b) = n 2 λ 2 (b) = 1−b 2 4 , and from them p(1) = 9 10 and p(2) = 1 10 . Finally the gain of information can be computed using (9) and it givesK = 0.0375 bits.
The last case we want to discuss is N = 3. Starting now from (26) we have we observe that only contributions to the 3 2 ⊗ 3 2 -and to two different 1 2 ⊗ 1 2 -irreps of SU (2) × SU (2) appear. Notice, in addition, that since in this expansion the contributions to 1 , the relevant irreps is precisely a symmetric combination of the two latter ones, The orthogonality lemma gives now Finally, by collecting the traces of each projection of (35) onto each irreps we obtain and thus the optimal measurement is composed by a 16dimensional and a 4-dimensional projectors into the two irreps shown above, the corresponding probabilities being p(1|b) = 1+b 2 2 and p(2|b) = 1−b 2 2 . From them p(1) = 4 5 and p(2) = 1 5 , and the gain of information is of 0.084 bits.

D. N > 3
We have applied the same, general procedure to obtain the gain of information up to N = 80, as reported in Table I and Figure 1. We observe a logarithmic asymptotic dependence of the gain of information on the number N of available copies of |ψ , which reads bits of information on b.

VI. OPTIMAL ESTIMATION OF MIXING
So far we have considered the most general measurement involving the whole space (H 2 ⊗ H 2 ) ⊗N of N copies of a two-qubit pure state. Now we are going to study optimal local measurements for the estimation of its entanglement. Alice will perform a collective measurement over the N copies of the state ρ A in Eq. (2) at her disposal in order to estimate the parameter b. Consequently, we are also studying optimal strategies for estimating the degree of mixing of a single-qubit mixed state, when N copies are available.
In order to study the latter with more generality we will consider a generic prior distribution f (b) for the degree of mixing while keeping an isotropic distribution in the Bloch vector directionâ of the unknown mixed state, with A general measurement on the local composite system supporting the state ρ ⊗N A consists of a resolution of the identity in the corresponding Hilbert space H ⊗N 2 by means of positive operators M (k) . The gain of information is as in (9), where now so that we need to compute the effective mixed state ρ (N ) where the integral is performed over the group G = SU (2) and a single copy of the mixed state has been expressed, as before, in terms of a reference state ρ A (b) ≡ c 2 + |+ +| + c 2 − |− −| and a unitary transformation U A . The procedure to be followed is analogous to the previous one, the spectral decomposition of the state (41) allowing us to build the optimal measurement.
The density matrix ρ A (b) ⊗N can be written -by using a straightforward modification of lemma 2 and the mentioned properties of the Clebsh-Gordan coefficients-in terms of the coupled basis {|v i A } as Notice that the important role played before by the symmetry between the kets in A and B (cf. Eq. (26)) is now played by the symmetry between the terms in the bra and in the ket. However we see that now there are no cross terms between inequivalent irreps of SU (2), and that equivalent irreps, such as the N − 1 copies of the ( N 2 −1)−irrep, obtain equal but independent contributions. The space H ⊗N 2 , decomposed in terms of irreps of SU (2), is (see also Refs. [6] and [17]) The spectral decomposition of ρ (N ) is determined by application of the orthogonality lemma. Since equivalent irreps receive always the same contributions in the decomposition (43), the corresponding eigenvalues are equal, so that (41) reads ρ (N ) This is, of course, simply what remains from Eq. (28) when Bob's subsystem is traced out, and we have included the whole derivation only for completeness. Eqs. (16)(17)(18) still hold and therefore the optimal measurement for the degree of mixing b corresponds, for any isotropic distribution, to projections onto each of the subspaces associated with the eigenvalues {λ L k }. The gain of information is then given by the right hand side of Eq. (18). Notice that both the number of outcomes and the corresponding probabilities p(k|b) = n L k λ L k (b) are equal to the ones obtained before for entanglement estimation. In particular, it follows that there is no way to learn about the degree of mixture of an unknown mixed state if only one copy is available.

VII. DISCUSSION AND CONCLUSIONS
We have presented in this work an optimal strategy for the estimation of the entanglement of two-qubit pure states, when N copies are available. Such optimal measurement is also minimal, in the sense that it consists of the minimum number of outcomes, namely N/2 + 1 ((N + 1)/2) outcomes for the even (odd) Ncopy case. Most of the corresponding projectors are of dimension greater than one, and of course any further decomposition of them can be used in principle to obtain, simultaneously, some additional information about other properties of the unknown state, although our optimal POVM is not compatible with projecting onto states of the form |ψ i ⊗N as optimal POVM for state determination do [2][3][4][5], and they are thus less powerful for that purpose.
An interesting particular case is when the initial state is a product one, i.e. b = 1. It can be seen that in this situation we have only the outcome corresponding to the space of maximum spin, since n 1 λ 1 (1) = 1. Therefore if the outcome k, with k > 1, is obtained we can assure that the state is entangled.
In the previous Section we have also been concerned with the optimal estimation of the degree of mixing. Our optimal measurement, again minimal, can be used, for instance, to quantify the degree of purity of states created by a preparation device whose polarization direction we ignore. Our strategy is actually complementary to the one aiming at revealing optimally the direction of polarization of the state [1]. As a matter of fact, the optimal POVM we have obtained is just a coarse graining of the one obtained in [6] for optimal estimation of mixed states, which turns out to reach also the optimal standards of direction estimation obtained in [1]. Consequently, direction and modulus of the Bloch vector of an unknown mixed state can be optimally estimated simultaneously. Notice that this is not a frequent situation. If, instead, we would like to estimate the x, y and z components of the Bloch vector independently, we would have obtained incompatible optimal strategies (consider e.g. the N = 1 case, where an optimal measurement for the component of the Bloch vector along directionn consists of a two outcome measurement projecting on that direction).
Finally, we can argue that bilocal measurements, either uncorrelated or classically correlated, do not imply any improvement on the simpler, local ones for entanglement estimation. Once we get an outcome from Alice's local measurement we can compute Bob's effective state, and it is clear from Eq. (28) that his outcome will be the same as Alice's, so that no extra information on b will be obtained. We have also seen that the optimal global measurement on |ψ ⊗N is perfectly mimicked by a local one on ρ ⊗N A (or ρ ⊗N B ), so that actually all four classes of measurements considered in Section III A are equivalent. In fact, with hindsight, one can understand this result: local measurements are performed on the reduced density matrix, which is obtained by a partial trace over the other subsystem. This operation erases the information contained in the parameters α andb of Eq. (1). On the other hand the global measurement can be interpreted as being performed on the effective density matrix of Eq. (14), where the same parameters have been integrated over. This operation erases the information contained in them too.
It would be challenging to address the same question for bipartite mixed states, and for systems shared by more than two parties. Notice that in none of these cases optimal estimation of the non-local parameters would be possible by means of local (or even uncorrelated bilocal) measuring strategies. This is the case for mixed states because any given reduced density matrix ρ A may correspond to infinitely many mixed states ρ, with different degrees of entanglement, so that not even in the limit N → ∞ can the entanglement of ρ be properly inferred from ρ ⊗N A . The mere existence of hidden non-local pa-rameters [18], that is of entanglement parameters that are erased during the partial trace operation, also prevents uncorrelated local strategies from being optimal for estimation of pure-state tripartite entanglement. To conclude, two-qubit pure-state entanglement, a quantum non-local property, can be optimally estimated by means of local, but collective, measurements. presented in this work are also optimal with respect to a fidelity-guided scheme if the quality of the guesses is evaluated through any concave fidelity function F (b − b k ) -where b is the unknown parameter and b k is the guess made after outcome k-that reasonably takes its maximum for b k = b, i.e. F ((x + x ′ )/2) ≥ (F (x) + F (x ′ ))/2 and F (0) ≥ F (x ∈ [−1, 1]).  Average gain of informationK about b given N copies of the state |ψ . The points represent the results obtained by the described optimal measurement, while the line shows the asymptotic behavior.