Universality of optimal measurements

We present optimal and minimal measurements on identical copies of an unknown state of a qubit when the quality of measuring strategies is quantified with the gain of information (Kullback of probability distributions). We also show that the maximal gain of information occurs, among isotropic priors, when the state is known to be pure. Universality of optimal measurements follows from our results: using the fidelity or the gain of information, two different figures of merits, leads to exactly the same conclusions. We finally investigate the optimal capacity of $N$ copies of an unknown state as a quantum channel of information.

We present optimal and minimal measurements on identical copies of an unknown state of a qubit when the quality of measuring strategies is quantified with the gain of information (Kullback of probability distributions). We also show that the maximal gain of information occurs, among isotropic priors, when the state is known to be pure. Universality of optimal measurements follows from our results: using the fidelity or the gain of information, two different figures of merits, leads to exactly the same conclusions. We finally investigate the optimal capacity of N copies of an unknown state as a quantum channel of information. Consider an unknown state of a two-level quantum system described by the density matrix ρ( b), b being the Bloch vector, b ≡ | b| ≤ 1. The preparation device provides N identical copies of the system, so that the state at our disposal is ρ( b) ⊗N . In the past few years the optimal measuring strategy, i.e. the most successful at revealing the identity of the unknown state, has been obtained, first for pure states [1,2] and then for mixed states [3]. Also the minimal among the optimal strategies, i.e. the ones with the smallest number of outcomes, have been constructed, both for pure states [4] and mixed states [3]. In the processing of information contained in quantum states, knowing the most efficient read-out procedures, i.e. the optimal and least resource consuming ones, is of course of importance.
In all these contributions the quality of the measuring strategy, characterized by a resolution of the identity in terms of positive operators M i ≥ 0, has been quantified by the fidelity [5]. In other words, when outcome i (related to M i ) happens one guesses the unknown state to beρ i ≡ ρ( p i ) and one quantifies the quality of the guess by One can arrive at eq. (2) from several different starting points. One of them is based on a measure of distinguishability of the probability distributions associated to ρ and ρ ′ by performing general positive operator valued measurements (as in eq. (1)) on them [6] and minimizing, ( Another is based on the standard Hilbert space scalar product of the two pure states which belonging to C 2 ⊗C 2 lead to ρ and ρ ′ when reduced [7], where maximization is performed over These equivalent definitions of the fidelity, plus the following properties which characterize it further, make it a unique quantification of the comparison of two general quantum states: In references [1,2,4] the unknown state was known to be pure, b = 1, but no knowledge of the direction of the Bloch vector was assumed. In reference [3] the unknown state was a mixed state drawn stochastically from a known isotropic distribution f (b), and although the best guessρ i depended on f (b), the optimal measuring strategy, that is the set {M i } of positive operators of the different outcomes, did not. For isotropic distributions optimal measurements are thus distribution, i.e. f (b), independent.
However, proposing an outcome-dependent guess and evaluating its quality through the fidelity is only one of the criteria that could have been used to define optimal measurements. A sound alternative, the one we shall investigate in this work and probably the most sensible choice in the context of quantum information theory, consists of quantifying the quality of measuring strategies through the gain of information about the unknown state. In fact, information theory already supplies a universally accepted, unambiguous scheme for this purpose, that we shall follow. It is based on Bayes formula, which provides a conditional (outcome-dependent), posterior distribution f c ( b|i) from the (here isotropic) prior distribution f (b), and on the Kullback, which quantifies the gain of information acquired when replacing f (b) with f c ( b|i).
More specifically, if is the probability of outcome i when the unknown state is ρ( b) and is the a priori probability of outcome i, then Bayes formula states that the posterior distribution f c ( b|i), the one which collects our knowledge about the unknown state ρ( b) after measuring when the initial knowledge was given by f (b), reads The gain of information about ρ( b), ∆I, is then given, in bits, by the Kullback of f c ( b|i) relative to f (b) [10] K This expression, the only one satisfying a series of intuitively reasonable conditions [11], is well-defined for continuous distributions (it has no dependence on the measure in the space of quantum states) and its average over possible outcomes, is precisely the difference of the a priori and average a posteriori entropies H of the corresponding probability distributions of states, as can be checked by considering eqs. (6)(7)(8) and that . This quantification is therefore equivalent to the one already used in previous works on quantum state estimation with discrete distributions (see, e.g., ref. [9]).
First, the question of which are the optimal measurements according to this information theoretically based criterion will be addressed. We will check explicitly for N = 1 and N = 2, and provide clues for any N , that optimal -and also minimal-measuring strategies are universal, i.e. independent of whether the fidelity or the increase of information is used for their quantification, and will compute the corresponding optimal gain of information ∆I. Then we will move to consider which is the isotropic prior f (b) for which optimal measurements extract most information, so that it corresponds to the optimal (isotropic) quantum channel of information. After introducing a reversible compression procedure we conclude that the optimal amount of extractable information is, as N → ∞, of one bit per effective qubit isotropic distributions.
In order to find an optimal measuring strategy, i.e. a set of operators M i as in eq. (1) maximizing the gain of information (eq. (8)), the following theorem and subsequent corollaries, valid for any number of copies N , will be very useful.
can be written, for any b, as the sum of two contributions of the form where the operators M i,1 , M i,2 are also positive (and M i,1 + M i,2 is not necessarily equal to M i ). Let us introduce corresponding prior probabilities P ap (i, k) and posterior distributions f c ( b|i, k) as in eqs. (5) and (6). Then, Proof: It follows from the inequality ∀ x 1 , x 2 , y 1 , y 2 ≥ 0. ✷ Corollary 1: An optimal measuring strategy with rankone operators always exists. (cf. [12]) Proof: Indeed, suppose i M i = 1 1 corresponds to an optimal measurement. Then, if M i = k |i, k i, k| is the spectral decomposition of M i , it follows from the theorem that the rank-one POV measurement i,k |i, k i, k| = 1 1 is also optimal. ✷ We can already consider the case N = 1, that is, when only one copy of the unknown state is available. One can convince oneself immediately that an optimal (and also minimal) measurement is just a standard von Neumann measurement. In fact, any will do because of the isotropy of f (b). Suppose that we measure σ z . Then, for b = (b sin θ cos φ, b sin θ sin φ, b cos θ), we have (12) and the gain of information is The function in square brackets in eq. (13) is monotonically increasing, so that the distribution for which the absolute increase in knowledge is maximal is i.e. an isotropic distribution of pure states. It is interesting to point out that if instead of using in ref. [3] the mean average fidelityF (1) we had used the mean average increase in fidelity, with the optimal guessρ 0 ≡ ρ(0) if no measurement is performed, so that with (cf. [3]) (I 0 = 1, I α ≥ 4I α+1 ), we would have obtained It is then easily verified that the maximum value of ∆F (1) also corresponds to the distribution eq. (14). Thus, for N=1, quantifying with the fidelity or with the Kullback information leads to the same (for N=1 somewhat obvious) optimal and minimal measuring strategy and to the same distribution which maximizes ∆I (1) and ∆F (1) . Is this also true for N = 2?
In order to answer this question we need to present a second corollary. Notice first that with the following notation (borrowed from [3]) for the composite Hilbert space of N copies of the unknown state ρ( b), for the corresponding local spin operators, . . .
and for the partial and total spin operators, the following spin invariances hold [3]: and since the total Hilbert space can be written as a direct sum where E {s (α) } are the simultaneous eigenspaces of all the operators S 2 (α) , ∀α = A, with corresponding eigenvalues {s α (s α + 1)}, ordered with decreasing α (see [3] for more details). For instance, for N = 2 only S 2 (B) (s (B) ) is relevant, i.e. E {s (α) } = E s (B) , and the decomposition reads where E 1 is the triplet or symmetric (under exchange of copies) subspace, with total spin s ≡ s (B) = 1, whereas E 0 is the singlet or antisymmetric subspace, with total spin s = 0. Then, Corollary 2: There always exists an optimal measuring strategy consisting only of rank-one operators of the form |{s (α) } {s (α) }|, where the not necessarily normalized vector |{s (α )} is an eigenvector of all partial and total spin operators, i.e.
and thus it belongs to the subspace E {s (α) } .
Proof: Let i M i = 1 1 correspond to an optimal measurement with rank-one operators M i = |i i| (where the |i do not need to be orthogonal nor normalized) and let Π {sα} = Π 2 {sα} be a projector onto the whole subspace E {sα} . Then it follows from eq. (24) that the theorem guarantees that the measurement of eq. (28) is also optimal. ✷ [Notice that exactly the same conclusion was also achieved, for any N , when the fidelity was used as a criterion for optimality [3], this being indicative of the universality we are considering here.] Thus, in order to find an optimal measuring strategy for N = 2 we can always choose the pure states on which the measurement projects to be symmetric or antisymmetric under the exchange of the two qubits. Let us next compute ∆I (2) for the optimal strategy of ref. [3], that is corresponding to a resolution of the identity of the form where |σ is the (normalized) singlet state, σ ·n|n = |n ( n|n = 1) and the four unitary vectorsn i point to the four directions of the vertices of a regular tetrahedron.
One readily obtains so that Can we do better, i.e. is there another resolution of the identity which leads to a larger ∆I (2) ? Let us prove that there is none. Because of corollary 2, the whole question boils down to whether symmetric entangled states could do better than the symmetric product states |n i |n i used in eq. (30). Consider therefore a general symmetric state of Schmidt decomposition where the isotropy of f (b) has been taken into account in choosing the basis. One can readily obtain the average Kullback information corresponding to this state, which after integration of φ gives This is a function of p that we want to maximize. Only k log 2 (k + √ k 2 − l 2 ) depends on p. The part −l 2 is maximized for p = 0 and p = 1. The other part too, as one can see easily neglecting the term l 2 . Thus ∆I (2) ψ is maximized when |ψ is a product state and the resolution of eq. (30) is indeed optimal.
As we did for N = 1, it is interesting to recall, with the help of ref. [3], the average increase in fidelity for N = 2 One can now check that both ∆I (2) and ∆F (2) are again maximized for the distribution eq. (14). For ∆I (2) this follows by observing that the part in square brackets of eq. (33) is an increasing function of b and that the other part, which depends on I 1 , increases as I 1 goes towards zero.
We have thus checked for N = 1 and N = 2 that both the fidelity and the Kullback information lead to the same optimal measuring strategy and to the same, pure state, distribution which maximizes their increases. We conjecture, while not foreseeing any feature which could jeopardize extending the proof to N > 2, that the universality of optimal measurements holds for any number N of copies of the unknown state [13]. Corollary 2 makes this conjecture very plausible. The precise optimal strategy is in fact determined to a great extend by the isotropy of the prior distribution, the symmetries of the state ρ( b) ⊗N which allow to choose each positive operator M i to act only on one of the subspaces E {s(α)} , and the fact that both the fidelity and the Kullback favour strategies with outcomes i whose normalized probability of occurrence Tr[ρ( b) N M i ]/Tr[M i ] spans the largest possible range as a function of the direction of b. Now, suppose we want to use the N qubits as a quantum channel of classical information. Alice prepares N copies of a given state ρ( b) (the classical information being encoded in the vector b) and sends them to Bob, who will perform a collective measurement in order to recover as much information about b as possible. The previous results single out using, when restricted to isotropic prior distributions, only pure states (b = 1) to encode classical information as the optimal method. We can then easily compute the optimal capacity of this isotropic quantum channel for any N , to find that which for large N gives log 2 N N bits carried per qubit. Notice that this is a purely quantum channel, no additional flow of classical information being required at any stage. Its poor capacity can be exponentially enhanced without spoiling this fact if we take into account that a pure state φ ⊗N belongs to the symmetric subspace S (N ) of the whole Hilbert space H (N ) . Since the dimension of S (N ) is N + 1, which corresponds to the dimension of a Hilbert space H (M) of M ≡ log 2 (N + 1) qubits, Alice can always compress, by means of a state-independent, unitary (and thus fully reversible) transformation, the state φ ⊗N to fit in M qubits, that will then be transferred to Bob. In this case the capacity increases up to 1 − O( 1 log N ) bits per qubit, which is asymptotically the classical one (as expectable, since for any two inequivalent states φ and φ ′ , φ ⊗N and φ ′⊗N become orthogonal as N → ∞), and which is consistent with the Levitin-Holevo bound [14] for the classical capacity of a quantum channel.
Summarizing, using the gain of information as a guide we have constructed optimal and minimal measurements on N = 1, 2 identical copies and have shown that for isotropic distributions the maximal gain of information is achieved for pure states. Also universality of optimal measurements has been proven, since these measurements exactly coincide with those obtained in previous work, where the fidelity was taken as figure of merit. We conjecture that also for N ≥ 3 the most informative measurements are the most faithful ones, and vice versa.