Energy Flow in the Cryptophyte PE545 Antenna is Directed by Bilin Pigment Conformation

Structure-based calculations are combined with quantitative modelling of spectra and energy transfer dynamics to detemine the energy transfer scheme of the PE545 principal light-harvesting antenna of the cryptomonad Rhodomonas CS24. We use a recently developed quantum-mechanics/molecular mechanics (QM/MM) method that allows us to account for pigment-protein interactions at atomic detail in site energies, transition dipole moments and electronic couplings. In addition, conformational flexibility of the pigment-protein complex is accounted for through molecular dynamics (MD) simulations. We find that conformational disorder largely smoothes the large energetic


Introduction
Photosynthetic antennae are special proteins characterized by carrying a large number of light-absorbing molecules (pigments), aimed at capturing sunlight and transferring this energy through a number of electronic energy transfer (EET) steps toward reaction centers, where the energy is subsequently stored through a number of chemical reactions. 1-2 This principle inspires the quest toward practical artificial photosynthetic devices, in which sunlight could be directly stored in the form of chemical fuels, thus lowering our dependency on fossil fuels. [3][4] Because lightharvesting is performed with very high quantum efficiencies (>90%) in natural biological systems, a detailed understanding of the mechanisms of photosynthetic EET has been of interest for a long time. A key objective in this area of research consists of developing integrated approaches able to describe the relation between structure and function of the light-harvesting antenna. [5][6] The fundamental properties of such systems can, to some degree, be understood based on Förster theory, which predicts incoherent EET dynamics according to the dipole-dipole interactions between pairs of pigments in the complex, as well as the spectral overlap between their optical line shapes. 7 However, a microscopic understanding of the role played by the environment is necessary in order to have a complete picture of how structure dictates EET dynamics. 8 In particular, for non-Förster energy transfer mechanisms it has become apparent that much more detailed insights into how molecules interact with complex environments are needed. [9][10] For instance, the protein environment actively modulates the dipole-dipole interactions, an effect that is described in Förster theory through a crude screening factor 1/n 2 , where n is the refractive index of the system. Recent developments of quantum chemical methods coupled to continuum or quantum mechanical/molecular mechanical (QM/MM) descriptions of the environment have shown that screening effects sensitively depend on the distance and mutual arrangement between the molecules, [11][12] as well as on the local protein dielectric environment that surrounds them. 13 Another key role played by the environment consists of the time-dependent modulation of the so-called site energies, i.e. the uncoupled electronic transition levels localized on each pigment. Such pigment-protein coupling, for instance, dictates line shapes, decoherence and energy transfer, and can be described through the frequency spectrum of the environment (spectral density). Equally important, though, is the magnitude of the pigment-protein coupling, which individually shifts each site energy in the complex thus defining the energy pathways in the system. A common strategy to unravel the site energies of a pigment-protein complex consists of a simultaneous fitting of several optical spectra, while adopting a set of electronic couplings calculated from the X-ray structure. 2,14 Such a strategy, however, becomes impractical when the number of pigments is very large, for instance, when dealing with the 96 chlorophyll a pigments in photosystem I (PSI). 15 A natural alternative consists in the calculation of site energies using quantum chemical approaches. In that case, though, errors can be substantial, so quantitative modeling of spectra and kinetics should be performed in order to elucidate a consensus model compatible with simulations and experiment.
Recently, Renger and co-workers have successfully applied a methodology that combines quantum chemistry and electrostatic calculations to unravel the site energies in a variety of chlorophyll-containing pigment-protein complexes. [15][16][17] Such a methodology derives energy shifts from the differential interaction of ground and excited state charge densities with the protein charge distribution, which is described by point charges, whereas protein polarization is tackled through a continuum dielectric model. This approach has proven to be quite robust. However, it neglects how the protein modulates the chromophore conformation, an effect that further tunes its electronic transition energy. Although this effect can be minor when dealing with chlorophylls, pigments characterized by a semi-rigid central porphyrin ring, it is expected to play a more significant role in bilins, characterized by a quasi-linear arrangement of four pyrrole rings. 18 Recently, other groups have computed the site energies in the Fenna-Matthews-Olson (FMO) complex by combining quantum chemical calculations with molecular dynamics (MD) simulations, thus allowing for conformational flexibility in the complex. [19][20] The results do not differ much from the previous set derived by Renger and co-workers, 16 thus apparently confirming that protein-induced distortions of the pigment conformation are not essential to describe cholorophyll site-energies. In this work we address the determination of site energies in a rather challenging system, the PE545 principal light-harvesting antenna of the cryptomonad Rhodomonas sp. strain CS24. 13,18,[21][22] The crystal structure of PE545, illustrated in Figure 1, is organized as an α 1 α 2 ββ dimer, and has been determined at 1.63 Å 23 and later at 0.97 Å resolution 21 . It contains 8 bilin chromophores, characterized by a linear tetrapyrrole structure covalently linked to the protein scaffold. In particular, each α chain (A and B) contains a 15,16-dihydrobiliverdin (DBV), whereas each β polypeptide chain (C or D) is linked to three phycoerythrobilins (PEB). The corresponding chromophores are  labeled DBV 19A , DBV 19B , PEB 158C , PEB 158D , PEB 50/61C , PEB 50/61D , PEB 82C and PEB 82D , where the subscript denotes the protein subunit and cysteine residue linked to the chromophore. The central PEB 50/61 pigments are linked to the protein by two cysteine residues. The overall PE545 structure (and the chromophores) displays a pseudosymmetry about the 2-fold axis relating the α 1 β and α 2 β monomers.
As mentioned above, bilins are linear tetrapyrroles that have a larger degree of conformational flexibility compared to chlorophylls. Thus, modulation of the site energies by the protein scaffold could be achieved by constraining the conformational space of the pigments. In addition, the DBVs have an extended conjugated π-system that red-shifts their spectral properties compared to PEBs, thus making them the obvious candidates for the energy trapping site in the complex. Previous theoretical studies suggested an energy ordering DBV 19 < PEB 82 < PEB 158 < PEB 50/61 in the complex 12,18 More recently, we modeled steady-state spectra and transient absorption of PE545 with modified Redfield theory. 22 The results suggested that quite significant energetic differences between pseudosymmetric pigments are needed in order to describe the spectroscopic data for the complex. However, several sets of site energies were found to be compatible with the spectra and kinetics. This translates into several unanswered questions, for instance, about the relative energetic ordering of the PEB 82C , DBV 19A and DBV 19B lowest energy chromophores.
In the present study we combine the MD-QM/MM methodology we recently developed 24 with spectral and kinetics modeling in order to determine a consensus model of the PE545 light-harvesting pathways. This methodology allows us to consistently incorporate environmental effects (electrostatic and polarization) at atomic detail not only in the evaluation of site energies, but also in the description of screening effects in the electronic couplings. From the calculated parameters, we model the spectra and excitation dynamics in PE545, and find a reasonable agreement with experiment, which can be further improved by minor adjustments of the calculated site energies. Our proposed model differs from the previous set of energies derived by fitting to experiments. 22 We find that the energetic differences between pseudosymmetric pairs, like PEB 50/61C /PEB 50/61D or PEB 82C /PEB 82D , are significantly smaller than previously suggested, in better accord with chemical intuition. In addition, we find that energy tuning in PE545 is mostly achieved by constraining the conformational space available to the bilin pigments in the protein scaffold, rather than by specific pigment-protein interactions. This paper is organized as follows. In Section 2 we describe the methods and computational details of the structure-based calculations as well as the modeling of spectra and kinetics. In Section 3.1, first we analyze the numerical robustness of our QM/MMpol methodology toward a number of issues. Then, in Section 3.2 we analyze the impact of conformational flexibility, as well as specific pigment-protein interactions, in tuning the energy levels of PE545. In Section 3.3 we model the spectra and kinetics of PE545 based on our calculated parameters and critically compare the results to experiments, with the aim of determining a consensus set of values for the PE545 site energies. Finally, we provide the conclusions and some future perspectives.

QM/MMpol model for energy transfer
In conventional QM/MM approaches, the environment is treated by assigning partial point charges to the atomic sites and the potential due to these point charges is introduced into the Hamiltonian of the QM system to give an effective Hamiltonian: (1) Where is the Hamiltonian of the isolated QM system (here the chromophores), and the operator introduces the coupling between the chromophores and the environment. The addition of to the Hamiltonian automatically leads to a modification of the chromophore wavefunction, which has now to be determined by solving the effective equation (1). This can be done using the same methods developed for isolated molecules; here in particular we shall focus on the standard Self-Consistent-Field (SCF) approach.
In our QM/MM approach, this standard MM description is improved by introducing chromophore-environment mutual polarization effects in terms of atomic polarizabilities which give rise to an induced charge distribution described by induced dipoles; as a result can be split into four terms The first term in Eq. (2), , describes the electrostatic interaction between the QM system and the atomic point charges in the MM region, whereas the second term, , represents the polarization interaction between the induced dipole moments and the electric field from the QM system. Finally, the MM term contains both the electrostatic self-energy of the MM charges ( ), and the polarization interaction ( ) between such charges and the induced dipoles. We note that the term enters in the effective Hamiltonian only as a constant energetic quantity, while the contribution depends on the QM wavefunction through the induced dipoles. In addition, we do not consider short-range dispersion and repulsion contributions in , since, as in most combined QM/MM methods, these are described by empirical potentials independent of the QM electronic degrees of freedom, thus not affecting our results.
The induced dipoles are dissected into a nuclear (n) and an electronic (e) component, namely where we have assumed a linear approximation, neglected any contribution of magnetic character related to the total electric field, and used an isotropic polarizability (α i ) for each selected point in the MM part of the system.
where N is the number of polarizable sites, which is determined uniquely by the positions of these sites and their polarizability values.
In this framework, the Fock operator which determines the SCF wavefunction of the solvated system becomes (4) where we have adopted the common expansion over a finite basis set, so all operators are given in a matrix form in such a basis, and P represents the one-electron density matrix in the same basis set. The first term in Eq. (4) corresponds to the gas phase Fock matrix whereas the solvent-induced one-and two-electron operators are given by: (5) where and indicate the electrostatic potential and electric field integrals.
This SCF-QM/MMPol approach can be further extended to a CIS formulation to calculate excited states and transition properties in a straightforward way once the proper Fock operator (4) has been defined. The CIS equations to be solved become (6) where i,j and a,b indicate occupied and virtual orbitals, respectively.
We note that in the CIS matrix A, the environment enters in two ways. First, the orbitals and the orbital energies have been modified by the environment through the addition of the terms reported in Eq. (5) to the ground-state Fock operator and, secondly, a new environment-induced response term C is introduced to account for the change in the induced dipoles due to the excitation process. As a result, the eigenvectors and the eigenvalues give now the transition density and transition energies for the molecule treated using QM in the presence of a polarizable MM environment. As reported in ref.
[ 24 ], the transition densities obtained from solving Eq. (6) can be further used to compute the electronic coupling which defines the EET rate so that both implicit and explicit (or screening) effects due to the solvent are automatically taken into account: all the details of such a formulation can be found in the cited reference.

Computational details
In our approach, we followed a two-step strategy. First, we performed ground-state classical MD simulations adopting the parm99SB 25-27 non-polarizable force field (FF) from the Amber package. Compared to polarizable FFs, parm99SB has been extensively tested and balanced in order to correctly describe the structural dynamics of solvated proteins. As a second step, we performed QM/MM calculations of the EET parameters, but in this case switching to an explicit polarizable FF to describe the MM region. This latter choice arises from the need to differentiate between nuclear and electronic polarization effects when describing electronic transitions and electronic couplings. 24 We note here that the aim of the MD simulation is to sample the ground-state ensemble of the system in order to estimate statistically averaged EET parameters corresponding to the ground-state configuration. The details of the MD simulation of PE545 in water can be found in our previous publication. 13 From the trajectory, a total of 141 snapshots (extracted every 50 ps during the last 7 ns) were considered for QM/MM calculations.
In the MMPol calculations, the protein residues, solvent molecules, and bilin chromophores not included in the QM region, were described using atom centered charges and isotropic polarizabilities. The corresponding polarizable FF was derived from DFT calculations using the scheme outlined in Ref. [ 28 ]. In brief, this strategy consists in cutting the protein into single-residues capped with COCH 3 and NHCH 3 groups. Then, the residues are subjected to separate DFT calculations of the FF parameters. Atomic isotropic polarizabilities were calculated at the DFT(B3LYP)/augcc-pVDZ level using the LoProp 29 approach as implemented in the Molcas code, 30 n X n ω whereas atomic point charges were obtained from DFT(B3LYP)/cc-pVTZ ESP calculations followed by RESP constrained fittings as implemented in the Gaussian09 31 and Amber9 32 programs, respectively. Charges and polarizabilities were derived corresponding to the initial crystal structure and then used in all QM/MM calculations.
The parameters for water, derived using the same strategy, were taken from Ref. [ 24 ].
In order to test the robustness of our approach we also used two alternative polarizable FFs, based on a completely different derivation of the parameters. First, we tested the Amber ff02 protein FF, 33 together with the water POL3 model, 34 which combines RESP charges with a modified version of Applequist polarization model.
Applequist distributed polarizability values are fitted to reproduce a set of molecular experimental polarizabilities by adopting a full interactive model. 35 However, the Amber ff02 energy model, as well as the water POL3 model, neglects 1-2 and 1-3 interactions. This translates into a well-known systematic underestimation of polarization terms in ff02 and POL3, but unfortunately 1-2 and 1-3 interactions cannot be simply switched on, because this would lead to a problem known as the "polarization catastrophe", originated by an unphysical mutual overpolarization of closely-spaced dipoles. [36][37] Thole introduced an alternative strategy to avoid this problem by replacing the point dipoles by smeared charge distributions. 38 We also tested this strategy by adopting the atomic polarizabilities derived by van Duijnen and Swart, 39 which were obtained by fitting molecular polarizabilities to experiment using Thole

Modeling of spectra and kinetics
In our previous paper 22

Validation and optimization of the QM/MMPol method
The QM/MM scheme we use allows a rich description of the effect of the protein and solvent environment on the PE545 light-harvesting properties. In particular, when associated with a MD simulation, it accounts for conformational flexibility of the pigment-protein complex, and is able to tackle the effect of pigment-protein interactions, including mutual polarization, on site energies, inter-pigment couplings, and transition dipole strengths. The efficiency and numerical accuracy of this methodology, however, rely on a number of technical issues. Here, we provide an analysis of such issues, most notably the sensitivity of the results on the polarizable force field used to describe the MM region of the complex, as well as the effects of the introduction of a cutoff radius to turn off MM polarization beyond a certain distance from the chromophores described at the QM level.

Polarization cutoff radius
As described in the Methods section, the MMPol method solves for mutual polarization between the QM and MM regions following either an iterative procedure or adopting a matrix inversion approach. In our test calculations on PE545, we found that the iterative method is significantly faster than the matrix inversion approach. In any case, the computational cost of the calculations depends dramatically on the number of MM polarizable sites. Thus, we have introduced a cutoff radius, R, in order to turn off MM polarization in regions located beyond a certain distance from the QM region (chromophores). This implies zeroing the polarizability of any MM site located farther than R from any QM atom, while atomic charges for all MM atoms are still accounted for. In Figure S1 in the Supporting Information, we show the increase in CPU time as a function of R corresponding to a full calculation of EET parameters for one structure of the PE545 complex, i.e. the calculation of the 8 bilin excited states and the 28 electronic couplings between such states. Roughly, passing from R=10 Å to 30 Å, the number of MM polarizable sites increases from ~2000 to ~20000. Such increase translates into a ~6-fold increase in CPU time. However, up to R=18 Å the variation is moderate (~50%), whereas at larger R values the increase is more pronounced.
Obviously, the choice of an optimal R value must balance a minimal CPU time with a satisfactory degree of convergence in the electronic transition energies, transition dipole strengths and electronic couplings calculated. In Figure S1, we also show the dependence of the errors in computed EET parameters with respect to the choice of R value. Here we use as reference the R=30 Å calculation, as negligible changes are already found between R=25 Å and 30 Å results (see Figure S1). Electronic couplings depend on R through the explicit environment-mediated term (see Ref.

Classical force field
A key issue in the application of QM/MM methods relies on the choice of the classical force field (FF) used to describe the MM region. In Figure 2, we show the comparison between the results obtained using three different FFs, as described in

Molecular basis for site energy tuning
In this section we analyze the molecular basis for the protein tuning of site energies.
Bilins are linear tetrapyrroles with a remarkable degree of conformational flexibility.   Table S1 in Supporting Information).
In Figure 3 we report the site energies obtained using either a continuum or an atomistic MMPol description of the protein-solvent environment. Our aim here is to assess the relative energetic ordering of the bilin sites. Given that CIS systematically overestimates transition energies due to the neglect of electron correlation, all DBV and PEB chromophores are shifted by a constant energy value, as indicated in Table S1  For the central PEB 50/61 pair, however, both PCM (∼600 cm -1 ) and MMPol methods (∼1500cm -1 ) predict a large energetic difference, so that the PEB 50/61C site has an energy level similar or even lower than the PEB 158 molecules. A similar energy difference is also obtained with MMPol for the PEB 82 pair (∼600cm -1 ). The energetic differences between PEB 50/61C / PEB 50/61D or PEB 82C / PEB 82D molecules could arise from the different sequence of the α polypeptide chains in the α 1 α 2 ββ dimer structure of the PE545 complex, which slightly breaks the symmetry of the system, but it could also be due to the neglect of protein flexibility or inaccuracies in the crystal structure. Apart from some differences in amino acid composition, the main difference between the α subunit A and B is that the A chain is 9 amino acids larger. This difference translates into an asymmetric arrangement of protein loops interacting with the PEB 50/61C / PEB 50/61D sites. 23 The enlarged segment of the A chain largely interacts with the PEB 50/61C molecule, as shown in Figure 4, whereas the PEB 50/61D site only slightly contacts the B subunit. This asymmetric arrangement translates into the PEB 50/61D site being considerably more  subunit B (pink), β subunit C (orange) and β subunit D (red). There is an asymmetric arrangement of protein loops interacting with the pigments because the PEB 50/61C site closely interacts with the A and C subunits, whereas the PEB 50/61D site is only loosely packed by the C and D chains, an arrangement significantly more exposed to the surrounding water solvent.
Despite the slightly asymmetric structure of PE545, when the full conformational flexibility of the pigment-protein complex is taken into account by averaging the energies over the MD trajectory, the energy difference between pseudosymmetric pigments is largely smoothed, in all cases being lower than 300 cm -1 , in better accord with chemical intuition. Thus, the large energetic asymmetries observed for the crystal structure arise from the fundamental approximation of using a single static structure to describe the conformation of the pigment-protein complex. In addition, the energetic ordering DBV 19 < PEB 82 < PEB 158 < PEB 50/61 does not change. In this case, the agreement between MD-averaged PCM and MMPol energies is quite striking, given that the former neglect specific pigment-protein interactions believed to play a major role in tuning the site energies of other photosynthetic complexes, such as FMO. 16 Again, this result indicates that the protein scaffold in PE545 mainly tunes the energy ladder in the complex by fixing the individual conformation of each PEB site, rather than by specific pigment-protein interactions. Of course, additional tuning is achieved by assembling two different bilin types (DBV and PEB) in the antenna.
In Figure 5, we show the MD-averaged conformations of the PEB and DBV sites in PE545 that dictate the energy ladder in the complex. For clarity, pseudosymmetric molecules are superimposed, so that slight differences in their conformations can be easily observed. This picture helps us illustrating the different conformational distributions sampled by each pigment. Note, however, that our MD-averaged site energies do not exactly correspond to these conformations, as they are averaged over 141 structures extracted from the MD trajectory. As expected, the conformation of pseudosymmetric sites is very similar in each case, thus explaining their similar site energies. Small differences are observed either in the orientation of the propionate groups linked to the central pyrrole rings, or in the ethylene group linked to the lateral pyrroles. Overall, the main difference between PEB and DBV conformations in terms of the linear tetrapyrrole arrangement arises from the τ 2 torsion, as illustrated in Figure 6.
The complete set of torsion angles averaged over the MD trajectory (including their standard deviations), or determined from the crystal structure, are reported in Table 1.
Inspection of these results indicates that relaxing the PE545 structure through the MD simulation induces only minor changes on the average values of the torsions compared to the crystal, with the remarkable exception of the PEB 50/61C site. In this case, the τ 2 value shifts by 30°, passing from a crystal value of 42° to 68°. In contrast, PEB 50/61D bilin maintains this torsion close the crystal value (47°). This effect could explain the lower energy of the PEB 50/61D electronic state compared to PEB 50/61C , given that a smaller torsion angle translates into a better conjugation along the π system. As discussed previously (see Figure 4), the asymmetric arrangement of protein loops interacting with the PEB 50/61C and PEB 50/61D sites most probably explains this different conformation.
Overall, our results indicate that biliprotein antennae seem to have a variety of mechanisms to evolve and shape their absorption profile, such as pigment composition, conformation and specific pigment-protein interactions. Our results indicate that, in contrast to chlorophyll-based antennae, pigment composition and conformation play the major role in defining the energy ladder in the PE545 complex.

Assessment of the model by comparison with spectra and kinetics
Using the site energies (Table S1), transition dipoles and center coordinates (Table S2), and couplings (Table S3)    The calculated and adjusted MD-MMPol site energies are compared with the energies extracted from the fit in our previous modeling 22 in Figure 9 and Table S4.
Comparing the calculated MMPol energies and the adjusted MMPol energies ( Figure   9b) we can find only two relatively small differences. First, the calculated DBV 19A and DBV 19B have different energies, whereas in the adjusted set this difference is eliminated in order to reproduce the red side of OD and FL shape. Second, the PEB 82C and PEB 82D pair (equal in energy in the calculation) should be more asymmetric in order to reproduce the positions, relative amplitudes, and shapes of the two absorption peaks (545 and 567 nm subbands). Thus, in the adjusted set the PEB 82D site is shifted to the red in order to obtain the right intensity of the 567 nm subband (together with contributions from DVB's). and PEB 50/61D sites, i.e. the PEB 50/61 dimer is more symmetric. In the spectra (obtained both with the calculated and adjusted energy sets) the corresponding exciton levels (i.e. levels k=7 and 8 in Figures S2 and 7) are more delocalized. In both models the lowest level is superradiant, whereas the higher one is only weakly allowed (see the two higher exciton components of the OD spectra in Figures S2 and 7). The differences between the original model and the MD-MMPol calculation are illustrated in Figure 9a. It is remarkable, that the quality of the fit obtained with the adjusted MMPol energies (as shown in Figures 7 and 8) is almost as good as obtained with the original model. 22 Moreover, the fit of the TA kinetics ( Figure 8) is even better (the original model gave a little bit too fast dynamics between 1 and 5 ps, see Figure 3 in Ref. [ 22 ]).
Bearing in mind that the adjusted energies are closer to the MMPol calculation, we conclude that the current (MD-MMPol+fit) energies (Table S4) are more reliable in order to explore the spectral properties and energy transfer dynamics in the PE545 complex.
Finally, we wish to discuss the energy transfer scheme in PE545 emerging from the present modeling. The corresponding time constants, pathways of energy equilibration, and the resulting kinetics of populations are given by Table S5, Figure 10,   The sites are labeled as in Table S5. Figure 11 shows the kinetics of the site populations upon 505 nm excitation.
Notice that in the new model the lower PEB 50/61D dimeric site is more populated initially. This is because the vibrational wing of the lowest superradiant level is even more intense at 505 nm than the higher (symmetry-forbidden) level with predominant contribution from PEB 50/61C . The equilibration dynamics after 0.5-1 ps is slightly slower, as expected from the analysis of the time constants. Due to this fact we obtain better agreement between the calculated and measured TA kinetics as shown in Figure   8.

Conclusions
In