Viral self-assembly as a thermodynamic process

The protein shells, or capsids, of all sphere-like viruses adopt icosahedral symmetry. In the present paper we propose a statistical thermodynamic model for viral self-assembly. We find that icosahedral symmetry is not expected for viral capsids constructed from structurally identical protein subunits and that this symmetry requires (at least) two internal"switching"configurations of the protein. Our results indicate that icosahedral symmetry is not a generic consequence of free energy minimization but requires optimization of internal structural parameters of the capsid proteins.

Spontaneous self-assembly of simple units into larger structures plays an important role in molecular biology and materials science. A striking example is the self-assembly of viruses [1]. As long ago as 1955, Fraenkel-Conrat and Williams [2] showed that an infectious rodlike virus -the tobacco mosaic virus -could be reversibly reconstituted in the laboratory from a twocomponent solution of its purified genome (RNA in this instance) and the protein that comprises its cylindrical capsid. Reversible self-assembly has been demonstrated as well for a number of spherelike plant viruses [3]. In all of these cases, the assembly proceeds spontaneously, without involving '' fuel consumption'' such as adenosine triphosphate hydrolysis.
As noted by Crick and Watson (CW) [4], the capsids of viruses are formed from a minimum number of gene products, given the small size of viral genomes. On this basis, CW argued that spherical viruses should have the symmetry of regular polyhedra (''platonic solids'') all of whose faces are identical perfect polygons in which all protein units sit in identical environments; the largest shell of this kind is an icosahedron consisting of 60 equivalent subunits. Subsequent capsid structure determinations confirmed the special role of icosahedral symmetry, but also indicated that larger numbers of protein subunits were involved.
Caspar and Klug (CK) [5] proposed a geometrical scheme for the general construction of icosahedral shells with an arbitrarily large number of subunits. Capsid proteins usually can be grouped into ''capsomers'' of either hexamer/pentamer units or trimer units. The number of proteins constituting a closed isometric surface equals 60 times a ''triangulation'' (T) number that adopts special integer values [6] such as 1, 3, 4, and 7 (see Fig. 1). Electron and x-ray diffraction studies have confirmed that the T-number classification applies to almost all spherelike viruses [7].
The success of the CK construction for a broad range of spherelike viruses indicates that the production of icosahedral symmetry might be a generic feature of the capsid free energy. Continuum elasticity theory supports this notion, at least for large capsids: The deformation energy cost incurred upon closing a hexagonal sheet on itself is minimized when the 12 fivefold sites are located as far as possible from each other, i.e., if the shell adopts icosahedral symmetry [8]. The CK construction is reproduced also in simple models for viral capsids based on the covering of a sphere with disks, provided icosahedral symmetry is imposed [9]. Icosahedral symmetry is, however, far from obligatory. In vitro self-assembly can produce not only icosahedral capsids, but also hexagonal sheets, rodlike aggregates resembling ''buckytubes,'' nonicosahedral spherelike capsids, and still more complex structures [3,10]. Closed, conelike structures of hexamers and pentamers, for example, are reported for the lentiviruses, such as HIV [11].
In this Letter, we propose a simple free energy for capsid self-assembly that can be used to study under what conditions self-assembly leads to structures with icosahedral symmetry. Our phenomenological Hamiltonian separates the free energy cost of a protein shell into an ''in-plane'' part describing deformations away from the ideal hexagonal packing structure, and an ''out-of-plane'' part arising from the difference between the preferred and the actual angle of neighboring capsomers. We computed a self-assembly phase diagram as a function of concentration and ''spontaneous curvature.'' Earlier accounts [12] of capsid self-assembly assume a particular capsid size and structure and focus instead on the kinetics of formation of intermediate and final structures. Additionally, a deterministic "local rules" theory has been developed [13] in which the nearest-neighbor interactions between individual proteins provide an intricate, coded template for specific capsid arrangements. The model in its simplest form treats both hexameric and pentameric capsomers as disks with an adhesive edge that describes the intersubunit bonding. Capsid selfassembly takes place from a disk solution with a given total mole fraction and chemical potential . Let V be the gain in energy when two disk edges are joined, where is the angle between the disk normals. We will assume that The V0 term in Eq. (1) is the (negative) disk-disk adhesion energy [14], which is assumed to be the dominant energy scale. The energy in the second term corresponds to the bending stiffness of a joint, while is the optimal angle of a joint; plays the role of the spontaneous curvature of the capsomer shell [15]. As first proposed by CK [5], spontaneous curvature is the natural thermodynamic control parameter for capsid size. The disk adhesion energy V0 favors packing a maximum number of disks on the capsid surface. Let N (< 1) denote the fraction of the capsid area covered by N disks at their maximum packing density on the surface of a sphere. An upper limit for N is the coverage max =2 3 p of a flat, hexagonal sheet of disks. By curving a hexagonal sheet to cover a sphere, additional interstitial spaces are introduced in the packing structure (see Fig. 1). The loss of binding energy suffered through the introduction of these holes -the in-plane deformational energy -is included as a ''mean field'' term N max ÿ N 2 , obtained by treating the layer of disks as a stretchable elastic sheet [16]. The resulting capsid Hamiltonian is Here z is the mean number of nearest neighbors per disk and B is a two-dimensional compressional modulus (times the disk area). The sum in the last term runs over all nearest-neighbor pairs of disks, with i;j the angle between their normals.
The problem of finding the coverage, N, of a sphere that is optimally (close-)packed by N circular disks is known in the mathematical literature as the Tammes Problem [17]. Results are available for small N values either in exact or numerical form [18]. Figure 2(a) shows N for N ranging from 10 to 75. Note that N remains significantly below the asymptotic value max , even for the largest available N values. The capsid energy HN, as computed from Eq. (2), exhibits as a function of N a complex, -dependent, spectrum of minima (not shown). To construct the self-assembly phase diagram, we treat N as a statistical quantity with N the mole fraction of N-disk capsids. Minimization of the solution free energy F hHNi ÿ TS with S ÿk B P N N lnN the mixing entropy, leads to a classical formula of selfassembly [19]: N / exp NÿHN . The onset of capsid formation is then identified by the condition that half of the disks remain in solution while the other half are incorporated in capsids. Minimization of F leads to the condition N ÿ 1 HN ÿ k B T lnN , with N the number of capsomers of the dominant capsid structure at onset.   N 72 (T 7), the first four structures in the T series. The T 1 dodecahedron is indeed quite stable for higher spontaneous curvatures, appearing at very low disk concentrations. However, the capsid structure that appears next for decreasing is not the T 3 (truncated) icosahedron at N 32 but instead, at N 24, a surprising octahedral, chiral structure that has the symmetry of an Archimedean solid known as the snub cube (see the second column in Fig. 1). The next structure encountered for decreasing is indeed N 32 (though the disk configuration does not have exact icosahedral symmetry [20]), though the N 42 (T 4) structure is superseded by N 48. Finally, N 72 appears, corresponding to T 7 (though again lacking perfect icosahedral symmetry). Note that this series of ''magic numbers'' coincides with the dominant maxima of N [see arrows of Fig. 2(a)]. Which of these magic numbers is selected is determined by the spontaneous curvature.
The essential aspect of our result is that, though N 12, N 32, and N 72 do appear as possible capsid geometries, the N 42 (T 4) structure is absent from the spectrum of magical numbers. On the other hand, two completely nonicosahedral conformations (N 24 and 48) are predicted to be present. An important test case in this respect is provided by the polyoma virus, which is exceptional in that all its capsomers have the same size (with five proteins per capsomer) [21], as assumed in our model. The native form of the polyoma virus is the N 72 (T 7) structure, but self-assembly of polyoma capsid proteins alone (i.e., without their genome) produces three dominant structures [22](depending on pH and ionic strength): N 12, N 24, and N 72. The N 24 structure has the symmetry of a (left-handed) snub cube, consistent with our model for the case of identical capsomers. We conclude that the adoption of icosahedral symmetry is not a generic feature of the self-assembly of finite-sized closed shells constructed from (more than 12) identical capsomers.
The capsomers of typical viruses are, however, not all identical as assumed in the simple model. Detailed structure studies [23] show that the 12 pentameric capsomers can have a quite different internal structure, often involving a conformational switch [23,24]. To account for this structural difference, we generalized the model by allowing 12 of the circular disks to have a smaller diameter than the others, and assigning a ''switch energy'' cost E to each transformation of a protein from a hexameric to a pentameric configuration. The ratio of the small and large disk diameters was chosen to optimize the coverage. Recomputing the close-packed structures for this ''tworadius'' model, we obtained surprising results; the Hamiltonian energies of the N 32 (T 3) icosahedral structure and, in particular, the N 42 (T 4) icosahe-dral structure now were significantly lower than that of the N 24 snub cube. This was, due to an increase in sphere coverage, as follows: 32 increases from 0.846 to 0.89 and 42 from 0.83 to 0.90, which even exceeds 12. The coverage increase for N 72 (T 7) was more modest, from 0.85 to 0.86. On the other hand, because for the N 24 snub cube the disks are all in equivalent positions, changing the size of one of the disks only reduces the coverage. We conjecture that this virtual elimination of the deformational energy term through the introduction of two, slightly different, disk radii, only takes place for N 10T 2, i.e., for the icosahedral symmetry T structures.
The appearance of a conformational switching energy term of 60E in Eq. (2) leads to an interesting effect: For increasing values of E, spherelike capsids transform to rodlike capsids. This can be understood by comparing the total energy of M spherical capsids, each having a curvature equal to the optimal curvature , with that of a single spherocylindrical capsid with the same mean curvature and the same total area. The last term in Eq. (2) favors a minimum value of the second moment of the curvature distribution. It then follows that the cylinder bending energy is larger than that of the spheres by an amount of order M. On the other hand, each spherelike capsid must accommodate 12 of the smaller disks, at a total energy price of order 60ME. A sphere-to-cylinder transition is thus expected when 60E= is of order one. We conclude that, with spontaneous curvature, E should be a second important control parameter for capsid size. Figure 3(a) shows the self-assembly diagram for the two-radius model as a function of E= and , giving the structures that appear first for increasing (only the T 1 and T 3 icosahedral structures and infinite tubes were included). We compared this self-assembly diagram with the phase behavior of a well-studied example of virus reconstitution, namely, the T 3 plant virus cowpea chlorotic mottle virus (CCMV), which is characterized by well-defined pentameric and hexameric capsomers [23]. The equilibrium phase diagram of the capsid proteins (again without genomic material) [3,25] [see Fig. 3(b)], displays a number of structures: hollow single and multishell capsids, hexagonal sheets, and buckytubelike spherocylinders [26]. A fairly monodisperse T 3 capsid phase is encountered in the pH range below pH 5.5 while at neutral pH (cross) the dominant population consists of protein dimers for low protein concentrations and buckytubes for higher protein concentrations. Comparing Figs. 3(a) and 3(b), it follows that the two-radius model can account for the CCMV phase diagram if the capsid spontaneous curvature increases with salinity -which is reasonable since the CCMV capsid proteins are strongly cationic, and if the switching energy E increases with pH. This last trend actually has been argued to be the case for CCMV proteins (due to titratability of terminal carboxyls) on the basis of structural studies [23].
In summary, we have proposed a simple model Hamiltonian for viral self-assembly by disklike capsomers. The preferred number of identical capsomers is characterized by a sequence of magical numbers that does not coincide with the Caspar-Klug sequence. We find that the appearance of icosahedral symmetry for smaller capsids is not an automatic consequence of free energy minimization (as it would be in the continuum limit) but instead requires optimization of a structural parameter (the ratio of the two disk radii). The two-radius model reproduces in that case the preference for icosahedral symmetry as well as a sphere-to-rod transition that has been observed for a number of viruses. The structural optimization is presumably the result of some form of biological adaptation [22] but this lies beyond the range of the present study.
We would like to thank Christopher Henley, John Johnson, Charles Knobler, David Nelson, and Alex McPherson for helpful discussions. One of us (D. R.) acknowledges support from the NSF through Grant No. CHE-0076384.