Evolution of a histone variant involved in compartmental regulation of NAD metabolism

NAD metabolism is essential for all forms of life. Compartmental regulation of NAD+ consumption, especially between the nucleus and the mitochondria, is required for energy homeostasis. However, how compartmental regulation evolved remains unclear. In the present study, we investigated the evolution of the macrodomain-containing histone variant macroH2A1.1, an integral chromatin component that limits nuclear NAD+ consumption by inhibiting poly(ADP-ribose) polymerase 1 in vertebrate cells. We found that macroH2A originated in premetazoan protists. The crystal structure of the macroH2A macrodomain from the protist Capsaspora owczarzaki allowed us to identify highly conserved principles of ligand binding and pinpoint key residue substitutions, selected for during the evolution of the vertebrate stem lineage. Metabolic characterization of the Capsaspora lifecycle suggested that the metabolic function of macroH2A was associated with nonproliferative stages. Taken together, we provide insight into the evolution of a chromatin element involved in compartmental NAD regulation, relevant for understanding its metabolism and potential therapeutic applications. MacroH2A histone variants originated before the split of fungi and animals. ADP-ribose binding is an ancestral feature of their macrodomains and is linked to the compartmental regulation of NAD metabolism. This function was selected for during the evolution of metazoans.

N AD metabolism plays an essential role in all domains of life 1,2 . NAD functions as a redox cofactor and a signaling molecule. As a redox cofactor in catabolic reactions, it enables ATP production in mitochondria 3 . As a signaling molecule in its oxidized form, NAD + , serves as a donor of ADP-ribose moieties for effector enzymes such as poly(ADP-ribose) polymerases (PARPs), sirtuins and the CD38 family of hydrolases 4 . The biosynthetic enzymes and effectors involved in NAD metabolism are differentially distributed in the cell, making NAD metabolism highly compartmentalized 5,6 . In particular, the balance between NAD + consumption in the nucleus and its availability for cytosolic and mitochondrial redox reactions is essential for energy homeostasis 6,7 . A major challenge for the field is to understand how NAD metabolism is regulated on the compartmental level, and to dissect the relevance of this regulation for health and disease 5 .
PARP1 is the major NAD + -consuming enzyme in the nucleus, best known for its function as a sensor in the DNA-damage response 8 . The inhibition of PARP1 results in increased global NAD + levels in both cultured cells and mice, indicating that PARP1 also consumes NAD + under basal physiological conditions [9][10][11] . Conversely, nuclear NAD + levels regulate PARP1 activity 12,13 . A picture is emerging in which the nuclear NAD + consumption by PARP1 in differentiated cells needs to be kept low to maintain NAD-dependent functions in other compartments 14 . In particular, during myogenic differentiation this is achieved by the transcriptional downregulation of PARP1 expression 15 and the simultaneous upregulation of its endogenous inhibitor, macrodomain-containing histone variant macroH2A1.1 (ref. 11 ).
Macrodomains are ancient globular protein modules that have emerged as key players in NAD + -dependent ADP-ribose signaling 16 . They bind ADP-ribose as a free molecule, as an oligomer or, when covalently bound to proteins, as a post-translational modification [17][18][19] , whereas in some cases they have hydrolyzing activity 20,21 . The macroH2A family of histone variants is the only structural chromatin component containing macrodomains. In vertebrates, two genes and one event of alternative splicing give rise to three macroH2A proteins that differ in their macrodomains 22 . The capacity to bind ADP-ribose is limited to the macrodomain of the splice variant macroH2A1. 1 (refs. 23,24 ). As a consequence, macroH2A1.1, but not macroH2A1.2 or macroH2A2, binds auto-ADP-ribosylated PARP1 (ref. 19 ) and contributes to its enrichment on specific chromatin regions 25 . MacroH2A1.1 can inhibit PARP1 activity and thus interfere with PARP1-dependent processes 24,26 . In differentiated muscle cells, macroH2A1.1 reduces nuclear NAD + consumption by PARP1 and increases the availability of NAD + in other compartments, thus indirectly promoting respiration and ATP production in mitochondria 11 .
Hence, macroH2A1.1 is a chromatin component that takes part in the compartmentalized regulation of NAD metabolism in vertebrates. In the present study, we addressed the question of when macroH2A histone variants emerged on an evolutionary scale and whether their implication in metabolism was an ancient trait. For this, we performed a phylogenetic analysis to determine the order of events in the evolution of macroH2A histone variants and characterized the function of a macroH2A macrodomain from one of the most divergent macroH2A-containing species in comparison to vertebrates.

Results
MacroH2A first appeared in unicellular protists. The amino acids encoded by the mutually exclusive exon 5 determine the capacity of macroH2A1.1 to bind ADP-ribose, inhibit PARP1 and thus affect NAD metabolism (Fig. 1a). To understand when this role of macroH2A1.1 evolved, we first aimed to determine when the fusion between the histone fold and the metabolite-binding macrodomain occurred. For this, we analyzed genomic and transcriptomic sequencing data representing a wide diversity of eukaryotes and identified a MACROH2A gene in 330 of them (Supplementary  Tables 1 and 2). Previously, we reported the presence of macroH2A in a unicellular filasterean, Capsaspora spp. 27 . Together with animals and fungi, filastereans belong to the group of opisthokonts 28 . In the present study, we report the presence of macroH2A sequences in protists that diverged earlier than opisthokonts, such as the haptist Choanocystis sp. and the breviate Pygsuia biforma ( Fig. 1b and Supplementary Fig. 1). It is interesting that we did not find mac-roH2A in fungi or other unicellular opisthokonts. The presence of macroH2A became more prevalent with the emergence of animal multicellularity. Indeed, we found macroH2A-encoding genes in diverse species of most animal phyla and all vertebrates. On the other hand, and as previously reported 16,27 , macroH2A was absent from several nonvertebrate species, such as Drosophila melanogaster and Caenorhabditis elegans, as well as tunicates (Fig. 1b), which is indicative of lineage-specific losses. The second macroH2A gene corresponding to human MACROH2A2 appeared after the whole-genome duplication in the last common ancestor of vertebrates, followed by the appearance of the alternative splicing variant of the ancestral MACROH2A1 in a common ancestor of jawed vertebrates (Fig. 1b). All three vertebrate macrodomains show a substantial level of conservation that placed them closer to the highly conserved, replication-coupled histone H2B than the fast-evolving histone variant H2A.Bbd (Fig. 1c). Comparison of the amino acid sequences corresponding to exon 5 indicated that nonvertebrate macroH2A is most similar to macroH2A1.1 (Fig. 1d). Importantly, this included a high level of conservation of amino acids known to be required for ADP-ribose binding in human macroH2A1.1 such as Asp203 and Gly224 (refs. 17,23 ).
Taken together, these results allowed us to describe the evolutionary order of events that resulted in the three different mac-roH2A histone variants present in vertebrates (Fig. 1e). Importantly, we found that macroH2A is much older than previously reported and originated in premetazoan protists. It was exciting to find that the first macroH2A gene is similar to vertebrate macroH2A1. 1. This suggests that the function of the macroH2A histone variant in nuclear NAD + metabolism might be ancient. The high conservation of the macrodomains of macroH2A1.2 and macroH2A2 suggests that they have acquired new, and yet unknown, binding functions.
A protist macrodomain has higher ADP-ribose affinity. Given the origin of macroH2A before metazoans (animals), we sought to determine its potential ancestral metabolic implication by determining the biochemical properties of macroH2A in one of the protist organisms. As a model system we used Capsaspora owczarzaki, one of the closest unicellular relatives of animals 29 . The macrodomain of C. owczarzaki interacted with ADP-ribose in a similar fashion to the murine macroH2A1.1 macrodomain, as suggested by nuclear magnetic resonance (NMR)-based binding spectra ( Fig. 2a and Extended Data Fig. 1a,b). However, the ADP-ribose binding by the Capsaspora macrodomain was 8× stronger than by the macrodomain of mouse macroH2A1.1, with equilibrium dissociation constants (K d ) of 1.3 µM and 11.3 µM, respectively ( Fig. 2b and Extended Data Fig. 1c-e). The thermodynamic profiles indicated that the two macrodomains bind ADP-ribose using a different binding mode. The Capsaspora macrodomain bound ADP-ribose through favorable enthalpic and entropic contributions, whereas the mouse macroH2A1.1 macrodomain bound it mainly in an enthalpy-driven manner, which was partially compensated by an unfavorable entropic contribution (Fig. 2c). Furthermore, Capsaspora macroH2A macrodomain showed high selectivity for ADP-ribose binding, because its affinity toward ADP was ~50-fold lower, and no interaction was observed with related nucleotides (ATP, AMP, GDP) or ribose (Extended Data Fig. 1f).
To characterize the Capsaspora macroH2A macrodomain and its interactions with ADP-ribose at the atomic level, we solved the structure of the protein in the presence and absence of ADP-ribose by protein X-ray crystallography (Fig. 2d,e and Supplementary Table 3). The unliganded macroH2A macrodomain crystallized in the space group P12 1 1 and could be refined to 1.4-Å (0.14-nm) resolution (Protein Data Bank (PDB) accession no. 7NY6), whereas the ADP-ribose-bound protein crystallized in the space group P3 2 21 and was refined to 2.0-Å resolution (PDB accession no. 7NY7). The obtained globular structures, with seven central β-sheets in the characteristic 1276354 order surrounded by α-helices (Extended Data Fig. 2a,b), showed high structural similarity to the previously described macrodomains 17,23,30 . Although the C-terminal α-helix was not resolved in the ADP-ribose bound form of C. owczarzaki, its root mean square deviation was only 0.5 Å to the human ADP-ribose-bound macroH2A1.1 macrodomain (PDB accession no. 3IID). The electron density of ADP-ribose clearly revealed that the ligand situates itself within the canonical binding pocket of mac-roH2A macrodomain (Fig. 2f). The residues important for ligand binding included Asp203, which established a H bond with the adenine amino group, and Phe352, which stabilized the adenine ring via π-electron stacking of the aromatic rings. Furthermore, the amino group of the side chain of Asn316 established a H bond with the distal ribose of ADP-ribose (Extended Data Fig. 2c and Supplementary  Table 4), which can explain the selectivity for ADP-ribose over ADP (Extended Data Fig. 1f). On ADP-ribose binding, the side chains of Gln225 and Asn316 move toward each other and establish a H bond, resulting in a conformation that encloses the central diphosphate moiety of ADP-ribose (Fig. 2g).
In summary, the capacity of the macroH2A macrodomain to bind ADP-ribose is conserved in the protist C. owczarzaki. Indeed, the Capsaspora macroH2A macrodomain bound ADP-ribose stronger than its vertebrate counterpart. This raises the possibility that evolution has selected for decreased ADP-ribose affinity along the vertebrate stem lineage.
Two evolutionarily divergent residues close the binding pocket. Next, we determined the importance of the conservation of the amino acid sequence for ADP-ribose binding and for the structural integrity of the macrodomain. Capsaspora macroH2A and human macroH2A1.1 macrodomains share only 50% identity at the level of the amino acid sequence (Extended Data Fig. 3a). However, the multiple sequence alignment of >300 macroH2A macrodomain sequences delineated two well-conserved regions of the protein, one located toward the N terminus, overlapping with the region encoded by exon 5, and one toward the C terminus of the macrodomain fold (Fig. 3a). It is of interest that, with the exception of Ser275, all ADP-ribose-interacting residues were located in the conserved N-and C-terminal regions. Comparison of these two regions between Capsaspora macroH2A and human macroH2A1.1 indicated that most amino acids involved in ADP-ribose binding were invariant or structurally related, although the Capsaspora mac-roH2A macrodomain established a lower number of bonds with ADP-ribose (Fig. 3a, Extended Data Fig. 3b and Supplementary  Table 4). Mapping the conservation rate on the crystal structure of the apo-form of the Capsaspora macroH2A macrodomain indicated a high conservation of the inner part of the ADP-ribose-binding pocket (Fig. 3b), whereas the surface regions were more variable (Fig. 3c, Extended Data Fig. 3c and Supplementary Video 1). The only two residues involved in ADP-ribose binding and with a low degree of conservation were Gln225 and Asn316 in the Capsaspora macroH2A, which correspond to Glu225 and Arg315 in human macroH2A1.1. Both pairs of residues close the binding pocket on ADP-ribose binding. However, the polar uncharged side chains of Gln225 and Asn316 establish an H-bond (Fig. 2g), whereas the corresponding residues Glu225 and Arg315 in human macroH2A1.1 establish a salt bridge to close the binding pocket on ADP-ribose binding 23 . Despite the slight difference in the orientation of these side chains in the Capsaspora apo form from the human apo form (Fig. 3d), both macrodomains adopt an almost identical conformation on ADP-ribose binding (Fig. 3e).  Taken together, the low deduced conservation at the level of primary sequence contrasts with the highly conserved three-dimensional structure of C. owczarzaki and human macrodomains. The residues forming the inner ADP-ribose-binding pocket are highly conserved between C. owczarzaki and vertebrates. Two nonconserved residues corresponding to Capsaspora Gln225 and Asn316 close the binding pocket of the macroH2A macrodomain after an ADP-ribose-induced fit. Succeeding substitutions of Gln225 and Asn316 reduce affinity. We hypothesized that the specific substitution of amino acids Gln225 and Asn316 drove the evolution toward the decreased ADP-ribose-binding affinity from C. owczarzaki to vertebrates.
To test this, we reconstructed the protein phylogeny with a subset of 260 sequences and inferred the ancestral states of these two sites (Fig. 4a, Extended Data Fig. 4 and Supplementary Table 5). The obtained results revealed that the replacement of the protist-characteristic residue at position 225 (Gln225Glu) occurred early during macroH2A evolution, most likely in the ancestor of bilaterians and cnidarians, and was subsequently maintained in most animals (Fig. 4a). On the other hand, the transition to arginine at position 316 (Asn316Arg) shows more variability, as revealed by its prevalence only at specific protostome lineages, while becoming almost ubiquitous in hemichordates and chordates ( Fig. 4a and Extended Data Fig. 5a-c).
To understand the physiological consequence of this evolutionary course, we individually introduced Gln225Glu and Asn316Arg mutations in the Capsaspora macrodomain and confirmed that the mutant proteins were folded (Extended Data Fig. 5d). It is interesting that the Gln225Glu substitution resulted in a sevenfold decrease of the affinity toward ADP-ribose compared with wild-type (WT) Capsaspora protein (Fig. 4b). Thus, the affinity and the thermodynamic properties of ADP-ribose binding by the Gln225Glu Capsaspora mutant were more similar to those of the mouse mac-roH2A1.1 macrodomain (Extended Data Fig. 5e,f). Strikingly, the Taken together, these results shed light on the evolutionary events that affected the affinity of macroH2A macrodomains toward ADP-ribose. The exchange of Gln225Glu proved to be a determining factor for the decrease in ADP-ribose affinity over the course of evolution and preceded the epistatic change of Asn316Arg.
Dynamic regulation of Capsaspora metabolism. To understand how the protist macroH2A is related to metabolism, we used C. owczarzaki as a model organism. C. owczarzaki has a dynamic lifecycle composed of three stages: filopodial, aggregative and cystic (Fig. 5a). As a response to environmental cues, single cells from the filopodial stage transition to the aggregative stage, one of the simple forms of multicellularity. In both filopodial and aggregative stage, cells are proliferative. In less advantageous environments, C. owczarzaki can transition to a spore-like cystic stage, which represents a resistance form with much smaller, nonproliferative cells 29 . Using previously generated, stage-specific RNA-sequencing (RNA-seq) data 29 , we analyzed the expression levels of a curated list  of >400 metabolic genes, including genes encoded by mitochondria (Supplementary Table 6). The large majority of these metabolic genes were differentially expressed in at least one of the three life stages. Differentially expressed genes could be grouped in two larger and two smaller clusters by unsupervised clustering (Fig. 5b and Extended Data Fig. 7a). The larger groups contained 103 and 192 genes that were specifically up-or downregulated, respectively, in the cystic stage compared with filopodial and aggregative stages. We refer to these groups as Cys high and Cys low. Of the metabolic genes, 25 and 29 formed the smaller clusters that were differentially expressed between the filopodial and aggregative stages, placing the cystic stage in an intermediate position. These results suggested that the three life stages differ in their metabolic activity, with the cystic stage having the most divergent metabolic state.
It is interesting that almost one-fifth of the genes in the Cys high cluster were not related to anabolism or catabolism, but instead to other metabolic processes, including autophagy-related V-ATPase proton pumps ( Fig. 5c and Extended Data Fig. 7b). Although the genes involved in catabolism had a relatively lower representation in the Cys high cluster than in the Cys low one (Fig. 5c), they seemed to be more directed toward efficient mitochondrial ATP generation. For instance, mitochondrial genes encoding components of the respiratory chain were exclusively found in Cys high (Fig. 5d). In terms of anabolism, the relative proportion of anabolism-related genes in the Cys high and Cys low gene clusters was the same (Fig. 5c), but they had a different metabolic implication. NAD and NADP biosynthesis pathways were strikingly overrepresented in the Cys high cluster (Fig. 5d). Intrigued by this observation, we examined the dynamic expression of an extended set of genes related to NAD + metabolism, including effectors such as PARP1 and macroH2A. Unsupervised clustering distinguished four clusters related to their differential expression in the three life stages (Fig. 5e). Clusters 1 and 2 contained genes with increased expression in the two proliferative stages and decreased expression in the non proliferative cystic stage (Fig. 5e). In line with the proliferative characteristics, they included most of the genes encoding  were included as reference. The calculated K d values are indicated. A representative graph of four independent experiments is shown. c, Pie chart showing that almost 25% of analyzed Mollusca have mollusk-specific alternative splicing (AS) of macroH2A. The scheme in the right panel shows the alternatively spliced exon 7, specifying the amino acid residues that this splicing affects. Residue 316 is outlined by an arrowhead. d, For the subset of Mollusca with AS of exon 7, the identity of the amino acid corresponding to 316 is plotted as a percentage of species for both alternative exons. In most cases, isoform 7.1 contains the protist-characteristic residue Asn316 and the isoform 7.2 has exclusively the vertebrate-characteristic Arg316. AA, amino acid.  (25) and Agg high (29). Data represent a distribution of z-scores of expressed genes (n = 3 biologically independent samples). Box plot parameters are detailed in Statistical analysis. c, Metabolic genes were classified as anabolic, catabolic, context dependent, other and unknown. Pie charts indicate their proportion in Cys high and Cys low clusters identified in b. d, Column charts showing the relative contribution of different pathways to the anabolic (top) and catabolic (bottom) component of the group of genes in Cys high and Cys low. The total number of anabolic and catabolic genes shown in c was set to 100%. CoA, coenzyme A; FAD, flavin adenine dinucleotide; GSH, glutathione. e, Differentially expressed genes related to NAD + metabolism group into four clusters. Genes were classified in eight groups and color coded as shown: macroH2A, other macrodomain proteins, sirtuins, Nudix proteins, PARPs, PARGs, DNA repair and NAD biosynthesis. f, Western blots of total Capsaspora cell extracts from the three life stages using C. owczarzaki-specific antibodies for macroH2A and PARP1, and histone H3 as a loading control. A representative western blot is shown (n = 3 independent assays). An uncropped blot image is available as Source data. g, Changes in the relative ratio of macroH2A (mH2A) and PARP1 as determined by mass spectrometry. The value in Filo has been set to 1.  Agg PARP enzymes and other components of the DNA-repair machinery, which might allow cells to cope with replication-associated DNA damage. Clusters 3 and 4 contained genes more prominently expressed in aggregative and cystic stages. This included genes encoding components of biosynthetic NAD pathways such as the salvage enzyme nicotinamide phosphoribosyltransferase (NAMPT) and the de novo synthesis enzyme quinolinate phosphoribosyl transferase (QPRT) (Fig. 5e).
It is noteworthy that the RNA expression patterns of PARP1 and macroH2A in cluster 2 were highly similar, suggesting coregulation at the transcriptional level (Fig. 5e). At the protein level, PARP1 was readily detected in both proliferative stages by western blotting, but its levels dropped in the cystic stage ( Fig. 5e and Extended Data Fig. 7c). For macroH2A, we detected a doublet at the expected size, which collapsed into a single band in the cystic stage ( Fig. 5f and Extended Data Fig. 7d). To confirm this change in ratio, we extracted the information on PARP1 and macroH2A from available shotgun mass proteomic data 31 . Although this approach is not quantitative at the absolute level, it demonstrated that the relative ratio of macroH2A and PARP1 was the highest in the cystic stage (Fig. 5g), consistent with our western blotting results.
Taken together, gene expression data indicated that metabolism is dynamically regulated between the three life stages of the unicellular filasterean C. owczarzaki. It is possible that the cells in the cystic stage use salvaged and de novo synthesized NAD + for life-sustaining oxidative phosphorylation and ATP production primarily fueled by a catabolic metabolism. The relative ratio of macroH2A and PARP1 was the highest in the cystic stage, suggesting that the macroH2A-dependent inhibition of PARP1's nuclear NAD + consumption is most likely to occur in the nonproliferative stage.   Fig. 6 | Differential impact of a protist and a vertebrate macroH2A macrodomain on PARP1 activity and cell metabolism. a, Scheme showing how the inhibition of PARP1 by macroH2A1.1 (mH2A1.1) in the nucleus is connected with mitochondrial respiration through NAD + metabolism in vertebrates. NAM, Nicotinamide; NMN, Nicotinamide mononucleotide; OXPHOS, Oxidative phosphorylation. b, In vitro PARP1 auto-PARylation activity (act.) induced by nicked DNA measured by anti-PAR western blotting. Naphthol Blue staining shows the increasing amounts of macrodomains that were titrated into the reaction. A representative western blot is shown (n = 3 independent assays). c, Schematic overview of the constructs that have been introduced into macroH2A-deficient HepG2 cells (referred to as DKD cells). d, The exogenous expression of constructs illustrated in c shown by western blotting using anti-macroH2A1.1, anti-Capsaspora macroH2A and anti-GFP. Anti-histone H3 is included as a loading control. A representative western blot is shown (n = 3 independent assays). e, The expression of genes encoding key components and regulators of indicated metabolic pathways analyzed by RT-qPCR in the cell lines described in c and d. Data represent mean (n = 3) ± s.d. A two-tailed Student's t-test was used to make the indicated pairwise comparisons ( * P < 0.05). Rel. expr., Relative expression. f, Mitochondrial content was assessed by measuring the relative amount of unique sequences in mitochondrial and genomic DNA (mt/gDNA). Mitochondrial DNA content normalized to nuclear DNA (genomic DNA) with MT-ND2 and NDUFV1. Data represent mean (n = 3) ± s.d. A two-tailed Student's t-test is used to analyze the presence of statistical differences. g, The oxygen consumption rate (OCR) of stable cell lines described in c and d, measured in routine culture and after subsequent addition of the ATPase inhibitor oligomycin (O), the uncoupling compound FCCP and the electron transport chain inhibitors rotenone and antimycin A (AA) (left). The bar plot shows the resulting ATP production (right). Data represent mean (n ≥ 4) ± s.d. Ordinary one-way ANOVA was used to make the indicated comparisons between different groups of samples: * P < 0.05, **** P < 0.0001. Uncropped blot images for b and d are available as Source data.
Together, these metabolic adaptations can explain how C. owczarzaki survives in adverse conditions. Affinity correlates with PARP1 inhibition and respiration. The inhibition of PARP1 by macroH2A1.1 in vertebrate cells reduces nuclear consumption of NAD + , thereby increasing NAD + availability in the mitochondria necessary for respiration 11 (Fig. 6a). In accordance with its increased affinity toward ADP-ribose, the macrodomain of Capsaspora macroH2A had an increased inhibitory capacity toward PARP1 in vitro than the murine macroH2A1.1 macrodomain. This was abolished by mutations in the ADP-ribose-binding pocket, such as Gly224Glu and Gly314Glu (Fig. 6b).
Next, we sought to determine the impact of ADP-ribose binding by the Capsaspora macrodomain on metabolism in vivo. Currently, there are no available tools that would allow for the genetic manipulation of C. owczarzaki. Therefore, we used an orthogonal approach and introduced the Capsaspora protein into human HepG2 cells, stably depleted of all macroH2A isoforms 32 . To avoid confounding influences caused by any differences in histone-fold or linker sequences, we fused the WT and mutant macrodomains of Capsaspora macroH2A to the histone-fold and linker region of mouse macroH2A1.1, respectively (Fig. 6c). The expression levels of the WT reached approximately half the level of full-length mouse macroH2A1.1 and were in the range of the endogenous levels of macroH2A proteins in parental HepG2 cells ( Fig. 6d and Extended Data Fig. 8a-c). We confirmed that the green fluorescent protein (GFP)-tagged chimeric and murine proteins were fully incorporated into chromatin (Extended Data Fig. 8d) and in contact with PARP1 (Extended Data Fig. 8e). Key metabolic genes were largely unaffected by the expression of the different macroH2A constructs (Fig. 6e). Furthermore, the mitochondrial content was similar across the four cell lines (Fig. 6f). However, when analyzing the oxygen consumption, we found that both the basal and the maximal mitochondrial respiration increased in the presence of mouse macroH2A1.1 (Fig. 6g). Strikingly, despite its lower expression level, this was even more pronounced in the case of the chimeric protein containing the Capsaspora WT macrodomain and translated into a significantly increased calculated ATP production (Fig. 6g). The Gly224Glu mutant macrodomain was inert, further substantiating the requirement for a functional and intact ADP-ribose pocket. In sum, our results show that the increased ADP-ribose affinity of Capsaspora macroH2A translated into stronger PARP1 inhibitory capacity and a more pronounced impact on mitochondrial respiration, when compared with mouse macroH2A1.1.
Taken together, the results of the present study suggest that the capacity of macroH2A to bind ADP-ribose, inhibit PARP1 and dampen its nuclear NAD + consumption is an ancient trait that was already present in protists. During evolution, a reduction in ADP-ribose affinity, mediated by changes in the residues that close the binding pocket on ligand binding, fine-tuned the stringency of this mechanism.

Discussion
The origin of macrodomain-containing histone variants. The macrodomain is the defining feature of all macroH2A histone variants. By focusing our evolutionary analysis on the amino acid sequence of macroH2A macrodomains, we were able to delineate the evolutionary history of this atypical histone variant. MacroH2A first appeared in protists ancestral to modern animals, filastereans and breviates, with the original gene resembling that of mac-roH2A1.1. The presence of the macroH2A gene was retained and further diversified in vertebrates, whereas it was sporadically lost in some invertebrates with accelerated evolution, such as Drosophila spp. It is of interest that the loss of macroH2A in these species correlates with a reduced number of PARP genes 16 . Gene duplication in a common ancestor of vertebrates resulted in the appearance of macroH2A2, an isoform deficient in ADP-ribose binding. Consecutively, the alternatively spliced exon encoding macroH2A1.2 appeared in a common ancestor of jawed vertebrates, adding to a second example of an NAD signaling-inert macroH2A histone variant that can be incorporated into chromatin. The presence of macroH2A-encoding genes in haptist and breviate suggest that the fusion of a macrodomain and a histone fold occurred before the split between fungi and animals, and thus much earlier than previously thought. Due to the scarcity of the data, we cannot fully rule out the possibility that the appearance of macroH2A in the haptist or breviate was caused by independent fusion events as an example of convergent evolution.
Macrodomains are present in all forms of life. Some viral and bacterial macrodomains bind ADP-ribose 33 , whereas the first extensively characterized ADP-ribose-binding macrodomain was archaeal 17 . Histone proteins, including the histone variant mac-roH2A, have a longer half-life than average cellular proteins [34][35][36][37] . It is intriguing to speculate that the fusion of a macrodomain to a histone provided cells with an abundant and strictly nuclear element with an increased half-life for the regulation of NAD-dependent reactions. The ability to influence NAD + -dependent reactions in a compartmentalized manner might have provided an advantage to eukaryotes, allowing them to adapt to changes in their environment by adopting different states, consistent with our analysis of Capsaspora life stages. In addition, early macroH2A might have had functions in ADP-ribose signaling.
Evolution reduced ADP-ribose-binding affinity. A major conclusion of the present study is that ADP-ribose binding is the most ancient trait of the histone variant macroH2A. The comparative analysis of C. owczarzaki and vertebrate macroH2A macrodomains provides us with a better understanding of how the ADP-ribose-and NAD metabolism-related functions of macroH2A were shaped through evolution. The Capsaspora macrodomain bound ADP-ribose with almost ten times higher affinity than its mouse counterpart and, consequently, was a much more potent PARP1 inhibitor. We were able to map this functional difference to the substitution of only two residues that close the binding pocket in the Capsaspora macrodomain, Gln225 and Asn316. The ancestral sequence reconstruction is consistent with the Gln225Glu replacement occurring as early as in the common ancestor of cnidarians and bilaterian metazoans, leading to a decreased ADP-ribose affinity, which was maintained in most animal groups. Similarly, the Asn316Arg replacement seems to have occurred early during macroH2A evolution, but was sporadically lost in several protostome groups. Asn316Arg is strongly represented among deuterostomes, although the physiological reason remains less clear. It is of interest that a similar course of evolution was recently reported for hemoglobin, where only two historical substitutions in the ancestral protein decrease oxygen affinity, while enabling tetramerization and cooperativity 38 .
NAD + and macroH2A sustain nonproliferative life stages. The decreased stringency of macroH2A-dependent regulation was probably selected for along the vertebrate stem lineage. In vertebrates, macroH2A1.1 takes part in crosscompartmental regulation of NAD metabolism by inhibiting PARP1 activity in the nucleus. This function scales with the expression of the macroH2A1.1 isoform and was particularly prominent in nonproliferative myotubes where macroH2A1.1 is expressed at slightly higher levels than PARP1 (refs. 11,14 ). At the present, it is unclear how the inhibitory effect of macroH2A1.1 is mediated at the molecular level. We speculate that the binding of the macrodomain to mono-ADP-ribosylated PARP1 or PARP1 modified with short-chain PARylation could interfere with conformational changes required for PARP1 activity 14 .
C. owczarzaki has three different life stages: two proliferative stages and a nonproliferative, spore-like, cystic stage, to which it transitions in unfavorable environmental conditions 29 . We found that the ratio between macroH2A and PARP1 was the highest in the cystic stage, suggesting that the putative macroH2A-dependent compartmental regulation of NAD metabolism might mostly operate in the nonproliferative stage of C. owczarzaki, similar to what was observed in myotubes. The cystic stage was further characterized by high expression of catabolic, mitochondrially encoded genes encoding components of the respiratory chain. Although anabolic pathways were overall downregulated in the cystic stage, genes involved in biosynthesis of NAD and precursors were upregulated. Curiously, several reports indicate the importance of ADP-ribose and NAD during sporulation and germination of bacterial spores 39,40 . Bacteria might rely on their NAD-based redox potential for germination; more specifically the accumulation of the reduced forms may have an important role in the initiation of germination 41 . It will be interesting to test whether a similar mechanism enables C. owczarzaki to re-enter the proliferative stages of its lifecycle. Taken together, our results indicate that C. owczarzaki uses a combination of both eukaryotic and bacterial mechanisms for survival in nutrient-poor environments. Our data indicate that NAD biosynthesis is channeled to life-sustaining catabolic reactions in the cystic stage. This coincides with a high macroH2A:PARP1 ratio, which has the potential to limit nuclear NAD + consumption by PARP1. The experimental proof is pending the development of genetic tools.

The need for compartmental regulation of NAD metabolism.
NAD homeostasis is vital for optimal cell function and, by extension, for organismal health 1,2 . The NAD + pools of independent compartments communicate and are connected through the shared NAMPT reaction of the salvage pathway, and transport of NAD + and its precursors 5 , and thus creating a requirement for communicating and regulating local needs. But why is the compartmental regulation of NAD levels essential? NAD + -dependent enzymes differ in their Michaelis-Menten constant (K m ) as much as 100-fold, from 2 μM to 1,000 μM 7,42 . Thus, their activities are controlled by the local NAD + concentrations, which differ between cellular compartments 13,43 . This is well illustrated by the example of nuclear NAD + -consuming enzymes PARP1, sirtuin 1 (SirT1) and PARP2, which have decreasing NAD + affinity, respectively. They form a regulatory loop, whereby SirT1 can inhibit PARP1, whereas PARP2 regulates the activity of SirT1, depending on the nuclear NAD + levels 7 . Furthermore, some of the nuclear NAD + -consuming enzymes, such as SirT6, have an even higher NAD + affinity than PARP1. It is interesting that macroH2A1.1 has been shown to interact with ADP-ribosylated SirT7 in an ADP-ribose-binding pocket-dependent manner 44 . This raises the possibility that mac-roH2A1.1 may be involved in a more general regulation of nuclear NAD + metabolism and ADP-ribose signaling.
Compartmental NAD + regulation in the context of evolution. The fine-tuned regulation of NAD + consumption and localization is particularly relevant during cell differentiation, when the requirements for NAD + compartmentalization change 13 . It is conceivable that nonproliferative states require less NAD + for nuclear functions, such as replication-associated DNA repair, and thus benefit from prioritizing NAD usage for life-sustaining functions, such as ATP production through mitochondrial respiration. The function of macroH2A1.1 as a nuclear NAD + regulator was first demonstrated in differentiated myotubes 11 . However, macroH2A1.1 is expressed in a wide array of tissues 24 . In addition, its upregulation is also observed during the differentiation of other tissues apart from muscle, such as skin and colon 45,46 . This suggests that its function is more widespread and generally related to cell differentiation and increased cellular plasticity in animals. In this regard, it is worth noting that C. owczarzaki, one of the closest relatives of animals with a complex lifecycle, shares several mechanisms of spatial cell differentiation with animals 31,47 .
Furthermore, the consolidation of macroH2A with decreased ADP-ribose affinity in vertebrates coincided with the diversification of NAD biosynthesis pathways, which provided additional elements of regulation 48 . The increased complexity of higher organisms requires intricate fine-tuning of cell processes. This is often achieved by increasing the number of proteins in a regulatory network to allow for efficient sensing of subtle changes in the environment, thus enabling a fast response to environmental cues. It has been suggested that changes at the periphery of metabolic networks, possibly encoded by nonessential genes, are more likely to endow the system with the high probability of gaining beneficial changes than the changes in the rigid core of the pathway encoded by essential genes 49 . Our data suggest that the emergence of macroH2A in protists could be such a peripheral change in the network of NAD metabolism, and that it has been selected for during the evolution of metazoans. However, macroH2A is not essential for multicellular life itself because several animal species have lost macroH2A and macroH2A-deficient mice are viable 50 .
In the present study, we described the evolution of a histone variant that can act as an inhibitor of nuclear NAD + consumption, adding a unique mechanism for compartmental metabolic regulation. A better understanding of these regulatory mechanisms will be informative for ongoing development of the therapies targeting NAD metabolism and signaling 1,2 . Future work will have to further elucidate how the metabolic function of macroH2A integrates with its other molecular functions, such as the regulation of higher-order chromatin architecture 32 , DNA repair 51 and transcription 52 .

online content
Any methods, additional references, Nature Research reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at https://doi.org/10.1038/ s41594-021-00692-5.

Methods
Molecular data mining. MacroH2A sequences were collected from the GenBank database by Blast searches using human sequences as a query. For better representation of species, especially in the transition to the vertebrate lineage, de novo assembly of transcriptomes for jawless fish (hagfish and lamprey; BioProject accession nos. PRJDB4902 and PRJNA292033, respectively) and the bowfin (BioProject accession no. PRJNA292033) were carried out using Trinity software, v.2.2.0 in the Galaxy web platform 53 . Briefly, paired-end Search Read Archive Fastq files were uploaded from the European Nucleotide Archive to the Galaxy platform and their quality analyzed using FastQC (http://www. bioinformatics.babraham.ac.uk/projects/fastqc). All left and right reads were concatenated in two separate files and used as input for Trinity with default parameters. Assembled sequences were used to create local nucleotide databases and Blast searches were performed as described above.
Overall, 467 sequences encompassing 327 metazoan (58 vertebrate and 269 nonvertebrate) and 3 nonmetazoan species were retrieved. Three macroH2A variants (macroH2A1.1, macroH2A1.2 and macroH2A2) were collected for all vertebrate species except for species displaying only macroH2A1.1. For all macroH2A proteins, only the globular part of the macrodomain (amino acids 182 to the end in human sequences) was used in the analyses, unless stated otherwise. Manual Blast searches were performed for curating the data of underrepresented species, specifically for filasterean and ichthyosporean species using available information 28,54-59 .
Sequence alignments. Multiple sequence alignments were performed using MAFFT v.7 (ref. 60 ) and Jalview v.2 (ref. 61 ), and edited for potential errors in BioEdit (v.7.2). Logo plots were generated based on the aligned sequences using WebLogo3 (ref. 62 ). Multiple sequence alignments of the macroH2A1.1-like macrodomain sequences from 305 species with 1 macroH2A1.1-like isoform was generated using the alignment tool PRALINE 63 . The conservation score indicates the conservation of amino acid biochemical properties on a scale from 1 to 6, and is represented by height and color of the bar.
For pairwise sequence homology analysis, we used blast homology search and macroH2A macrodomain sequences of representative organisms as a query. The resulting percentage homology rate is represented in the homology matrix visualized using Morpheus matrix analysis and visualization software (software. broadinstitute.org/morpheus).

Phylogenetic and evolutionary analyses.
If not stated otherwise, phylogenetic and molecular evolutionary analyses were conducted using MEGA X v.10.1.7 (ref. 64 ). MacroH2A phylogeny and ancestral sequence states were inferred by using the maximum likelihood method with the LG substitution model 65 , including gamma-distributed variation among sites. Positions with <95% site coverage were eliminated, so the analysis involved 260 amino acid sequences and a total of 181 positions in the final dataset. The reliability of the reconstructed topology was contrasted by a nonparametric bootstrap method (1,000 replicates).
Protein sequence divergence was estimated using uncorrected differences (P distances, partial deletion 95%), and the rates of evolution were estimated by correlating pairwise protein divergences between pairs of taxa with their corresponding divergence times as defined by the TimeTree database 66 .

Metabolism-focused analysis of high-content data from C. owczarzaki.
Previously generated raw RNA-seq data 29 were realigned using STAR v.2.7.3a 67 . The function genomeGenerate was used to include the mitochondrial transcriptome data in the described assembly of the Capsaspora genome 68 and quantified using featureCounts software v.2.0.1 (ref. 69 ). Statistical analysis was performed using DESeq2 (ref. 70 ), using the likelihood ratio test selecting genes that show an adjusted P ≤ 0.01. Clusterization of the data was obtained using the DEGreport software package on BioconductoR.
Functional annotation was performed using EGGnog 5.0 using precomputed clusters and phylogenies 71 . The curated list of metabolic genes from Kyoto Encyclopedia of Genes and Genomes 72 was generated by retrieving Capsaspora genes using EGGnog 5.0 and GHOSTKoala 73 . This was further complemented by using Orthofinder 74 to identify Capsaspora orthologues of the human queries. The absence of orthologues was confirmed by Blast search. The resulting list has further been manually curated and genes involved in multiple pathways have been assigned to their parent metabolic pathway and categorized as anabolic, catabolic, context dependent (anabolic or catabolic), other and unknown.
The relative abundance of proteins was extracted from previously generated proteomics data using raw measurements, and averaging three replicates per condition 31 .
For mammalian expression constructs, the full-length sequences of Capsaspora macroH2A, or mouse macroH2A1.1 or macroH2A1.2, were cloned into pLVX-Puro lentiviral backbone (Clontech) adding an N-terminal enhanced GFP tag. Mouse-C. owczarzaki chimeras were generated by sequential cloning of fragments. First, histone-fold and linker-domain sequences from mouse macroH2A1.1 were inserted into the backbone, followed by the insertion of either WT or mutant (Gly224Gllu) Capsaspora macroH2A macrodomain. The pLVX-Puro with enhanced GFP alone was cloned and kindly provided by M. Gamble 75 . All sequences were verified by sequencing.
Protein production and purification. Rosetta (DE3), chemically competent Escherichia coli were transformed with bacterial expression vectors and grown in lysogeny broth medium supplemented with 34 µg ml −1 of chloramphenicol and 50 µg ml −1 of kanamycin at 37 °C overnight. The culture was used to inoculate 1 l of this Terrific Broth medium (Sigma) and grown at 37 °C and 200 r.p.m. until reaching an absorbance at 600 nm of 0.4-0.6. The protein expression was then induced with 0.5 mM isopropyl-β-d-1-thiogalactopyranoside for 16 h at 20 °C. The next day, bacteria were pelleted by centrifugation at 10,000g for 15 min at 4 °C. The bacterial pellet was lysed in 50 mM Tris, 300 mM NaCl, 10 mM imidazole and 1 mM DTT, pH 7.4, supplemented with 1 mg ml −1 of lysozyme, 10 µg ml −1 of DNase I and protease inhibitors (Roche cOMPLETE EDTA free). The lysates were cleared by centrifugation at 30,000g for 45 min at 4 °C. Subsequently, the cleared lysates were incubated with Ni-NTA beads (QIAGEN) for 1 h at 4 °C and passed over a gravity flow column 3×. After washing the beads with 3 column volumes using 50 mM Tris, 1 M NaCl, 10 mM imidazole and 1 mM DTT, pH 7.4, the proteins of interest were eluted with 50 mM Tris, 300 mM NaCl, 300 mM imidazole and 1 mM DTT, pH 7.4. The eluted proteins were dialyzed overnight into a phosphate buffer (50 mM KH 2 PO 4 and 1 mM DTT, pH 6.5), unless stated otherwise. The purified proteins were concentrated using a 3-kDa molecular mass cutoff, centrifugal concentrator (Amicon), and then flash-frozen in liquid nitrogen and stored at −80 °C.

STD-NMR. Saturation transfer difference NMR (STD-NMR) experiments
were performed as described elsewhere 76,77 . Briefly, 10 µM His-tagged Capsaspora macroH2A macrodomain, or murine macroH2A1.1 WT or mutant macrodomains, was dialyzed and prepared in a deuterated water buffer containing 16.2 mM Na 2 HPO 4 , 3.8 mM NaH 2 PO 4 , 1 mM tris(2-carboxyethyl) phosphine, pH 7.4 and 10 μM disuccinimidyl suberate. STD spectra of 1 mM ADP-ribose in the presence of 10 μM macroH2A macrodomains were obtained at 25 °C (298K) on a Bruker Avance 600-MHz spectrometer equipped with a cryoprobe. A pseudo-two-dimensional version of the STD-NMR sequence was used for the interleaved acquisition of on-resonance and off-resonance spectra. The on-resonance frequency was set to 0.0 p.p.m. and the saturation time was 3 s. The STD effect (%) was quantified based on the following equation: I STD = 100 × (I 0 − I SAT )/I 0 , where I SAT and I 0 are the intensities of a given signal in the on-resonance and the off-resonance spectra, respectively.
Isothermal titration calorimetry. Isothermal titration calorimetry (ITC) was performed as previously described 17 . Before the experiment, proteins were dialyzed overnight against 50 mM KH 2 PO 4 and 1 mM DTT, pH 6.5, at 4 °C. The dialyzed proteins were then centrifuged for 20 min at 20,000g at 4 °C, and the protein concentration was determined by absorbance measurements at 280-nm wavelength using calculated molar extinction coefficients. The nucleotides and ribose were prepared in the same buffer in the concentration range 1-1.5 mM. The concentration of ADP-ribose was additionally confirmed by absorbance measurements at 260 nm, using a molar extinction coefficient of 13,500 M −1 cm −1 . Assays were conducted on the PEAQ-ITC instrument (MicroCal) at 25 °C, and experimental data analysis was performed using MicroCal PEAQ-ITC Analysis Software.
Thermal shift assays. Fluorescence-based thermal shift assays were performed as previously described 18 . Briefly, 5 µM protein solutions supplemented with 8× SYPRO Orange were heated in 50 mM KH 2 PO 4 , pH 6.5 and 1 mM DTT from 5 °C to 95 °C at a ramp rate of 1%. The assays were conducted in MicroAmp Fast 96-well reaction plates sealed with MicroAmp Optical Adhesive Film (Applied Biosystems). The fluorescence measurements at 554 nm were normalized to the lowest value before the transition and the maximum fluorescence.
Crystallization, data collection and processing. All crystallization experiments were conducted at the Crystallization Facility of the Max Planck Institute of Biochemistry (Martinsried, Germany). Before setting up the crystallization, the proteins were dialyzed overnight against 20 mM Bis-Tris, pH 7.0, at 4 °C. The dialyzed proteins were then centrifuged for 20 min at 20,000g and 4 °C, and the protein concentration was determined by absorbance measurements at 280 nm using calculated molar extinction coefficients.
Crystals of Capsaspora macroH2A macrodomain in apo-form were obtained in sitting drop vapor diffusion experiments performed at 20 °C by mixing 100 nl of 0.1 M Bis-Tris, pH 5.5 and 25% (w:v) poly(ethylene glycol) 3350 (PEG-3350) with 200 nl of a solution containing the protein at 27 mg ml −1 . Crystals of Capsaspora macroH2A macrodomain in complex with ADP-ribose were obtained in sitting drop vapor diffusion experiments performed at 20 °C by mixing 100 nl of 0.2 M ammonium tartarate dibasic and 20% (w:v) PEG-3350 with 200 nl of a solution containing the protein at 21 mg ml −1 and 3.8 mM ADP-ribose. Crystals were cryoprotected by soaking in mother liquor supplemented with 30% ethylene glycol and flash cooled in liquid nitrogen. Diffraction data of proteins were collected on the Swiss Light Source or the in-house X-ray source of the Crystallization Facility of the Max Planck Institute of Biochemistry. All datasets were processed using XDS 78 .
The structures were solved by molecular replacement using the human macroH2A1.1 macrodomain (PDB accession no. 3IID (ref. 19 )) as a search model. Model building and real space refinement were performed in COOT 79,80 and the structures refined using PHENIX REFINE 81 . Model and restraints for ADP-ribose were prepared using Phenix.Elbow 82 . A summary of the data collection and refinement statistics is shown in Supplementary Table 3. The Ramachandran statistics for the final refined models were 97.86% favored and 2.14% allowed (apo, PDB accession no. 7NY6), and 96.55% favored and 3.45% allowed (ADP-ribose bound, PDB accession no. 7NY7). UCSF Chimera software 83 and the PyMOL Molecular Graphics System (Schrödinger, LLC) have been used for visualization.
The LigPlot diagram for the crystal structure of Capsaspora macroH2A macrodomain in complex with ADP-ribose was generated using an online platform LIGPLOT v.4.5.3 (ref. 84 ). A Consurf bioinformatic tool 85 was used for the projection of evolutionary conservation scores. Briefly, the conservation analysis of amino acid positions was calculated based on the phylogenetic relationships between sequences of 305 species with 1 macroH2A1.1-like isoform and using Capsaspora apo-macroH2A macrodomain structure as a query. UCSF Chimera software was used for visualization 83 .
Antibodies. We generated a specific antibody against Capsaspora proteins by immunizing rabbits with purified His-tagged Capsaspora macroH2A macrodomain or carrier protein-coupled peptides of Capsaspora PARP1. Specifically, we used a mix of three different peptides corresponding to amino acids 103-114, 132-142 and 299-310 of Capsaspora PARP1 protein. Sera were collected from terminal bleeds after three to four rounds of inoculation. The obtained antibody sera were used at 1:150 dilutions. The animal procedures were carried out by the CID-CSIC Antibody Generation Service (Spain) and UVic-animal care facility (Canada).

Culture of human and Capsaspora cells.
Unless stated otherwise, the following conditions were used for cell culture. MacroH2A-depleted HepG2 (DKD) cells 32 and HEK293T (American Type Culture Collection (ATCC), catalog no. CRL-3216) were routinely cultured in Dulbecco's modified Eagle's medium (DMEM) containing 4.5 g l −1 of glucose (Gibco) supplemented with 10% v:v fetal bovine serum (Gibco), 2 mM l-glutamine (Gibco), 50 U ml −1 of penicillin (Gibco) and 50 mg ml −1 of streptomycin (Gibco). Cells were authenticated, incubated at 37 °C in 5% CO 2 , and periodically checked for Mycoplasma contamination. Cells were collected by scraping, washed with phosphate-buffered saline (PBS) and pelleted. C. owczarzaki was cultured, and cells from all three different stages were harvested as described 29 . Before analysis, Capsaspora cells were washed with PBS and flash-frozen in liquid nitrogen.
Gene transduction and establishment of stable cell lines. HEK293T cells were used as packaging cells to produce viral particles for lentiviral infections. Four million HEK293T cells were seeded in P10 plates and cultured to 60-70% confluency. At that point the cells were transfected with 10 μg of the lentiviral plasmid of interest and 3 μg of the pCMV-VSV-G plasmid and pCMV-dR8.91.
Plasmid DNA was mixed in 1× HBS solution (2× HBS: 272 mM NaCl, 2.8 mM Na 2 HPO 4 and 55 mM HEPES, pH 7) containing 125 mM CaCl 2 in a total volume of 800 μl. The mix was added on to the cell culture medium dropwise and left overnight. Transfection efficiency was controlled using a GFP expression vector. The supernatant containing viral particles produced by HEK293T cells was collected for 24 h at 48 h and 72 h after transfection, filtered using a 0.45-μm filter and supplemented with 8 μg ml −1 of polybrene (Sigma-Aldrich). The fresh viral supernatant was added to target cells at 60-70% confluency in 6-well plates that were centrifuged for 45 min at 1,200 r.p.m. and 37 °C, incubated at 37 °C for 45 min and then cultured overnight in fresh medium. The same process was repeated 24 h after the first infection. The transduced cells were selected with 2 μg ml −1 of puromycin. The necessary selection time was determined by using a negative control plasmid without resistance. This procedure was used to generate all HepG2 stable cell lines. The efficiency of the cell infection was validated by live cell fluorescence and flow cytometry.
Cell fractionation, immunoprecipitation and western blotting. For western blotting of total cell material, cell pellets were directly resuspended in Laemmli's sample buffer, sonicated using a Bioruptor Plus (Diagenode) and incubated at 95 °C for 10 min before loading the samples on polyacrylamide gels.
For the coimmunoprecipitation of PARP1, the lysis buffer was complemented with poly(ADP-ribose) glycohydrolase (PARG) and PARP inhibitors (1 μM ADP-HPD from CalBioChem and 1 μM olaparib from SelleckChem, respectively). Insoluble material was removed by centrifugation, and lysates were precleared with Sepharose beads. At this step, 5% of the total lysate was kept as input material, and the rest of the lysate was incubated for 3 h with anti-GFP nanobodies coupled to magnetic beads (ChromoTek), which were previously blocked with 1% bovine serum albumin in lysis buffer. Precipitates were washed 3× with lysis buffer containing 1% Triton X-100. For SDS-PAGE and western blotting analysis, typically 1% input and 20% immunoprecipitated material were loaded.
For cell fractionations, nuclei were prepared as described above and the supernatant was kept as the cytosolic fraction. Nuclei were then incubated with high-salt buffer (20 mM HEPES, pH 7.9, 410 mM KCl, 1.5 mM MgCl 2 , 0.2 mM EDTA, 25% glycerol and 0.5% NP-40) for 30 min. Ultracentrifugation at 50,000g was used to separate the chromatin (pellet) and nucleosol (supernatant).
After the transfer of proteins to the nitrocellulose membrane (GE Healthcare), the membranes were blocked with 5% low-fat milk (Nestle) and incubated with primary antibodies overnight at 4 °C. The next day, membranes were washed with Tris-buffered saline and Tween 20 (TBST) and incubated with fluorophore-conjugated secondary antibodies for 1 h at 25 °C in the dark. After washing with TBST, the dried membranes were scanned with an Odyssey CLx Imager and analyzed using ImageStudioLite quantification software (LI-COR Biosciences).
RNA and DNA analysis. Total RNA from mammalian cells was isolated by the Maxwell RSC simplyRNA Cells Kit (Promega) using the Maxwell RSC Instrument (Promega) according to the manufacturer's instructions. Total RNA, 1 μg, was used for cDNA synthesis using the First Strand cDNA synthesis kit (Thermo Fisher Scientific) and oligo(dT) primers according to the manufacturer's instructions. Relative cDNA levels were quantified by reverse-transcription quantitative PCR (RT-qPCR) (LightCycler 480 II instrument, Roche). All samples were analyzed in technical triplicates. Values were normalized to two reference housekeeping genes (RPLP0 and GAPDH) and plotted relative to a reference sample set to 1. To measure mitochondrial and genomic DNA, we extracted the total DNA from all cell lines of interest. Briefly, cells were pelleted and the DNA isolation buffer (10 mM Tris-HCl, pH 8.5, 5 mM EDTA, 0.5% SDS, 200 mM NaCl and 0.1 mg ml −1 of proteinase K) was added directly to the pellets, and the samples were incubated overnight at 37 °C while shaking (Thermoblock). Proteinase K was inactivated by incubating the samples for 10 min at 99 °C. An equal volume of isopropanol was added to the lysates and left incubating for 20 min at 25 °C under constant shaking to precipitate the DNA. The precipitated DNA was pelleted by centrifuging at 10,000 r.p.m. for 10 min at 4 °C. The supernatant was removed and the precipitate washed with ice-cold 70% ethanol and centrifuged at 10,000 r.p.m. for 10 min at 4 °C. After removing the ethanol, the pellet was air-dried and resuspended in an appropriate volume of DNase-free water. The obtained DNA was used to perform qPCR with oligos of mitochondrial (MT-TL1, MT-ND2) and genomic DNA (ACTB, NCOA3). Results were demonstrated as a mitochondrial:genomic DNA ratio. The sequences of all primers used are given in Supplementary Table 7.
Analysis of mitochondrial oxidative phosphorylation. Mitochondrial respiration was monitored with the XFe-96 Cell Bionalyzer (Seahorse Biosciences). Optimal cell density and drug concentrations had been previously determined 11 . A standard MitoStress assay was performed. Briefly, 20,000 cells were plated in an XFe-96-well plate, and cells were kept for 6 h in DMEM-10% fetal bovine serum to allow the cells to attach. Then, the medium was changed to 10 mM glucose, 2 mM glutamine and 1 mM pyruvate XFe DMEM (5 mM HEPES, pH 7.4), and cells were incubated for 1 h at 37 °C without CO 2 . Three different modulators of mitochondrial respiration were sequentially injected. After determination of the basal oxygen consumption rate, 1.5 μM oligomycin, which inhibits ATPase, was injected to determine the amount of oxygen dedicated to ATP production by mitochondria.
To determine the maximal respiration rate or spare respiratory capacity, 1.5 μM carbonyl cyanide-4-(trifluoromethoxy)phenylhydrazone (FCCP) was injected to free the gradient of H + from the mitochondrial intermembrane space, and thus activate maximal respiration. Finally, 0.75 μM antimycin A and 0.75 μM rotenone were added to completely inhibit the mitochondrial respiration.
Statistical analysis and figure editing. In all bar plots, the height of the bar corresponds to the mean value and the bars indicate the s.d.. In all box plots, the box signifies the upper (75th) and lower (25th) quartiles, the median is represented by a horizontal line within the box and the mean by a rhombus within the box. The upper whisker extends from the upper hinge to the largest value no further than 1.5× the interquartile range (IQR) from the hinge (that is, the IQR is the distance between the first and third quartiles). The lower whisker extends from the hinge to the smallest value, at most 1.5× the IQR of the hinge. The statistical test and comparison used to calculate P values as well as P values set as the significance level are reported in each figure and/or figure legend. If not indicated otherwise, a two-tailed Student's t-test was used to assess statistical significance. The number of technical replicates or independent cell culture experiments is indicated in the relevant figure legend(s). Figures were edited using Inkscape (inkscape.org).
Reporting Summary. Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability
The reported protein structures are deposited in the Protein Data Bank with PDB accession nos. 7NY6 (unliganded Capsaspora macroH2A macrodomain) and 7NY7 (ADP-ribose bound Capsaspora macroH2A macrodomain). Source data are provided with this paper.

Code availability
We have exclusively used publicly available packages for bioinformatic analysis and provide their references in Methods. If not stated otherwise, we have used default parameters. Specific scripts are available on request. Fig. 2 | The apo and ADPR-bound-structure of the Capsaspora macrodomain (accompanying Fig. 2). a, Topology diagram of apo-form of Capsaspora mH2A macrodomain. The topology diagram shows β-sheets in a conserved 1276354 order, surrounded by 6 α-helices. b, Topology diagram of Capsaspora mH2A macrodomain in complex with ADP-ribose (ADPR). The topology diagram shows β-sheets in a conserved 1276354 order, surrounded by 5 α-helices. The C-terminal α-helix observed in the apo-structure (a) was not detected in the ADPR-bound structure. This prevented the modelling of the C-terminal 13 amino acids, while the apo structure could be fully refined all the way to the C-terminus of the macrodomain. Whether this is the result of a conformational change upon ligand binding or an artefact from the crystal packing remains to be determined. However, we previously observed that ADPribose binding induces a conformational change in the human macroH2A1.1 macrodomain, specifically by a 30° rotation of the C-terminal α-helix away from the globular macrodomain fold (Timinszky et al., 2009, NSMB). c, The schematic representation of ADPR interaction with Capsaspora macroH2A macrodomain. All non-covalent interactions between ADPR and the macrodomain are summarized in the LigPlot diagram. ADPR ligand is represented in thick purple line and the amino acid residues of Capsaspora macroH2A macrodomain in thin orange lines. Hydrogen bonds are represented by the dashed lines between atoms involved, while the circles or semicircles with radiating lines represent atoms or residues involved in hydrophobic contacts between protein and ligand.