A proteomics approach to decipher the molecular nature of planarian stem cells
Departament de Genètica and Institute of Biomedicine (IBUB), Universitat de Barcelona, Av. Diagonal 645, 08028, Barcelona, Catalonia, Spain
Abstract
Background
In recent years, planaria have emerged as an important model system for research into stem cells and regeneration. Attention is focused on their unique stem cells, the neoblasts, which can differentiate into any cell type present in the adult organism. Sequencing of the
Results
We developed a proteomic strategy to identify neoblast-specific proteins. Here we describe the method and discuss the results in comparison to the genomic high-throughput analyses carried out in planaria and to proteomic studies using other stem cell systems. We also show functional data for some of the candidate genes selected in our proteomic approach.
Conclusions
We have developed an accurate and reliable mass-spectra-based proteomics approach to complement previous genomic studies and to further achieve a more accurate understanding and description of the molecular and cellular processes related to the neoblasts.
Background
As we move further into the post-genomic era it becomes increasingly clear that DNA sequence data alone is insufficient to explain complex cellular and molecular processes. Although the enormous volume of data generated by genome sequencing projects, expressed sequence tags (ESTs), and cDNA analyses has improved our understanding of many processes, they often fail to reflect the influence of posttranscriptional modifications and protein interactions or offer a true reflection of protein levels or activity. Consequently, the role of specific proteins is relatively difficult to determine with confidence on the basis of mRNA expression or genomic data alone
Proteomic approaches offer a more realistic description of protein function and its influence on cell dynamics. Although comparative analysis of phenotypically different biological samples, such as in diseased versus healthy tissue
Planarians, an emerging model system for the investigation of stem cell and regenerative biology,
Figure 1
Neoblast depletion by irradiation and image of a neoblast shown by electron microscopy
Neoblast depletion by irradiation and image of a neoblast shown by electron microscopy. Immunostaining with anti-phosphorylated histone H3 (αH3P), labelling mitotic neoblasts in 3-day head-regenerating organisms: A, control; B, 75 Gy irradiated 3 days after irradiation; and C, 75Gy irradiated 14 days after irradiation. Whereas a high number of proliferating cells appear in control animals next to the blastema and some mitotic cells still remain 3 days after irradiation, no divisions are detected after 14 days, showing that neoblasts are completely eliminated at that time. D, Electron microscopy image of a neoblast cell. Cytoplasm (dim yellow) and nucleus (yellow) are highlighted for clarity. The red arrow indicates a chromatoid body. Scale bars: A-C = 0.5 mm, D = 3 μm.
Results
Establishment of the planarian proteomic approach
Different methods were tested to achieve a consistent and reproducible pattern on two-dimensional (2D) gels. To optimize sample preparation, proteins were extracted from dissociated cells or from whole animals. The yield from dissociated cells was insufficient to establish an efficient 2D procedure. Furthermore, the reproducibility of the 2D gel pattern was poor (data not shown). Prior to extraction from whole animals, a short treatment with 2% cysteine chloride in planarian water was used to eliminate mucous production, which is known to interfere with molecular techniques
Figure 2
Two-dimensional gels used for the selection of differential spots
Two-dimensional gels used for the selection of differential spots. The proteomic approach shown compares the protein profile of a sample containing neoblast cells with one in which these cells have been depleted by irradiation. Upper panels show a comparison between two silver-stained 2D gels of a whole proteome from wild type and irradiated animals. Spots not present in the proteome of irradiated planarians are shown and lettered in red. These spots were selected and analyzed by mass spectrometry. Bottom panels show DIGE comparison of irradiated and wild type planarian proteomes. Spots that increase or decrease in the irradiated planarian proteome are shown in red and blue, respectively. These spots were included in the mass spectrometry analyses.
Additional file 1
Details on Material and Methods. An extended description of the proteomics protocols applied to perform the analyses presented on this paper.
Click here for file
Table 1
Variables taken into account for the establishment of the planarian proteomic protocol using 2D gels.
Samples:
Whole planarian extracts,
dissociated cell extracts,
dissociated cell and sub-fractionated extracts.
Extraction Buffers:
SDS,
urea/thiourea.
Processing Sample:
(Precipitation procedure)
Amersham 2D clean up kit,
acetone,
TCA-acetone.
Isoelectric Focusing
(1st Dimension):
(Immobiline Dry strip gels 24 cm)
Linear pH 4-7,
Linear pH 7-11,
Non-linear pH 3-11.
Other Modifications:
Trypsin inhibitors,
general protease inhibitors,
sonication.
All the different variables affecting protein sample production and 2D gel electrophoresis are listed on this table.
Proteomic data
In order to identify proteins specifically expressed in neoblasts, we compared 2D patterns of two samples: wild type (WT) versus irradiated animals (IA). This method has been extensively used to study the effects of neoblast depletion
Additional file 2
Image scans of all silver-stained 2D gel replicates. Image scans of different and independent silver-stained 2D gels used in the study. A to D and the respective zooms, for the regions delimited by red squares, I to L, come from 100 μg of loaded samples. E to H and the respective zooms M to P correspond to 500 μg loaded samples. A, C, E and G are control samples. B, D, F and H are irradiated samples. Although the staining and running conditions were not exactly equivalent, one can observe that the spot pattern shown by all the gels is repetitive, which is more evident on the zoomed regions.
Click here for file
Table 2
Spot counts for the 2D gels.
Semi-Automatic Procedure
Final Selected Spots
Irradiated
Wild type
100-SIL
1182 ± 43.13
901 ± 77.07
26
500-SIL
1931 ± 92.63
1413 ± 81.31
500-DIGE
2445
58
Summarized data are shown for the 2D gel analyses. Image master 2DTM software (from Amersham Biosciences) was used to analyze the scanned gels. SIL, Silver staining; 100-SIL, 100 μg of total protein extract loaded on the gel; 500-SIL, 500 μg of total protein extract loaded on the gel; DIGE, differential in gel electrophoresis.
Computational analyses
MASCOT
Figure 3
Computational screening of protein candidates
Computational screening of protein candidates. Spectra fingerprints were analyzed by MASCOT, comparing the experimental peaks against those obtained
MASCOT predicted a total of 44,712 and 36,956 peptides for the forward and decoy databases, respectively, and these were mapped to 8300 unique ORFs (URFs), corresponding to 23,376 and 26,741 unique peptide sequences. When the same peptide was mapped on two or more URFs, the highest score was retrieved. Figure
Figure 4
Selection of candidate peptides by decoy score threshold
Selection of candidate peptides by decoy score threshold. Upper panels: histograms showing the distribution of the peptide scores (the maximum score was chosen when a peptide was mapped more than once to different open reading frames). Lower panels: scatter-plots comparing those peptide scores with the information content, in bits. Above a bit score of 2.5 (orange line), the peptide sequences can be considered of low complexity or repetitive. Decoy score threshold is depicted on all the panels as a vertical blue line, set at a score of 55 for our data.
The sequences of all the URFs for the forward database were uploaded into the BLAST2GO software suite
Figure 5
Functional distribution of the hits based on GO annotation
Functional distribution of the hits based on GO annotation. BLAST2GO multilevel ontology classification by molecular biological process over the candidate unique open reading frame sequences. Further details on the functional classes are provided in the Results section.
After GO assignment and the corresponding functional annotation of the sequences derived from our approach, enzyme codes were mapped by BLAST2GO when possible. With such codes it was possible to retrieve the KEGG pathway where the protein may play its role on the planarian molecular biology. However, less than one third of the sequences had a homologous gene/protein BLAST hit--especially for URFs dataset--, and from those many had a GO functional assignment. A fraction of the sequences with at least one GO hit was linked to an enzyme code, which would be related to a component of the KEGG pathways: 1,670 of 2,804 clusters, mapping to 118 pathways, and 131 of 5,528 clusters, mapping to 35 pathways, for MASCOT results on RefSeq and URFs respectively. All 35 pathways for URFs were also found using the RefSeq dataset. The lower ratio for the URFs set can be explained by species specific sequences, proteins or functions that are not yet annotated on the reference databases. 297 RefSeq clustered sequences had a match to 171 enzyme codes for proteins distributed on the 118 pathways. 16 URFs clustered sequences had a match to 9 enzyme codes for proteins distributed on the 35 pathways. The enzymes can appear on several pathways, due to the hierarchical structure of the KEGG a match can be found on both, a general route as "Metabolic pathway", and a more specific process, such as "Glycolysis/Gluconeogenesis". Among the pathways found, metabolism routes of sugars and lipids were expected, as energy is required for cellular processes, regeneration among them. Nevertheless, there are few candidate sequences that will deserve further analyses, as they appear on pathways close to development and regeneration: "Selenoamino acid metabolism", "Retinol metabolism in animals", and "mTOR signaling pathway". Additional data, including figures of all those pathways with color-highlighted boxes for proteins found, is available on the planarian proteomics web page
Gene profile
As depicted in Figure
Additional file 3
Comparing the results presented in this manuscript with previously published studies relating to stem cells. Comparison of candidate neoblast protein sequences presented in this paper with genes reported in other proteomic studied to be related to stem cells
Click here for file
Functional studies
We performed functional analyses on some candidates from our lists to further assess the quality and accuracy of the approach used. Candidates were selected from the RefSeq and the URFs from the traces (see Table
Table 3
Summary of BLAST hits found for the analyzed candidate sequences
RefSeq candidate sequences
ACCESSION NUMBER
BLAST HOMOLOGY
E-VALUE
Rab-11B, member RAS oncogene family
1e-79
Rab-39, Ras-related protein
1e-23
Rac-1, ras-related
3e-90
Hsp40 (DnaJ)
7e-18
Hsp60
3e-103
Hsp70 (Mortalin-like protein)
0.0
Hunchback-like
1e-50
PrkC (cAMP-dependent protein kinase)
2e-57
4e-38
URFs candidate sequences
ACCESSION NUMBER
BLAST HOMOLOGY
E-VALUE
Chaperonin containing TCP1 theta subunit
1e-51
Splicing factor 3b subunit 1
6e-109
TNF receptor associated factor
3e-25
Similar to pol polyprotein
2e-32
---
Lectin-like
4e-28
BLAST homologies to both RefSeq and URFs candidate sequences are shown. Candidate sequences coming from MASCOT predictions over the RefSeq database were mapped onto the genome draft of
To assess the relationship between these genes and the neoblasts, we analyzed their expression patterns and RNAi phenotypes (Figure
Figure 6
Functional analyses of candidate genes from the RefSeq database
Functional analyses of candidate genes from the RefSeq database. Expression profiles and RNAi phenotypes are shown for a set of selected genes. A, Rab-11B; B, Rab-39; C, Rac-1; D, Hsp40; E, Hsp60; F, Hsp70; G, Hunchback-like; H, PrkC; I,
Since neoblasts are known to be the only source of cells for homeostasis and regeneration, the relationship between the selected genes and the neoblasts was validated by RNAi experiments
In a second screen to validate candidate URFs from the traces, the expression of some of these genes was analyzed by comparing intact and irradiated organisms. Whole-mount in situ hybridization in intact adult organisms revealed parenchymal expression consistent with a neoblast distribution, whereas this expression pattern was not present in irradiated animals (Figure
Figure 7
Expression patterns of candidate genes from the
Expression patterns of candidate genes from the
Discussion
The results of this study show that we have successfully developed a rapid and reliable method for 2D analysis of planarian protein samples (Figure
Proteomic studies can help to fill gaps on the annotation of the planarian genome. Despite the large number of entries already submitted, sequence databases such as NCBI
The use of ORF sequences in whole genomes without prior knowledge of where the genes, mainly the exons, are located presents a number of issues that can distort the measures used to discriminate between true and false peptide hits. These include the ratio of coding to non-coding sequences, which can be quite low (around 2% of coding regions for the human genome
Galindo et al.
Identification of proteins
Apart from the presence of metabolic proteins that indicate the high metabolic rate of neoblasts, several of the proteins detected in this analysis seem to be good candidates to be involved in neoblast-related functions, and thus in regeneration and tissue homeostasis. One of those,
An initial proteomic picture of the neoblasts
The genes identified in this study represent the first list of neoblast-related candidate genes identified using a proteomic approach in planarias (Table
Additional file 4
Table of peptide candidates. Listing of the sequence candidates obtained from the computational analysis of the raw proteomics data over the RefSeq and URF datasets (see the corresponding sheet on the spreadsheet file). Only those with a significant BLAST hit are shown (using BLASTP against NCBI-nr, min e-value = 0.001, min hsp length = 25). Genes described in detail in Table
Click here for file
Conclusions
We have developed a proteomic approach to characterize specific planarian stem-cell (neoblast) proteins. An accurate and reproducible method for protein purification, 2D gel electrophoresis and MS analysis was defined and an ORF database of species-specific genomic DNA was developed for peptide assignment of the retrieved MS spectra. Subsequent computational analyses yielded a list of annotated candidate proteins, some of which were functionally validated as neoblast-specific genes by RNAi and whole-mount in situ hybridization. Substantial overlap was observed between the candidate genes identified in our study and those reported from previous analyses of embryonic stem cells, thus validating the specificity of the approach. In addition, we detected novel sequence candidates and expression changes that merit further investigation in future studies to determine their role in stem-cell biology.
Methods
Sequences
The genome of
Irradiation
Intact asexual planarians were irradiated at 75 Gy (1,66 Gy/minute) with a Gammacell 1000 [Atomic Energy of Canada Limited]
Sample preparation
Protein samples were obtained from whole animals using a lysis buffer and heating. See Additional File
Running 2D gels
First-dimension isoelectric focusing was performed on immobilized pH gradient strips (24 cm, pH 3-10) using an Ettan IPGphor system. Second-dimension SDS-PAGE was performed by laying the strips on 12.5% isocratic Laemmli gels (24 × 20 cm) cast in low-fluorescence glass plates on an Ettan DALT system. Details of the procedure are available in the Additional File
Sample analysis
Gel spots were extracted and digested before analysis by MS. Then, MASCOT software (Matrix Science, London, UK) was used to search those spectra on different databases. All spectra were processed by PRIDE Converter software
Gene Cloning
Gene identifiers and corresponding forward/reverse primers (including nested primers).
In situ hybridization
Digoxigenin-labeled RNA probes were prepared using an in vitro labeling kit (Roche). Whole-mount in situ hybridization was performed as described by Agata et al
RNA interference
Double-stranded RNAs (dsRNA) were produced by
Abbreviations
EST: expressed sequence tags; MS: mass spectrometry; CB: chromatoid bodies; 2D gel: two-dimensional gel; DIGE: difference in gel electrophoresis; cm: centimeters; Ip: Isoelectric point; MW: Molecular weight; WT: wild type; IA: irradiated animals; H3P: phosphorylated histone H3; ORF: open reading frame; URF: unique ORF; NCBI-nr: NCBI non-redundant (database); WUSTL: Washington University in Saint Louis; hsp: high-scoring segment pair (BLAST); GO: Gene Ontology; EC: Enzyme Code (KEGG); ES: embryonic stem cells; HSP/Hsp: heat shock protein; kDa: kilodalton; RNAi: RNA interference; CNS: central nervous system; Gy: grays; dsRNA: double-stranded RNA.
Authors' contributions
EFT, ES and JFA conceived of the study. EFT ran the 2D gels and counted the spots. JFA performed the computational analyses, compiled the sequence databases, processed the MASCOT results, and ran the GO functional and KEGG annotation. EFT ran the MASCOT searches and produced the initial BLAST annotation for RefSeq candidates. EFT and GRE performed the experimental validation of the selected protein candidates. All authors participated in its design and coordination, helped to draft the manuscript, and read and approved the final manuscript.
Acknowledgements
Genomic sequence data was produced by the Washington University Genome Sequencing Center in St. Louis, although trace sequences to generate the URFs database were downloaded from NCBI Trace server. We would like to thank Dr. Roger Florensa for his help in the protein sample preparation and setting up the 2D-gel running conditions, and Dr. Eliandre Oliveira and all members of the proteomic facility at the Parc Científic de Barcelona for their help in the proteomic work and analyses. We thank all members of the Saló group for advice and critical reading of the manuscript and Dr. Iain Patten for editorial advice. We are also grateful to the reviewers of the earlier version of the manuscript for their helpful comments. This work was supported by grants BFU-2005-00422 and BFU2008-01544/BMC from the Ministerio de Educación y Ciencia, Spain, and grant 2009SGR1018 from AGAUR (Generalitat de Catalunya, Spain). JFA started this project as a Juan de la Cierva post-doctoral fellow. E.F.T. and G.R.E. received an FPI fellowship from the Ministerio de Ciencia y Cultura.
Post-transcriptional expression regulation in the yeast
Proteomics to study genes and genomes
Disease proteomics
Proteome analysis of separated male and female gametocytes reveals novel sex-specific
Stem cells and regeneration in planarians
The power of regeneration and the stem-cell kingdom: freshwater planarians (Platyhelminthes)
The
Regeneration and pattern formation in planarians III. Evidence that neoblasts are totipotent stem cells and the source of blastema cells
Bromodeoxyuridine specifically labels the regenerative stem cells of planarians
Chromatoid bodies in somatic cells of the planarian: observations on their behavior during mitosis
Planarian regeneration: An overview of some cellular mechanisms
Characterization and categorization of fluorescence activated cell sorted planarian stem cells by ultrastructural analysis
Sur la migration des cellules de régénération chez les planaires
Planarian Hox genes: novel patterns of expression during regeneration
Optimizing a method of protein extraction for two-dimensional electrophoretic separation of proteins from planarians (Platyhelminthes, Turbellaria)
Limitations of current proteomics technologies
MASCOT search engine to identify proteins from primary sequence databases using mass spectrometry data
Database resources of the National Center for Biotechnology Information
NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins
The planarian
SmedGD: the Schmidtea mediterranea genome database
Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research
High-throughput functional annotation and data mining with the Blast2GO suite
Planarian neoblast proteomics online supplementary data
Identification of mouse embryonic stem cell-associated proteins
2-DE proteome analysis of a proliferating and differentiating human neuronal stem cell line (ReNcell VM)
Comprehensive proteome expression profiling of undifferentiated versus differentiated neural stem cells from adult rat hippocampus
Proteome analyses of
Large-scale identification of proteins expressed in mouse embryonic stem cells
Preliminary 2-D chromatographic investigation of the human stem cell proteome
Basic local alignment search tool
Identification of genes needed for regeneration, stem cell function, and tissue homeostasis by systematic gene perturbation in planaria
Deciphering the molecular machinery of stem cells: a look at the neoblast gene expression profile
The Rab GTPase family
Ypt and Rab GTPases: insight into functions through novel interactions
Isolation and characterization of a human novel RAB (RAB39B) gene
Rho signals to cell growth and apoptosis
Cell migration: Rho GTPases lead the way
Stress management-heat shock protein-70 and the regulation of apoptosis
Molecular chaperones in the cytosol: from nascent chain to folded protein
Hsp90 as a capacitor for morphological evolution
Regulation of neuroblast competence in Drosophila
Protein kinase C-alpha mediates TNF release process in RBL-2H3 mast cells
Signaling cascades in radiation-induced apoptosis: roles of protein kinase C in the apoptosis regulation
Lsm proteins and RNA processing
Functions of Lsm proteins in mRNA degradation and splicing
Yeast Sm-like proteins function in mRNA decapping and decay
Searching for the prototypic eye genetic network sine oculis is essential for eye regeneration in planarians
Double-stranded RNA specifically disrupts gene expression during planarian regeneration
An MCM2-related gene is expressed in proliferating cells of intact and regenerating planarians
GenBank
The universal protein resource (UniProt)
Metagenomics reveals our incomplete knowledge of global diversity
The Sorcerer II Global Ocean Sampling expedition: expanding the universe of protein families
Finishing the euchromatic sequence of the human genome
ProtRepeatsDB: a database of amino acid repeats in genomes
Peptides encoded by short ORFs control development and define a new eukaryotic gene family
Smed-SmB, a member of the (L)Sm protein superfamily, is essential for chromatoid body organization and planarian stem cell proliferation
A mortalin-like gene is crucial for planarian stem cell viability
Expression of hsp90 mediates cytoprotective effects in the gastrodermis of planarians
Differential expression of heat shock protein 70 in relation to stress type in the flatworm
Homeobox-containing genes in the newt are organized in clusters similar to other vertebrates
Probability-based protein identification by searching sequence databases using mass spectrometry data
Immobilized pH gradients as a first dimension in shotgun proteomics and analysis of the accuracy of pI predictability of peptides
Improving reproducibility and sensitivity in identifying human proteins by shotgun proteomics
Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry
Experiment-specific estimation of peptide identification probabilities using a randomized database
Randomized sequence databases for tandem mass spectrometry peptide and protein identification
Cell movement in intact and regenerating planarians. Quantitation using chromosomal, nuclear and cytoplasmic markers
PRIDE Converter: making proteomics data-sharing easy
A guide to the Proteomics Identifications Database proteomics data repository
Gene ontology: tool for the unification of biology. The Gene Ontology Consortium
Structure of the planarian central nervous system (CNS) revealed by neuronal cell markers
Characterization of innexin gene expression and functional roles of gap-junctional communication in planarian regeneration