MultiDataSet: an R package for encapsulating multiple data sets with application to omic data integration
| dc.contributor.author | Hernández Ferrer, Carles | |
| dc.contributor.author | Ruiz Arenas, Carlos | |
| dc.contributor.author | Beltran Gomila, Alba | |
| dc.contributor.author | González, Juan Ramón | |
| dc.date.accessioned | 2017-02-03T10:28:05Z | |
| dc.date.available | 2017-02-03T10:28:05Z | |
| dc.date.issued | 2017-01-17 | |
| dc.date.updated | 2017-02-01T19:00:47Z | |
| dc.description.abstract | BACKGROUND: Reduction in the cost of genomic assays has generated large amounts of biomedical-related data. As a result, current studies perform multiple experiments in the same subjects. While Bioconductor's methods and classes implemented in different packages manage individual experiments, there is not a standard class to properly manage different omic datasets from the same subjects. In addition, most R/Bioconductor packages that have been designed to integrate and visualize biological data often use basic data structures with no clear general methods, such as subsetting or selecting samples. RESULTS: To cover this need, we have developed MultiDataSet, a new R class based on Bioconductor standards, designed to encapsulate multiple data sets. MultiDataSet deals with the usual difficulties of managing multiple and non-complete data sets while offering a simple and general way of subsetting features and selecting samples. We illustrate the use of MultiDataSet in three common situations: 1) performing integration analysis with third party packages; 2) creating new methods and functions for omic data integration; 3) encapsulating new unimplemented data from any biological experiment. CONCLUSIONS: MultiDataSet is a suitable class for data integration under R and Bioconductor framework. | |
| dc.format.extent | 7 p. | |
| dc.format.mimetype | application/pdf | |
| dc.identifier.issn | 1471-2105 | |
| dc.identifier.pmid | 28095799 | |
| dc.identifier.uri | https://hdl.handle.net/2445/106469 | |
| dc.language.iso | eng | |
| dc.publisher | BioMed Central | |
| dc.relation.isformatof | Reproducció del document publicat a: http://dx.doi.org/10.1186/s12859-016-1455-1 | |
| dc.relation.ispartof | BMC Bioinformatics, 2017, vol. 18, num. 36 | |
| dc.relation.projectID | info:eu-repo/grantAgreement/EC/FP7/308333/EU//HELIX | |
| dc.relation.uri | http://dx.doi.org/10.1186/s12859-016-1455-1 | |
| dc.rights | (c) Hernández Ferrer et al., 2017 | |
| dc.rights.accessRights | info:eu-repo/semantics/openAccess | |
| dc.source | Articles publicats en revistes (ISGlobal) | |
| dc.subject.classification | Processament de dades | |
| dc.subject.classification | Dades massives | |
| dc.subject.other | Data processing | |
| dc.subject.other | Big data | |
| dc.title | MultiDataSet: an R package for encapsulating multiple data sets with application to omic data integration | |
| dc.type | info:eu-repo/semantics/article | |
| dc.type | info:eu-repo/semantics/publishedVersion |
Fitxers
Paquet original
1 - 1 de 1
Carregant...
- Nom:
- hernandez-ferrer2017_2366.pdf
- Mida:
- 614.76 KB
- Format:
- Adobe Portable Document Format