MultiDataSet: an R package for encapsulating multiple data sets with application to omic data integration

dc.contributor.authorHernández Ferrer, Carles
dc.contributor.authorRuiz Arenas, Carlos
dc.contributor.authorBeltran Gomila, Alba
dc.contributor.authorGonzález, Juan Ramón
dc.date.accessioned2017-02-03T10:28:05Z
dc.date.available2017-02-03T10:28:05Z
dc.date.issued2017-01-17
dc.date.updated2017-02-01T19:00:47Z
dc.description.abstractBACKGROUND: Reduction in the cost of genomic assays has generated large amounts of biomedical-related data. As a result, current studies perform multiple experiments in the same subjects. While Bioconductor's methods and classes implemented in different packages manage individual experiments, there is not a standard class to properly manage different omic datasets from the same subjects. In addition, most R/Bioconductor packages that have been designed to integrate and visualize biological data often use basic data structures with no clear general methods, such as subsetting or selecting samples. RESULTS: To cover this need, we have developed MultiDataSet, a new R class based on Bioconductor standards, designed to encapsulate multiple data sets. MultiDataSet deals with the usual difficulties of managing multiple and non-complete data sets while offering a simple and general way of subsetting features and selecting samples. We illustrate the use of MultiDataSet in three common situations: 1) performing integration analysis with third party packages; 2) creating new methods and functions for omic data integration; 3) encapsulating new unimplemented data from any biological experiment. CONCLUSIONS: MultiDataSet is a suitable class for data integration under R and Bioconductor framework.
dc.format.extent7 p.
dc.format.mimetypeapplication/pdf
dc.identifier.issn1471-2105
dc.identifier.pmid28095799
dc.identifier.urihttps://hdl.handle.net/2445/106469
dc.language.isoeng
dc.publisherBioMed Central
dc.relation.isformatofReproducció del document publicat a: http://dx.doi.org/10.1186/s12859-016-1455-1
dc.relation.ispartofBMC Bioinformatics, 2017, vol. 18, num. 36
dc.relation.projectIDinfo:eu-repo/grantAgreement/EC/FP7/308333/EU//HELIX
dc.relation.urihttp://dx.doi.org/10.1186/s12859-016-1455-1
dc.rights(c) Hernández Ferrer et al., 2017
dc.rights.accessRightsinfo:eu-repo/semantics/openAccess
dc.sourceArticles publicats en revistes (ISGlobal)
dc.subject.classificationProcessament de dades
dc.subject.classificationDades massives
dc.subject.otherData processing
dc.subject.otherBig data
dc.titleMultiDataSet: an R package for encapsulating multiple data sets with application to omic data integration
dc.typeinfo:eu-repo/semantics/article
dc.typeinfo:eu-repo/semantics/publishedVersion

Fitxers

Paquet original

Mostrant 1 - 1 de 1
Carregant...
Miniatura
Nom:
hernandez-ferrer2017_2366.pdf
Mida:
614.76 KB
Format:
Adobe Portable Document Format