Please use this identifier to cite or link to this item: http://hdl.handle.net/2445/200060
Full metadata record
DC FieldValueLanguage
dc.contributor.advisorEscalera Guerrero, Sergio-
dc.contributor.advisorBallester Bautista, Rubén-
dc.contributor.authorCarpay, Otis-
dc.date.accessioned2023-06-29T06:59:24Z-
dc.date.available2023-06-29T06:59:24Z-
dc.date.issued2023-01-15-
dc.identifier.urihttp://hdl.handle.net/2445/200060-
dc.descriptionTreballs finals del Màster de Fonaments de Ciència de Dades, Facultat de matemàtiques, Universitat de Barcelona. Curs: 2022-2023. Tutor: Sergio Escalera Guerrero i Rubén Ballester Bautistaca
dc.description.abstract[en] The performance of a deep neural network (DNN) is dependent on its ability to generalize. This ability is often expressed in the difference in accuracy on a training and test set, or the generalization gap. Recent research has seen the use of topological data analysis to estimate this performance gap without the use of a test set. Here, persistent homology measures are derived from a weighted graph of neuron activation correlations (functional network graph). The resulting persistence diagram is vectorized by a number of statistical summaries and correlated with the generalization gap. However, the computational complexity of persistent homology calculations hinders the application to DNNs with a larger number of activations. Methods are needed to sample these activations without losing predictive power. This work assesses the effect of different sampling strategies on the resulting persistence diagrams and their summaries. These include (non-)stratified random sampling, three methods based on notions of neuron importance similar to those used in pruning, and one using $k$-means++. In line with previous research some of these strategies provide models for predicting the generalization gap with high accuracy. The investigations provide insight and open up new lines of research into the structure of the functional network activation graph.ca
dc.format.extent41 p.-
dc.format.mimetypeapplication/pdf-
dc.language.isoengca
dc.rightscc-by-nc-nd (c) Otis Carpay, 2023-
dc.rightscodi: GPL (c) Otis Carpay, 2023-
dc.rights.urihttp://www.gnu.org/licenses/gpl-3.0.ca.html*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/es/*
dc.sourceMàster Oficial - Fonaments de la Ciència de Dades-
dc.subject.classificationXarxes neuronals (Informàtica)-
dc.subject.classificationAprenentatge automàtic-
dc.subject.classificationProcessament de dades-
dc.subject.classificationTreballs de fi de màster-
dc.subject.classificationTopologia algebraicaca
dc.subject.otherNeural networks (Computer science)-
dc.subject.otherMachine learning-
dc.subject.otherData processing-
dc.subject.otherMaster's theses-
dc.subject.otherAlgebraic topologyen
dc.titleSampling methods for activation correlation graphs to predict neural network generalization using topological data analysisca
dc.typeinfo:eu-repo/semantics/masterThesisca
dc.rights.accessRightsinfo:eu-repo/semantics/openAccessca
Appears in Collections:Programari - Treballs de l'alumnat
Màster Oficial - Fonaments de la Ciència de Dades

Files in This Item:
File Description SizeFormat 
TFM_TDA_sampling-10.pdfMemòria744.47 kBAdobe PDFView/Open
tfm-master.zipCodi font105.57 MBzipView/Open


This item is licensed under a Creative Commons License Creative Commons