Carregant...
Fitxers
Tipus de document
TesiVersió
Versió publicadaData de publicació
Si us plau utilitzeu sempre aquest identificador per citar o enllaçar aquest document: https://hdl.handle.net/2445/116205
Support Vector Machines for Survival Analysis: Methods and Variable Relevance = Màquines de Suport Vectorial per Anàlisi de la Supervivència: Mètodes i Rellevància de Variables
Títol de la revista
Autors
ISSN de la revista
Títol del volum
Resum
[eng] The process of creating an efficacious malaria vaccine is complex due to the characteristics of the disease that are directly related to the responsible parasite. In the disease-vaccine interaction several aspects need to be taken into account to improve and understand the vaccine and for that reason different types of data need to be analyzed. Current assays technology allows analyzing several proteins simultaneously with a small blood volume. The combination of the medium throughput dataset of some assays and the small sample size of some malaria studies may hinder the use of classical statistical methods. In the context of low number of observations and medium or high number of variables the support vector machines (SVM) models are a powerful tool to analyze sparse data, i.e., data in which the number of predictors is larger or approximately equal to the number of observations, especially when handling binary outcomes. However, biomedical research often involves analysis of time-to-event outcomes. Several methods have been tested in the literature to deal with censored data into the SVM framework. Most of these methods are based on a support vector regression (SVR) approach and results found in the literature suggest no significant differences with Cox proportional hazards model and kernel Cox regression. Another perspective is a SVM for binary classification, however, almost no work has been done into this approach: only SVM learning using privileged information and SVM with uncertain classes have been described. This PhD thesis aims to propose alternative methods and extensions to the ones existing in the binary classification framework, specifically, proposing a conditional survival approach for weighting censored observations, a semi-supervised SVM with local invariances perspective and evaluating a weighted SVM model. Another important aspect in biomedical research is to identify the relevance of the variables in a model, i.e., which variables are important related to the response variable. In the SVM framework most of the work done is related to linear kernels, however, the main advantage of SVM is using non-linear kernels. This PhD thesis aims to propose three approaches based on the Recursive Feature Elimination (RFE) algorithm to rank variables based on non-linear SVM and SVM for survival analysis. Moreover, the proposed algorithms are focused on interpretation and visualization of each one the RFE iterations, allowing to identify relevant variables associated with the response variable and among predictor variables. After evaluating all proposed methods in a simulation study under several scenarios, a real dataset applying these methods has been analyzed: the Mal067 data aims to identify immune responses correlated with protection from malaria that were elicited by the malaria RTS,S vaccine and by natural immunity. All SVM for survival analysis methods have been implemented in R, since neither R packages nor R functions have been found.
Descripció
Matèries (anglès)
Citació
Col·leccions
Citació
SANZ RÓDENAS, Héctor. Support Vector Machines for Survival Analysis: Methods and Variable Relevance = Màquines de Suport Vectorial per Anàlisi de la Supervivència: Mètodes i Rellevància de Variables. [consulta: 30 de novembre de 2025]. [Disponible a: https://hdl.handle.net/2445/116205]