Carregant...
Tipus de document
TesiData de publicació
Si us plau utilitzeu sempre aquest identificador per citar o enllaçar aquest document: https://hdl.handle.net/2445/217107
Mastering the Triad of Data, Models and Tasks in Deep Learning for Image Understanding
Títol de la revista
Autors
ISSN de la revista
Títol del volum
Resum
[eng] Deep learning's rapid growth brings vast application potential across diverse domains. Achieving optimal performance hinges on a critical interplay between three key elements: powerful model architectures, vast amounts of data, and a deep understanding of the target domain. Each element presents unique challenges. This thesis tackles these challenges to unlock the full potential of models, exploring solutions for data, models, and task understanding.
The first part of this thesis tackles the fundamental challenges associated with data used in deep learning. Acquiring large-scale data is a significant challenge, often limited by factors like annotation costs and label errors. Data within a dataset frequently exhibits significant diversity. We address these challenges with a multifaceted approach. We investigate the development of noise-robust sample-selection-based deep learning models to handle the presence of label errors. To leverage the large volumes of unlabeled data available, we explore contrastive self- supervised learning strategies. To address the heterogeneity within datasets, we propose a sample importance strategy to prioritize samples that present learning challenges. These solutions address the various data-related challenges that hinder deep learning models. The second part of the thesis covers the critical role of understanding model behaviour. We use uncertainty quantification metrics to gain valuable insights into the capabilities of the models in making predictions. By understanding these metrics, we identify areas where the model’s predictions might be less reliable. We extend our exploration by applying these uncertainty metrics across various tasks to improve the decision-making process of the models. The final part of this thesis explores the importance of task understanding. We utilize the challenging domain of food recognition as a case study. Food recognition presents unique challenges due to the visual complexity of food images. We address the domain-specific challenges of fine- grained and multi-label classification by strategically designing and modifying deep learning models to improve their performances.
Our research during this thesis yielded significant advancements in several key areas of model development. We achieved state-of-the-art results on several benchmarks across various tasks, demonstrating the effectiveness of our proposed solutions. This highlights the potential of our work to contribute to the broader field of deep learning.
Descripció
Matèries (anglès)
Citació
Citació
NAGARAJAN, Bhalaji. Mastering the Triad of Data, Models and Tasks in Deep Learning for Image Understanding. [consulta: 30 de novembre de 2025]. [Disponible a: https://hdl.handle.net/2445/217107]