DreamText: Harnessing text descriptions as an intermediate step for 3D reconstruction

dc.contributor.advisorRadeva, Petia
dc.contributor.advisorRodrigues Sepúlveda Marques, Ricardo Jorge
dc.contributor.authorPuriy Puriy, Nazar
dc.date.accessioned2023-09-06T09:33:35Z
dc.date.available2023-09-06T09:33:35Z
dc.date.issued2023-06-13
dc.descriptionTreballs Finals de Grau d'Enginyeria Informàtica, Facultat de Matemàtiques, Universitat de Barcelona, Any: 2023, Director: Petia Radeva i Ricardo Jorge Rodrigues Sepúlveda Marquesca
dc.description.abstract[en] The field of 3D generation, a rapidly emerging domain within generative AI, holds immense potential for various applications in fields such as architecture, product design, marketing, entertainment, and even in the novel realm of virtual reality. Enhancing 3D technologies bears significant utility in fostering society development and serves as a captivating and intellectually stimulating field of study, offering intriguing challenges and opportunities for innovative advancements. In this dissertation, we introduce DreamText, an innovative Image to 3D generative model that harnesses text descriptors as an intermediate step for 3D reconstruction. Our proposed method effectively learns to describe objects within images, capturing crucial object details while disregarding extraneous contextual information such as lighting, point of view, or specific arrangements. This learned information serves as the foundation for generating compelling and novel views of the object, subsequently facilitating the creation of a comprehensive and accurate 3D reconstruction. Remarkably, our approach achieves high quality results, surpassing current state-of-the-art methodologies like RealFusion (CVPR2023)[1] in several test cases. Concrete evidence of our results can be observed in the following link. Furthermore, we present FitFusion, which leverages the knowledge of a pretrained image generative model, Stable Diffusion, to train a Neural Radiance Field capable of generating 3D models when provided with image data during training. This concept stems from a comprehensive analysis and understanding of a previous model called Stable DreamFusion[2], combined with meticulous parameter tuning that culminates in improved outcomes. This project entails extensive mathematical and experimental analysis of cutting-edge models, encompassing a comprehensive understanding of their intricate details.ca
dc.format.extent103 p.
dc.format.mimetypeapplication/pdf
dc.identifier.urihttps://hdl.handle.net/2445/201714
dc.language.isoengca
dc.rightsmemòria: cc-nc-nd (c) Nazar Puriy Puriy, 2023
dc.rightscodi: Apache (c) Nazar Puriy Puriy, 2023
dc.rights.accessRightsinfo:eu-repo/semantics/openAccessca
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/es/
dc.rights.urihttps://www.apache.org/licenses/LICENSE-2.0*
dc.sourceTreballs Finals de Grau (TFG) - Enginyeria Informàtica
dc.subject.classificationVisualització tridimensionalca
dc.subject.classificationIntel·ligència artificialca
dc.subject.classificationProgramarica
dc.subject.classificationTreballs de fi de grauca
dc.subject.classificationAprenentatge automàticca
dc.subject.otherThree-dimensional display systemsen
dc.subject.otherArtificial intelligenceen
dc.subject.otherComputer softwareen
dc.subject.otherMachine learningen
dc.subject.otherBachelor's thesesen
dc.titleDreamText: Harnessing text descriptions as an intermediate step for 3D reconstructionca
dc.typeinfo:eu-repo/semantics/bachelorThesisca

Fitxers

Paquet original

Mostrant 1 - 2 de 2
Carregant...
Miniatura
Nom:
tfg_puriy_puriy_nazar.pdf
Mida:
79.5 MB
Format:
Adobe Portable Document Format
Descripció:
Memòria
Carregant...
Miniatura
Nom:
DreamText.zip
Mida:
3.44 GB
Format:
ZIP file
Descripció:
Codi font