Code deobfuscation by program synthesis-aided simplification of mixed boolean-arithmetic expressions

dc.contributor.advisorRoca Cánovas, Raúl
dc.contributor.advisorBenseny, Antoni
dc.contributor.advisorReyes De Los Mozos, Mario
dc.contributor.authorGàmez-Montolio, Arnau
dc.date.accessioned2021-05-04T07:59:33Z
dc.date.available2021-05-04T07:59:33Z
dc.date.issued2020-06-21
dc.descriptionTreballs Finals de Grau de Matemàtiques, Facultat de Matemàtiques, Universitat de Barcelona, Any: 2020, Director: Raúl Roca Cánovas, Antoni Benseny i Mario Reyes de los Mozosca
dc.description.abstract[en] This project studies the theoretical background of Mixed Boolean-Arithmetic (MBA) expressions as well as its practical applicability within the field of code obfuscation, which is a technique used both by malware threats and software protection in order to complicate the process of reverse engineering (parts of) a program. An MBA expression is composed of integer arithmetic operators, e.g. $(+,-, *)$ and bitwise operators, e.g. $(\wedge, \vee, \oplus, \neg).$ MBA expressions can be leveraged to obfuscate the data-flow of code by iteratively applying rewrite rules and function identities that complicate (obfuscate) the initial expression while preserving its semantic behavior. This possibility is motivated by the fact that the combination of operators from these different fields do not interact well together: we have no rules (distributivity, factorization...) or general theory to deal with this mixing of operators. Current deobfuscation techniques to address simplification of this type of data-flow obfuscation are limited by being strongly tied to syntactic complexity. We explore novel program synthesis approaches for addressing simplification of MBA expressions by reasoning on the semantics of the obfuscated expressions instead of syntax, discussing their applicability as well as their limits. We present our own tool $r$ 2syntia that integrates Syntia, an open source program synthesis tool, into the reverse engineering framework radare 2 in order to retrieve the semantics of obfuscated code from its Input/Output behavior. Finally, we provide some improvement ideas and potential areas for future work to be done.ca
dc.format.extentx p.
dc.format.mimetypeapplication/pdf
dc.identifier.urihttps://hdl.handle.net/2445/176925
dc.language.isoengca
dc.rightscc-by-nc-nd (c) Arnau Gàmez i Montolio, 2020
dc.rights.accessRightsinfo:eu-repo/semantics/openAccessca
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/es/*
dc.sourceTreballs Finals de Grau (TFG) - Matemàtiques
dc.subject.classificationMatemàtica discretaca
dc.subject.classificationTreballs de fi de grau
dc.subject.classificationTeoria de la computacióca
dc.subject.classificationLògica matemàticaca
dc.subject.classificationÀlgebra universalca
dc.subject.otherDiscrete mathematicsen
dc.subject.otherBachelor's theses
dc.subject.otherTheory of computationen
dc.subject.otherMathematical logicen
dc.subject.otherUniversal algebraen
dc.titleCode deobfuscation by program synthesis-aided simplification of mixed boolean-arithmetic expressionsca
dc.typeinfo:eu-repo/semantics/bachelorThesisca

Fitxers

Paquet original

Mostrant 1 - 2 de 2
Carregant...
Miniatura
Nom:
codi_176925.zip
Mida:
165.57 KB
Format:
ZIP file
Descripció:
Codi font
Carregant...
Miniatura
Nom:
176925.pdf
Mida:
2.46 MB
Format:
Adobe Portable Document Format
Descripció:
Memòria