Detail publikace

Composite Data Type Recovery in a Retargetable Decompilation

Originální název

Composite Data Type Recovery in a Retargetable Decompilation

Anglický název

Composite Data Type Recovery in a Retargetable Decompilation

Jazyk

en

Originální abstrakt

Retargetable decompilation is a reverse engineering technique performing a transformation of a platform-dependent binary file into a high level language (HLL) representation. Despite its complexity, several new decompilers or other similar binary analysis utilities have been developed in recent years. They are not yet advanced enough to serve as a standalone tool, but combined with the traditional disassemblers, they allow much faster understanding of the analysed machine code. To achieve the necessary quality, many advanced analyses must be performed on the input file. One of the toughest, but most rewarding, is the data type reconstruction analysis. It aims to assign each object (i.e. variables, functions' parameters and returns) with a high level type, preferably the same as in the original source code. This paper presents the composite data type analysis used by the retargetable decompiler developed within the Lissom project at FIT BUT. Its basic principles are based on an existing state of the art approach. However, we design a whole new patter creation and recognition algorithm, which is both retargetable and suitable for operating on code in SSA form. Moreover, we devise a new pattern aggregation rules that increase the quality of recovered data types. The solution is tested on several real world applications compiled for three different architectures.

Anglický abstrakt

Retargetable decompilation is a reverse engineering technique performing a transformation of a platform-dependent binary file into a high level language (HLL) representation. Despite its complexity, several new decompilers or other similar binary analysis utilities have been developed in recent years. They are not yet advanced enough to serve as a standalone tool, but combined with the traditional disassemblers, they allow much faster understanding of the analysed machine code. To achieve the necessary quality, many advanced analyses must be performed on the input file. One of the toughest, but most rewarding, is the data type reconstruction analysis. It aims to assign each object (i.e. variables, functions' parameters and returns) with a high level type, preferably the same as in the original source code. This paper presents the composite data type analysis used by the retargetable decompiler developed within the Lissom project at FIT BUT. Its basic principles are based on an existing state of the art approach. However, we design a whole new patter creation and recognition algorithm, which is both retargetable and suitable for operating on code in SSA form. Moreover, we devise a new pattern aggregation rules that increase the quality of recovered data types. The solution is tested on several real world applications compiled for three different architectures.

BibTex


@inproceedings{BUT111654,
  author="Peter {Matula} and Dušan {Kolář}",
  title="Composite Data Type Recovery in a Retargetable Decompilation",
  annote="Retargetable decompilation is a reverse engineering technique performing
a transformation of a platform-dependent binary file into a high level language
(HLL) representation. Despite its complexity, several new decompilers or other
similar binary analysis utilities have been developed in recent years. They are
not yet advanced enough to serve as a standalone tool, but combined with the
traditional disassemblers, they allow much faster understanding of the analysed
machine code. To achieve the necessary quality, many advanced analyses must be
performed on the input file. One of the toughest, but most rewarding, is the data
type reconstruction analysis. It aims to assign each object (i.e. variables,
functions' parameters and returns) with a high level type, preferably the same as
in the original source code.
This paper presents the composite data type analysis used by the retargetable
decompiler developed within the Lissom project at FIT BUT. Its basic principles
are based on an existing state of the art approach. However, we design a whole
new patter creation and recognition algorithm, which is both retargetable and
suitable for operating on code in SSA form. Moreover, we devise a new pattern
aggregation rules that increase the quality of recovered data types. The solution
is tested on several real world applications compiled for three different
architectures.",
  address="NOVPRESS s.r.o.",
  booktitle="Proceedings of the 9th Doctoral Workshop on Mathematical and Engineering Methods in Computer Science",
  chapter="111654",
  edition="NEUVEDEN",
  howpublished="print",
  institution="NOVPRESS s.r.o.",
  year="2014",
  month="october",
  pages="63--76",
  publisher="NOVPRESS s.r.o.",
  type="conference paper"
}