RMIT University
Browse

Function similarity using family context

journal contribution
posted on 2024-11-02, 18:00 authored by Paul Black, Iqbal GondalIqbal Gondal, Peter Vamplew, Arun Lakhotia
Finding changed and similar functions between a pair of binaries is an important problem in malware attribution and for the identification of new malware capabilities. This paper presents a new technique called Function Similarity using Family Context (FSFC) for this problem. FSFC trains a Support Vector Machine (SVM) model using pairs of similar functions from two program variants. This method improves upon previous research called Cross Version Contextual Function Similarity (CVCFS) e epresenting a function using features extracted not just from the function itself, but also, from other functions with which it has a caller and callee relationship. We present the results of an initial experiment that shows that the use of additional features from the context of a function significantly decreases the false positive rate, obviating the need for a separate pass for cleaning false positives. The more surprising and unexpected finding is that the SVM model produced by FSFC can abstract function similarity features from one pair of program variants to find similar functions in an unrelated pair of program variants. If validated by a larger study, this new property leads to the possibility of creating generic similar function classifiers that can be packaged and distributed in reverse engineering tools such as IDA Pro and Ghidra.

History

Related Materials

  1. 1.
    DOI - Is published in 10.3390/electronics9071163
  2. 2.
    ISSN - Is published in 20799292

Journal

Electronics (Switzerland)

Volume

9

Number

1163

Issue

7

Start page

1

End page

20

Total pages

20

Publisher

MDPI AG

Place published

Switzerland

Language

English

Copyright

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Former Identifier

2006109733

Esploro creation date

2021-09-04