Learning features (signatures) of malware instead of manually specifying.

"maximally suspicious common subgraph”.


  • Inter-component call relations
  • Their semantic data


  • Syntactic: e.g. seq of instructions
  • Semantics: Control- or data-flow properties

Approximate signature matching: generating a new signature S assuming that A is an instance of family F. Then decide if A is actually an instance of F based on the similarity score between S and F’s existing signature.

Android components:

  1. Activity
  2. Service
  3. BroadcastReceiver
  4. ContentProvider

Communicate through Intents (Inter-Component Communication).

Each Intent has several attributes:

  1. target
  2. action
  3. data type

Implementation: Intent filter —> manifest.

Intuitively, an MSCS is the signature candidate that maximizes the number of metadata edges, where each edge is weighted by its suspiciousness.

ICC(V, X, Y): — what is metadata Y?