Learning features (signatures) of malware instead of manually specifying.
"maximally suspicious common subgraph”.
Shared: * Inter-component call relations * Their semantic data
- Syntactic: e.g. seq of instructions
- Semantics: Control- or data-flow properties
Approximate signature matching: generating a new signature S assuming that A is an instance of family F. Then decide if A is actually an instance of F based on the similarity score between S and F’s existing signature.
Communicate through Intents (Inter-Component Communication).
Each Intent has several attributes:
- data type
Implementation: Intent filter —> manifest.
Intuitively, an MSCS is the signature candidate that maximizes the number of metadata edges, where each edge is weighted by its suspiciousness.
ICC(V, X, Y): — what is metadata Y?