Soot

Points-to Algorithms

  • CHA: Class Hierarchy Analysis - it assumes that all reference variables can point to any object of the correct type
  • geomPTA: https://github.com/Sable/soot/wiki/Using-Geometric-Encoding-based-Context-Sensitive-Points-to-Analysis-(geomPTA)
  • I heard that it is buggy. Is it?

Misc

  • Scene:
    • application classes (to be analyzed)
    • the main class (entry point)
    • points-to info
    • call-graphs
  • Points-to analysis: SPARK and Paddle
  • A Unit is a statement
    • Get values used in it: getUseBoxes
    • Get valued defined in it: getDefBoxes
    • Get units jumping to it: getBoxesPointingToThis
    • Get units it is jumping to: getUnitBoxes
  • Boxes is a reference
    • UnitBoxes: code, branching
    • ValueBoxes
  • Four IR are parallel, having different characteristics:
    • Baf: bytecode similar
    • Jimple: 3-address, semi-like Java, applicable to most analysis
    • Shimple: SSA form, simplifies analysis
    • Grimp: more human-readable than Jimple
  • Inter-procedural analysis: “-w” whole-program analysis switch
    • cg phase
    • wjtp phase: whole jimple transformation pack
    • wjap
    • “-W” switch will further add whole-program optimizations:
      • wjop
    • What is wjpp? Pre-processing pack
  • List help for a pack: java soot.Main -ph PACK
  • -include-all All classes referred to by any argument classes will be treated as application classes.
  • -f J to produce a Jimple file
  • The data-flow framework
    • Backwards or forwards flow analysis?
    • Branching or not?
    • May or must analysis?
    • Soot data-flow framework is designed to handle any form of cfg implementing the interface soot.toolkits.graph.DirectedGraph
  • Soot provides four implementations of flow sets: ArraySparseSet, ArrayPacked-Set, ToppedSet and DataFlowSet. We will describe only the first three.
    • ArraySparseSet is an unbounded flow set. The set is represented as an array of references
    • ArrayPackedSet is a bounded flow set. Requires that the programmer provides a FlowUniverse object
    • ToppedSet wraps another flow set (bounded or not) adding information regarding whether it is the top set (⊤) of the lattice.
  • CFG
    • BriefUnitGraph is very simple in the sense that it doesn’t have edges representing control flow due to exceptions being thrown.
    • ExceptionalUnitGraph includes edges from throw clauses to their handler (catch block, referred to in Soot as Trap), that is if the trap is local to the method body.
    • TrapUnitGraph like ExceptionalUnitGraph, takes into account exceptions that might be thrown.
  • The call graph has methods to query for the edges coming into a method, edges coming out of method and edges coming from a particular statement (edgesInto(method), edgesOutOf(method) and edgesOutOf(statement), respectively
  • IR for Abstracting CFG: https://github.com/domainexpert/sootexamples/tree/master/intermediate_representation/src/dk/brics/soot/intermediate
  • Implement the analysis interface: https://github.com/Sable/soot/wiki/Implementing-an-intra-procedural-data-flow-analysis-in-Soot#3-implementing-the-analysis-interface

Guides:

[x]http://www.brics.dk/SootGuide/

[x]https://github.com/Sable/soot/wiki/Implementing-an-intra-procedural-data-flow-analysis-in-Soot

https://www.rasthofer.info/publications/paper/RV2013-AndroidTutorial.pdf

Using Soot to instrument a class file https://www.sable.mcgill.ca/soot/tutorial/profiler2/index.html

https://github.com/Sable/soot/wiki/Using-Soot-as-a-Program-Optimizer

https://stackoverflow.com/questions/44944837/how-to-create-a-control-flow-graph-with-soot

Projects:

https://github.com/secure-software-engineering/FlowDroid

https://github.com/Sable/soot

https://github.com/MIT-PAC/droidsafe-src

https://github.com/grievejia/CostInstrument

https://www.abartel.net/dexpler/

https://github.com/CalebFenton/simplify/

https://github.com/CalebFenton/dex-oracle

https://sourceforge.net/projects/proguard/

https://github.com/JesusFreke/smali

https://github.com/skylot/jadx

https://github.com/necst/aamo

https://github.com/pxb1988/dex2jar

https://github.com/secure-software-engineering/SuSi

https://sourceforge.net/p/dex2jar/wiki/DecryptStrings/

http://siis.cse.psu.edu/ded/

https://github.com/izgzhen/java2smali

https://github.com/SUPERAndroidAnalyzer/super

Blogs:

https://rednaga.io

https://www.evilsocket.net/2016/04/18/how-i-defeated-an-obfuscated-and-anti-tamper-apk-with-some-python-and-a-home-made-smali-emulator/

http://calebfenton.github.io

https://blog.datarepo.cn/2017/12/30/android-malware-datasets/

QA:

https://stackoverflow.com/questions/12703500/is-it-possible-to-use-the-soot-analyses-without-calling-soot-main-main

Meta:

Android (Investigative) Tools http://cecs.wright.edu/~pmateti/Courses/4440/Lectures/Tools/

https://www.csc2.ncsu.edu/faculty/xjiang4/alerts.html Mobile Security Alerts

Datasets:

https://github.com/secure-software-engineering/DroidBench

https://f-droid.org/en/

http://www.malgenomeproject.org

http://modroid.co.nf/research/ M0Droid

https://www.sec.cs.tu-bs.de/~danarp/drebin/

http://contagiodump.blogspot.com/2013/03/16800-clean-and-11960-malicious-files.html ContagioDump

https://github.com/ashishb/android-malware

http://kharon.gforge.inria.fr/dataset/

http://amd.arguslab.org

https://www.unb.ca/cic/datasets/android-adware.html

http://pralab.diee.unica.it/en/AndroidPRAGuardDataset

https://androzoo.uni.lu

https://koodous.com

ANDROZO

Publications:

https://www.dropbox.com/sh/xqmyw7vf6quijum/AABUC53vhLirK814t-xsYX-wa?dl=0

https://ieeexplore.ieee.org/abstract/document/540302

Lecture Slides:

http://www.cs.northwestern.edu/~ychen/classes/cs450-f16/lectures/10.10_Static%20Analysis.pdf Taint Analysis

VASCO: Capable of encoding more problem than IFDS/IDE.