Week 6: Evalulating the HinDroid Result

NOTE: Due to the Wednesday holiday, this week we will be meeting Friday at 9AM PST.

Topics

This week’s assignments will guide you through the following topics:

  • Review SVMs and the kernal trick.
  • Implement the basic Hindroid Malware Classifier.
  • Assess different choices of kernel.

Reading

Please read the following:

  • Hindroid sections 3 and 4, paying particular attention to the table of results evaluating the different metapaths.
  • Review this tutorial. It is a succinct 5-min read that should help your revise SVM. Assume red-triangles are malwares and blue-circles are benign apps.
  • For more review on SVMs, this is really a great lecture and these are some great slides.

Tasks

Complete the following tasks:

  • Develop code that can calculate the commuting matrices formed by products of A, B, P. (Refer to week 05 for a hint on sparse matrices!)
  • Implement the Hindroid Malware classifier for various kernels on a dataset from DSMLP.Try training models with kernels: AA^T, ABA^T, APA^T, APBP^TA^T`. See implementation note below.
  • Evaluate the performance of the above classifiers on different Malware types. (How are you splitting train and test?)

Note on Implementation

Hindroid’s model is an SVM classifier with a custom kernel. Recall that SVMs only need the dot-product of the data-points for classification, in our case, this will be the commuting matrix Mp[i][j], where i and j are the indexes corresponding to the Apps.

You can use scikit’s SVM for your assignment. As explained above you are going to assume that you already have the pre-computed kernel matrix in the form of Mp. Hence, you will need to use custom kernels.

More specifically, you can directly use Gram-Matrix.

Class Notes

Notes

Weekly Questions

Answer the following questions on Canvas:

  • What properties do these different meta-paths capture in the data?
  • Why do all of the meta-paths in the Hindroid kernels start and end with the adjacency matrix A?
  • Use the test data to train a classifier with kernel ABA^T and compute the resulting f1-score.