Anomaly Detection At Multiple Scales

Introduction

The research project Anomaly Detection at Multiple Scales (ADAMS) was initiated to detect insider threats (IT). A part of this project was to develop a multi layered IT detection system known as PRODIGAL (PROactive Detection of Insider threats with Graph Analysis and Learning). This paper is about PRODIGAL and its findings until now.

Challenges: · Dynamic threat scenarios (i.e., ITs are always changing) because malicious insiders actively attempt to avoid being caught and because organizational environments evolve.· IT instances consist of combinations of activities, each of which may be legitimate when performed in different context.

Goals/Contributions

The paper aims to develop, integrate & evaluate new approaches to detect weak signal characteristic of IT’s and to develop a visual anomaly detection language for specifying combinations of features, time periods, to detect anomalies which suggest instances of insider threat behaviour.

Method/approach/Architecture/FrameworkTest

Data was collected using a commercial tool called SureView, which was installed on user workstations to captures user actions. Scenarios of IT activity were developed by an independent expert red team (RT). Three kinds starting points were identified namely, Indicators (e.g.URL), Anomalies (e.g. topics in emails sent) & Scenarios (indicators & anomalies). And Multiple AD algorithms based on suspected scenarios of malicious insider behavior were developed and evaluated. To name a few:

  1. Relational Pseudo-Anomaly Detection (RPAD) which constructs a classifier to distinguish the observed data instances from the pseudo-anomalies.
  2. Gaussian Mixture Model (GMM), This model maps gaussians in ellipses and the lowest-density points denote the most anomalous zones.
  3. Vector Space Models (VSM): VSM deals with sequential data that represents, e.g., a user’s behavior over time.
  4. STINGER: platform for dynamic graph analysis.
  5. Interactive Graph Exploration (Apolo): is an interactive graph visualization component, it keeps the graph in an embedded database (SQLite), thus reducing memory needs while maintain

Anomaly Detection Language

To depict scenarios where we may want to detect users whose daily behavior over a recent month differs from their daily behavior over a previous six-month, an anomaly detection language is developed. Figure 2- Required arguments are in and optional arguments in [square brackets]. Algorithm types used are specified by symbols.

Results and Discussion

This paper focusses on multiple AD models, aimed at reducing the number of missed anomalies. In a dataset (with 6 red team inserts - RT), 4 AD’s were used, the results marked all 6 RT with 99.5% effectiveness. · An IP theft scenario tries to find the leader of the set responsible for IP theft in an organization, when tested in one data sample, 1 user ranked first and a major chunk of data was completely ignored, this is exactly what was expected, scenario was able to narrow down individuals on days when they behave like leaders of small groups exfiltrating IP.· In RPAD, Feature normalization resulted in very high performance on the test data, achieving an AUC of 0.979. However, additional research is needed to make this a deployable and usable solution by real analysts.

References:

Bader, D.A., et al. 2009. STINGER: Spatio-Temporal Interaction Networks and Graphs (STING) Extensible Representation. Technical Report, May 8, 2009.

Bettadapura, V., et al. 2013. Augmenting Bag-of-Words: Data-Driven Discovery of Temporal and Structural Information for Activity Recognition. IEEE Conference on Computer Vision and Pattern Recognition(CVPR).

03 December 2019
close
Your Email

By clicking “Send”, you agree to our Terms of service and  Privacy statement. We will occasionally send you account related emails.

close thanks-icon
Thanks!

Your essay sample has been sent.

Order now
exit-popup-close
exit-popup-image
Still can’t find what you need?

Order custom paper and save your time
for priority classes!

Order paper now