Unlocking the Power of Big Data Analytics for Application Security and Security Operation

Size: px
Start display at page:

Download "Unlocking the Power of Big Data Analytics for Application Security and Security Operation"

Transcription

1 Unlocking the Power of Big Data Analytics for Application Security and Security Operation Virginia Lee, Senior Security Architect CISSP, CISA, CEH 1 September 2018

2 What is Machine Learning? Learning from data No explicit programming Discovering hidden patterns Data-driven decisions

3 Example Application of Machine Learning Credit card fraud detection Recommendations on scan result 3

4 Categories of machine learning

5 Categories of Machine Learning Techniques Machine Learning Supervised Unsupervised Reinforcement Task Driven (Classification/Regression) Data Driven (Clustering) Algorithm learns to react to an environment

6 Categories of Machine Learning Techniques

7 Association Analysis Goal: Find rules to capture associations between items. Examples: Recommend items based on purchase/browsing history Have sales on related items often purchased together Identify web pages accessed together 7

8 What s the machine learning technique used in User Behavior Analysis?

9 Traditional approach to Behavior Profiling Challenges with using common statistical techniques Averages, Standard deviation, Chi-square and other statistical techniques don t work on every day datasets with varying amount of variance and density Single high variance data point imposes a drag on averages and impacts standard deviation significantly Data may have high number of data points with the same frequency Time series data sets cannot be retained for indefinite period of time 9

10 k-means Cluster Analysis Divides data into clusters Similar items are in same cluster 10

11 What Else In UBA Need Machine Learning? Low Risk Cohesiveness Peer Cohesiveness depends on the number of shared properties within the Peer Group. Each access privilege held by a user is compared across the members of each Peer group to determine the number of users that hold the same access privilege. High Risk Access Privilege is in text string format, need other Cluster Analysis algorithm to deride the cohesiveness value 11

12 12 Activity Outlier Detection In UBA

13 13 Anomaly Detection: Cluster Analysis!

14 Leveraging Clustering Algorithm for Anomaly Detection k-means clustering algorithm is a popular algorithm used in anomaly detection Micro Focus UBA devised its own proprietary k-means clustering algorithm to detect anomalies 14

15 Host Profiler Visualizations

16 What s the machine learning technique used in Application Security?

17 Static analysis workflow Or: How I scan a singe application Critical Source Static Code Analysis Security Auditor Medium High Low Not an Issue Finding relevant scan results is expensive and hard to scale because it requires: Security expertise Knowledge of scanned application s context

18 Identifying Issues at scale can be painstaking Source Source Source Source Source Source Static Code Analysis Static Code Analysis Static Code Analysis Static Code Analysis Static Code Analysis Static Code Analysis Security Auditor Critical Medium High Low Not an Issue

19 Software Security Center (SSC) - Audit Assistant Machine learning assisted identification of relevant scan results Exploitable Potential Vulns. Indeterminate Audit Assistant Not an Issue

20 Seamless workflow integration with existing tools Audit Workbench Security Auditor s View All product views are illustrations and might not represent actual product screens

21 There are many types of findings which are not issues Contextual awareness and expertise is required to validate findings Raw Scan Results Possible Vulnerabilities Audited Scan Results Critical High Medium Low Not an Issue Not an Issue Not Exploitable Not Reachable Noise Policy False Positive Causes Mitigations in place Code not reachable Scan Configuration Organizational choice Not a real vulnerability Application Context Organizational Preference Security Expertise

22 Train Audit Assistant based on your organizational preferences Training Corrections Anonymous Issue Metrics Critical Critical Critical Medium Critical Medium Medium Critical High Medium High High High LowHigh Low Low Medium Low Not an Issue Low Not an Issue Not an Issue Not an Issue Not an Issue Anonymous Issue Metrics Fortify scan analytics Train Classifier Critical High Medium Low False Positives Critical High Medium Low Corrections False Positives Fortify scan analytics

23 Return value-added time to auditors and developers Without sacrificing scan integrity Non-Issue Reduction Accuracy False Negative 25% 90% 80%-98% <1% Results obtained are based on real world applications and scenarios. Results vary based on training and customization. They are not guarantees of future performance.

24 Makes your application security program more efficient Reduces the number of issues that need manual examination Identify relevant issues earlier in the SDLC Scale application security program with existing resources Application Security Program Maintain consistency in auditing and reporting Applications across whole enterprise Enterprise wide software security assurance

25 Adopting Audit Assistant Before Audit Assistant After Audit Assistant

26 Thank you

27 27 Categories of Machine Learning

28 Supervised vs. Unsupervised Supervised Approaches Unsupervised Approaches Target (what model is predicting) is provided Labeled data Classification & regression are supervised. Target is unknown or unavailable unlabeled data Cluster analysis & association analysis are unsupervised.. 28

29

30

31

32

33