Machine Learning & Artificial Intelligence (AI)

Deep Learning of Behaviors

G. Cybenko (Dartmouth)

Objectives

  • A wide variety of security challenges involve reverse engineering the constituent elements of observed data.
    • Disentanglement
    • Data Fission
    • Cocktail Party-type problems
    • Analysis of Competing Hypotheses
    • “Multiple exposure” problem

Key Science Methods & Advances

  • Deep Learning has had a huge impact on recognizing individual objects.  EG, AlexNet.
  • But there has not yet been a similar breakthrough on “disentanglement” problems.
  • Combining state-of-the-art recurrent deep neural network learning and deep reinforcement learning, radically new approaches to this problem set are now possible.

Results & Impact

  • Street scene video:  people walking in different directions, vehicles, animals, bicycles
  • Occlusions, formations, different types of kinematics
  • Computer network: emails, backups, file shares, remote computing, database accesses
  • Encrypted traffic, NATed, partially observable, attacks
  • Emails/contacts/browsing/social media: social, job related, hobbies, logistics, health
  • Multiplicity of disparate activities, novel activities, partially observable, out-of-band events

The Global Cyber-Vulnerability Report

V.S. Subrahmanian (Dartmouth) with M. Ovelgonne, T. Dumitras (UMD), B.S. Prakash (Virginia Tech)

Objectives

  • Use concrete data to quantify the vulnerability of a range of countries to cyber attack.
  • Develop detailed statistics on the average number of attacks per host (machine) and the percentage of hosts in a country that are attacked.
  • Provide policy guidance on the basis of objective cyber-vulnerability statistics.

Key Science Methods & Advances

  • Developed a data set with over 4M hosts per year and over 20B malware and telemetry reports from selected Symantec anti-virus products over a 2-year period.
  • Developed detailed country-specific statistics on types of attacks per country (e.g. worms, trojans, spyware,…).
  • Summarized cybersecurity policies of 44 countries (incl. most major developed economies) and suggested improvements to their cyber policies.

Results & Impact

  • Of the 44 countries studied, India and S. Korea were the most cyber-vulnerable.
  • Denmark, Norway, Finland, Sweden were amongst the least vulnerable.
  • The US, UK, were around 10-15th least cyber-vulnerable according to our metrics.

CCafe: Country Cyber-Attack Forecasting Engine

V.S. Subrahmanian (Dartmouth) with C. Kang (AT&T), N. Park (University of North Carolina), B.A. Prakash (Virginia Tech), E. Serra (Boise State)

Objectives

  • Can we use data-driven methods to predict the number of hosts (machines) in a host population H that will be attacked by a specific malware?
  • Objective:
    • Because our data was limited to countries, our host populations were the machines in a country.
    • Developed novel ensemble methods for highly accurate predictions.

Key Science Methods & Advances

  • Developed a data set with 40 countries, and 50 most inflectious malware with telemetry and malware reports from Symantec.
  • Developed novel bi-fixpoint algorithm to compute the hardness (to detect) of a malware and the competence to detect malware of a country.
  • Adapted the SIR model of disease spread to model the spread of malware within a host population (country).
  • Because traditional regression models did not work, we developed novel methods to merge: (i) country cyber-similarity measures, (ii) clustering methods, (iii) epidemic models, and (iv) regression methods.

Results & Impact

  • Developed novel methods to predict number of hosts likely to be attacked by a specific malware m in a country C.

  • ESM predicted attacks on a per-day basis which is very hard to do because of massive daily fluctuations – often over 90% accurate (Pearson correlation coefficient).

  • Proposed ESM algorithm whose predictions had very low root mean squared errors when compared with ground truth across many countries.

  • CCAFE systems works on 40 countries (to date).

Anonysense: Privacy-Aware People-Centric Sensing

David Kotz (Dartmouth) with Minho Shin, Cory Cornelius, Dan Peebles, Apu Kapadia, and Nikos Triandopoulos

Objectives

  • Enable crowd-sourced contributions to wide-area sensing applications,
  • while protecting anonymity of contributing users,
  • and authenticity of contributing users,
  • even if the infrastructure servers are not trusted.

Key Science Methods & Advances

  • Developed system for applications to submit tasks to crowd-sensing system, for mobile devices to accept tasks and submit reports,
  • protecting the anonymity of contributing devices using anonymous crypto and mix nodes,
  • while ensuring contributors are authentic using group encryption,
  • leveraging only a trusted authentication service.

Results & Impact

  • AnonySense is a general-purpose framework for anonymous opportunistic tasking, sensing, and reporting.
  • We demonstrate two applications in AnonySense.
  • We implemented AnonySense and showed that our approach is efficient.

Location Privacy for Mobile Crown Sensing Through Population Mapping

David Kotz (Dartmouth) with Minho Shin, Cory Cornelius, Apu Kapadia, Nikos Triandopoulos

Objectives

  • Enable opportunistic sensing of mobile devices:
    • allow applications to “task” mobile devices to measure context in a target region;
    • but context reports include the time and location of the event, putting the privacy of users at increased risk.
  • Protect users' privacy against the system while reporting context.

Key Science Methods & Advances

  • Novel spatiotemporal blurring mechanism based on tessellation and clustering,
  • enables devices to perform local blurring of reports efficiently without an online anonymization server before the data are sent to the system;
  • can control the degree of certainty in location privacy and the quality of reports through a system parameter.

Results & Impact

  • Novel spatiotemporal blurring mechanism based on tessellation and clustering,
  • Design of architecture to implement this solution.
  • Security analysis of this approach.
  • We evaluated our tessellation and clustering algorithm against real mobility traces.

Bastion: Bluetooth and Architectural Support for Trusted I/O on SGX

David Kotz (Dartmouth) with Travis Peters, Reshma Lal (Intel), Srikanth Varadarajan (Intel), Pradeep Pappachan (Intel)

Objectives

  • Secure sensitive I/O data in transit between a Bluetooth I/O device (such as keyboard) and trusted app on SGX-enabled platforms in the presence of adversaries that compromise all system software.

Key Science Methods & Advances

  • Novel approach to securing I/O path between Bluetooth controller and trusted applications in a Trusted Execution Environment (TEE);
  • Leverages Intel SGX technology;
  • Supports legacy Bluetooth devices; no change in hardware or firmware is needed there.

Results & Impact

  • First-ever solution to securing sensitive data along the entire path from trusted-app to trusted-device, without need for any trusted system software in between.
  • Proof-of-concept implementation developed at Intel, on Intel hardware.