Ontology for Insider Threat Indicators

The ontology provides a mechanism for sharing and testing indicators of insider threat across multiple participants without compromising organization-sensitive data.

The study of insider threat presents some of the most complex challenges in information security. Even defining the insider threat has proven difficult, with interpretations and scope varying depending on the problem space. Organizations have begun to acknowledge the importance of detecting and preventing insider threats, but there is a surprising lack of standards within the insider threat domain to assist in the development, description, testing, and sharing of these techniques. For many organizations, establishing an insider threat program and beginning to look for potentially malicious insider activity is a new business activity.

In this data exfiltration example, the insider transferred proprietary engineering plans from the victim organization's computer systems to his new employer.

The primary goal of this effort is to support the creation, sharing, and analysis of indicators of insider threat. Because insider data is sensitive, insider threat teams often work only with data from inside their own organizations. These records frequently include documented employee behaviors, intellectual property, employee activity on networks, and information on organizational proprietary networks and information technology (IT) architecture. A shared ontology will allow teams to share indicators of insider threat without disclosing their own sensitive data. The desired outcome is to facilitate information sharing on effective indicators of malicious insider activity across organizations, with an emphasis on extensibility, semi-automation, and the ability for community members to benefit from investigations and analysis performed by others.

This work involves gathering and analyzing data from public (e.g., media reports, court documents, and other publications) and nonpublic (e.g., law enforcement investigations, internal investigations from other organizations, interviews with victim organizations, and interviews with convicted insiders) sources. This data collection primarily focuses on gathering information about three entities: the organizations involved, the perpetrator of the malicious activity, and the details of the incident. Each case in the insider incident repository contains a natural language description of the technical and behavioral observables of the incident. These descriptions were used as the primary data source for the ontology.

The top level of the ontology is composed of five classes: Actor, Action, Asset, Event, and Information. The Actor class contains subclasses for representing people, organizations, and organizational components such as departments. The Action class contains the subclasses that define the things that actors can perform. The Asset class provides subclasses that define the objects of actions. The Information class provides subclasses that provide support for modeling the information contained within some assets (examples include personally identifiable information, trade secrets, and classified information). The Event class provides support for multiple types of events of interest. Events are generally associated with one or more Actions.

The full framework of the ontology is meant to support detection of potential indicators of malicious insider activity that is then triaged. An effective implementation of the framework depends on the indicators it contains, and not all satisfied indicators necessarily warrant an investigation. The evaluation of specific instances of indicators requires expert analysis and investigation to remove false positives, assess severity of the satisfied indicator, and perform set and temporal analysis on the satisfied indicators. The framework can support a workflow-based analysis and incident escalation process. Specific implementations of the framework are expected to grow and change as the organization, its insider threat program, and the larger insider threat community and domain all do the same.

This work was done by Daniel L. Costa, Matthew L. Collins, Samuel J. Perl, Michael J. Albrethsen, George J. Silowash, and Derrick L. Spooner of Carnegie Mellon University for the Defense Advanced Research Projects Agency. DARPA-0015