Basic Data Model Concepts

The Sift Security Information Model (SIM) is a shared semantic model focused on extracting value from data. By mapping data into this model, you are able to get the full benefits of Sift Security including correlation, detection, graph analytics and visualization, dashboards & reporting. SIM is based on Splunk’s CIM, but extended to support additional data sources not supported by CIM and to support Sift Security’s Graph Data Model.

This documentation describes how to incorporate new data sources. In a nutshell, we need to know what type of data we are ingesting. The primary goal is to provide context (e.g., authentication logs) and to map to the correct field names (e.g. sourceip to src_ip).

Logstash is the primary tool that should be used to transform existing logs. General Logstash documentation is available from Elastic. and we also share some helpful hints and best practices in Writing Logstash Configs. A GUI to simplify certain types of transformations will be available in a later version of Sift Security’s product.

The general process for integrating a new data source is:

  1. Determine to which categories the data belong.
  2. Add tags to the events for the selected categories.
  3. Rename incoming field to match the attribute names for the selected categories.
  4. Ensure appropriate result and actions fields are set, depending on which category is chosen.

Determining which categories to use

Generally, one mapping is used per data source. However, it is possible to pass data through multiple mappings. If multiple mappings are desired, the tags should be stored in a space delimited string. Each of the categories is described in detail in Categorical Data Models.

Tags

Once you have selected a category, you must apply the tags to the records.

Renaming Fields

Renaming the fields must be done in accordance to the naming scheme described for the selected category in Categorical Data Models.

Actions and Results

Some mappings write different edges for successes and failures, or mutating and non-mutating actions. Which edges to write is determined based on the values of certain fields. The specific fields depend on which mapping is chosen and described in the section for the mapping. These are described in more detail in Setting action and status fields .