Tag title

Security Information Exchange (SIE) User Guide

Published on: 
November 8, 2022
On This Page
Share:

About Security Information Exchange (SIE)

This guide gives you a functional overview of the Security Information Exchange, or SIE, and it links to detailed usage instructions, tutorials, and reference information.

SIE gives you real-time access to data from our global sensor network. That data includes over half a million passive DNS (pDNS) observations per second, as well as other key security data points:

  • Raw and processed passive DNS data
  • Darknet/darkspace telescope data
  • SPAM sources and URLs
  • Phishing URLs and associated targeted brands
  • Connection attempts from malware-infected systems (as seen by a sinkhole)
  • Network traffic blocked by Intrusion Detection Systems (IDS) and firewall devices

We process this data into usable formats, stream it over real-time channels, and provide you with tools to access it according to your use case.

Each unique set of data in SIE is known as a channel and the data acquired from a specific channel can be customized. A channel in SIE may be the result of raw data analysis, or a subset of data from other channels.

Channels are available through the following mechanisms, which are explained in more detail below:

  • SIE Batch is a web interface and REST API with access to to the last 12-18 hours of data from your subscribed feeds.
  • SIE Remote Access creates a tunnel from SIE to the analyst’s system, and supports a REST API (see AXAMD, below).
  • SIE Direct Connect is a leased blade server with pre-installed SIE tools that provides direct access to the SIE network.

The following table displays available data formats, bitrates, and approximate payload rates. Additional descriptive information about each feed is available in the Channel Information section.

SIE Data Formats

To acquire, prepare, and transport SIE data, Farsight created an adaptablecontainer wire and file format for storing and transmitting blobs of data calledNetwork Message (NMSG). As its core, NMSG leverages Google Protocol BuffersVersion 2 for binary encodingusing pre-defined schemas, or in a native packetized format like PCAP.

Other data formats, like JSON or XML, can also be encapsulated in NMSG forconsistent transport across Farsight's Security Information Exchange (SIE)infrastructure and acquired and analyzed by receiving systems. This documentaddresses what is needed to acquire and process NMSG message payloads.

Farsight uses NMSG to transmit data on the higher volume channels. Other channelsmight use Newline delimited JSON for delivering data, and a few channels deliverin PCAP (Packet Capture) format.

Data FormatInformationNMSGFarsight's Network Message (NMSG) Encapsulation format. See the SIE NMSG User Guide for details.JSONJavaScript Object Notation formatNDJSONNewline delimited JSONPCAPPacket Capture format

SIE Access Methods

Data from SIE can be accessed and acquired using the following methods:

  • Direct Connect: Connect a system to the SIE network. This 1.) requires a server to be installed in a data center where Farsight has a point of presence, and 2.) then ordering a network cross connect between your server and the SIE network. Customers can optionally, and prefer to, lease a blade server from Farsight
  • SIE Remote Access (SRA): Remotely connect to the SIE network using an encrypted tunnel from your workstation or a server in your local data center
  • SIE Batch: Provides on-demand access for downloading data from SIE channels using a RESTful API or web-based interface. You select the channel and duration of time you are interested in, and then download the data for analysis. The duration of available data is dependent on the channel, but is typically the most recent 12-18 hours

For additional information about SIE access methods, please see the SIE Technical Overview document.

Direct Connect

SIE Direct Connect allows a customer to physically connect a server to theFarsight SIE network for maximum data throughput. This can be done in one oftwo ways:

  • Blade Server: Pre-configured blade servers co-located in one of Farsight's data centers that can be leased by customers for direct access to SIE channels
  • Customer Server: Customer (owned, managed, and operated) servers that can be installed in one of Farsight's data centers and physically connected to the SIE network with a network cross-connect

If a blade server is leased from Farsight, it will be pre-installed with theessential software components needed to acquire, process, compress, buffer, andtransfer data from SIE channels to the customer's data center for additionalanalysis, enrichment, and storage.

The customer will be given an admin account, with root access, and the passwordfor the root account. This allows the customer to modify the operating systemfor their specific needs. The creation of any additional accounts on the bladeserver is the responsibility of the customer. See the SIE Blade User Guidefor additional information.

If a customer uses their own server, an order can be submitted for across-connect to the SIE switches hosted at select Equinix data centers (AshburnDC3 and Palo Alto SV8). An FSI account manager can help guide cross-connectprovisioning details, hosting, or colocation options.

SIE Remote Access (SRA)

SIE Remote Access (SRA) enables a customer to remotely connect to the SecurityInformation Exchange (SIE) from anywhere on the Internet. SRA provides accessto SIE channel data on customer's local servers, allowing their analysis andprocessing systems to be located in their own data centers rather thanphysically co-located at a Farsight's data center.

Due to the technical limitations of transporting high bitrate SIE channels across the Internet, the SRA access method is not available for all SIE channels. Please reference the SIE Channel Guide for channels that can be accessed using SRA.

SRA uses the Advanced Exchange Access (AXA) transport protocol which enables SRAsessions to perform the following:

  • Select which SIE channel or channels to monitor and acquire data from
  • Define user-specified search or filtering criteria to match IP or DNS traffic
  • Control rate-limits and other AXA parameters

The streaming search and filtering capabilities of AXA enables SRA to access andacquire meaningful and relevant data from SIE while avoiding the costs oftransporting enormous volumes of data across the Internet.

Note: For high volume channels accessed using SRA, it is expected thatcustomer's will specify a search or filter for IP addresses and DNS domain namesor hostnames of interest. The SRA service will only collect and send datamatching the specified criteria across the Internet to the customer.

The SRA session to SIE is encrypted and streamed inside a Transport Layer Security (TLS)tunnel. Authentication and access control for the TLS tunnel is provided by TLS pre-sharedkeys (PSKs). A customer that chooses to access SIE using SRA must create a key-pair(which will generate two (2) keys; one (1) private and one (1) public key) and send thepublic key to Farsight. Farsight will configure the customer's public key to access SIEusing SRA and the list of SIE channels that the customer is subscribed to based oncontract entitlement.

Look here more information on TLS.

Customer's have the option to use Farsight's open source tools or they can writecustom AXA applications using the C or Python APIs. Farsight's SRA tools arefreely available.

Farsight has published source code examples that demonstrate how to access SIEusing the SRA service and AXA protocol. The example code includes a "tunnel"application that replicates SIE channels on local sockets and creates loopbackinterfaces or files. This enables the use of any Network Message (NMSG) orPacket Capture (PCAP) based software that can observe or acquire data from anSIE channel using the Direct Connect access method.

SIE Batch

SIE Batch provides on-demand access for downloading data from SIE channels usinga RESTful API or web-based interface. You select the channel and duration oftime you are interested in, and then download the data for analysis. Theduration of available data is dependent on the channel, but is typically themost recent 12-18 hours. SIE Batch allows you to acquire data from SIE channelusing two (2) methods:

  • API: Allows you to write tools to programmatically download data from SIE channels for analysis
  • Interactively: Web-based interface to the API that enables you to select and download SIE channel data on-demand

Configuring the SIE Network Interface

The sie-update python script is required for configuring and connecting theSIE network interfaces on the customer's server to the SIE switch. Thisconfiguration update is for Direct Connect, not SIE Remote Access (SRA) or SIEBatch. The python script configures the required virtual LAN (VLAN) interfacesand updates configuration files needed by libnmsg and nmsgtool. The MACaddress of the SIE network interfaces on the customer's server must be providedto and provisioned in Farsight's system for sie-update to run properly.

A current version of the sie-update script is available as a Debian/Ubuntupackage after installation of Farsight's package repository. Customers can also run the followingapt-get command on a system with the Debian/Ubuntu operating system:

$ apt-get install python-daemon sie-update

For other operating systems, customers can download the script and install itusing the following commands:

$ wget -O "/usr/local/bin/sie-update" "https://raw.github.com/farsightsec/sie-update/master/sie-update"
$ chmod +x "/usr/local/bin/sie-update"

For optional "daemon" support, python-daemon must be installed.

$ easyinstall python-daemon # requires python setuptools

For sie-update to run properly, the name of the SIE network interface must beprovided on the command line. Systems that interact with SIE must have two (2)network interfaces, one to observe traffic from SIE channels and one thatprovides connectivity to other networks. You must provide sie-update with theinterface to use for it to properly work. It is recommended that sie-updatebe run in "daemon" mode using the --daemon flag, the script willperiodically check for changes and automatically update as necessary. Forexample, to use sie-update with eth1 interface as the SIE network interface,run:

$ sie-update -i eth1 -d

Multiple interfaces can be specified on the command line like -i eth1 -i eth3.This command must be run at system startup, for instance by adding the followingline to the /etc/rc.local script:

$ sie-update -i eth1 -d

Note: Depending on how your environment is configured, you may need tospecify the absolute path of the sie-update script.

By default, the sie-update script creates the nmsg alias files in the /etcdirectory, but this can be overriden by specifying the -e / --etcdir parameterto sie-update. Note: When compiling nmsg from source,--sysconfdir=/etc should be passed to the ./configure script so libnmsgsearches the correct directory for alias files, otherwise the configurationfiles will by default be installed in the /usr/local/etc directory.

$ /usr/local/bin/sie-update -v -i eth1 -e eth3 -e /usr/local/etc

Advanced Exchange Access Toolkit (AXA)

Farsight's Advanced Exchange Access Toolkit (AXA) enables customers to remotelyand securely connect to the SRA (SIE Remote Access) service. The SRA serviceprovides access to channels available from Farsight's Security InformationExchange (SIE). AXA is a Farsight developed binary protocol used to transportreal-time data available from SIE.

AXA uses a streaming API encrypted by TLS for transporting SIE data over theInternet. The AXA protocol uses two (2) streams that transport messages betweena customers client, such as sratool, and the SRA service. There is one (1)stream in each direction using a single TCP connection.

Some SIE channels may burst to an extremely high bitrate, potentially more than500Mbps. AXA has two (2) solutions for high volume channels: 1.) optionalfiltering and 2.) loss-tolerance are both built into the protocol.

One of the following filtering methods can be used to reduce the volume ofdata received from SRA.

  • Rate-Limiting: Define a user-specified number of packets-per-second (pps)
  • Watches: Define a user-specified search or filtering criteria that match IP or DNS traffic

Note: The AXA protocol is deliberately "lossy", which means data can potentially be lost. If a customer requests more data than the network can transport, data overruns will occur. To notify customers when this happens, loss markers are reliably transmitted within the AXA stream using the AXA accounting subsystem. Because of this, the AXA protocol must use a reliable stream protocol - which is why AXA connections use TLS over TCP.

Note: SIE data can potentially be lost before encapsulation into AXA protocol messages due to network congestion, CPU overload, lack of memory, etc. or other system issues.

Farsight also provides a RESTful middleware layer in front of its AXA service.This service is called the AXA Middleware Daemon (AXAMD) and provides a RESTfulcapability that adds a streaming HTTP interface on top of the AXA toolkit. Thisenables web-application developers to interface with SIE using SRA. Farsightalso published a command line tool and Python extension library calledaxamd_client. This toolkit is licensed under the Apache 2.0 license.

Note: "AXA" is an overloaded term and depending on the context, may referto the following:

  • Actual AXA wire protocol
  • C API
  • Suite of tools presented in this document
  • Set of SRA and/or RAD servers

In this document, where appropriate, context is provided to disambiguate thesesituations.

The AXA Toolkit

The Advanced Exchange Access (AXA) toolkitcontains tools and a C library to bring Farsight's real-time data and servicesdirectly from the Farsight Security Information Exchange (SIE) to the customersnetwork.

You can find the AXA Toolkit here.

The axa-tools distribution contains the following:

  • sratunnel is a production command-line tool that streams SIE data to the local network
  • radtunnel is a production command-line tool that streams anomaly data to the local network
  • sratool is a testing, debugging, and instructional command-line tool used to connect to an SRA server, set watches, enable SIE channels, and stream data
  • radtool is a testing, debugging, and instructional command-line tool used to connect to a RAD server, set watches, enable anomaly detection modules, and stream data
  • libaxa is the C library that provides an API for the AXA protocol, which includes:
    • connection instantiation/teardown
    • message encapsulation/decapsulation
    • watch parsing/loading
    • trie storage and lookup
    • control packet rate-limits, sampling rates, window sizes, and
    • many other AXA-specific functions.
  • wdns is a low-level C library for dealing with wire-format DNS packets

libaxa is the middleware for the AXA protocol and includes capabilities toallow remote SIE data to appear on a local network socket.

For detailed usage of sratunnel, radtunnel, sratool, and radtool, pleasereview the respective man pages included in the distribution.

The AXA Transport Layer

AXA offers three (3) encrypted transports for establishing sessions andtunneling data. One of the following identity / authentication methods isrequired to use AXA. While all three options provide equal security, Farsightstrongly recommends using the APIKEY method due to its ease of setup anduse.

  • APIKEY: Customer is identified by and authenticates using a Farsight provided alphanumeric "apikey". The session is encrypted using the TLS ECDHE-RSA-AES256-GCM-SHA384 cipher suite
  • TLS: Customer is identified by and authenticates using a customer created TLS keypair. The session is encrypted using the TLS ECDHE-RSA-AES256-GCM-SHA384 cipher suite
  • SSH: Customer is identified by and authenticates using a customer created SSH keypair

Prior to transporting data across a network, AXA compresses all NMSGs using the built-in zlib compression capability. IP packets are not compressed.

NMSG

To acquire, prepare, and transport SIE data, Farsight created an adaptable container wire and file format for storing and transmitting blobs of data called Network Message (NMSG). As its core, NMSG leverages Google Protocol Buffers Version 2 for binary encoding using pre-defined schemas, or in a native packetized format like PCAP.

Other data formats, like JSON or XML, can also be encapsulated in NMSG forconsistent transport across Farsight's Security Information Exchange (SIE)infrastructure and acquired and analyzed by receiving systems.

The adaptable NMSG container format allows for consistent or variable messagetypes. NMSG container data may be streamed to a file or transmitted as UDPdatagrams. NMSG containers can contain multiple NMSG messages or a fragment ofa message too large to fit in a single container. The data in an NMSG containercan also be compressed. Additional capabilities include sequencing andrate-limiting.

More information is available in the Farsight's Network Message, Volume 1: Introduction to NMSG blog article.

System Requirements

Farsight supports the Debian operating system (OS). For information about thecurrently supported Debian OS, please see Security Information Exchange (SIE)on Debian.

Installation instructions for Security Information Exchange (SIE) on CentOS /RHEL Linux and FreeBSD are available at the following links:

Note: Installation of SIE software packages from source code can beperformed on other operating systems, but may require modifications to properlywork.

Data from lower volume SIE channels can be acquired and processed with an Atomserver or cloud instance with a 1GHz CPU, a hard disk for local storage, and 1GBor more of RAM.

A server that is configured by Farsight for trial or lease has the followingoperating system (OS) and hardware specifications.

ComponentSpecificationOperating SystemDebian 9 ("stretch")CPUOne (1) x Intel Quad Core Xeon E3 3.40GhzMemory16GB RAMStorageTwo (2) x 2TB 7200RPM drives (configured for RAID 1, 2TB available for customer use)Internet Access100Mbps connection to the InternetSIE NetworkConnection to SIE network

These hardware specifications are adequate for acquiring, processing, compressing, and buffering data from any SIE channel. However, there may not be enough memory (RAM) to perform intensive processing and analysis on data from SIE channels with the highest data volumes.

For the Direct Connect access method to SIE, the system must have two (2)network interfaces:

  1. SIE broadcast network: Typically a ten (10) gigabit link.
  2. Internet uplink: Typically a one (1) gigabit link that Farsightrate-limits to 100Mbps by default.

Additional Information

Farsight has written several blog articles demonstrating ways to interact withSIE using several of the methodologies and tools described in this document.See the following list of blog articles for more information about effectivelyusing SIE: