The Security Information Exchange (SIE), from Farsight Security® Inc. (now a part of DomainTools), is a highly scalable security information sharing platform. It can be thought of as “radar for the Internet”, a way for you to study what’s happening online. Farsight collects and redistributes more than 200,000 new raw observations per second from its global network of sensors. Farsight also applies unique proprietary methods to improve the usability of that data, sharing refined intelligence with SIE customers directly and via DNSDB, one of the world’s largest passive DNS databases.

  • SIE distributes a variety of types of data of use for the security professional, including:
  • Raw and processed passive DNS data
  • Darknet/darkspace telescope data
  • SPAM sources and URLs
  • Phishing URLs
  • Connections from malware-infected systems (as seen by a sinkhole)
  • Intrusion detection system (IDS) and firewall connection block data

SIE Batch is a new delivery method that gives you access to a RESTful API that can be used to download data as needed. It also has a web-based interface that can be used to define your data sets and download them. With SIE Batch you can select the data sets and time periods of interest to you, download that data and have it available for your analysis. SIE Batch allows you to access data two ways:

  • Via the SIE Batch API: The API allows you to write programs to pull down data for processing automatically.
  • Interactively: There is a web-based interface that acts as a front end to the API and allows you to select and download sets of data on demand.

SIE Batch gives you access to the most recent data distributed via the SIE system. How much data is available depends on the channel you’re pulling data from, but is typically the most recent 12-18 hours.

SIE Batch is not intended to be used to access data in near-real-time. It is primarily intended for periodic downloads, such as hourly updates. If your use case requires timely access to data, e.g. realtime or even near-realtime, then you will likely achieve better results from using SIE Remote Access (SRA).

Accessing SIE Data Interactively via SIE Batch

The SIE Batch system requires a subscription to the SIE data. When you set up the subscription you will receive an API key which will give you access to the system.

If you don’t have an active subscription, please contact the DomainTools sales team.

Once you are logged in to the browser API you will see the SIE Batch dashboard. SIE data is returned in one of two formats: Newline Delimited JSON (ND-JSON) and NMSG. ND-JSON formatted files have a suffix of .ndjson, while NMSG formatted files have a suffix of .nmsg. Most channels return data in ND-JSON format, with the highest volume channels using NMSG because it is a more compact format.

{
  "time": "2020-01-13 17:53:00.097326040",
  "vname": "SIE",
  "mname": "newdomain",
  "source": "a1ba02cf",
  "message": {
    "domain": "clienttons.com.",
    "time_seen": "2020-01-13 16:16:04",
    "bailiwick": "ipv4-only.cname.clienttons.com.",
    "rrname": "jdkyqftipq6rwxq4s7ca-pw7etn-d8f0af301.ipv4-only.cname.clienttons.com.",
    "rrclass": "IN",
    "rrtype": "CNAME",
    "rdata": [
      "a248.b.akamai.net."
    ],
    "keys": [],
    "new_rr": []
  }
}

On this page you can select the channel and a date range for the data you want. The system will confirm the channel to you and set a default date range of records to be downloaded and show you how much data the channel generates on average per hour. You can accept this date range or set your own. You can set any date range as long as the data is available on the system. Channel data is expired from the system in between 12 and 20 hours depending on the data rate for each channel.

Clicking on Start will generate a data file for you to download.

Clicking on Download will download the file through the browser. For channels with high data rates, this can take some time. Clicking on Copy will copy the URL into your clipboard so it can be passed to a processing program or some other system that will copy and use the data.

If you want the data current to when you start the download, you can set the time to the current time minus about ten seconds. The URL that is generated will download the same set of data if it us used again later, as long as the data is available through the system. The chfetch API call is equivalent to the download button in the browser.

Note: if you download multiple batches of data with overlapping time periods, that data will not be de-duplicated by the system. Users should either not combine downloaded data sets with overlapping time ranges or do their own de-duplication during the merge.

Below that is a second section that gives you a series of commonly used downloads for the channels you’ve subscribed to, giving you quick access to the most recent data available for that range from when you start the download.

Click on the appropriate time segment for the channel you are interested in and the download will begin. If you hover over the download button, it will give you an estimate of the file size you’ll get if you select it.

Once you have the files downloaded, you can hand them off to an appropriate program you have created to evaluate and process the data in them.

Newline Delimited JSON (ND-JSON) Formatted files

ND-JSON files are formatted text files. The specific fields within the data will vary by channel, but it will look something like this sample, which is from Channel 213, Newly Observed Domains:


{"time":"2020-01-13 17:53:00.097143888","vname":"SIE","mname":"newdomain",
"source":"a1ba02cf","message":{"domain":"alibaba.com.","time_seen":"2020-01-13 17:51:47",
"bailiwick":"alibaba.com.","rrname":"fuz8fk.tdum.alibaba.com.",
"rrclass":"IN","rrtype":"CNAME","rdata":["tdumproxy.alibaba.com."],
"keys":[],"new_rr":[]}}

{"time":"2020-01-13 17:53:00.097326040","vname":"SIE","mname":"newdomain",
"source":"a1ba02cf","message":{"domain":"clienttons.com.","time_seen":"2020-01-13 16:16:04",
"bailiwick":"ipv4-only.cname.clienttons.com.",
"rrname":"jdkyqftipq6rwxq4s7ca-pw7etn-d8f0af301.ipv4-only.cname.clienttons.com.",
"rrclass":"IN","rrtype":"CNAME","rdata":["a248.b.akamai.net."],"keys":[],"new_rr":[]}}

{"time":"2020-01-13 17:53:00.097453117","vname":"SIE","mname":"newdomain",
"source":"a1ba02cf","message":{"domain":"yandex.ru.","time_seen":"2020-01-13 17:51:52",
"bailiwick":"yandex.ru.", "rrname":"203859815.verify.yandex.ru.",
"rrclass":"IN","rrtype":"CNAME","rdata":["an.yandex.ru."],"keys":[],"new_rr":[]}}

ND-JSON files can be viewed directly or used with any tool that supports the ND-JSON format.

NMSG Formatted files

NMSG files are a binary format, so they can’t be viewed directly. Farsight has released tools that will decode and display NMSG formatted content. The NMSG tool can be found on Github at https://github.com/farsightsec/nmsg.

To look at NMSG data, you run nmsgtool, which will format an NMSG file as readable text. If you were to view a file downloaded from Channel 221 (NSDomains) via the command nmsgtool -r, you will see something like this:

[43] [2020-01-13 17:46:47.996798921] [2:6 SIE dnsnx] [a1ba02cf] [] []
qname: xvu.co.ls.
qclass: IN (1)
qtype: AAAA (28)
response_ip: 196.216.168.70
soa_rrname: co.ls.

[70] [2020-01-13 17:46:47.996805233] [2:6 SIE dnsnx] [a1ba02cf] [] []
qname: 246.25.155.49.in-addr.arpa.
qclass: IN (1)
qtype: PTR (12)
response_ip: 194.146.106.106
soa_rrname: 49.in-addr.arpa.

[68] [2020-01-13 17:46:47.996816452] [2:6 SIE dnsnx] [a1ba02cf] [] []
qname: vla1-3s19.yndx.net.yandex.net.
qclass: IN (1)
qtype: AAAA (28)
response_ip: 93.158.134.1
soa_rrname: yandex.net.

[64] [2020-01-13 17:46:47.996831866] [2:6 SIE dnsnx] [a1ba02cf] [] []
qname: USTPE2LJ6XDVZ1.jacobs.com.
qclass: IN (1)
qtype: SOA (6)
response_ip: 13.107.24.8
soa_rrname: jacobs.com.

[63] [2020-01-13 17:46:47.996873084] [2:6 SIE dnsnx] [a1ba02cf] [] []
qname: _ldap._tcp.pdc._msdcs.sg.com.
qclass: IN (1)
qtype: SRV (33)
response_ip: 207.204.40.129
soa_rrname: sg.com.

SIE Batch API

Pulling data from the SIE Batch API involves sending an HTTPS POST to the API end point. The parameters in the POST Query are:

  • Your SIE Batch API Key
  • The channel you wish to pull data from
  • The starting date/time you want data returned from, in 24 hour format and UTC timezone
  • The ending date/time you want data returned to, in 24 hour format and UTC timezone
  • Use of the API is language independent, allowing access via any programming language including Python, Java, Perl and C.

Additional Information