
This article is the third in a multi-part blog series intended to introduceand acquaint the user with Farsight Security’s NMSG suite. This articleexplores some of the low-level implementation details of the NMSG protocolincluding header composition and data encoding.
Before reading this article, it is recommended that you readFarsight’s Network Message, Volume 1: Introduction to NMSG and Farsight’s Network Message, Volume 2: Introduction to nmsgtool. This article covers NMSG (protocol) version
2
and
nmsg
(C library) version
0.9.1
.
NMSG units begin with a small 10 octet header as depicted below:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| 'N' | 'M' | 'S' | 'G' |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Flags | Version | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Length (cont) | Payload(s)
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ . . . . . . . . . . . . . . . .
The NMSG header always starts with the four octet magic value:
N
M
S
G
.The
Flags
octet is next, and depending if payload(s) is a fragment and/orcompressed, it can be one, both, or none of the following:
NMSG_FLAG_ZLIB
NMSG_FLAG_FRAGMENT
The
Version
octet should be
2
. The final header field,
Length
, is anunsigned four octect integer in network byte order that holds the length in octets of the payload(s).
NMSG payload(s) are encoded using Google Protocol Buffers. They are introduced in the nextsection.
Google Protocol Buffers (sometimes referred to a “protobufs”) are an efficientlanguage and platform neutral way to serialize arbitrary structured data.Protobufs are comparable to to XML but smaller, faster, and more efficient.This makes them an ideal solution to encode the variably typed data that flowsthrough our Security Information Exchange (SIE).
To use protobufs in a program (or library code such as nmsg), the programmerfirst needs to define what the source data looks like. Again using XML as themodel, protobufs are similar to an XML schema. This definition is writtenusing a simple specification language and saved to a text file with a
extension.Once defined, this file is compiled using the one of theprotobuf compilers. This produces header and source files containing the API toserialize the data.
The nmsg library is written in C so it uses theprotobuf-c compilerto generate the API code for its protobuf serialization code.
If you want to learn more, Google maintains great documentation. The following protobuf-heavysections will make more sense if you are familiar with the
.proto
specification language.
After the header, the first protobuf encoded message will either be of type
Nmsg
(which carries one or more
NmsgPayload
messages) or
NmsgFragment
(which carries an
NmsgFragment
message). Both are discussed below.
The .proto definition for
Nmsg
is shown below:
message Nmsg
{
repeated NmsgPayload payloads = 1;
repeated uint32 payload_crcs = 2;
optional uint32 sequence = 3;
optional uint64 sequence_id = 4;
}
payloads
.proto
payload_crcs
sequence
sequence_id
sequence_id
If the
NMSG_FLAG_FRAGMENT
flag is set in the NMSG header, then the data partis an
NmsgFragment
protobuf message, as shown below:
message NmsgFragment
{
required uint32 id = 1;
required uint32 current = 2;
required uint32 last = 3;
required bytes fragment = 4;
optional uint32 crc = 5;
}
id
current
last
fragment
crc
The
NmsgPayload
messages contain payload data and are defined as follows:
message NmsgPayload
{
required uint32 vid = 1;
required uint32 msgtype = 2;
required int64 time_sec = 3;
required fixed32 time_nsec = 4;
optional bytes payload = 5;
optional uint32 source = 7;
optional uint32 operator = 8;
optional uint32 group = 9;
}
vid
base
SIE
msgtype
vid
msgtype
dns
encode
ipconn
vid
msgtype
time_sec
time_nsec
payload
source
operator
nmsg.opalias
group
nmsg.gralias
Accompanying nmsg are the vendor
base
encoding modules. These provideprotobuf serialization for a handful of common use cases. Currently includedare the following modules:
dns: For encoding DNS RRs, RRsets, and question RRs.dnsqr: For capturing DNS query/response state. This message type is used byFarsight’s Passive DNS sensors.email: For describing email message metadata relating to unsolicited emailmessages (colloquially referred to as “spam”.)encode: For encapsulating data in other generic formats for transportacross SIE. Supported are text, JSON, YAML, MsgPack, and XML.http: For representing hits to HTTP sinkholes.ipconn: For describing an IP connection, a five tuple that includes thetransport layer protocol.linkpair: For representing links between web pages.logline: For representing a single line from a log file (i.e.: syslog).ncap: For representing legacy NCAP data.packet: For representing an IPv4 or IPv6 packet.pkt: A legacy encoder for representing packet data, deprecated in favor ofpacket
xml: For representing XML data.Farsight maintains a separate package,
sie-nmsg
, that contains a group ofmessage module plug-ins specifically designed for Farsight’s SIE. Theseplug-ins are:
delay: A legacy encoder used to generate a reduction of SIE Channel 202containing transaction latencies.dnsdedupe: For encoding de-duplicated and de-duplicated/verifiedPassive DNS traffic.newdomain: For encoding Newly Observed Domains (NOD) traffic.qr: A legacy encoder intended for use with an early version DNSDB lookupserver.reputation: For encoding Distributed Reputation Whiteboard data, an experimentalservice developed by Farsight to facilitate the real-time sharing of reputationdata without a priori knowledge of data types.The next article in the NMSG series will introduce the
libnmsg
C programmingAPI.
Mike Schiffman is a Protocol Legerdemainist for Farsight Security, Inc.
Read the next part in this series: Farsight’s Network Message, Volume 4: The C Programming API