NMSG and JSON encoding
Abstract
This article introduces njt
(amalgamation of NMSG+JSON+tool), a new
convenience tool used for working with base
:encode
(JSON) NMSGs at the
command line. With this tool, the user has a previously unavailable
command-line interface to serialize arbitrary JSON as NMSG
base
:encode
(JSON) protocol data units (PDUs) or de-serialize
base
:encode
(JSON) NMSG PDUs to JSON.
To get the most from this article, it is recommended that you be comfortable with the material from the following Farsight Security Blog articles:
- Farsight’s Network Message, Volume 1: Introduction to NMSG
- Farsight’s Network Message, Volume 2: Introduction to nmsgtool
- Farsight’s Network Message, Volume 3: Headers and Encoding
Introduction
Farsight Security’s nmsgtool
is
the de facto tool for sending and receiving NMSGs. To encode and decode
different message types, it employs a run-time loadable module based system.
These modules provide an extensible framework of protobuf encoders and
decoders. While nmsgtool
certainly has a base
module for encoding and
decoding base
:encode
messages, it does not include support for decoding
the “last mile” of whatever data was encoded in the first place.
In fact, the base
:encode
module was provided as a stopgap for users who wish
to encode new data, but don’t want to deal with the heavy lifting of writing a
new protobuf module.
Until now, if you wanted to get at the data inside a base
:encode
NMSG,
you had to write your own program to decode it using the
C API
or the
Python API. Choosing what we felt was the most popular and most widely used format
we wrote njt
to provide our users with a simple and singular
interface to manage encoding JSON to NMSG and decoding of NMSG data into JSON.
Use Cases
njt
was developed as a convenience for nmsgtool
users as well as
Farsight DNSDB,
SIE, and SRA customers. Some of the more common use cases
that are readily available with njt
include:
- Encode a pre-existing JSON file into an on-disk compressed NMSG file for convenient storage / subsequent transmission
- Decode on-disk NMSGs containing JSON-encoded data
- Decode live JSON-encoded SIE Channel data
- Encode DNSDB JSON-encoded query responses as NMSG and stream to a remote
nmsgtool
endpoint
Each will be detailed below.
JSON Record Delineation
There are several ways to separate multiple JSON objects in a stream, some of the more common methods are described below (Thanks to Robert Edmonds for taking the time to bikeshed this with me).
A top-level JSON array that contains each object.
“newline-delimited JSON” (also known as “JSON Lines”). This format uses
\n
delimiters between objects and bans the use literal\n
characters in JSON objects. Newline-delimited JSON is a convenient and natural format for storing structured data that may be processed one record at a time. This is what the DNSDB API uses as well as certain SIE channels.“newline delimited” JSON, with
\r\n
delimiters between objects, allowing literal\n
characters in JSON objects. This is what the Twitter streaming API uses.The “json-seq” MIME type, which uses ASCII “record separator”
0x1E
and line feed characters to separate JSON objects. This is standardized in RFC 7464.
Currently, the only type of JSON njt
expects and emits is newline-delimited
JSON. Each record is expected to be terminated with a literal \n
and no
\n
‘s can appear inside the JSON. If there is demand to process other forms of
record separation, Farsight Security will add support in future releases.
njt Download
njt
is an open-source Python tool available for download from Farsight Security’s GitHub repository here.
Usage
Invoked with --help
, njt
dumps the following usage message:
$ ./scripts/njt --help usage: njt [-h] (-e | -d) [-c COUNT] [-w OUT_FILE] [-p] [-z] [--setsource SETSOURCE] [-V] [-v] [--setoperator SETOPERATOR] [--setgroup SETGROUP] [in_file] Serialize JSON as base:encode(JSON) NMSG PDUs or deserialize base:encode(JSON) NMSG PDUs to JSON positional arguments: in_file input file, also accepts input from pipeline optional arguments: -h, --help show this help message and exit -e, --encode encode JSON --> NMSG -d, --decode decode NMSG --> JSON -c COUNT, --count COUNT stop after count payloads -w OUT_FILE, --out_file OUT_FILE write NMSG data to file -p, --prettyprint sort and pretty print JSON output -z, --zlibout compress NMSG output --setsource SETSOURCE set payload source -V, --verbose print debugging information -v, --version show program's version number and exit --setoperator SETOPERATOR set payload operator --setgroup SETGROUP set payload group
njt
must be invoked in one of two modes, either -e
to encode JSON to NMSG
or -d
to decode NMSG to JSON. It accepts input using all of the standard
everything-is-a-file paradigm:
- Positional file argument at command line:
$ njt -e test.jsonl
- Redirect input from a file:
$ njt -d < test.nmsg
- Output of a pipeline:
$ cat test.jsonl | njt -e
- Additionally, the encoder can read ASCII text at command line:
$ njt -e^M {"count": 1}^M^D
Encode JSON file
The simplest and probably most common use case of njt
is to encode a
(previously created) JSON file into NMSG. Using the bundled 16 record test file
test.jsonl
, the invocation is as simply:
$ njt -e test.jsonl
This creates the following NMSG file:
$ ls -l njt.out* -rw-r--r-- 1 mschiffm mschiffm 4008 May 2 01:14 njt.out.1430529251.66.nmsg
We can verify the veracity of the NMSG file and count the number of payloads:
$ nmsgpcnt-test njt.out.1430529251.66.nmsg containers: 1 payloads: 16
In addition to accepting input from a pipeline, njt
also supports several
nmsgtool
-derived options to modify the output like setting the source ID and
operator, choosing a file name and compressing the output:
$ cat test.jsonl | njt -e -V --setsource 0xdeadbeef --setoperator FSI -w test.nmsg wrote 168 byte payload wrote 233 byte payload wrote 196 byte payload wrote 222 byte payload wrote 164 byte payload wrote 163 byte payload wrote 223 byte payload wrote 185 byte payload wrote 214 byte payload wrote 231 byte payload wrote 222 byte payload wrote 219 byte payload wrote 220 byte payload wrote 220 byte payload wrote 219 byte payload wrote 220 byte payload Finished, wrote 3319 bytes in 16 payloads to test.nmsg
With nmsgtool
, we can verify one of the NMSGs is what we expect:
$ nmsgtool -r test.nmsg -c 1 [173] [2015-05-05 18:57:36.232985973] [1:11 base encode] [deadbeef] [FSI] [] type: JSON payload:
Decode NMSGs
Using the NMSG file created above, we can easily decode and pretty-print a single JSON record:
$ njt -d test.nmsg -p -c 1 { "bailiwick": "example.com.", "count": 2, "rdata": [ "10 foo.example.ru." ], "rrname": "example.com.", "rrtype": "MX", "time_first": 1372708329, "time_last": 1372708329 }
Additionally, njt
supports pipelining directly into jq
, a powerful command-line tool offering rich functionality for
processing JSON data. Using it with njt
, we can slice and filter output to
our liking. For example, to emit just the rrtypes
, we can issue the
following command:
$ njt -d test.nmsg | jq ".rrtype" "MX" "NS" "NS" "NS" "A" "A" "NS" "NS" "NS" "NS" "NS" "SOA" "SOA" "SOA" "SOA" "SOA"
Decode live SIE Channel data
Another useful option njt
offers is the ability to decode live data from
Farsight Security’s SIE. Any JSON-encoded feed can be decoded and emitted. For
example, if you are a base-channel package subscriber, you can decode Channel
42 (anonymized IDS and firewall logs from ThreatSTOP):
$ nmsgtool -C ch42 -c 1 -w - | njt -d -p { "Alert": { "AdditionalData": [ { "content": "1", "meaning": "direction" }, { "content": "1", "meaning": "anon" }, { "content": "1", "meaning": "version" }, { "content": "apr 20 20:34:14 ...
Encode DNSDB queries and Stream to a Remote Endpoint
Busier pipelines are available to njt
. Using Farsight Security’s Python DNSDB
query tool dnsdb_query.py
and nmsgtool
DNSDB API customers can package up
DNSDB query responses and NMSG and stream them to a remote endpoint.
First, set up an nmsgtool
listener on an unused port (for this simulation
we’ll use loopback but in practice any unfiltered IP address will work):
$ nmsgtool -l 127.0.0.1/9430
Next, issue a DNSDB query, encode the returned JSON as NMSG and use
nmsgtool
to write the payloads to the network:
$ dnsdb_query.py -r example.com -j | njt -e | nmsgtool -r - -s 127.0.0.1/9430
And nmsgtool
emits:
[244] [2015-04-26 21:29:04.044215917] [1:11 base encode] [00000000] [] [] type: JSON payload: ...
As above, decoding and filtering are available. To do this, substitute the
original nmsgtool
listener invocation for something like:
$ nmsgtool --unbuffered -l 127.0.0.1/9430 -w - | njt -d | jq ".rrtype"
Mike Schiffman is a Packet Esotericist for Farsight Security, Inc.