NMSG and JSON encoding

Abstract

This article introduces njt (amalgamation of NMSG+JSON+tool), a new convenience tool used for working with base:encode(JSON) NMSGs at the command line. With this tool, the user has a previously unavailable command-line interface to serialize arbitrary JSON as NMSG base:encode(JSON) protocol data units (PDUs) or de-serialize base:encode(JSON) NMSG PDUs to JSON.

To get the most from this article, it is recommended that you be comfortable with the material from the following Farsight Security Blog articles:

Introduction

Farsight Security’s nmsgtool is the de facto tool for sending and receiving NMSGs. To encode and decode different message types, it employs a run-time loadable module based system. These modules provide an extensible framework of protobuf encoders and decoders. While nmsgtool certainly has a base module for encoding and decoding base:encode messages, it does not include support for decoding the “last mile” of whatever data was encoded in the first place.

In fact, the base:encode module was provided as a stopgap for users who wish to encode new data, but don’t want to deal with the heavy lifting of writing a new protobuf module.

Until now, if you wanted to get at the data inside a base:encode NMSG, you had to write your own program to decode it using the C API or the Python API. Choosing what we felt was the most popular and most widely used format we wrote njt to provide our users with a simple and singular interface to manage encoding JSON to NMSG and decoding of NMSG data into JSON.

Use Cases

njt was developed as a convenience for nmsgtool users as well as Farsight DNSDB, SIE, and SRA customers. Some of the more common use cases that are readily available with njt include:

Encode a pre-existing JSON file into an on-disk compressed NMSG file for convenient storage / subsequent transmission
Decode on-disk NMSGs containing JSON-encoded data
Decode live JSON-encoded SIE Channel data
Encode DNSDB JSON-encoded query responses as NMSG and stream to a remote nmsgtool endpoint

Each will be detailed below.

JSON Record Delineation

There are several ways to separate multiple JSON objects in a stream, some of the more common methods are described below (Thanks to Robert Edmonds for taking the time to bikeshed this with me).

A top-level JSON array that contains each object.
“newline-delimited JSON” (also known as “JSON Lines”). This format uses \n delimiters between objects and bans the use literal \n characters in JSON objects. Newline-delimited JSON is a convenient and natural format for storing structured data that may be processed one record at a time. This is what the DNSDB API uses as well as certain SIE channels.
“newline delimited” JSON, with \r\n delimiters between objects, allowing literal \n characters in JSON objects. This is what the Twitter streaming API uses.
The “json-seq” MIME type, which uses ASCII “record separator” 0x1E and line feed characters to separate JSON objects. This is standardized in RFC 7464.

Currently, the only type of JSON njt expects and emits is newline-delimited JSON. Each record is expected to be terminated with a literal \n and no \n‘s can appear inside the JSON. If there is demand to process other forms of record separation, Farsight Security will add support in future releases.

njt Download

njt is an open-source Python tool available for download from Farsight Security’s GitHub repository here.

Usage

Invoked with --help, njt dumps the following usage message:

    $ ./scripts/njt --help
    usage: njt [-h] (-e | -d) [-c COUNT] [-w OUT_FILE] [-p] [-z]
               [--setsource SETSOURCE] [-V] [-v] [--setoperator SETOPERATOR]
               [--setgroup SETGROUP]
               [in_file]

    Serialize JSON as base:encode(JSON) NMSG PDUs or deserialize base:encode(JSON)
    NMSG PDUs to JSON

    positional arguments:
      in_file               input file, also accepts input from pipeline

    optional arguments:
      -h, --help            show this help message and exit
      -e, --encode          encode JSON --> NMSG
      -d, --decode          decode NMSG --> JSON
      -c COUNT, --count COUNT
                            stop after count payloads
      -w OUT_FILE, --out_file OUT_FILE
                            write NMSG data to file
      -p, --prettyprint     sort and pretty print JSON output
      -z, --zlibout         compress NMSG output
      --setsource SETSOURCE
                            set payload source
      -V, --verbose         print debugging information
      -v, --version         show program's version number and exit
      --setoperator SETOPERATOR
                            set payload operator
      --setgroup SETGROUP   set payload group

njt must be invoked in one of two modes, either -e to encode JSON to NMSG or -d to decode NMSG to JSON. It accepts input using all of the standard everything-is-a-file paradigm:

Positional file argument at command line: $ njt -e test.jsonl
Redirect input from a file: $ njt -d < test.nmsg
Output of a pipeline: $ cat test.jsonl | njt -e
Additionally, the encoder can read ASCII text at command line: $ njt -e^M {"count": 1}^M^D

Encode JSON file

The simplest and probably most common use case of njt is to encode a (previously created) JSON file into NMSG. Using the bundled 16 record test file test.jsonl, the invocation is as simply:

    $ njt -e test.jsonl

This creates the following NMSG file:

    $ ls -l njt.out*
    -rw-r--r-- 1 mschiffm mschiffm 4008 May  2 01:14 njt.out.1430529251.66.nmsg

We can verify the veracity of the NMSG file and count the number of payloads:

    $ nmsgpcnt-test njt.out.1430529251.66.nmsg
    containers: 1
    payloads:   16

In addition to accepting input from a pipeline, njt also supports several nmsgtool-derived options to modify the output like setting the source ID and operator, choosing a file name and compressing the output:

    $ cat test.jsonl | njt -e -V --setsource 0xdeadbeef --setoperator FSI -w test.nmsg
    wrote 168 byte payload
    wrote 233 byte payload
    wrote 196 byte payload
    wrote 222 byte payload
    wrote 164 byte payload
    wrote 163 byte payload
    wrote 223 byte payload
    wrote 185 byte payload
    wrote 214 byte payload
    wrote 231 byte payload
    wrote 222 byte payload
    wrote 219 byte payload
    wrote 220 byte payload
    wrote 220 byte payload
    wrote 219 byte payload
    wrote 220 byte payload
    Finished, wrote 3319 bytes in 16 payloads to test.nmsg

With nmsgtool, we can verify one of the NMSGs is what we expect:

    $ nmsgtool -r test.nmsg -c 1
    [173] [2015-05-05 18:57:36.232985973] [1:11 base encode] [deadbeef] [FSI] [] 
    type: JSON
    payload:

Decode NMSGs

Using the NMSG file created above, we can easily decode and pretty-print a single JSON record:

    $ njt -d test.nmsg -p -c 1
    {
        "bailiwick": "example.com.",
        "count": 2,
        "rdata": [
            "10 foo.example.ru."
        ],
        "rrname": "example.com.",
        "rrtype": "MX",
        "time_first": 1372708329,
        "time_last": 1372708329
    }

Additionally, njt supports pipelining directly into jq, a powerful command-line tool offering rich functionality for processing JSON data. Using it with njt, we can slice and filter output to our liking. For example, to emit just the rrtypes, we can issue the following command:

    $ njt -d test.nmsg | jq ".rrtype"
    "MX"
    "NS"
    "NS"
    "NS"
    "A"
    "A"
    "NS"
    "NS"
    "NS"
    "NS"
    "NS"
    "SOA"
    "SOA"
    "SOA"
    "SOA"
    "SOA"

Decode live SIE Channel data

Another useful option njt offers is the ability to decode live data from Farsight Security’s SIE. Any JSON-encoded feed can be decoded and emitted. For example, if you are a base-channel package subscriber, you can decode Channel 42 (anonymized IDS and firewall logs from ThreatSTOP):

    $ nmsgtool -C ch42 -c 1 -w - | njt -d -p
    {
        "Alert": {
            "AdditionalData": [
                {
                    "content": "1",
                    "meaning": "direction"
                },
                {
                    "content": "1",
                    "meaning": "anon"
                },
                {
                    "content": "1",
                    "meaning": "version"
                },
                {
                    "content": "apr 20 20:34:14
                ...

Encode DNSDB queries and Stream to a Remote Endpoint

Busier pipelines are available to njt. Using Farsight Security’s Python DNSDB query tool dnsdb_query.py and nmsgtool DNSDB API customers can package up DNSDB query responses and NMSG and stream them to a remote endpoint.

First, set up an nmsgtool listener on an unused port (for this simulation we’ll use loopback but in practice any unfiltered IP address will work):

    $ nmsgtool -l 127.0.0.1/9430

Next, issue a DNSDB query, encode the returned JSON as NMSG and use nmsgtool to write the payloads to the network:

    $ dnsdb_query.py -r example.com -j | njt -e | nmsgtool -r - -s 127.0.0.1/9430

And nmsgtool emits:

    [244] [2015-04-26 21:29:04.044215917] [1:11 base encode] [00000000] [] []
    type: JSON
    payload: 
    ...

As above, decoding and filtering are available. To do this, substitute the original nmsgtool listener invocation for something like:

    $ nmsgtool --unbuffered -l 127.0.0.1/9430 -w - | njt -d | jq ".rrtype"

Mike Schiffman is a Packet Esotericist for Farsight Security, Inc.

NMSG and JSON encoding

Share this entry

Abstract

Introduction

Use Cases

JSON Record Delineation

njt Download

Usage

Encode JSON file

Decode NMSGs

Decode live SIE Channel data

Encode DNSDB queries and Stream to a Remote Endpoint

Sign up for our newsletter

Related Content

RDAP and BGP in Investigative Journalism

Part 2: Tracking LummaC2 Infrastructure

Newsletter No. 5: A Little Bit of Research in my life…