Farsight TXT Record

Google Protocol Buffer Deserialization The Hard Way

Written by: 
Published on: 
Apr 17, 2015
On This Page
Share:

Abstract

This article is a low-level tutorial explaining why and how to write your ownGoogle Protocol Buffer(protobuf) deserializer using the C programming language. Using the techniquesdiscussed here, you can write your own code to supplant or replace the codegenerated via protobuf-c orNanopb. This tutorial is specificto Farsight Security’s nmsg package. Ifyou write C code that deals with NMSGs, or directly with protobufs, thisarticle is for you.

Before reading this article, it is recommended that you read the FarsightSecurity Blog series onNMSG withspecific attention to Farsight’s Network Message, Volume 3: Headers and Encoding.

Counting NMSG Payloads

Farsight Security has a simple use case in which we need to count the number ofpayloads in on-wire NMSG containers. The environment this needs to be done isour real-timeSIE switch fabric, which, inaggregate, has extremely high bit rates.

Farsight solved this problem by rolling our own protobuf deserializer. We think the concept and overall process to be useful to our peers and customers.As such, we extracted the code from our production tool and wrapped it into astandalone program named

nmsgpcnt-fsi

. Also included for benchmarking aretwo reference implementations,

nmsgpcnt-npb

and

nmsgpcnt-ptc

which arewritten using popular open source C-based protobuf implementations.

All three tools accomplish the same thing. Each parses on-disk NMSG protobufsand counts the number of containers and total number of payloads. They differmainly in two areas, code complexity and speed of execution.

Why In The World Would You Do This?

The reason we wrote this low-level code when it would be much easier to use agenerated API is singular: speed of execution. As you’ll see,

nmsgpcnt-fsi

,is quite a bit faster than the other two benchmark programs. Our protagonist,

nmsgpcnt-fsi

should be faster — it is purpose-built to perform very specificoperations on protobufs. The protobuf code generators are designed to bemultipurpose and support a wide range of operations across many different datatypes. As such, for many operations on protobufs, there are calls to

malloc()

,

memcpy()

, and

free()

.

nmsgpcnt-fsi

has none of these.Because of this, the comparison between

nmsgpcnt-fsi

and the other twois definitely skewed in favor of

nmsgpcnt-fsi

.

nmsgpcnt-npb and nmsgpcnt-ptc

The first reference program ,

nmsgpcnt-npb

, is built using Nanopb, a C-basedprotobuf implementation with a “small code size”. Targeted for embedded systemsand 32-bit microcontrollers, Nanopb touts minimal requirements for RAM andcode space. While there are no

malloc()

/

free()

calls duringdeserialization, some speed has been sacrificed for code size. This will beclearly shown in our tests below.

The second reference program,

nmsgpcnt-ptc

, is built using protobuf-c, afeature-rich pure C protobuf implementation.

The source code complexity for both is relatively low.

Both programs are invoked identically, with a single file argument referring toa binary NMSG file. For benchmarking, we’ll use a large NMSG file containingjust over 2,000,000 payloads as captured using

nmsgtool

on Farsight Security’s Passive DNS feed onSIE Channel 202.

$ nmsgpcnt-npb
nmsgpcnt-npb parse an NMSG file and emit container/payload count
usage: ./nmsgpcnt-npb file.nmsg

$ ./nmsgpcnt-npb 202-2000000.nmsg
containers: 720
payloads: 2000481
$ ./nmsgpcnt-ptc 202-2000000.nmsg
containers: 720
payloads: 2000481

nmsgpcnt-fsi

nmsgpcnt-fsi

is the Farsight Security in-house built protobuf deserializer.It follows the same overall logic as the other two but the code is quite a bitlarger and more complex. This is because

nmsgpcnt-fsi

has no externaldependencies and carries with it exactly and only the code it needs to decodean NMSG protobuf and count each payload.

The workhorse in

nmsgpcnt

is the

decode_pb()

function. It is responsiblefor deserializing and parsing the NMSG container and processing

Nmsg

fields. The function is about as fast as it can be. It iterates over an NMSGcontainer and populates a passed-by-referenced structure. It does no memoryallocation or deallocation, no memory copies, and has just four possiblebranches.

Disclaimer: while

nmsgpcnt-fsi

is faster than its contemporaries, the sourcecode is more than twice the size and more tedious to read, write, andmaintain. Also, the very nature of dealing with code that operates at thebit-level practically invites mistakes and requires a lot of testing. Caveatemptor.

Benchmarking NMSG Payload Counting

Included with the

nmsgpcnt

distribution is a shell script

test.sh

that executes and times all threeprograms, displaying the results. You can see the crosswalk below:

$ ./test.sh 202-2000000.nmsg
Running nmsgpcnt-npb (nanopb) against 202-2000000.nmsg
containers: 720
payloads: 2000481
real 2.31
user 1.84
sys 0.47
Running nmsgpcnt-ptc (protoc-c) against 202-2000000.nmsg
containers: 720
payloads: 2000481
real 1.47
user 0.97
sys 0.49
Running nmsgpcnt-fsi (Farsight Security) against 202-2000000.nmsg
containers: 720
payloads: 2000481
real 0.59
user 0.15
sys 0.44

The

nmsgpcnt-fsi

program is more than twice as fast as the protobuf-cvariant, and almost five times faster than the Nanopb version. In production,this translates into faster processing of NMSG datagrams and lower packet loss.

nmsgpcnt-fsi source

The following is a discussion of the full

nmsgpcnt-fsi

C source codewith in-line annotations.

The full source code, including the code for the other benchmark programs, isavailable for download fromFarsight Security’s blog-code GitHub page here.

Preamble

The first section contains the source code license and the standard C headerfile include progression:

/*
* Copyright (c) 2015 by Farsight Security, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <inttypes.h>
#include <sys/stat.h>
#include <arpa/inet.h>
#include <unistd.h>
#include <fcntl.h>
#include <stdbool.h>

#include "./common.h"

Define Protobuf Symbolic Constants

The following symbolic constants are defined for convenience and code clarity.

/* protocol buffer constants */
#define PB_WIRETYPE_MASK 7 /* AND mask to get wire type bits */
#define PB_WT_VI 0 /* wire type: varint */
#define PB_WT_64 1 /* wire type: 64-bit */
#define PB_WT_LD 2 /* wire type: length delimited */
#define PB_WT_32 5 /* wire type: 32-bit */

/* protocol buffer Nmsg field numbers */
#define PB_NMSG_PAYLOAD 1 /* payload */
#define PB_NMSG_PAYLOAD_CRC 2 /* payload CRC */
#define PB_NMSG_SEQSRC 3 /* sequence source */
#define PB_NMSG_SEQID 4 /* sequence ID */

Define Varint Decoder

All of the magic performed below is done using C’s bitwise operators. If you’renot already, you’ll want toget familiar with them and withhow protobufs are encoded. You also might find interestingthis discussion ofvarints and accompanying C++ code from which the following was inspired.

The

CHNM_DCDE_VARINT()

macro decodes an unsigned varint. It has the followingarguments:

  • shifter: Used to shift decoded bits into their correct position.
  • buf: The buffer containing the stream of protobuf bytes.
  • val: An accumulator that will contain the decoded value.
  • idx: The index into buf.

With varint encoding, the lower 7 bits of each byte are used to store thetwo’s complement representation of the number in groups of 7 bits, leastsignificant group first. The most significant bit is used as a sentinel valueto let the decoder know if more bytes are coming.

The macro works by iterating over protobuf bytes in

buf

, decoding each byteand iteratively storing the result in

val

. It strips the most significantbits from each byte and accumulates them in

val

. It does this until

buf[idx] & 0x80

evaluates to false, which means it has decoded the lastbyte in the varint. To see why this is true, recall the following:

  • Most significant bit set: More bytes to come.
  • Most significant bit cleared: No more bytes to come.

Finally,

CHNM_DCDE_VARINT()

is declared as a macro in order to ensure thecode is placed inline and obviate the overhead a function call would require.

/* decode a varint and stick results in val */
#define CHNM_DCDE_VARINT(shifter, buf, val, idx) \
shifter = 0, val = 0; \
do \
{ \
idx++; \
val |= (buf[idx] & 0x7F) << shifter; \
shifter += 7; \
} \
while (buf[idx] & 0x80); \

Define NMSG Protobuf Data type

The following opaque structure is used to hold data about an NMSG container.For the use case in this article,

nmsgpcnt-fsi

is only concerned about thenumber of payloads, but it can be extended by adding additional members to

nmsg_pb_data_t

and adding more decoding logic to

decode_pb()

.

/* holds results of protobuf deserialization */
struct nmsg_pb_data
{
uint32_t n_payloads; /* number of payloads */
};
typedef struct nmsg_pb_data nmsg_pb_data_t;

Define The Decoder

The function that does the decoding is defined here. The workflow is herestraightforward.

  1. An outer loop walks the entire contents of the protobuf.
  2. The current byte is logically ANDed with the PB_WIRETYPE_MASK whichmasks off the protobuf wire type bits. This value is used to branch fromin the switch table.
  3. Depending on the wire type, the appropriate action is taken. Here,nmsgpcnt-fsi only cares about wire type

PB_WT_LD

  1. with a field number of

1

  1. . This heralds the start of an NMSG payload. The payload counter isincremented and the length of payload is decoded (which itself is a varintencoded value). The payload bytes are stepped over and control passes backto the top of the loop.

Sidebar: The switch table makes extending the decoder to understand other NMSGprotobuf fields quite easy. In fact this code is actually part of a larger,more extensible framework used inside Farsight for programs that need to getat other parts of the NMSG protobuf beyond an ordinal payload count.

int
decode_pb(const uint8_t *pb_data, uint32_t len, nmsg_pb_data_t *nmsg_data)
{
int idx;
uint8_t wire_type;
uint64_t i, j;
uint64_t shifter;

memset(nmsg_data, 0, sizeof (*nmsg_data));

i = 0;
idx = 0;
while (i < len)
{
/*
* Each key in the streamed message is a varint with the value
* (field_number << 3) | wire_type.
*
* To decode a protobuf byte, AND off the lower 3 bits
* (GPB_WIRETYPE_MASK) to get the wire type. From there, right
* shift by 3 to get the field number.
*
* For an NMSG protobuf, the field numbers are:
*
* message Nmsg
* {
* repeated NmsgPayload payloads = 1;
* repeated uint32 payload_crcs = 2;
* optional uint32 sequence = 3;
* optional uint64 sequence_id = 4;
* }
*
* And the protobuf wiretypes are:
*
* 0 Varint PB_WT_VI
* 1 64-bit PB_WT_64
* 2 length (payload) PB_WT_LD
* 5 32-bit PB_WT_32
*/
wire_type = pb_data[i] & PB_WIRETYPE_MASK;
switch (wire_type)
{
case PB_WT_VI:
CHNM_DCDE_VARINT(shifter, pb_data, j, i);
i++;
break;
case PB_WT_64:
/* next byte has the 64-bit value; skip both */
i += 9;
break;
case PB_WT_LD:
if ((pb_data[i] >> 3) == PB_NMSG_PAYLOAD)
{
/* field number 1 == payload */
nmsg_data->n_payloads++;
}
CHNM_DCDE_VARINT(shifter, pb_data, j, i);
/* skip over payload bytes */
i += j;
i++;
break;
case PB_WT_32:
/* next byte has the 32-bit value; skip both */
i += 5;
break;
default:
/* unknown wire type */
/* bad NMSG payload, possibly malfunctioning hardware? */
return -1;
}
}
return 1;
}

Main Entry Point

Here is the main entry point into the program. The only notable bit hereis the

load_container()

function (contained in

common.c

). It is responsiblefor the following:

  • Open the NMSG file specified at the command line.
  • Figure out how big it is.
  • Allocate enough memory to hold its contents.
  • Read contents into memory.
  • Return a buffer containing the NMSG container.

A successful call to

load_container()

needs to be balanced with asubsequent corresponding call to

free()

.

int main(int argc, char **argv)
{
ssize_t n;
uint32_t payloads, processed, containers;
uint8_t *buf = NULL;

if (argc != 2)
{
printf("%s parse an NMSG file and emit container/payload count\n",
argv[0]);
printf("usage: %s file.nmsg\n", argv[0]);
return EXIT_FAILURE;
}

buf = load_container(argv[1], &n);
if (buf == NULL)
{
return EXIT_FAILURE;
}

Loop over Containers

Next is the main event loop. An NMSG can have one or more containers (that eachcan have zero or more payloads). The program iterates over conntainers andserially processes each one.

/* loop through file, processing as many containers as we find */
for (containers = payloads = processed = 0; processed < n; )
{
uint32_t c_len;
uint8_t *p = NULL, version, flags;
nmsg_pb_data_t nmsg_data;

p = buf + processed;

Process NMSG Header

Before

nmsgpcnt-fsi

can decode a protobuf, it needs to first do some headerverification. If any of the following checks fail, the program breaks from theevent loop and exits.

  1. Confirm the NMSG magic number in the header.
  2. Confirm the NMSG protocol version (is 2).
  3. Confirm the container does not contain compressed payloads.
  4. Confirm the container does not contain fragmented payloads.
  5. Get the length of the container.

The NMSG header is now processed and

nmsgpcnt-fsi

is ready to decode thecontainer itself.

/* confirm NMSG header */
if (CHECK_MAGIC(p) != 0)
{
fprintf(stderr, "NMSG header verification failed\n");
break;
}

/* step over magic */
p += 4;

/* load version and flags */
LOAD_VERSION_AND_FLAGS(p, version, flags);

/* confirm NMSG protocol version */
if (version != 2U)
{
fprintf(stderr, "NMSG version check failed\n");
break;
}

/* we don't process compressed payloads */
if (flags & NMSG_FLAG_ZLIB)
{
fprintf(stderr, "NMSG container contains compressed payloads\n");
break;
}

/* we don't process fragmented payloads */
if (flags & NMSG_FLAG_FRAGMENT)
{
fprintf(stderr, "NMSG container contains fragmented payloads\n");
break;
}

/* step over version/flags */
p += 2;

/* get container length */
c_len = ntohl(*((uint32_t* )(p)));

/* step over length */
p += 4;

Decode the Protobuf

Next the decoder is run. If no errors are encountered, the container andpayload counters are incremented and control returns to the top of theevent loop.

/* deserialize the protobuf encoded container */
if (decode_pb(p, c_len, &nmsg_data) < 0)
{
fprintf(stderr, "error: can't parse NMSG payload\n");
break;
}
containers++;
processed += c_len + 10;
payloads += nmsg_data.n_payloads;
}

Wrap up and Shutdown

After all of the containers are exhausted the totals are emitted to

stdout

.Next, the memory held by

buf

is released and the program exits.

printf("containers:\t%d\n", containers);
printf("payloads:\t%d\n", payloads);

if (buf)
{
free(buf);
}
return 0;
}

Mike Schiffman is a Packet Esotericist for Farsight Security, Inc.