Blog Farsight TXT Record

RECORD TYPE=NULL Records In DNSDB Mtbl Files

1. Introduction

Whenever you analyze a sample of DNS traffic, it’s routine to see a variety of different DNS record types — including A’s, AAAA’s, CNAME’s, MX’s, NS’s, SOA’s, TXT’s, and even a variety of DNSSSEC-related record types, too. If you pull a sample of DNS data and see some assortment of those record types, you’d find that totally unsurprising.

However, while working with some mtbl files [1] (used by DNSDB to store its data), I was a little surprised to notice a substantial number of record type=NULL resource records. This is a less common occurence, and in fact, many folks may never have bumped into record type=NULL resource records before.

For those who may not be familiar with them, the very name of the NULL record type can even be potentially a bit confusing. [2] We’re actually referring to records that have their record type explictly set to actually be NULL (we’re NOT referring to malformed DNS records where the record type has somehow become “missing” or “undefined”). The NULL resource record type is defined in Section 3.3.10 of RFC 1035 as written by Paul Mockapetris in November 1987 (Paul Mockapetris has been and is now a member of the Farsight Board of Directors).

That RFC states:

Anything at all may be in the RDATA field so long as it is 65535 octets or less.

NULL records cause no additional section processing. NULL RRs are not allowed in master files. NULLs are used as placeholders in some experimental extensions of the DNS.

We note that the NULL record type is also still listed in IANA’s list of DNS parameters — and as such, it remains a valid (if experimental) record type, notwithstanding Wikipedia’s claim that it is “obsolete.” [3]

Now it is true that many experimental protocols get defined in one RFC or another, and when those experiments end, are seldom seen thereafter “in the wild.”

That’s NOT the case for record type=NULL records. Let’s get down to some data-driven specifics.

2. Two Weeks of DNS Data

The mtbl data I was looking at consisted of two weeks of daily mtbl files, from 2017-01-10 through 2017-01-23.

Checking the size of those files, they looked like:

$ ls -l dns*.mtbl | awk '{print $9 " " $5}'
dns.20170110.D.mtbl 22414575496
dns.20170111.D.mtbl 20909343839
dns.20170112.D.mtbl 20234421949
dns.20170113.D.mtbl 19939262792
dns.20170114.D.mtbl 20408136124
dns.20170115.D.mtbl 19399639051
dns.20170116.D.mtbl 21220878109
dns.20170117.D.mtbl 19269509674
dns.20170118.D.mtbl 19380684532
dns.20170119.D.mtbl 21306888712
dns.20170120.D.mtbl 21658526457
dns.20170121.D.mtbl 20516367749
dns.20170122.D.mtbl 20581310329
dns.20170123.D.mtbl 23071888224

$ ls -l dns*.mtbl | awk '{s+=$5} END {print s}'
2.90311e+11 [octets]

These days, 290 gigabytes worth of aggregate raw data is a reasonable-sized collection of data, as these things go.

Because many specific resource record sets may appear in multiple daily files, and because it can be convenient to work with just one file rather than fourteen files, our initial thought was that maybe we should try merging those 14 daily files into one consolidated fortnightly file. While that’s an extra step, merging those files might substantially reduce the total number of observations we needed to process.

The process of merging mtbl files is straightforward, requiring but a single command. You’ll note that dnstable_merge reports its progress and performance as it runs:

$ dnstable_merge dns.20170110.D.mtbl dns.20170111.D.mtbl dns.20170112.D.mtbl dns.20170113.D.mtbl dns.20170114.D.mtbl dns.20170115.D.mtbl dns.20170116.D.mtbl dns.20170117.D.mtbl dns.20170118.D.mtbl dns.20170119.D.mtbl dns.20170120.D.mtbl dns.20170121.D.mtbl dns.20170122.D.mtbl dns.20170123.D.mtbl dns-two-week-rollup.mtbl
[...]
mtbl_merge: wrote 1,000,000 entries (3,587,737 merged) in 5.72 sec, 174,901 ent/sec, 627,500 merge/sec
[...]
mtbl_merge: wrote 5,255,164,682 entries (12,594,323,945 merged) in 17,999.20 sec, 291,966 ent/sec, 699,715 merge/sec

Running dnstable_merge process did succeed in reducing the size of the combined output file from ~12.6 billion observations down to ~5.26 billion observations. From a disk space perspective, the savings were even greater: the new merged file required only 91,308,637,129 octets. Doing the math, that’s 91,308,637,129 / 2.90311E11 * 100 = 31.45% of the dataset’s original size in octets. That’s a nice saving in space.

However, “there is no free lunch.” Merging those daily files took just under five hours of run time (17999.2 seconds/3600 seconds per hour), even when that job was run on a lightly-loaded fast server with lots of cores, plenty of memory, and solid state disks.

Another consideration: having a single data file also “serializes” some operations, such as dumping the database and grepping for records of interest. For example, a simple naive/unoptimized sequential search for NULL records

$ dnstable_dump -r dns-two-week-rollup.mtbl | grep -F " IN NULL " > two-week-one-file-out.txt

took 1 hour and 49 minutes to run. That’s a little longer than would be ideal. Thus unless you have limited storage capacity, you may not want to bother going through the process of merging more-granular mtbl files — leaving them uncombined may allow you to more easily work with multiple chunks at the same time.

How long would it take to do the same sort of grep command against the individual daily files, running all the jobs in parallel (note the ampersand at the end of each of the following commands)?

$ dnstable_dump -r dns.20170110.D.mtbl | grep " IN NULL " > output-10.txt &
$ dnstable_dump -r dns.20170111.D.mtbl | grep " IN NULL " > output-11.txt &
$ dnstable_dump -r dns.20170112.D.mtbl | grep " IN NULL " > output-12.txt &
$ dnstable_dump -r dns.20170113.D.mtbl | grep " IN NULL " > output-13.txt &
$ dnstable_dump -r dns.20170114.D.mtbl | grep " IN NULL " > output-14.txt &
$ dnstable_dump -r dns.20170115.D.mtbl | grep " IN NULL " > output-15.txt &
$ dnstable_dump -r dns.20170116.D.mtbl | grep " IN NULL " > output-16.txt &
$ dnstable_dump -r dns.20170117.D.mtbl | grep " IN NULL " > output-17.txt &
$ dnstable_dump -r dns.20170118.D.mtbl | grep " IN NULL " > output-18.txt &
$ dnstable_dump -r dns.20170119.D.mtbl | grep " IN NULL " > output-19.txt &
$ dnstable_dump -r dns.20170120.D.mtbl | grep " IN NULL " > output-20.txt &
$ dnstable_dump -r dns.20170121.D.mtbl | grep " IN NULL " > output-21.txt &
$ dnstable_dump -r dns.20170122.D.mtbl | grep " IN NULL " > output-22.txt &
$ dnstable_dump -r dns.20170123.D.mtbl | grep " IN NULL " > output-23.txt &

When I ran this job, it started at Wed Jan 25 06:29:58 UTC 2017, and the last of the extracted files finally got done being written roughly half an hour later at Jan 25 06:58 UTC 2017.

While we could likely further reduce that time by using a higher performance replacement for grep (such as ripgrep [4]), we’ve got the data we need for the rest of this analysis, so let’s not get ratholed/obsessed with data prep chores.

Total size of the final resulting output files was was roughly 13 gigabytes:

$ ls -l output* | awk '{s+=$5} END {print s}'
1.3059e+10

That’s the data we’ll be working with going forward from here.

3. The record type=NULL Traffic We Saw

Methodological considerations aside, let’s remember that this article is substantively about record type=NULL resource records.

What effective 2nd-level domains [5] were most commonly represented in the record type=NULL traffic? We can find that information with the Unix commands:

$ cat output-*.txt | awk '{print $1}' | 2nd-level-dom > sum-all.txt
$ sort sum-all.txt | uniq -c | sort -nr > sum-counts.txt

“Decoding” those commands for those who may not routinely work with Unix:

'cat output-*.txt'

    Concatenate and display all the files that match 
    the file name pattern output-*.txt

'|' (pipe or "vertical bar")

    take the output from the preceding command and feed it
    into the next command

'awk '{print $1}''

    Output just the first column (delimited by whitespace)

'2nd-level-dom' [see Appendix I]

    This is a little Perl script that just prints the 
    effective 2nd-level-domain (e.g., sending the domain 
    name www.bbc.co.uk through this script would return 
    bbc.co.uk)

'>'

    write output from the preceding command to the 
    named file

Looking at the 2nd command, and just mentioning bits that 
haven't already been explained:

'sort'

    sorts the named file

'uniq -c'

    count the number of occurences of each unique pattern

'sort -nr'

    sort the output in reverse numeric order (largest to 
    smallest)

Rearranging the output in sum-counts.txt, we saw four main “clumps” of effective 2nd-level domains (with the fourth clump being a catch-all/miscellaneous category):

    Count    Effective 2nd-level-domain
    - -----
    10058051 1yf.de
     6801193 53r.de
     1647365 2yf.de

     1978010 dashnxdomain.net

       862625 na2.in
      862453 qv4.in
      861302 nf5.in
      860901 06x.in
      860719 mm4.in
      859004 g6h.in
      845414 z84.in
      843622 8uy.in
      830967 88j.in
      813120 bn3.in
      208943 7uu.in
      177937 jk0.in
      173278 ty7.in
      169905 09m.in
      166271 9fh.in
      164127 bv6.in
      164096 m30.in
      161542 nb3.in
      149647 u7l.in
      124231 n23.in
      113601 b5h.in
      106852 7vv.in
       81887 pt6.in
       81787 n52.in
       80207 vb0.in
       79663 76o.in
       79493 v3v.in
       79413 mq8.in
       79181 zz3.in
       79112 09j.in
       78902 vx3.in
       78853 wg7.in
       78839 nt1.in
       78691 zv3.in
       78557 77g.in
       78337 7b7.in
       77899 a71.in
       77882 m0x.in
       77808 m7q.in
       77713 po2.in
       77604 po0.in
       77508 hy7.in
       77493 y67.in
       77475 55m.in
       77449 g3m.in
       77349 11v.in
       76224 gg8.in
       73637 s3n.in
       72425 bb0.in
       68904 0x7.in
       68613 jt6.in
       66984 gt7.in

      180378 nwsome.site
       51925 it.cx
       24521 njpjs.com
        1554 ignorelist.com
         294 food4mum.ru
          290 get-schwifty.com
         192 nova.ws

At least some of that record type=NULL traffic appears to be DNS tunneling traffic.

For example, looking at the most commonly seen domains (1yf.de, 2yf.de, etc.), we can see records that look like:

test.s01.1yf.de. IN NULL 656D7330312E796F75722D66726565646F6D2E64653B55533B36362E39302E37332E34363B303B313232363B64656661756C742C766F6C756D652C6E6F727468616D65726963612C696E7465726163746976652C766F69702C6F70656E76706E2C707074702C736F636B73353B
test.s01.1yf.de. IN NULL 656D7330312E796F75722D66726565646F6D2E64653B55533B36362E39302E37332E34363B303B313236393B64656661756C742C766F6C756D652C6E6F727468616D65726963612C696E7465726163746976652C766F69702C6F70656E76706E2C707074702C736F636B73353B
test.s01.1yf.de. IN NULL 656D7330312E796F75722D66726565646F6D2E64653B55533B3636E39302E37332E34363B303B313237333B64656661756C742C766F6C756D652C6E6F727468616D65726963612C696E7465726163746976652C766F69702C6F70656E76706E2C707074702C736F636B73353B

test.s02.2yf.de. IN NULL 656D7330322E796F75722D66726565646F6D2E64653B44453B3139332E3136342E3133332E37323B303B32313532353B64656661756C742C6575726F70652C7032702C6F70656E76706E2C697076362C736F636B73352C707074702C667265652C696E7465726163746976653B
test.s03.2yf.de. IN NULL 656D7330332E796F75722D66726565646F6D2E64653B44453B38312E3136392E3135342E32393B303B31323938313B64656661756C742C766F6C756D652C6575726F70652C6F70656E76706E2C766F69702C707074702C736F636B73352C667265652C696E7465726163746976652C697076363B
test.s04.2yf.de. IN NULL 656D7330342E796F75722D66726565646F6D2E64653B44453B38312E3136392E3135342E32373B303B373134353B64656661756C742C766F6C756D652C6575726F70652C6F70656E76706E2C766F69702C707074702C736F636B73352C667265652C696E7465726163746976652C697076363B

Using dig +trace, we can see that 2yf.de relies upon the name servers dns{1,2,3}.resolution.de:

$ dig +trace 1yf.de
[...]
1yf.de.			86400	IN	NS	dns1.resolution.de.
1yf.de.			86400	IN	NS	dns2.resolution.de.
1yf.de.			86400	IN	NS	dns3.resolution.de.
;; Received 196 bytes from 194.146.107.6#53(194.146.107.6) in 35 ms
[...]

We can check DNSDB to see other domains that also use those same name servers:

$ dnsdb_query.py -n dns1.resolution.de | awk '{print $1}' | grep -v " SOA " | reverse-domain-names | sort -u 

“Decoding” that command pipeline:

    'dnsdb_query.py -n dns1.resolution.de'

        Report domains known to use dns1.resolution.de in the
        domain's rdata ("right hand side" of the DNS resource
        records)
        
    '|' (vertical bar or "pipe" character)
    
        "Send the output from the preceding command into the 
        next command"
        
    'awk '{print $1}''
    
        keep just the first column of that output
        
    'grep -v " SOA "'
    
        drop any SOA (Start Of Authority) records
        
    'reverse-domain-names' [see Appendix II]
    
        this is a little script that rotates the labels in the 
        domain name 
                    
    'sort -u'
    
        sort the domain list and uniquify the results (remove duplicate records)
                

Output from that command pipeline looks lke:

com.ntmoffline
com.ntmoffline.www
de.1yf
de.2yf
de.53r
de.blum-pfersdorff
de.familie-baeumle
de.familie-pfersdorff
de.familie-wachtel
de.pfersdorff-blum
de.resolution
de.resolution.dyndns
de.resolution.oklahoma
de.your-freedom
de.your-freedom.pptp
de.your-freedom.www
info.getmeconnected
info.grass-is-green
net.your-freedom
net.your-freedom.cgi
net.your-freedom.pptp
net.your-freedom.www

Those records appear to all be related to “https://your-freedom.net/” and its DNS tunneling option as described here.

As is often the case for traffic tunneled via the public DNS, the record type=NULL traffic for 1yf.de appears to be obfuscated/encrypted. This post is not about cracking encrypted network traffic streams, but if that were a goal, one might:

  • Download a copy of the associated network client (Java versions of network clients often represent a convenient starting point for casual reverse-engineering).
  • Attempt to dis-assemble the Java code with a disassembler such as the one that’s available here.
  • Browse the disassembled code and related files, looking for any hardcoded cryptographic keys in the dis-assembled source code.

If any hardcoded cryptographic keys could be identified in disassembled source code, it might be possible for an adversary to decode previously captured and saved encrypted traffic [6] (unless a cryptographic protocol offering forward secrecy [7] was used or keys have been routinely rotated and earlier versions of the client aren’t available)

Discovery and exploitation of hardcoded cryptographic keys could obviously have grave potential consequences for those who may have previously used and relied on a flawed cryptographic product for protection against eavesdropping, to say nothing of those who might unwittingly use such a program in the future.

5. What About The Other record type=NULL Domains We Saw?

A. Recall that we saw 1,978,010 record type=NULL resource records associated with the 2nd-level-domain dashnxdomain.net.

Fortunately, figuring out what that domain’s about is pretty straightforward — see “Measuring DNSSEC” by Geoff Huston & George Michaelson.

B. Another set of domains consisted of {na2.in, qv4.in, nf5.in, 06x.in, mm4.in, g6h.in, etc.}. According to whois, at least some of these domains are owned by:

Hari Pada
Noapara
Kolkata
West Bengal
700125
IN
+91.9433300300

The web site at https://www.revolvy.com/main/index.php?s=Noapara,%20India reports that “Noapara Metro Station is the newest and largest station of the Kolkata Metro situated in the Noapara, India.” Thus, the domain whois address doesn’t provide us with much in the way of an insight.

However, if we search for 91.9433300300 (the phone number that’s associated with these domains) we have more luck. 91.9433300300 is ALSO used by the domain tunnelguru.com — a terrific indicator of what these domains are being used for (yes, more DNS tunneling):

Domain Name: TUNNELGURU.COM
Registrant Name: Domain Store
Registrant Street: Orisaa    }
Registrant Street: Orisaa    }    <-- not very specific, this
Registrant City: Orisaa      }        is 1 of 29 Indian states
Registrant State/Province: Orissa
Registrant Postal Code: 440701
Registrant Country: IN
Registrant Phone: +91.9433300300
Registrant Email: [email protected]

Inaccurate/incomplete street addresses can be reported to ICANN via their WDPRS site, https://forms.icann.org/en/resources/compliance/complaints/whois/inaccuracy-form

In spite of the disappointing lack of detail in that whois data, the tunnelguru.com whois does at least give us a new email to potentially follow: [email protected].

Checking the domaintools.com reversewhois tool [8] for that email address returns a number of domain names, including (but not limited to):

dns-tunnel.com
dnsproxy.pro
icmp-tunnel.com
icmptunnel.com
slowdns.com
smartdnsproxy.pro
tunnel-guru.com
tunnel.guru
tunnelguru.asia
tunnelguru.com
tunnelguru.in
tunnelguru.net
tunnelguruvpn.com

Based on the above domain names, we’re quite comfortable tagging this full set of domains as likely being DNS tunnel-related, too.

C. We’re now down to our miscellaneous domains. We could see what DNSDB knows about any of these domains by saying, for example:

$ dnsdb_query.py -r \*.nwsome.site/NULL > nwsome.site.txt

However, for the purposes of this article, let’s see what may be known about these domains OUTSIDE of traditional passive DNS data:

    180378 nwsome.site 
      
        ​[nothing relevant in Google/Bing]
        ​[domain whois cloaked]
                
    51925 it.cx
       
        ​[context for this domain: http://www.alexa.com/siteinfo/it.cx ]
                 
    24521 njpjs.com
       
        ​[Web site states "This demo secretly makes 
        requests to STUN servers that can log your 
        request. These requests do not show up in 
        developer consoles and cannot be blocked by 
        browser plugins (AdBlock, Ghostery, etc.).] 
        ​[domain whois cloaked]
                  
    1554 ignorelist.com

        ​[Web site states "ignorelist.com is being 
        shared via Free DNS, a dynamic DNS domain 
        sharing project where members can setup, and 
        administrate their dns entries on their own 
        mote internet connected systems in real 
        time. To create a free subdomain from any 
        shared domain, you can visit the shared 
        domain list."]

    294 food4mum.ru

        ​[Russian-language site]
        ​[nominally focused on breastfeeding, but
        site may be down/unreacahable]    

    290 get-schwifty.com

        ​[cloaked domain whois]
        ​[parked domain]
        ​[c.f. https://en.wikipedia.org/wiki/Rick_and_Morty]

    192 nova.ws

        ​[registered via regtime.net/webnames.ru]
        ​[reverse proxied via cloudflare]
        ​[Russian-language site]
        ​[see http://www.alexa.com/siteinfo/nova.ws ]

6. Conclusion

You’ve now learned a bit about some record type=NULL domains, and how you can efficiently pull them out of DNSDB Export mtbl files. You’ve seen how they are getting used for DNS tunneling, research about DNSSEC deployment, and other potentially unknowable purposes.

If you need information about how to license DNSDB Export for your company, please contact Farsight Security, Inc., sales at https://www.farsightsecurity.com/order-services/

For more information about licensing access to DomainTools’ Reverse Whois tool and other products, see this page.

Notes:

[1] For an excellent introduction to mtbl files, see the two part blog series previously written by Farsight Engineer Eric Ziegast: “Farsight’s Real-time DNSDB,” Parts One and Two, see https://www.farsightsecurity.com/blog/txt-record/realtime-dnsdb-20151028/ and https://www.farsightsecurity.com/2015/11/18/ziegast-realtime-dnsdb-2/.

[2] This is almost reminiscent of https://en.wikipedia.org/wiki/Who’s_on_First%3F

[3] https://en.wikipedia.org/wiki/List_of_DNS_record_types

[4] https://github.com/BurntSushi/ripgrep

[5] If you’re not familiar with the notion of effective 2nd level domains, see https://publicsuffix.org/

[6] “Leaked NSA Doc Says It Can Collect And Keep Your Encrypted Data As Long As It Takes To Crack It,” http://www.forbes.com/sites/andygreenberg/2013/06/20/leaked-nsa-doc-says-it-can-collect-and-keep-your-encrypted-data-as-long-as-it-takes-to-crack-it/

[7] https://en.wikipedia.org/wiki/Forward_secrecy

[8] http://reversewhois.domaintools.com/

Appendix I: 2nd-level-dom Perl script

#!/usr/bin/perl
use strict;
use warnings;
use IO::Socket::SSL::PublicSuffix;

my $pslfile = 'your_path_here/public_suffix_list.dat';
my $ps = IO::Socket::SSL::PublicSuffix->from_file($pslfile);

my $line;

foreach $line (<>) {
    chomp($line);
    my $root_domain = $ps->public_suffix($line,1);
    printf( "%s\n", $root_domain );
}

Appendix II: reverse-domain-names Perl script

#!/usr/bin/perl

my @lines = <>;
chomp @lines;

@lines =
	map { join ".", reverse split /\./ }
sort
@lines;

print "$_\n" for @lines;

Joe St Sauver, Ph.D. is a Scientist with Farsight Security, Inc.