DNSDB Export User Guide
Introduction
DNSDB Export is a subscription service that allows a customer to create a local version of the DNSDB API system and tools. The customer can run a local version of the DNSDB API or search the database files directly. This allows a customer to create a secure, local environment to work with DNSDB data without the queries and results visible, even encrypted, on the public networks. The DNSDB Export data files are delivered to this local environment via a secure HTTPS-based transport.
This document will explain:
- The data sets and file names
- The provisioning process
- How to download the latest DNSDB Export files
The examples in this document will use the filesystem directory
/srv/dnsdb-export
and the username dnsdb-export-username
. These are
example values and should be replaced with your site’s specific values,
as provided to you by Farsight.
Audience
This document is intended for system administrators responsible for administration of a DNSDB Export service.
Hardware and Network Requirements
DNSDB Export is intended to be installed on real or virtual servers owned or controlled by the customer within a secure hosting facility owned and controlled by the customer or their service provider. These servers must be managed securely to guarantee that DNSDB data is only disclosed to authorized users.
Per the subscription contract with Farsight, the DNSDB data must be segregated from other data sets and cannot be merged into other databases or mingled with non-Farsight data. If the Farsight subscription is terminated then all DNSDB data must be deleted and cannot be retained.
The recommended OS for the servers is Debian 9, but these operating systems are currently supported:
- Debian 9 (Stretch) and 10 (Buster)
- Ubuntu 20 LTS (Focal Fossa)
- CentOS 7 and Red Hat 7
For complete hardware and network requirements to run a server that supports DNSDB Export, please see the document DNSDB Export Requirements.
Data sets and file names
DNSDB Export is a service that provides read-only mtbl
and tgdb
database files used by
DNSDB. mtbl
is a compressed, indexed binary file format for storing key-value entries.
DNSDB’s mtbl
files use an encoding for passive DNS data called dnstable
.
See mtbl and dnstable for the open source C implementation libraries, and pymtbl and pydnstable for the open source Python library that wrap the C libraries.
The DNSDB files (mtbl
) are partitioned into dns
and dnssec
subsets. The dnssec
subset includes all DNS Resource Records of type DS
, RRSIG
, NSEC
, DNSKEY
, NSEC3
,
NSEC3PARAM
, or DLV
, while the dns
subset contains DNS Resource Records for all other
types that are not included in the dnssec
subset. All mtbl
files used by DNSDB Export
start with the filename prefix dns.
or dnssec.
depending on whether the MTBL file is
for the dns
or dnssec
subsets.
The Flex database files (tgdb
) contain information to flexibly
search dns records. Each tgdb file is generated by a proprietary process
from mtbl
files. There are separate *.tgdb
files for rrnames and rdata.
Farsight does not currently include dnssec records in flexible
search. The RRTypes that have their rdata indexed in Flex include
CNAME
, HINFO
, MX
, NAPTR
, NS
, PTR
, RP
, SOA
, SPF
,
SRV
and TXT
records. SOA
RData records do not have the numeric
fields, and CDS
and CDNSKEY
records cannot be queried by type
using the Flex backend server.
To summarize:
- files with the naming
dns.*.mtbl
host dns record data and are queried by the DNSDB APIs. - files with the naming
dnssec.*.mtbl
host dnssec record data and queried by the DNSDB APIs. - files with the naming
*.tgdb
host records for the DNSDB Flex server and are queried by the Flex API.
The DNSDB mtbl
and tgdb
files are further partitioned by time interval into
‘minute’, ‘decaminute’, ‘hour’, ‘day’, ‘month’, and ‘year’ sets. Each
file has a filename suffix which indicates the time interval.
Each file name also includes a datestamp indicating which time interval it corresponds to. The format of the datestamp varies depending on which time interval is used.
Time Interval | Filename suffix | Time format | Time format example |
---|---|---|---|
minute | .m.mtbl | %Y%m%d.%H%M | 20130129.0212 |
decaminute | .X.mtbl | %Y%m%d.%H%M | 20130129.0220 |
hour | .H.mtbl | %Y%m%d.%H%M | 20130129.0200 |
day | .D.mtbl | %Y%m%d | 20130129 |
month | .M.mtbl | %Y%m | 201301 |
year | .Y.mtbl | %Y | 2013 |
Note: in the file name suffix, we use lower-case m
for minute files and upper-caseM
for month files. The opposite case is used in the time format string.
The underlying data records are created from deduplicated firstseen/lastseen/count
observations of RRSET data seen in the SIE real-time stream in the last minute. Records are initially aggregated into minute files. Records are aggregated into larger time periods by adjusting firstseen/lastseen/count values shared between identical RRSETs. The higher granularity files are aggregated into larger, lower-granularity files by a merging process. When merging files, a given RRSET will have its oldest firstseen is preserved, will use the newest lastseen, and have counts added together.
More granular files | Aggregates into file |
---|---|
most recent data | minute .m |
10 minute files | decaminute .X |
6 decaminute files | hour .H |
24 hour files | day .D |
28-31 daily files | month .M |
12 month files | year .Y |
The time granularity of information available for any RRSET becomes more coarse as the time periods grow longer. For a popular frequently-observed RRSET combination, the firstseen will be close to the beginning of the time period and lastseen close to the end with a large count. A short-lived RRSET would have a low count and firstseen/lastseen close to the actual times the RRSET was first/last observed withing the time period of the database file.
The daemon dnstable-manager
downloads each file and verifies its digest. It tracks which files are present in a file set. For each collection of files, there is a separate, local fileset file. These have a “.fileset” extension and are atomically updated after each new file is ready. Here is a description of each file set file:
Collection | .fileset file name |
---|---|
dns .mtbl files | /srv/dnsdb-export/mtbl/dns.fileset |
dnssec .mtbl files | /srv/dnsdb-export/mtbl-dnssec/dnssec.fileset |
rrnames .tgdb files | /srv/dnsdb-export/tgdb/rrnames.fileset |
rdata .tgdb files | /srv/dnsdb-export/tgdb/rdata.fileset |
dnstable-manager
also stores a file with a SHA-256 checksum for each .mtbl and .tgdb file. These files have a .sha256
extension appended to the original filename.
Selecting Files to Download
You can adjust which data files are downloaded by appending parameters
to the fileset URI in the dnstable-manager
configuration file. This
configuration file is a YAML format file and is normally in
/etc/dnstable-manager/dnstable-manager.yaml
. The configuration
allows you, for example, to limit the time period you have available
for your queries if you choose not to download the entire set. You can
also restrict your downloads to certain aggregation levels; for
instance, you can choose not to download minutely files.
By default dnstable-manager
fetches all files that are allowed in a
user’s subscription for the whole term of the subscription period, as
if one specified time_letters=mXHDMY
. If any letters are removed,
dnstable-manager
stops downloading related files. In addition,
dnstable_manager
will delete any local files not specified in
the time_letters
. An unsuspecting adminstrator could run into issues
where they want monthly or daily files and remove the “Y”. Any local
yearly files would be deleted by this change. Whenever making changes
where the time letters are changed, you need to make a backup copy
of the data, or at least a set of local hard links to the files, so
that they are still available in case they are accidentally deleted.
If you want to manually download smaller files without downloading larger files, consider manually downloading files instead of using dnstable_manager
. Consult with Technical Support for more information.
Restricting by Time Period
Use the time_letters
parameter to specify a set of time periods that
you wish to receive.
Time Letter | Meaning |
---|---|
m | Minute |
X | Decaminute (10x minute files) |
H | Hour |
D | Day |
W | Week (unused) |
M | Month |
Q | Quarter (unused) |
Y | Year |
Example: append time_letters=mXHDM
to your fileset uri if you do not
want yearly files, change your fileset URI to: https://export.dnsdb.info/dnsdb-export/mtbl/dns.fileset?time_letters=mXHDM
Restricting by Age
Use the before
and after
parameters if you want to limit the age or freshness of the data in the fileset. These parameters may be any of:
- anything parseable by Python dateutil.parser seconds since January 1, 1970 UTC
- negative offset from now in seconds relative delta. The format is
key: value
- keys:
years
,months
,weeks
,days
,hours
,minutes
,seconds
,microseconds
(thoughseconds
andmicroseconds
are unlikely to be useful) - separator: this has to be a semicolon character followed by a whitespace (e.g. %20 for space)
- value: a number value
- Here is an example combining restrictions to download daily files for the past 35 days:
https://export.dnsdb.info/dnsdb-export/mtbl/dns.fileset?time_letters=D&after=days: 35
- keys:
- Below are additional examples and their meanings:
Examples | Meaning |
---|---|
after=-86400 | Anything from the last day. |
after=years:%201 | Anything from the last year. |
before=2021-01-01 | Anything up to January 1, 2021. |
before=1451606400 | Anything up to January 1, 2016. |
before=-300 | Files delayed by five minutes. |
Packages
The following packages contain the tools used for working with DNSDB Export data files. Where the package names differ between debian .deb files and Redhat .rpm files due to naming conventions, both are shown.
Generally required:
dnstable-manager
: downloads local copies of DNSDB data files from Farsight.dnsdb-api-server
: the dnstable lookup interface for DNSDB API version 1. It is a Python 2 WSGI webapp server and it reads MTBL files. It is only available on operating systems with full python 2 support.dnstli2
: the dnstable lookup interface for DNSDB API version 2. It is a Python 3 WSGI webapp server and it reads MTBL files.flex-api-server
: serves the frontend of Flex API. It is a Python 3 WSGI webapp server. It connects to aflexbackendd
daemon.flex-backend-server
: provides theflexbackendd
backend daemon and reads.tgdb
flex data files.mtbl-bin
(debian) ormtbl
(rpm): command-line utilities such asmtbl_info
,mtbl_verify
,mtbl_dump
, andmtbl_merge
for inspecting and managing individual MTBL files.dnstable-bin
(debian) ordnstable
(rpm): command-line utilities such asdnstable_dump
for decoding individual DNSDB Export MTBL files.
Developer Tools for C:
libdnstable-dev
(debian) ordnstable-devel
(rpm): development package for libdnstable C applications.libmtbl-dev
(debian) ormtbl-devel
(rpm): development package for libmtbl C applications.
Developer Tools for Python:
python-dnstable
(debian) orpython-pydnstable
(rpm): development package for libdnstable Python applications.python-mtbl
(debian) orpython-pymtbl
(rpm): development package for libmtbl Python applications.
Basic Usage
This section describes basic usage of the more commonly used tools.
mtbl utilities
mtbl-bin
is the debian package name, mtbl
is the rpm package name.
This package provides low-level command-line tools like mtbl_info
,
mtbl_verify
, mtbl_dump
, and mtbl_merge
. These utilities can be
used to provide information, verify, view and merge MTBL files.
Example usage:
mtbl_info
: display information about an MTBL file.
$ mtbl_info /srv/dnsdb-export/mtbl/dns.20170411.1300.m.mtbl
file name: /srv/dnsdb-export/mtbl/dns.20170411.1300.m.mtbl
file size: 96,616,169
index block offset: 96,091,431
index bytes: 524,226 (0.54%)
data block bytes 96,091,431 (99.46%)
data block size: 8,192
data block count 27,969
entry count: 5,475,567
key bytes: 255,823,906
value bytes: 44,771,729
compression algorithm: zlib
compactness: 32.14%
mtbl_verify
: verify integrity of an MTBL file’s data and index blocks.
$ mtbl_verify /srv/dnsdb-export/mtbl/dns.20170411.1316.m.mtbl
/srv/dnsdb-export/mtbl/dns.20170411.1316.m.mtbl: OK
mtbl_dump
: print key-value entries from an MTBL file.
$ mtbl_dump /export/dnstable/mtbl/dns.20170411.1316.m.mtbl | head -1
"\x00\x02ac\x03dns\x00\x02\x02ac\x00\x0f\x01a\x03dns\x04park\x02io\x00\x0f\x01b\
x03dns\x04park\x02io\x00" "\xcf\xad\xb3\xc7\x05\xcf\xad\xb3\xc7\x05\x00"
For additional details, see the mtbl
manpages.
dnstable utilities
dnstable-bin
is the debian package name, dnstable
is the rpm package name.
This package provides command-line tools like dnstable_dump
and can
be used to decode the DNSDB Export MTBL files.
Example usage (with some line breaks added for readability):
dnstable_dump
: dump dnstable data file to text or JSON.
$ dnstable_dump -j -r /srv/dnsdb-export/mtbl/dns.201312.M.mtbl | head -1
{"bailiwick": ".", "rrname": ".", "time_last": 1388531097, "time_first": 1385842207,
"count": 63528993, "rrtype": 2, "rdata": ["a.root-servers.net.",
"b.root-servers.net.", "c.root-servers.net.", "d.root-servers.net.",
"e.root-servers.net.", "f.root-servers.net.", "g.root-servers.net.",
"h.root-servers.net.", "i.root-servers.net.", "j.root-servers.net.",
"k.root-servers.net.", "l.root-servers.net.", "m.root-servers.net."]}
dnstable_lookup
: lookup individual records in a dnstable data file
or set of data files.
$ export DNSTABLE_FNAME=/srv/dnsdb-export/mtbl/dns.20210427.2000.H.mtbl
$ dnstable_lookup -J rrset fsi.io | head -1
{"count":7,"time_first":"2021-04-25T18:21:00Z","time_last":"2021-04-26T13:20:55Z",\
"rrname":"fsi.io.","rrtype":"A","bailiwick":"fsi.io.","rdata":["104.244.14.13"]}
For additional details, see the dnstable
manpages.
Accessing the Data
There are a variety of ways to access DNSDB and Flex data depending on your needs:
Accessing the Data via Command Line
dnstable_lookup
You can use the dnstable_lookup
tool found in the dnstable-bin
package. See the above example.
dnsdbq and dnsdbflex
The dnsdbq
and dnsdbflex
tools give access to DNSDB and Flex data respectively.
Examples (with some line breaks added for readability):
$ echo "APIKEY=$APIKEY" | sudo tee --append /etc/dnsdb-query.conf
$ echo 'DNSDB_SERVER="https://localhost"' | sudo tee --append /etc/dnsdb-query.conf
$ dnsdbq -r fsi.io -l 1 -j -t A
{"count":10392,"time_first":1381265499,"time_last":1428418529,\
"rrname":"fsi.io.","rrtype":"A","bailiwick":"fsi.io.","rdata":["66.160.140.76"]}
$ dnsdbflex --r '^[ew].*\.fsi\.io\.' -l 1
{"rrname":"eu.fsi.io.","rrtype":"A"}
curl
The Curl library is used by dnsdbq
and dnsdbflex
. You can use
curl
directly, as shown in the following examples for all three
APIs (with some line breaks added for readability):
$ curl -g -k -Ss -H "X-Api-Key: $DNSDB_API_KEY" -H "Accept: application/json" \
"https://$HOSTNAME/lookup/rrset/name/fsi.io?limit=1"
{"count":10392,"time_first":1381265499,"time_last":1428418529,\
"rrname":"fsi.io.","rrtype":"A","bailiwick":"fsi.io.","rdata":["66.160.140.76"]}
$ curl -g -k -Ss -H "X-Api-Key: $DNSDB_API_KEY" -H "Accept: application/x-ndjson" \
"https://$HOSTNAME/dnsdb/v2/lookup/rrset/name/fsi.io/A?limit=1"
{"cond":"begin"}
{"obj":{"count":10392,"time_first":1381265499,"time_last":1428418529,\
"rrname":"fsi.io.","rrtype":"A","bailiwick":"fsi.io.","rdata":["66.160.140.76"]}}
{"cond":"limited","msg":"Result limit reached"}
$ curl -g -k -Ss -H "X-Api-Key: $DNSDB_API_KEY" -H "Accept: application/x-ndjson" \
"https://$HOSTNAME/dnsdb/v2/regex/rrnames/^[ew].*\.fsi\.io\.?limit=1"
{"cond":"begin"}
{"obj":{"rrname":"eu.fsi.io.","rrtype":"A"}}
{"cond":"succeeded"}
Python
Here is how to install the dnsdb_query.py
script and configure it to
query the local API server. Its Python code is a good head start at
writing a Python based integration.
$ sudo wget -q -O /usr/local/bin/dnsdb_query.py \
https://raw.github.com/dnsdb/dnsdb-query/master/dnsdb_query.py
$ sudo chmod 755 /usr/local/bin/dnsdb_query.py
$ echo "APIKEY=$APIKEY" | sudo tee --append /etc/dnsdb-query.conf
$ echo 'DNSDB_SERVER="https://localhost"' | sudo tee --append /etc/dnsdb-query.conf
$ /usr/local/bin/dnsdb_query.py -r fsi.io -l 1
;; bailiwick: fsi.io.
;; count: 1214
;; first seen: 2021-01-31 20:28:16 -0000
;; last seen: 2021-05-05 14:31:50 -0000
fsi.io. IN A 104.244.14.13
Accessing the data using a GUI
DNSDB Scout for Export is a browser-based GUI for accessing and making queries to the DNSDB API. It can be installed for use on an Export installation. For more information on DNSDB Scout, please see:
Farsight also has integrations that work with a number of third party tools that might work for your environment. For a complete list of our third party integrations, please visit our Integrations page.
Limits
Data Synchronization
Data Synchronization is limited by the server to mitigate potential resource exhaustion. Because of this, it could take up to 10 days to retrieve a complete copy of the DNSDB database when copying files over the Internet.
For some customers, Farsight is able to ship an initial copy of the DNSDB database files using an encrypted drive to bootstrap a new installation of DNSDB Export.
Troubleshooting
- Disk full – disk full is one of the most common problems. Check for this problem
with
df -h /srv
and by reviewing the error messages fromdnstable-manager
.
- Running out of disk space for storing data is one of the most common problems customers see. We keep producing more data over time, so the system or storage administrator needs to make sure more space is made available or older filesets are deleted. To figure out how much space is available, run
cd /srv/dnsdb-export/
df .
if there is just one mount point, otherwisedf *
- Corrupted Files – Verify mtbl files using
mtbl_verify
.
- If you find a corrupted file, you can correct by deleting the corrupted file.
This should cause
dnstable-manager
to download the file again.
- General Errors – Check the
dnstable-manager
logs and review listed errors.
Validating manually downloaded files with a checksum
If you choose to not use dnstable-manager
to download the data files but are downloading
them via some other mechanism, you may want to checksum the files to validate their
correctness. You can compare the checksum that Farsight generated with one you generate
(dnstable-manager
does this automatically).
Here is a technique to download Farsight’s checksum and generate a checksum of a data file that you downloaded:
$ curl --silent --head -H "X-API-Key:$EXPORT_APIKEY" \
https://export.dnsdb.info/dnsdb-export/mtbl/dns..D.mtbl \
| grep ^Digest: | cut -f2 -d=
The following openssl
command will calculate the base64 encoded checksum of your already
downloaded file:
$ openssl dgst -sha256 -binary dns..D.mtbl | openssl base64 -e
Compare the outputs of the two commands — they should be the same.