
As part of a recent update to DNSDB,
dnsdbq
now offers a “
-V summarize
” verb (this is an implementation of the “estimation of result size” feature mentioned in an earlier blog article). Since we covered the new feature using DNSDB Scout in that initial article, we will only focus on
dnsdbq,
Farsight’s DNSDB command line client written in C, here.
To make use of the new
dnsdbq -V summarize
feature, begin by ensuring that you’re running the latest available version of dnsdbq. The manual page for the new verb describes the -V option as:
-V verb
The verb to perform, i.e. the type of query, either "lookup" or
"summarize". The default is the "lookup" verb. As an option, you
can specify the "summarize" verb, which gives you an estimate of
result size. At-a-glance, it provides information on when a given
domain name, IP address or other DNS asset was first-seen and last-
seen by the global sensor network, as well as the total observation
count.
As noted in the manual page, when you specify
-V summarize
in
dnsdbq
you will get JUST:
It may help to consider an example. Let’s ask for RRname results for www.mit.edu/A/mit.edu, and limit that query to three results:
$ dnsdbq -r www.mit.edu/A/mit.edu -l 3
;; record times: 2010-06-24 06:02:21 .. 2013-04-01 16:52:02
;; count: 5215640; bailiwick: mit.edu.
www.mit.edu. A 18.9.22.169
;; record times: 2013-01-22 21:10:33 .. 2013-01-23 00:11:57
;; count: 452; bailiwick: mit.edu.
www.mit.edu. A 18.9.22.169
www.mit.edu. A 141.101.116.213
www.mit.edu. A 141.101.117.213
;; record times: 2013-01-22 17:51:20 .. 2013-01-22 17:53:43
;; count: 9; bailiwick: mit.edu.
www.mit.edu. A 128.103.63.138
Now let’s run that same query, this time including the
-V summarize
option:
$ dnsdbq -r www.mit.edu/A/mit.edu -l 3 -V summarize
;; record times: 2010-06-24 06:02:21 .. 2013-04-01 16:52:02
;; count: 5216101; num_results: 3
Note that this output corresponds to our “full” results:
We did NOT get “imputed” information for “all potential results” that DNSDB may know for that query, just the three we asked for.
The
dnsdbq
summarize verb works “just like a regular query,” EXCEPT:
Let’s consider another example, a
dnsdbq
summarize query for *.uber.com returning up to a million results. We’ll begin by “manually” summing up the counts for an up-to-million results with jq and a tiny one-line
awk
REPL script:
$ dnsdbq -r \*.uber.com -l 1000000 -j | jq -r '.count' | awk '{s+=$1}END{print s}'
2771990113
Now let’s see what we see from the actual
dnsdbq summarize
verb:
$ dnsdbq -r \*.uber.com -l 1000000 -V summarize
;; record times: 2010-06-24 10:38:39 .. 2019-08-29 21:33:59
;; zone times: 2010-04-24 16:12:21 .. 2018-03-22 16:02:25
;; count: 2771990200; num_results: 1000000
The results for this example are interesting for a couple of reasons:
A
dnsdbq -V summarize
query “counts the same” as a regular query in terms of your quota usage
A common question, as you might expect, is “So if doing a
dnsdbq -V summarize
query counts the same as doing a regular
dnsdbq
query, why not just do a regular query?” The answer is that the summarize verb is a nice option when you ONLY care about things like aggregate counts/first/last seen times because it avoids the necessity of taking all the detail records (only to then subsequently end up “throwing them away”).
dnsdbq -V summarize output?”dnsdbq
includes
num_results
in its output because it provides important context for the summary output.
For example, if you’ve asked for 500,000 results but we only know about 400,000 results, we want to ensure you know that we weren’t able to give you a summary for the full 500,000 you requested.
When you use the
dnsdbq -V summarize
option,
dnsdbq
returns its summary based on the results you would otherwise have seen had you not specified the summarize verb. The summarize verb does NOT somehow magically review ALL the results that DNSDB potentially knows about a given query (as if the limit value didn’t matter).
To make this concrete, let’s pretend that
dnsdbq
knows about 25 million unique combinations of (RRname, RRtype, Bailiwick, Rdata, and zone-file vs observed-by-a-sensor). Let’s also assume you use
dnsdbq -V summarize
and ask for the maximum number of results you can get from
dnsdbq
in a single query (e.g., one million results).
The first-seen, last-seen and count values that will be reported through
dnsdbq -V summarize
will be be based on the one million displayable results you would otherwise have been shown in detail, NOT the full set of 25 million results.
This means that you do NOT know, and CANNOT know, how many total unique results for your query may still “lurk” undisclosed in the passive DNS database, nor what the sum of the counts for all those results might be — the
summarize
verb will just report on what you could otherwise have gotten in normal detail-record form.
The author would like to thank his colleague David Waitzman for his helpful comments on this article, and for all his work in adding new features in DNSDB API. Any errors remaining in this article are the responsibility of the author.
We hope that this introduction to the
dnsdbq summarize
verb has been helpful and instructive for you.
The Farsight Security Sales Team can be reached at [email protected].
Joe St Sauver Ph.D. is a Distinguished Scientist with Farsight Security®, Inc.