How Many "Parts" (or "Labels") Does A Domain Name Typically Have?
How Many “Parts” (or “Labels”) Does A Domain Name Typically Have?
I. Introduction
Domain names can be thought of as a series of “labels” or “parts” or “chunks” separated with dots.
We’re all familiar with domain names that have three labels, such as www.example.com.
It’s also not unusual to find domain names that have just two labels (such as uoregon.edu) or an “extra” label (as in www.matse.illinois.edu).
But what does the distribution of domain names really look like? Are there domains with more than four labels? More than ten? What does that distribution look like? We decided to look at a day’s worth of data from DNSDB and find out.
II. The Dataset
We arbitrarily selected June 28th, 2017 for this study. The DNSDB Export MTBL file for that date, dns.20170628.D.mtbl, is a medium-sized file, at 31,626,779,104 bytes. That mtbl file is in a compact binary format, but can be exported into a more-convenient text format for simple analyses.
We processed that file with the command pipeline:
$ dnstable_dump -r /export/dnstable/mtbl/dns.20170628.D.mtbl | grep -v ";;" | grep -v "^$" | grep -v ".in.addr." | grep -v ".ip6.arpa." | awk '{print $1}' > rrnames.txt
That command string dumps the rrnames (domain names) from the specified mtbl file, removing any:
-- comment lines (";;")
-- blank lines ("^$")
-- IPv4 inverse adress records (".in.addr.") and
-- IPv6 inverse address records (".ip6.arpa.").
We then save the resource record names into a temporary file as a first step.
Once that job finished, we had a list of unsorted resource record names in rrnames.txt, including potentially many duplicate domain names. To eliminate duplicates, we sorted and uniq’d those names:
$ sort -u < rrnames.txt > uniq-rrnames.txt
We then counted the number of “dots” per unique name, outputting one “dot count” per record. Conveniently, because each names end in a trailing dot, a name with two labels will also have two dots, a name with three parts will have three dots, etc. We then counted those dots by saying:
$ sed 's/[^\.]//g' < uniq-rrnames.txt | awk '{ print length }' > rrname-uniq-dot-count.txt
We then sorted the dot counts in descending order by observed frequency of occurrence. Generally, this followed the length of the domain, but in a couple of cases we had to move things around manually to get them in ascending order by label count.
$ sort rrname-uniq-dot-count.txt | uniq -c | sort -nr > rrname-uniq-dot-count-tops.txt
III. The Distribution of Label Counts
The number of unique RRnames with <N> labels can be see in Figure 1. When reviewing Figure 1, note that this graph has log-linear axes.
Figure 1.

Summarizing that graph, 99.98% of all unique RRnames seen have 10 or fewer labels, and 78.36% have just 1, 2 or 3 labels:
Table 1.
Count # of Labels % obs Cum % sn-5ualdn7l.gvt1.com_-.edgedl_-.release2_-.aj7czeerus1-_-.59.0.3071.115_58.0.3029.110_chrome_updater.exe.58af.un-6a4b.v3.url.zvelo.com.ecc-untangle.ecc-clinic[dot]org. 0.68.106.66.73_-.data_-.03ef32f2a1e6db82_-.r5sn-5ualdn7l.gvt1.com_-.edgedl_-.release2_-.aj7czeerus1-_-.59.0.3071.115_58.0.3029.110_chrome_updater.exe.58af.un-6a4b.v3.url.zvelo.com.ecc-untangle.ecc-clinic[dot]org. paypal.com.us.webapps.mpp.home.signin.country.x.us.locale.x.en.us.mpp.account.selection.customer.personal.account.info.privacy.legal.contact.home.request.form-check.com.yandex[dot]ru.
Long names similar to these are sometimes used to confuse users and potentially lure them into visiting a malware-dropping or phishing site.
Other names looked like:
daewoo.daewoo.daewoo.daihatsu.daihatsu.daihatsu.daihatsu.daihatsu.daihatsu.bmw.suzuki.bmw.bmw.bmw.bmw.subaru.subaru.subaru.subaru.subaru.subaru.subaru.bmw.bmw.bmw.bmw.test.auto.testquelle[dot]de. daewoo.daewoo.jeep.jeep.jeep.jeep.daewoo.rover.rover.rover.rover.nissan.nissan.nissan.daewoo.daewoo.daewoo.daewoo.daewoo.daewoo.daihatsu.suzuki.suzuki.suzuki.suzuki.subaru.test.staubsauger.testquelle[dot]de. daewoo.daewoo.rover.rover.rover.rover.rover.rover.rover.rover.donkervoort.donkervoort.rover.rover.rover.rover.rover.rover.rover.rover.audi.audi.audi.audi.audi.lexus.lexus.donkervoort.was-tun-bei-motorschaden[dot]de.
These names may have been crafted in this format in a misguided attempt at improving search engine rankings.
V. Conclusion
You now know a bit more about how many labels typically make up a domain name on the Internet, and you may be able to see how you could use DNSDB Export datasets to explore DNS-related questions of your own. Please contact Farsight Sales at [email protected] or visit https://www.farsightsecurity.com/order-services/ for more information about obtaining access to DNSDB Export datasets.
Appendix 1. Raw Data
Frequency # of Labels 1,535 1 96,557,627 2 85,837,909 3 22,353,662 4 16,056,670 5 3,094,765 6 6,560,952 7 1,783,962 8 193,365 9 279,332 10 7,994 11 4,039 12 6,590 13 2,378 14 2,248 15 1,426 16 859 17 679 18 555 19 616 20 500 21 520 22 419 23 336 24 393 25 552 26 342 27 214 28 779 29 6,997 30 191 31 169 32 162 33 154 34 168 35 151 36 109 37 108 38 88 39 91 40 72 41 63 42 57 43 57 44 50 45 44 46 44 47 43 48 42 49 40 50 38 51 38 52 38 53 39 54 36 55 35 56 37 57 32 58 33 59 29 60 24 61 15 62 10 63 1 64 2 67 1 70 1 71 1 72 1 79
Joe St Sauver Ph.D. is a Scientist for Farsight Security, Inc.