DNSDB Glob Reference Guide
Glob flexible searches
Globbing is an advanced form of wildcard searches, more powerful than DNSDB’s Standard Search left-hand or right-hand wildcards, but not as advanced as Farsight Compatible Regular Expressions (FCRE). They can be simpler to write, especially for API users who are not familiar with regular expressions.
In general, Farsight’s glob implementation follows standard Unix glob(7) semantics, but not what’s sometimes referred to as “extended globbing.”
Glob searches are evaluated against the DNS master file form of the hostnames (aka rrnames) and rdata values, which by design contains only printable ASCII characters. All non-printable characters, including octets outside the ASCII range, are converted to “\DDD” escape sequences, where “DDD” is a three digit decimal number per RFC 1035. This is only applicable to RData (RHS) queries.
Glob Syntax
A glob is a string of printable characters with the following characters given special meaning:
*
— Match any zero or more characters.?
— Match exactly any one character.[
— Begin a character class. Any of the contained characters or ranges will match.]
— End a character class.\
— Escape the next character (but not within a character class)
Any other characters in globbing pattern get matched exactly as written, except that characters are not case sensitive.
Character Class Syntax
A character class is a set of characters enclosed between an opening [
and a closing
]
. A simple example is [m-z1-3]
to match characters m through z and 1 to 3.
Within the character class, the following characters are handled specially:
!
— If the first character after the opening[
, denotes a negated character class, i.e. a class which matches any character not listed in the remainder of the class.]
— If the first character after the opening[
or[!
, encodes a literal]
as a member of the class. A]
after the first character after the opening[
or[!
ends the character class.-
— If the first character after the opening[
or[!
or the last character before the closing]
, encodes a literal-
as a member of the character class.- If between two characters A and B, encodes the range of characters between A and B, inclusive, as members of the character class. The character A must occur before B in ASCII encoding.
The sequences [.
and [=
are not allowed between the opening [
or [!
and the
closing ]
, to prevent confusion with unsupported POSIX collation sequences and
collation classes.
If the sequence [:
appears in a character class, it must be the beginning of one of
the following POSIX character classes:
[:alnum:]
— Alphanumeric characters 0-9, A-Z, and a-z[:alpha:]
— Alphabetic characters A-Z, a-z[:blank:]
— Blank characters (space and tab)- Only printable characters occur in searchable strings and space is the only
printable whitespace character, thus use of
[:blank:]
is equivalent to a space character. - Tabs in data appear as the escape sequence
\009
and can be matched with\\009
.
- Only printable characters occur in searchable strings and space is the only
printable whitespace character, thus use of
[:cntrl:]
— Control characters- Only printable characters occur in searchable strings, so
[:cntrl:]
will not match any characters. - Control characters in data will appear as
\DDD
escape sequences sequences. To match one of those, you will need to backslash-quote the backslash. Thus to match\004
, use\\004
.
- Only printable characters occur in searchable strings, so
[:digit:]
— Decimal digits 0-9[:graph:]
— Any printable character other than space.- Only printable characters occur in searchable strings, thus a character class
containing
[:graph:]
is equivalent to[! ]
(negated character class containing only a space).
- Only printable characters occur in searchable strings, thus a character class
containing
[:lower:]
— Lower case alphabetic characters a-z- Hostnames will be folded to lower case, thus use of
[:lower:]
is equivalent to[:alpha:]
.
- Hostnames will be folded to lower case, thus use of
[:print:]
— Any printable character- Only printable characters occur in searchable strings, so
[:print:]
will match any character.
- Only printable characters occur in searchable strings, so
[:punct:]
— Punctuation characters (printable characters other than space and[:alnum:]
)[:space:]
— Any whitespace character- The space character is the only printable whitespace character, thus use of [:space:] is equivalent to a space character.
[:upper:]
— Upper case alphabetic characters A-Z- Since all of our data is indexed as lower-case, this is not useful as it is
equivalent to
[:lower:]
.
- Since all of our data is indexed as lower-case, this is not useful as it is
equivalent to
[:xdigit:]
— Hexadecimal digits 0-9, a-f, A-F
The above named character classes must appear inside an enclosing [
and ]
, e.g. [[:digit:][:punct:]]
to match a digit or punctuation
character. Without the enclosing braces, [:digit:]
will match the
characters :
, d
, i
, g
, or t
.
Neither the above character classes nor a character range may begin or end a character
range. For example, the character class expressions [0-[:alpha:]]
and [a-n-z]
are
invalid.
All other characters between the opening [
or [!
and the closing ]
are added
to the character class, including the backslash \
character.
There is no way to express a character class containing a single !
character.
Important notes
- Glob searches are not case sensitive.
- Globbing patterns are “anchored” front and back by default. (This is a major difference from FCRE.)
- All hostnames (rrnames) in the DNS dataset end in a
.
, which must be accounted for in globs.- Therefore, a search for
*.com
will not match any hostnames. A glob that searches in rrnames must end in something that matches a.
, so*.com.
would match what was intended.
- Therefore, a search for
- All well-formed rdata we currently index in the DNS dataset ends in
a
.
or a"
, which should be accounted for in globs.- Therefore, a glob that searches in rdata should end in
something that matches a
.
or a"
.
- Therefore, a glob that searches in rdata should end in
something that matches a
- There must be at least two consecutive non-wildcard characters in the pattern. The implicit front and back anchor counts as a non-wildcard character.
Examples
To match hostnames with a label containing the word “smoke”:
- use glob
*smoke*
in a rrnames search - Examples of results:
- smokeping.pdf.ac.
- smoke.tesla.ac.
- use glob
To match hostnames with a label containing the word “cider” but not containing “hard”:
- use glob
*cider*
in a rrnames search, with an exclude filter of*hard*
- Examples of results:
- ciderpress.ca.
- colombus.citycider2018.eventbrite.ca.
- use glob
To match hostnames with a label ending in “www.” and a later label starting with “.com”
- use glob
*www.*.com*
in a rrnames search - Examples of results:
- www.example.com.
- dev-www.subdomain.example.com.
- www.example.com.cdn.net.
- stage-www.dev.community.org.
- use glob
To match hostnames starting with “www.” and ending in “.com.”
- use glob
www.*.com.
in a rrnames search - Examples of results:
- www.example.com.
- www.subdomain.example.com.
- use glob
To match hostnames starting with “www.” and ending with “.com” with no other dots in between,
- this cannot be done in a general way using globs; use regular expression instead.
To match hostnames starting with “www” optionally preceded by a “dev-” or “stage-” prefix in a .net or .edu domain,
- this cannot be done in a general way using globs; use regular expression instead.
To match TXT records encoding an SPF policy with a ~all default
- use glob
"v=spf1 * ~all"
in a rdata search - Examples of results:
- “v=spf1 a mx ~all”
- “v=spf1 a 10.2.0.0/16 ~all”
- use glob
To match single character domain names (which are really two character domain names when you add the implicit trailing ‘.’),
- use glob
?.
in an rrnames or rdata search - Examples of results:
- a.
- 0.
- use glob
To match “bri” followed by exactly any three characters followed by “morning” followed by anything (or nothing) [a question mark will match exactly one character]
- use glob
bri???morning*
in a rrnames search - Examples of results:
- brightmorning.com
- brightmorningtoday.com
- use glob
To match “ns” followed by any single digit followed by anything (or nothing) and ending in “.net.”
- use glob
ns[0-9]*.net.
in a rrnames search - Examples of results:
- ns0.fsi.net
- ns0abc.fsi.net
- use glob
Additional Information
- Farsight Flexible Search Reference Guide has technical details on Flexible Search.
- The Flex API extensions to the RESTful DNSDB APIv2 are documented in Flex API protocol.
- See https://www.domaintools.com/resources/user-guides/ for more information on the DNSDB API specifications.
- Introducing DNSDB 2.0
- What’s a Regular Expression?
- What is Globbing?