1# DNS - Domain Name System - RFC 1035
2# Pattern attributes: great slow fast
3# Protocol groups: networking ietf_internet_standard
4# Wiki: http://www.protocolinfo.org/wiki/DNS
6# Thanks to Sebastien Bechet <s.bechet AT av7.net> for TLD detection
7# improvements
9# While RFC 2181 says "Occasionally it is assumed that the Domain Name
10# System serves only the purpose of mapping Internet host names to data,
11# and mapping Internet addresses to host names.  This is not correct, the
12# DNS is a general (if somewhat limited) hierarchical database, and can
13# store almost any kind of data, for almost any purpose.", we will assume 
14# just that, because that represents the vast majority of DNS traffic.
16# The packet starts with a 2 byte random ID number and 2 bytes of flags that
17# aren't easy to match on.
19# The first thing that is matchable is QDCOUNT, the number of queries.
20# Despite the fact that you can apparently ask for up to 65535
21# things at a time, usually you only ask for one and I doubt you ever ask for
22# zero.  Let's allow up to two, just in case (even though I can't find any 
23# situation that generates more than one).
25# Next comes the ANCOUNT, NSCOUNT, and ARCOUNT fields, which could be null
26# or some smallish number, not matchable except by length (up to 6)
28# The next matchable thing is the query address. The first byte indicates the
29# length of the first part of the address, which is limited to 63 (0x3F == '?').
30# The next byte has to be a letter (for domain names) or number (for reverse lookups).
31# Then there can be an combination of 
32# letters, digits, hyphens, and 0x01-0x3F length markers.
33# Then we check for the presence of a top-level-domain at some later point.
34# This is indicated by a 0x02-0x06 and at least two letters, followed by no
35# more than four more letters.
36# Note that this will miss a very few queries that are for a TLD alone.
37# i.e. "host museum" (
39# http://www.icann.org/tlds   http://www.iana.org/cctld/cctld-whois.htm
41# next is the QTYPE field, which has valid values 1-16 (although this
42# could probably be restricted further since many are rare) and \x1c for
43# IPv6 (and maybe more?).  It should follow immediately after the TLD
44# (and some stripped-out nulls)
46# next is QCLASS, which has valid values 1-4 and 255, except 2 is never used.
47# I'm not sure if 3 and 4 are used, so I'll include them. 1=Internet 255=any
49# If we wanted to match queries and responses separately, there could be
50# more specifics after this for the responses.
53# here's a sane way of doing it
56# This way assumes that TLDs are any alpha string 2-6 characters long.
57# If TLDs are added, this is a good fallback.
60# If you have more processing power than me, you can substitute this for
61# the [a-z][a-z][a-z]?[a-z]?[a-z]?[a-z]?