1# DNS - Domain Name System - RFC 1035 2# Pattern attributes: great slow fast 3# Protocol groups: networking ietf_internet_standard 4# Wiki: http://www.protocolinfo.org/wiki/DNS 5 6# Thanks to Sebastien Bechet <s.bechet AT av7.net> for TLD detection 7# improvements 8 9# While RFC 2181 says "Occasionally it is assumed that the Domain Name 10# System serves only the purpose of mapping Internet host names to data, 11# and mapping Internet addresses to host names. This is not correct, the 12# DNS is a general (if somewhat limited) hierarchical database, and can 13# store almost any kind of data, for almost any purpose.", we will assume 14# just that, because that represents the vast majority of DNS traffic. 15 16# The packet starts with a 2 byte random ID number and 2 bytes of flags that 17# aren't easy to match on. 18 19# The first thing that is matchable is QDCOUNT, the number of queries. 20# Despite the fact that you can apparently ask for up to 65535 21# things at a time, usually you only ask for one and I doubt you ever ask for 22# zero. Let's allow up to two, just in case (even though I can't find any 23# situation that generates more than one). 24 25# Next comes the ANCOUNT, NSCOUNT, and ARCOUNT fields, which could be null 26# or some smallish number, not matchable except by length (up to 6) 27 28# The next matchable thing is the query address. The first byte indicates the 29# length of the first part of the address, which is limited to 63 (0x3F == '?'). 30# The next byte has to be a letter (for domain names) or number (for reverse lookups). 31# Then there can be an combination of 32# letters, digits, hyphens, and 0x01-0x3F length markers. 33# Then we check for the presence of a top-level-domain at some later point. 34# This is indicated by a 0x02-0x06 and at least two letters, followed by no 35# more than four more letters. 36# Note that this will miss a very few queries that are for a TLD alone. 37# i.e. "host museum" (195.7.77.17) 38# 39# http://www.icann.org/tlds http://www.iana.org/cctld/cctld-whois.htm 40 41# next is the QTYPE field, which has valid values 1-16 (although this 42# could probably be restricted further since many are rare) and \x1c for 43# IPv6 (and maybe more?). It should follow immediately after the TLD 44# (and some stripped-out nulls) 45 46# next is QCLASS, which has valid values 1-4 and 255, except 2 is never used. 47# I'm not sure if 3 and 4 are used, so I'll include them. 1=Internet 255=any 48 49# If we wanted to match queries and responses separately, there could be 50# more specifics after this for the responses. 51 52dns 53# here's a sane way of doing it 54^.?.?.?.?[\x01\x02].?.?.?.?.?.?[\x01-?][a-z0-9][\x01-?a-z]*[\x02-\x06][a-z][a-z][fglmoprstuvz]?[aeop]?(um)?[\x01-\x10\x1c][\x01\x03\x04\xFF] 55 56# This way assumes that TLDs are any alpha string 2-6 characters long. 57# If TLDs are added, this is a good fallback. 58#^.?.?.?.?[\x01\x02].?.?.?.?.?.?[\x01-?][a-z0-9][\x01-?a-z]*[\x02-\x06][a-z][a-z][a-z]?[a-z]?[a-z]?[a-z]?[\x01-\x10][\x01\x03\x04\xFF] 59 60# If you have more processing power than me, you can substitute this for 61# the [a-z][a-z][a-z]?[a-z]?[a-z]?[a-z]? 62#(aero|arpa|biz|com|coop|edu|gov|info|int|mil|museum|name|net|org|pro|arpa|ac|ad|ae|af|ag|ai|al|am|an|ao|aq|ar|as|at|au|aw|az|ba|bb|bd|be|bf|bg|bh|bi|bj|bm|bn|bo|br|bs|bt|bv|bw|by|bz|ca|cc|cd|cf|cg|ch|ci|ck|cl|cm|cn|co|cr|cu|cv|cx|cy|cz|de|dj|dk|dm|do|dz|ec|ee|eg|eh|er|es|et|fi|fj|fk|fm|fo|fr|ga|gd|ge|gf|gg|gh|gi|gl|gm|gn|gp|gq|gr|gs|gt|gu|gw|gy|hk|hm|hn|hr|ht|hu|id|ie|il|im|in|io|iq|ir|is|it|je|jm|jo|jp|ke|kg|kh|ki|km|kn|kp|kr|kw|ky|kz|la|lb|lc|li|lk|lr|ls|lt|lu|lv|ly|ma|mc|md|mg|mh|mk|ml|mm|mn|mo|mp|mq|mr|ms|mt|mu|mv|mw|mx|my|mz|na|nc|ne|nf|ng|ni|nl|no|np|nr|nu|nz|om|pa|pe|pf|pg|ph|pk|pl|pm|pn|pr|ps|pt|pw|py|qa|re|ro|ru|rw|sa|sb|sc|sd|se|sg|sh|si|sj|sk|sl|sm|sn|so|sr|st|sv|sy|sz|tc|td|tf|tg|th|tj|tk|tm|tn|to|tp|tr|tt|tv|tw|tz|ua|ug|uk|um|us|uy|uz|va|vc|ve|vg|vi|vn|vu|wf|ws|ye|yt|yu|za|zm|zw) 63