README revision 17680
1@(#) $Header: README,v 1.49 96/07/15 18:28:23 leres Exp $ (LBL)
2
3TCPDUMP 3.2.1
4Lawrence Berkeley National Laboratory
5Network Research Group
6tcpdump@ee.lbl.gov
7ftp://ftp.ee.lbl.gov/tcpdump.tar.Z
8
9This directory contains source code for tcpdump, a tool for network
10monitoring and data acquisition.  The original distribution is
11available via anonymous ftp to ftp.ee.lbl.gov, in tcpdump.tar.Z.
12
13Tcpdump now uses libcap, a system-independent interface for user-level
14packet capture.  Before building tcpdump, you must first retrieve and
15build libpcap, also from LBL, in:
16
17	ftp://ftp.ee.lbl.gov/libpcap.tar.Z
18
19Once libpcap is built (either install it or make sure it's in
20../libpcap), you can build tcpdump using the procedure in the INSTALL
21file.
22
23The program is loosely based on SMI's "etherfind" although none
24of the etherfind code remains.  It was originally written by Van
25Jacobson as part of an ongoing research project to investigate and
26improve tcp and internet gateway performance.  The parts of the
27program originally taken from Sun's etherfind were later re-written
28by Steven McCanne of LBL.  To insure that there would be no vestige
29of proprietary code in tcpdump, Steve wrote these pieces from the
30specification given by the manual entry, with no access to the
31source of tcpdump or etherfind.
32
33Over the past few years, tcpdump has been steadily improved
34by the excellent contributions from the Internet community
35(just browse through the CHANGES file).  We are grateful for
36all the input.
37
38Richard Stevens gives an excellent treatment of the Internet
39protocols in his book ``TCP/IP Illustrated, Volume 1''.
40If you want to learn more about tcpdump and how to interpret
41its output, pick up this book.
42
43Some tools for viewing and analyzing tcpdump trace files are available
44from the Internet Traffic Archive:
45
46	http://town.hall.org/Archives/pub/ITA/
47
48Problems, bugs, questions, desirable enhancements, source code
49contributions, etc., should be sent to the email address
50"tcpdump@ee.lbl.gov".
51
52 - Steve McCanne
53   Craig Leres
54   Van Jacobson
55-------------------------------------
56This directory also contains some short awk programs intended as
57examples of ways to reduce tcpdump data when you're tracking
58particular network problems:
59
60send-ack.awk
61	Simplifies the tcpdump trace for an ftp (or other unidirectional
62	tcp transfer).  Since we assume that one host only sends and
63	the other only acks, all address information is left off and
64	we just note if the packet is a "send" or an "ack".
65
66	There is one output line per line of the original trace.
67	Field 1 is the packet time in decimal seconds, relative
68	to the start of the conversation.  Field 2 is delta-time
69	from last packet.  Field 3 is packet type/direction.
70	"Send" means data going from sender to receiver, "ack"
71	means an ack going from the receiver to the sender.  A
72	preceding "*" indicates that the data is a retransmission.
73	A preceding "-" indicates a hole in the sequence space
74	(i.e., missing packet(s)), a "#" means an odd-size (not max
75	seg size) packet.  Field 4 has the packet flags
76	(same format as raw trace).  Field 5 is the sequence
77	number (start seq. num for sender, next expected seq number
78	for acks).  The number in parens following an ack is
79	the delta-time from the first send of the packet to the
80	ack.  A number in parens following a send is the
81	delta-time from the first send of the packet to the
82	current send (on duplicate packets only).  Duplicate
83	sends or acks have a number in square brackets showing
84	the number of duplicates so far.
85
86	Here is a short sample from near the start of an ftp:
87		3.00    0.20   send . 512
88		3.20    0.20    ack . 1024  (0.20)
89		3.20    0.00   send P 1024
90		3.40    0.20    ack . 1536  (0.20)
91		3.80    0.40 * send . 0  (3.80) [2]
92		3.82    0.02 *  ack . 1536  (0.62) [2]
93	Three seconds into the conversation, bytes 512 through 1023
94	were sent.  200ms later they were acked.  Shortly thereafter
95	bytes 1024-1535 were sent and again acked after 200ms.
96	Then, for no apparent reason, 0-511 is retransmitted, 3.8
97	seconds after its initial send (the round trip time for this
98	ftp was 1sec, +-500ms).  Since the receiver is expecting
99	1536, 1536 is re-acked when 0 arrives.
100
101packetdat.awk
102	Computes chunk summary data for an ftp (or similar
103	unidirectional tcp transfer). [A "chunk" refers to
104	a chunk of the sequence space -- essentially the packet
105	sequence number divided by the max segment size.]
106
107	A summary line is printed showing the number of chunks,
108	the number of packets it took to send that many chunks
109	(if there are no lost or duplicated packets, the number
110	of packets should equal the number of chunks) and the
111	number of acks.
112
113	Following the summary line is one line of information
114	per chunk.  The line contains eight fields:
115	   1 - the chunk number
116	   2 - the start sequence number for this chunk
117	   3 - time of first send
118	   4 - time of last send
119	   5 - time of first ack
120	   6 - time of last ack
121	   7 - number of times chunk was sent
122	   8 - number of times chunk was acked
123	(all times are in decimal seconds, relative to the start
124	of the conversation.)
125
126	As an example, here is the first part of the output for
127	an ftp trace:
128
129	# 134 chunks.  536 packets sent.  508 acks.
130	1       1       0.00    5.80    0.20    0.20    4       1
131	2       513     0.28    6.20    0.40    0.40    4       1
132	3       1025    1.16    6.32    1.20    1.20    4       1
133	4       1561    1.86    15.00   2.00    2.00    6       1
134	5       2049    2.16    15.44   2.20    2.20    5       1
135	6       2585    2.64    16.44   2.80    2.80    5       1
136	7       3073    3.00    16.66   3.20    3.20    4       1
137	8       3609    3.20    17.24   3.40    5.82    4       11
138	9       4097    6.02    6.58    6.20    6.80    2       5
139
140	This says that 134 chunks were transferred (about 70K
141	since the average packet size was 512 bytes).  It took
142	536 packets to transfer the data (i.e., on the average
143	each chunk was transmitted four times).  Looking at,
144	say, chunk 4, we see it represents the 512 bytes of
145	sequence space from 1561 to 2048.  It was first sent
146	1.86 seconds into the conversation.  It was last
147	sent 15 seconds into the conversation and was sent
148	a total of 6 times (i.e., it was retransmitted every
149	2 seconds on the average).  It was acked once, 140ms
150	after it first arrived.
151
152stime.awk
153atime.awk
154	Output one line per send or ack, respectively, in the form
155		<time> <seq. number>
156	where <time> is the time in seconds since the start of the
157	transfer and <seq. number> is the sequence number being sent
158	or acked.  I typically plot this data looking for suspicious
159	patterns.
160
161
162The problem I was looking at was the bulk-data-transfer
163throughput of medium delay network paths (1-6 sec.  round trip
164time) under typical DARPA Internet conditions.  The trace of the
165ftp transfer of a large file was used as the raw data source.
166The method was:
167
168  - On a local host (but not the Sun running tcpdump), connect to
169    the remote ftp.
170
171  - On the monitor Sun, start the trace going.  E.g.,
172      tcpdump host local-host and remote-host and port ftp-data >tracefile
173
174  - On local, do either a get or put of a large file (~500KB),
175    preferably to the null device (to minimize effects like
176    closing the receive window while waiting for a disk write).
177
178  - When transfer is finished, stop tcpdump.  Use awk to make up
179    two files of summary data (maxsize is the maximum packet size,
180    tracedata is the file of tcpdump tracedata):
181      awk -f send-ack.awk packetsize=avgsize tracedata >sa
182      awk -f packetdat.awk packetsize=avgsize tracedata >pd
183
184  - While the summary data files are printing, take a look at
185    how the transfer behaved:
186      awk -f stime.awk tracedata | xgraph
187    (90% of what you learn seems to happen in this step).
188
189  - Do all of the above steps several times, both directions,
190    at different times of day, with different protocol
191    implementations on the other end.
192
193  - Using one of the Unix data analysis packages (in my case,
194    S and Gary Perlman's Unix|Stat), spend a few months staring
195    at the data.
196
197  - Change something in the local protocol implementation and
198    redo the steps above.
199
200  - Once a week, tell your funding agent that you're discovering
201    wonderful things and you'll write up that research report
202    "real soon now".
203
204