1<!doctype linuxdoc system>
2
3<article>
4
5<title>SS Utility: Quick Intro
6<author>Alexey Kuznetosv, <tt/kuznet@ms2.inr.ac.ru/
7<date>some_negative_number, 20 Sep 2001
8<abstract>
9<tt/ss/ is one another utility to investigate sockets.
10Functionally it is NOT better than <tt/netstat/ combined
11with some perl/awk scripts and though it is surely faster
12it is not enough to make it much better. :-)
13So, stop reading this now and do not waste your time.
14Well, certainly, it proposes some functionality, which current
15netstat is still not able to do, but surely will soon.
16</abstract>
17
18<sect>Why?
19
20<p> <tt>/proc</tt> interface is inadequate, unfortunately.
21When amount of sockets is enough large, <tt/netstat/ or even
22plain <tt>cat /proc/net/tcp/</tt> cause nothing but pains and curses.
23In linux-2.4 the desease became worse: even if amount
24of sockets is small reading <tt>/proc/net/tcp/</tt> is slow enough.
25
26This utility presents a new approach, which is supposed to scale
27well. I am not going to describe technical details here and
28will concentrate on description of the command.
29The only important thing to say is that it is not so bad idea
30to load module <tt/tcp_diag/, which can be found in directory
31<tt/Modules/ of <tt/iproute2/. If you do not make this <tt/ss/
32will work, but it falls back to <tt>/proc</tt> and becomes slow
33like <tt/netstat/, well, a bit faster yet (see section "Some numbers"). 
34
35<sect>Old news
36
37<p>
38In the simplest form <tt/ss/ is equivalent to netstat
39with some small deviations.
40
41<itemize>
42<item><tt/ss -t -a/ dumps all TCP sockets
43<item><tt/ss -u -a/ dumps all UDP sockets
44<item><tt/ss -w -a/ dumps all RAW sockets
45<item><tt/ss -x -a/ dumps all UNIX sockets
46</itemize>
47
48<p>
49Option <tt/-o/ shows TCP timers state.
50Option <tt/-e/ shows some extended information.
51Etc. etc. etc. Seems, all the options of netstat related to sockets
52are supported. Though not AX.25 and other bizarres. :-)
53If someone wants, he can make support for decnet and ipx.
54Some rudimentary support for them is already present in iproute2 libutils,
55and I will be glad to see these new members.
56
57<p>
58However, standard functionality is a bit different:
59
60<p>
61The first: without option <tt/-a/ sockets in states
62<tt/TIME-WAIT/ and <tt/SYN-RECV/ are skipped too.
63It is more reasonable default, I think.
64
65<p>
66The second: format of UNIX sockets is different. It coincides
67with tcp/udp. Though standard kernel still does not allow to
68see write/read queues and peer address of connected UNIX sockets,
69the patch doing this exists.
70
71<p>
72The third: default is to dump only TCP sockets, rather than all of the types.
73
74<p>
75The next: by default it does not resolve numeric host addresses (like <tt/ip/)!
76Resolving is enabled with option <tt/-r/. Service names, usually stored
77in local files, are resolved by default. Also, if service database
78does not contain references to a port, <tt/ss/ queries system
79<tt/rpcbind/. RPC services are prefixed with <tt/rpc./
80Resolution of services may be suppressed with option <tt/-n/.
81
82<p>
83It does not accept "long" options (I dislike them, sorry).
84So, address family is given with family identifier following
85option <tt/-f/ to be algined to iproute2 conventions.
86Mostly, it is to allow option parser to parse
87addresses correctly, but as side effect it really limits dumping
88to sockets supporting only given family. Option <tt/-A/ followed
89by list of socket tables to dump is also supported.
90Logically, id of socket table is different of _address_ family, which is
91another point of incompatibility. So, id is one of
92<tt/all/, <tt/tcp/, <tt/udp/,
93<tt/raw/, <tt/inet/, <tt/unix/, <tt/packet/, <tt/netlink/. See?
94Well, <tt/inet/ is just abbreviation for <tt/tcp|udp|raw/
95and it is not difficult to guess that <tt/packet/ allows
96to look at packet sockets. Actually, there are also some other abbreviations,
97f.e. <tt/unix_dgram/ selects only datagram UNIX sockets.
98
99<p>
100The next: well, I still do not know. :-)
101
102
103
104
105<sect>Time to talk about new functionality.
106
107<p>It is builtin filtering of socket lists. 
108
109<sect1> Filtering by state.
110
111<p>
112<tt/ss/ allows to filter socket states, using keywords
113<tt/state/ and <tt/exclude/, followed by some state
114identifier.
115
116<p>
117State identifier are standard TCP state names (not listed,
118they are useless for you if you already do not know them)
119or abbreviations:
120
121<itemize>
122<item><tt/all/        - for all the states
123<item><tt/bucket/     - for TCP minisockets (<tt/TIME-WAIT|SYN-RECV/)
124<item><tt/big/	      - all except for minisockets
125<item><tt/connected/  - not closed and not listening
126<item><tt/synchronized/ - connected and not <tt/SYN-SENT/
127</itemize>
128
129<p>
130   F.e. to dump all tcp sockets except <tt/SYN-RECV/:
131
132<tscreen><verb>
133   ss exclude SYN-RECV
134</verb></tscreen>
135
136<p>
137   If neither <tt/state/ nor <tt/exclude/ directives
138   are present,
139   state filter defaults to <tt/all/ with option <tt/-a/
140   or to <tt/all/,
141   excluding listening, syn-recv, time-wait and closed sockets.
142
143<sect1> Filtering by addresses and ports.
144
145<p>
146Option list may contain address/port filter.
147It is boolean expression which consists of boolean operation
148<tt/or/, <tt/and/, <tt/not/ and predicates. 
149Actually, all the flavors of names for boolean operations are eaten:
150<tt/&amp/, <tt/&amp&amp/, <tt/|/, <tt/||/, <tt/!/, but do not forget
151about special sense given to these symbols by unix shells and escape
152them correctly, when used from command line.
153
154<p>
155Predicates may be of the folowing kinds:
156
157<itemize>
158<item>A. Address/port match, where address is checked against mask
159      and port is either wildcard or exact. It is one of:
160 
161<tscreen><verb>
162	dst prefix:port
163	src prefix:port
164	src unix:STRING
165	src link:protocol:ifindex
166	src nl:channel:pid
167</verb></tscreen>
168
169      Both prefix and port may be absent or replaced with <tt/*/,
170      which means wildcard. UNIX socket use more powerful scheme
171      matching to socket names by shell wildcards. Also, prefixes
172      unix: and link: may be omitted, if address family is evident
173      from context (with option <tt/-x/ or with <tt/-f unix/
174      or with <tt/unix/ keyword) 
175
176<p>
177      F.e.
178
179<tscreen><verb>
180	dst 10.0.0.1
181	dst 10.0.0.1:
182	dst 10.0.0.1/32:
183	dst 10.0.0.1:*
184</verb></tscreen>
185   are equivalent and mean socket connected to
186	                 any port on host 10.0.0.1
187
188<tscreen><verb>
189	dst 10.0.0.0/24:22
190</verb></tscreen>
191   sockets connected to port 22 on network
192                          10.0.0.0...255.
193
194<p>
195      Note that port separated of address with colon, which creates
196      troubles with IPv6 addresses. Generally, we interpret the last
197      colon as splitting port. To allow to give IPv6 addresses,
198      trick like used in IPv6 HTTP URLs may be used:
199
200<tscreen><verb>
201      dst [::1]
202</verb></tscreen>
203       are sockets connected to ::1 on any port
204
205<p>
206      Another way is <tt/dst ::1/128/. / helps to understand that
207      colon is part of IPv6 address.
208
209<p>
210      Now we can add another alias for <tt/dst 10.0.0.1/:
211      <tt/dst [10.0.0.1]/. :-)
212
213<p>   Address may be a DNS name. In this case all the addresses are looked
214      up (in all the address families, if it is not limited by option <tt/-f/
215      or special address prefix <tt/inet:/, <tt/inet6/) and resulting
216      expression is <tt/or/ over all of them.  
217
218<item>   B. Port expressions:
219<tscreen><verb>
220      dport &gt= :1024
221      dport != :22
222      sport &lt :32000
223</verb></tscreen>
224      etc.
225
226      All the relations: <tt/&lt/, <tt/&gt/, <tt/=/, <tt/>=/, <tt/=/, <tt/==/,
227      <tt/!=/, <tt/eq/, <tt/ge/, <tt/lt/, <tt/ne/...
228      Use variant which you like more, but not forget to escape special
229      characters when typing them in command line. :-) 
230
231      Note that port number syntactically coincides to the case A!
232      You may even add an IP address, but it will not participate
233      incomparison, except for <tt/==/ and <tt/!=/, which are equivalent
234      to corresponding predicates of type A. F.e.
235<p>
236<tt/dst 10.0.0.1:22/
237    is equivalent to  <tt/dport eq 10.0.0.1:22/
238      and
239      <tt/not dst 10.0.0.1:22/     is equivalent to
240 <tt/dport neq 10.0.0.1:22/
241
242<item>C. Keyword <tt/autobound/. It matches to sockets bound automatically
243      on local system.
244
245</itemize>
246
247
248<sect> Examples
249
250<p>
251<itemize>
252<item>1. List all the tcp sockets in state <tt/FIN-WAIT-1/ for our apache
253   to network 193.233.7/24 and look at their timers:
254
255<tscreen><verb>
256   ss -o state fin-wait-1 \( sport = :http or sport = :https \) \
257                          dst 193.233.7/24
258</verb></tscreen>
259
260   Oops, forgot to say that missing logical operation is
261   equivalent to <tt/and/.
262
263<item> 2. Well, now look at the rest...
264
265<tscreen><verb>
266   ss -o excl fin-wait-1
267   ss state fin-wait-1 \( sport neq :http and sport neq :https \) \
268                       or not dst 193.233.7/24
269</verb></tscreen>
270
271   Note that we have to do _two_ calls of ss to do this.
272   State match is always anded to address/port match.
273   The reason for this is purely technical: ss does fast skip of
274   not matching states before parsing addresses and I consider the
275   ability to skip fastly gobs of time-wait and syn-recv sockets
276   as more important than logical generality.
277
278<item> 3. So, let's look at all our sockets using autobound ports:
279
280<tscreen><verb>
281   ss -a -A all autobound
282</verb></tscreen>
283
284
285<item> 4. And eventually find all the local processes connected
286   to local X servers:
287
288<tscreen><verb>
289   ss -xp dst "/tmp/.X11-unix/*"
290</verb></tscreen>
291
292   Pardon, this does not work with current kernel, patching is required.
293   But we still can look at server side:
294   
295<tscreen><verb>
296   ss -x src "/tmp/.X11-unix/*"
297</verb></tscreen>
298
299</itemize>
300
301
302<sect> Returning to ground: real manual  
303
304<p>
305<sect1> Command arguments
306
307<p> General format of arguments to <tt/ss/ is:
308
309<tscreen><verb>
310       ss [ OPTIONS ] [ STATE-FILTER ] [ ADDRESS-FILTER ]
311</verb></tscreen>
312
313<sect2><tt/OPTIONS/
314<p> <tt/OPTIONS/ is list of single letter options, using common unix
315conventions.
316
317<itemize>
318<item><tt/-h/  - show help page
319<item><tt/-?/  - the same, of course
320<item><tt/-v/, <tt/-V/  - print version of <tt/ss/ and exit
321<item><tt/-s/  - print summary statistics. This option does not parse
322socket lists obtaining summary from various sources. It is useful
323when amount of sockets is so huge that parsing <tt>/proc/net/tcp</tt>
324is painful.
325<item><tt/-D FILE/  - do not display anything, just dump raw information
326about TCP sockets to <tt/FILE/ after applying filters. If <tt/FILE/ is <tt/-/
327<tt/stdout/ is used. 
328<item><tt/-F FILE/  - read continuation of filter from <tt/FILE/.
329Each line of <tt/FILE/ is interpreted like single command line option.
330If <tt/FILE/ is <tt/-/ <tt/stdin/ is used. 
331<item><tt/-r/  - try to resolve numeric address/ports
332<item><tt/-n/  - do not try to resolve ports
333<item><tt/-o/  - show some optional information, f.e. TCP timers
334<item><tt/-i/  - show some infomration specific to TCP (RTO, congestion
335window, slow start threshould etc.)
336<item><tt/-e/  - show even more optional information
337<item><tt/-m/  - show extended information on memory used by the socket.
338It is available only with <tt/tcp_diag/ enabled.
339<item><tt/-p/  - show list of processes owning the socket
340<item><tt/-f FAMILY/ - default address family used for parsing addresses.
341                 Also this option limits listing to sockets supporting
342                 given address family. Currently the following families
343                 are supported: <tt/unix/, <tt/inet/, <tt/inet6/, <tt/link/,
344                 <tt/netlink/.
345<item><tt/-4/ - alias for <tt/-f inet/
346<item><tt/-6/ - alias for <tt/-f inet6/
347<item><tt/-0/ - alias for <tt/-f link/
348<item><tt/-A LIST-OF-TABLES/ - list of socket tables to dump, separated
349                 by commas. The following identifiers are understood:
350                 <tt/all/, <tt/inet/, <tt/tcp/, <tt/udp/, <tt/raw/,
351                 <tt/unix/, <tt/packet/, <tt/netlink/, <tt/unix_dgram/,
352                 <tt/unix_stream/, <tt/packet_raw/, <tt/packet_dgram/.
353<item><tt/-x/ - alias for <tt/-A unix/
354<item><tt/-t/ - alias for <tt/-A tcp/
355<item><tt/-u/ - alias for <tt/-A udp/
356<item><tt/-w/ - alias for <tt/-A raw/
357<item><tt/-a/ - show sockets of all the states. By default sockets
358                in states <tt/LISTEN/, <tt/TIME-WAIT/, <tt/SYN_RECV/
359                and <tt/CLOSE/ are skipped.
360<item><tt/-l/ - show only sockets in state <tt/LISTEN/ 
361</itemize>
362
363<sect2><tt/STATE-FILTER/
364
365<p><tt/STATE-FILTER/ allows to construct arbitrary set of
366states to match. Its syntax is sequence of keywords <tt/state/
367and <tt/exclude/ followed by identifier of state.
368Available identifiers are:
369
370<p>
371<itemize>
372<item> All standard TCP states: <tt/established/, <tt/syn-sent/,
373<tt/syn-recv/, <tt/fin-wait-1/, <tt/fin-wait-2/, <tt/time-wait/,
374<tt/closed/, <tt/close-wait/, <tt/last-ack/, <tt/listen/ and <tt/closing/.
375
376<item><tt/all/ - for all the states 
377<item><tt/connected/ - all the states except for <tt/listen/ and <tt/closed/ 
378<item><tt/synchronized/ - all the <tt/connected/ states except for 
379<tt/syn-sent/
380<item><tt/bucket/ - states, which are maintained as minisockets, i.e.
381<tt/time-wait/ and <tt/syn-recv/.
382<item><tt/big/ - opposite to <tt/bucket/
383</itemize>
384
385<sect2><tt/ADDRESS_FILTER/
386
387<p><tt/ADDRESS_FILTER/ is boolean expression with operations <tt/and/, <tt/or/
388and <tt/not/, which can be abbreviated in C style f.e. as <tt/&amp/,
389<tt/&amp&amp/.
390
391<p>
392Predicates check socket addresses, both local and remote.
393There are the following kinds of predicates:
394
395<itemize>
396<item> <tt/dst ADDRESS_PATTERN/ - matches remote address and port
397<item> <tt/src ADDRESS_PATTERN/ - matches local address and port
398<item> <tt/dport RELOP PORT/    - compares remote port to a number
399<item> <tt/sport RELOP PORT/    - compares local port to a number
400<item> <tt/autobound/           - checks that socket is bound to an ephemeral
401                                  port
402</itemize>
403
404<p><tt/RELOP/ is some of <tt/&lt=/, <tt/&gt=/, <tt/==/ etc.
405To make this more convinient for use in unix shell, alphabetic
406FORTRAN-like notations <tt/le/, <tt/gt/ etc. are accepted as well.
407
408<p>The format and semantics of <tt/ADDRESS_PATTERN/ depends on address
409family.
410
411<itemize>
412<item><tt/inet/ - <tt/ADDRESS_PATTERN/ consists of IP prefix, optionally
413followed by colon and port. If prefix or port part is absent or replaced
414with <tt/*/, this means wildcard match.
415<item><tt/inet6/ - The same as <tt/inet/, only prefix refers to an IPv6
416address. Unlike <tt/inet/ colon becomes ambiguous, so that <tt/ss/ allows
417to use scheme, like used in URLs, where address is suppounded with
418<tt/[/ ... <tt/]/.
419<item><tt/unix/ - <tt/ADDRESS_PATTERN/ is shell-style wildcard.
420<item><tt/packet/ - format looks like <tt/inet/, only interface index
421stays instead of port and link layer protocol id instead of address.
422<item><tt/netlink/ - format looks like <tt/inet/, only socket pid
423stays instead of port and netlink channel instead of address.
424</itemize>
425
426<p><tt/PORT/ is syntactically <tt/ADDRESS_PATTERN/ with wildcard
427address part. Certainly, it is undefined for UNIX sockets. 
428
429<sect1> Environment variables
430
431<p>
432<tt/ss/ allows to change source of information using various
433environment variables:
434
435<p>
436<itemize>
437<item> <tt/PROC_SLABINFO/  to override <tt>/proc/slabinfo</tt>
438<item> <tt/PROC_NET_TCP/  to override <tt>/proc/net/tcp</tt>
439<item> <tt/PROC_NET_UDP/  to override <tt>/proc/net/udp</tt>
440<item> etc.
441</itemize> 
442
443<p>
444Variable <tt/PROC_ROOT/ allows to change root of all the <tt>/proc/</tt>
445hierarchy.
446
447<p>
448Variable <tt/TCPDIAG_FILE/ prescribes to open a file instead of
449requesting kernel to dump information about TCP sockets.
450
451
452<p> This option is used mainly to investigate bug reports,
453when dumps of files usually found in <tt>/proc/</tt> are recevied
454by e-mail.
455
456<sect1> Output format
457
458<p>Six columns. The first is <tt/Netid/, it denotes socket type and
459transport protocol, when it is ambiguous: <tt/tcp/, <tt/udp/, <tt/raw/,
460<tt/u_str/ is abbreviation for <tt/unix_stream/, <tt/u_dgr/ for UNIX
461datagram sockets, <tt/nl/ for netlink, <tt/p_raw/ and <tt/p_dgr/ for
462raw and datagram packet sockets. This column is optional, it will
463be hidden, if filter selects an unique netid.
464
465<p>
466The second column is <tt/State/. Socket state is displayed here.
467The names are standard TCP names, except for <tt/UNCONN/, which
468cannot happen for TCP, but normal for not connected sockets
469of another types. Again, this column can be hidden.
470
471<p>
472Then two columns (<tt/Recv-Q/ and <tt/Send-Q/) showing amount of data
473queued for receive and transmit.
474
475<p>
476And the last two columns display local address and port of the socket
477and its peer address, if the socket is connected.
478
479<p>
480If options <tt/-o/, <tt/-e/ or <tt/-p/ were given, options are
481displayed not in fixed positions but separated by spaces pairs:
482<tt/option:value/. If value is not a single number, it is presented
483as list of values, enclosed to <tt/(/ ... <tt/)/ and separated with
484commas. F.e.
485
486<tscreen><verb>
487   timer:(keepalive,111min,0)
488</verb></tscreen>
489is typical format for TCP timer (option <tt/-o/).
490
491<tscreen><verb>
492   users:((X,113,3))
493</verb></tscreen>
494is typical for list of users (option <tt/-p/).
495
496
497<sect>Some numbers
498
499<p>
500Well, let us use <tt/pidentd/ and a tool <tt/ibench/ to measure
501its performance. It is 30 requests per second here. Nothing to test,
502it is too slow. OK, let us patch pidentd with patch from directory
503Patches. After this it handles about 4300 requests per second
504and becomes handy tool to pollute socket tables with lots of timewait
505buckets.
506
507<p>
508So, each test starts from pollution tables with 30000 sockets
509and then doing full dump of the table piped to wc and measuring
510timings with time:
511
512<p>Results:
513
514<itemize>
515<item> <tt/netstat -at/ - 15.6 seconds
516<item> <tt/ss -atr/, but without <tt/tcp_diag/     - 5.4 seconds
517<item> <tt/ss -atr/ with <tt/tcp_diag/     - 0.47 seconds
518</itemize>
519
520No comments. Though one comment is necessary, most of time
521without <tt/tcp_diag/ is wasted inside kernel with completely
522blocked networking. More than 10 seconds, yes. <tt/tcp_diag/
523does the same work for 100 milliseconds of system time.
524
525</article>
526