#
a95cd6e4 |
|
07-Dec-2023 |
Richard Scheffenegger <rscheff@FreeBSD.org> |
siftr: refactor batch log processing Refactoring to perform the batch processing of log messaged in two phases. First cycling through a limited number of collected packets, and only thereafter freeing the processed packets. This prevents any chance of calling free while in a critical / spinlocked section. Reviewed By: tuexen Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D42949
|
#
fdafd315 |
|
24-Nov-2023 |
Warner Losh <imp@FreeBSD.org> |
sys: Automated cleanup of cdefs and other formatting Apply the following automated changes to try to eliminate no-longer-needed sys/cdefs.h includes as well as now-empty blank lines in a row. Remove /^#if.*\n#endif.*\n#include\s+<sys/cdefs.h>.*\n/ Remove /\n+#include\s+<sys/cdefs.h>.*\n+#if.*\n#endif.*\n+/ Remove /\n+#if.*\n#endif.*\n+/ Remove /^#if.*\n#endif.*\n/ Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/types.h>/ Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/param.h>/ Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/capsicum.h>/ Sponsored by: Netflix
|
#
d79a9edb |
|
23-Nov-2023 |
Mitchell Horne <mhorne@FreeBSD.org> |
alq, siftr: add panic/debugger checks to shutdown hooks Don't try to gracefully terminate the pkt_manager thread if the scheduler is not running. We should not attempt to shutdown ald if RB_NOSYNC is set, and must not if the scheduler is stopped (the function calls wakeup()). Reviewed by: markj MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D42340
|
#
fafb03ab |
|
25-Jul-2023 |
Cheng Cui <cc@FreeBSD.org> |
siftr: flush pkt_nodes to the log file in batch Reviewed by: rscheff, tuexen Differential Revision: https://reviews.freebsd.org/D41175
|
#
685dc743 |
|
16-Aug-2023 |
Warner Losh <imp@FreeBSD.org> |
sys: Remove $FreeBSD$: one-line .c pattern Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
|
#
2176c9ab |
|
01-Jul-2023 |
Michael Tuexen <tuexen@FreeBSD.org> |
dtrace: improve siftr probe Improve consistency of the field names with tcpsinfo_t: * Use mss instead of max_seg_size. * Use lport and rport instead of tcp_localport and tcp_foreignport. Use t_flags instead of flags to improve consistency with t_flags2. Add laddr and raddr, since the addresses were missing when compared to the output of siftr. Reviewed by: cc Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D40834
|
#
cd9da8d0 |
|
01-Jul-2023 |
Michael Tuexen <tuexen@FreeBSD.org> |
siftr: unbreak dtrace support This patch adds back some fields needed by the siftr probe, which were removed in https://cgit.freebsd.org/src/commit/?id=aa61cff4249c92689d7a1f15db49d65d082184cb With this fix, you can run dtrace scripts again when the siftr module is loaded. And the siftr probe works again. Reviewed by: rscheff Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D40826
|
#
dc2d26df |
|
30-Jun-2023 |
Michael Tuexen <tuexen@FreeBSD.org> |
siftr: provide dtrace with the correct pointer to data This fixes a bug which was introduced in the commit https://svnweb.freebsd.org/changeset/base/282276 Reviewed by: cc, rscheff MFC after: 3 days Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D40806
|
#
7a52b570 |
|
30-May-2023 |
Cheng Cui <cc@FreeBSD.org> |
siftr: bring back the siftr_pkts_per_log feature Summary: this missing feature is introduced by commit aa61cff4249c Test Plan: verified in Emulab.net Reviewers: rscheff, tuexen Approved by: tuexen (mentor) Subscribers: imp, melifaro, glebius Differential Revision: https://reviews.freebsd.org/D40336
|
#
b71f2784 |
|
29-May-2023 |
Cheng Cui <cc@FreeBSD.org> |
siftr: convert this tval.tv_sec to type intmax_t to print across platforms Reviewers: rscheff, tuexen Approved by: tuexen (mentor) Subscribers: imp, melifaro, glebius Differential Revision: https://reviews.freebsd.org/D40323
|
#
78914cd6 |
|
29-May-2023 |
Cheng Cui <cc@FreeBSD.org> |
siftr: sync-up man page with recent code changes, and cleanup code Reviewers: rscheff, tuexen Approved by: tuexen (mentor) Subscribers: imp, melifaro, glebius Differential Revision: https://reviews.freebsd.org/D40322
|
#
d29a9a61 |
|
29-May-2023 |
Cheng Cui <cc@FreeBSD.org> |
siftr: fix a build error for powerpc and arm platforms Summary: This is introduced by commit aa61cff4249c. Reviewers: rscheff, tuexen Approved by: tuexen (mentor) Subscribers: imp, melifaro, glebius Differential Revision: https://reviews.freebsd.org/D40320
|
#
aa61cff4 |
|
27-May-2023 |
Cheng Cui <cc@FreeBSD.org> |
siftr: three changes that improve performance Summary: (1) use inp_flowid or a new packet hash for a flow identification (2) cache constant connection info into struct flow_hash_node (3) use compressed notation for IPv6 address representation Reviewers: rscheff, tuexen Approved by: tuexen (mentor) Subscribers: imp, melifaro, glebius Differential Revision: https://reviews.freebsd.org/D40302
|
#
4d846d26 |
|
10-May-2023 |
Warner Losh <imp@FreeBSD.org> |
spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch up to that fact and revert to their recommended match of BSD-2-Clause. Discussed with: pfg MFC After: 3 days Sponsored by: Netflix
|
#
60167184 |
|
26-Apr-2023 |
Cheng Cui <cc@FreeBSD.org> |
siftr: remove barely used hash generation per record Reviewers: rscheff, tuexen Approved by: rscheff, tuexen Subscribers: imp, melifaro, glebius Differential Revision: https://reviews.freebsd.org/D39835
|
#
d090464e |
|
25-Apr-2023 |
Cheng Cui <cc@FreeBSD.org> |
Change the unit of srtt and rto to usec, inspired by these in struct "tcp_info". Therefore, no need hz and tcp_rtt_scale in the headline of the log. Update the man page as well. Summary: Simplify srtt and rto values in siftr log. Test Plan: Tested in Emulab testbed: cc@s1:~ % sudo sysctl net.inet.siftr net.inet.siftr.port_filter: 0 net.inet.siftr.genhashes: 0 net.inet.siftr.ppl: 1 net.inet.siftr.logfile: /var/log/siftr.log net.inet.siftr.enabled: 0 cc@s1:~ % sudo sysctl net.inet.siftr.port_filter=5001 net.inet.siftr.port_filter: 0 -> 5001 cc@s1:~ % sudo sysctl net.inet.siftr.enabled=1 net.inet.siftr.enabled: 0 -> 1 cc@s1:~ % cc@s1:~ % iperf -c r1 -n 1M ------------------------------------------------------------ Client connecting to r1, TCP port 5001 TCP window size: 32.0 KByte (default) ------------------------------------------------------------ [ 1] local 10.1.1.2 port 33817 connected with 10.1.1.3 port 5001 [ ID] Interval Transfer Bandwidth [ 1] 0.00-0.91 sec 1.00 MBytes 9.22 Mbits/sec cc@s1:~ % sudo sysctl net.inet.siftr.enabled=0 net.inet.siftr.enabled: 1 -> 0 cc@s1:~ % ll /var/log/siftr.log -rw-r--r-- 1 root wheel 91K Apr 25 09:38 /var/log/siftr.log cc@s1:~ % cat /var/log/siftr.log enable_time_secs=1682437111 enable_time_usecs=121115 siftrver=1.3.0 sysname=FreeBSD sysver=1400088 ipmode=4 o,0x00000000,1682437125.907343,10.1.1.2,33817,10.1.1.3,5001,1073725440,1073725440,2,0,0,0,0,2,536,0,1,672,1000000,32768,0,65536,0,0,0,0,0 i,0x00000000,1682437126.106759,10.1.1.2,33817,10.1.1.3,5001,1073725440,1073725440,2,0,0,0,0,2,536,0,1,672,1000000,32768,0,65536,0,1,0,0,0 o,0x00000000,1682437126.106767,10.1.1.2,33817,10.1.1.3,5001,1073725440,14480,2,65535,65700,9,9,4,1460,201000,1,16778209,803000,33580,0,65700,0,0,0,0,0 o,0x00000000,1682437126.107141,10.1.1.2,33817,10.1.1.3,5001,1073725440,14480,2,65535,65700,9,9,4,1460,201000,1,16778208,803000,33580,60,65700,0,0,0,0,0 ... i,0x00000000,1682437127.016754,10.1.1.2,33817,10.1.1.3,5001,1073725440,606109,1030,748544,66048,9,9,9,1460,100812,1,1008,303000,475948,0,65700,0,0,0,0,0 o,0x00000000,1682437127.016759,10.1.1.2,33817,10.1.1.3,5001,1073725440,606109,1030,748544,66048,9,9,10,1460,100812,1,1011,303000,475948,0,65700,0,0,0,0,0 disable_time_secs=1682437131 disable_time_usecs=767582 num_inbound_tcp_pkts=371 num_outbound_tcp_pkts=186 total_tcp_pkts=557 num_inbound_skipped_pkts_malloc=0 num_outbound_skipped_pkts_malloc=0 num_inbound_skipped_pkts_tcpcb=0 num_outbound_skipped_pkts_tcpcb=0 num_inbound_skipped_pkts_inpcb=0 num_outbound_skipped_pkts_inpcb=0 total_skipped_tcp_pkts=0 flow_list=10.1.1.2;33817-10.1.1.3;5001, Reviewers: rscheff, tuexen Approved by: rscheff, tuexen Subscribers: imp, melifaro, glebius Differential Revision: https://reviews.freebsd.org/D39803
|
#
1f782fcc |
|
24-Apr-2023 |
Cheng Cui <cc@FreeBSD.org> |
Remove unused fields in siftr_stats. Thus, update the man page as well. Summary: Remove unused fields in siftr_stats. Thus, update the man page as well. Test Plan: Tested in Emulab testbed. Reviewers: rscheff, tuexen Approved by: rscheff, tuexen Subscribers: imp, melifaro, glebius Differential Revision: https://reviews.freebsd.org/D39776
|
#
caf32b26 |
|
14-Feb-2023 |
Gleb Smirnoff <glebius@FreeBSD.org> |
pfil: add pfil_mem_{in,out}() and retire pfil_run_hooks() The 0b70e3e78b0 changed the original design of a single entry point into pfil(9) chains providing separate functions for the filtering points that always provide mbufs and know the direction of a flow. The motivation was to reduce branching. The logical continuation would be to do the same for the filtering points that always provide a memory pointer and retire the single entry point. o Hooks now provide two functions: one for mbufs and optional for memory pointers. o pfil_hook_args() has a new member and pfil_add_hook() has a requirement to zero out uninitialized data. Bump PFIL_VERSION. o As it was before, a hook function for a memory pointer may realloc into an mbuf. Such mbuf would be returned via a pointer that must be provided in argument. o The only hook that supports memory pointers is ipfw:default-link. It is rewritten to provide two functions. o All remaining uses of pfil_run_hooks() are converted to pfil_mem_in(). o Transparent union of pfil_packet_t and tricks to fix pointer alignment are retired. Internal pfil_realloc() reduces down to m_devget() and thus is retired, too. Reviewed by: mjg, ocochard Differential revision: https://reviews.freebsd.org/D37977
|
#
9c655838 |
|
06-Oct-2022 |
Richard Scheffenegger <rscheff@FreeBSD.org> |
siftr: apply filter early on Quickly check TCP port filter, before investing into expensive operations. No functional change. Obtained from: guest-ccui Reviewed By: #transport, tuexen, guest-ccui Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D36842
|
#
53af6903 |
|
06-Oct-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: remove INP_TIMEWAIT flag Mechanically cleanup INP_TIMEWAIT from the kernel sources. After 0d7445193ab, this commit shall not cause any functional changes. Note: this flag was very often checked together with INP_DROPPED. If we modify in_pcblookup*() not to return INP_DROPPED pcbs, we will be able to remove most of this checks and turn them to assertions. Some of them can be turned into assertions right now, but that should be carefully done on a case by case basis. Differential revision: https://reviews.freebsd.org/D36400
|
#
c198adf3 |
|
12-Sep-2022 |
Dag-Erling Smørgrav <des@FreeBSD.org> |
siftr: spell PFIL_PASS correctly. Sponsored by: NetApp Sponsored by: Klara Inc. Differential Revision: https://reviews.freebsd.org/D36539
|
#
1241e8e7 |
|
07-Apr-2022 |
Tom Jones <thj@FreeBSD.org> |
siftr: expose t_flags2 in siftr output Replace the old snd_bwnd field which was kept for compatibility with the t_flags2 field from the tcpcb. This exposes in siftr logs interesting things such as ECN, PLPMTUD, Accurate ECN and if first bytes are complete. Reviewed by: rscheff (transport), chengc_netapp.com, debdrup (manpages) Sponsored by: NetApp, Inc. Sponsored by: Klara, Inc. X-NetApp-PR: #73 Differential Revision: https://reviews.freebsd.org/D34672
|
#
34d8ffff |
|
03-Nov-2021 |
Allan Jude <allanjude@FreeBSD.org> |
SIFTR: Fix compilation with -DSIFTR_IPV6 A few pieces of the SIFTR code that are behind #ifdef SIFTR_IPV6 have not been updated as APIs have changed, etc. Reported by: Alexander Sideropoulos <Alexander.Sideropoulos@netapp.com> Reviewed by: rscheff, lstewart Sponsored by: NetApp Sponsored by: Klara Inc. Differential Revision: https://reviews.freebsd.org/D32698
|
#
662c1305 |
|
01-Sep-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
net: clean up empty lines in .c and .h files
|
#
7029da5c |
|
26-Feb-2020 |
Pawel Biernacki <kaktus@FreeBSD.org> |
Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many) r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are still not MPSAFE (or already are but aren’t properly marked). Use it in preparation for a general review of all nodes. This is non-functional change that adds annotations to SYSCTL_NODE and SYSCTL_PROC nodes using one of the soon-to-be-required flags. Mark all obvious cases as MPSAFE. All entries that haven't been marked as MPSAFE before are by default marked as NEEDGIANT Approved by: kib (mentor, blanket) Commented by: kib, gallatin, melifaro Differential Revision: https://reviews.freebsd.org/D23718
|
#
481be5de |
|
12-Feb-2020 |
Randall Stewart <rrs@FreeBSD.org> |
White space cleanup -- remove trailing tab's or spaces from any line. Sponsored by: Netflix Inc.
|
#
746c7ae5 |
|
07-Oct-2019 |
Michael Tuexen <tuexen@FreeBSD.org> |
In r343587 a simple port filter as sysctl tunable was added to siftr. The new sysctl was not added to the siftr.4 man page at the time. This updates the man page, and removes one left over trailing whitespace. Submitted by: Richard Scheffenegger Reviewed by: bcr@ MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D21619
|
#
54739273 |
|
01-Feb-2019 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Repair siftr(4): PFIL_IN and PFIL_OUT are defines of some value, relying on them having particular values can break things.
|
#
b252313f |
|
31-Jan-2019 |
Gleb Smirnoff <glebius@FreeBSD.org> |
New pfil(9) KPI together with newborn pfil API and control utility. The KPI have been reviewed and cleansed of features that were planned back 20 years ago and never implemented. The pfil(9) internals have been made opaque to protocols with only returned types and function declarations exposed. The KPI is made more strict, but at the same time more extensible, as kernel uses same command structures that userland ioctl uses. In nutshell [KA]PI is about declaring filtering points, declaring filters and linking and unlinking them together. New [KA]PI makes it possible to reconfigure pfil(9) configuration: change order of hooks, rehook filter from one filtering point to a different one, disconnect a hook on output leaving it on input only, prepend/append a filter to existing list of filters. Now it possible for a single packet filter to provide multiple rulesets that may be linked to different points. Think of per-interface ACLs in Cisco or Juniper. None of existing packet filters yet support that, however limited usage is already possible, e.g. default ruleset can be moved to single interface, as soon as interface would pride their filtering points. Another future feature is possiblity to create pfil heads, that provide not an mbuf pointer but just a memory pointer with length. That would allow filtering at very early stages of a packet lifecycle, e.g. when packet has just been received by a NIC and no mbuf was yet allocated. Differential Revision: https://reviews.freebsd.org/D18951
|
#
435a8c15 |
|
30-Jan-2019 |
Brooks Davis <brooks@FreeBSD.org> |
Add a simple port filter to SIFTR. SIFTR does not allow any kind of filtering, but captures every packet processed by the TCP stack. Often, only a specific session or service is of interest, and doing the filtering in post-processing of the log adds to the overhead of SIFTR. This adds a new sysctl net.inet.siftr.port_filter. When set to zero, all packets get captured as previously. If set to any other value, only packets where either the source or the destination ports match, are captured in the log file. Submitted by: Richard Scheffenegger Reviewed by: Cheng Cui Differential Revision: https://reviews.freebsd.org/D18897
|
#
c53d6b90 |
|
18-Jan-2019 |
Brooks Davis <brooks@FreeBSD.org> |
Make SIFTR work again after r342125 (D18443). Correct a logic error. Only disable when already enabled or enable when disabled. Submitted by: Richard Scheffenegger Reviewed by: Cheng Cui Obtained from: Cheng Cui MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D18885
|
#
855acb84 |
|
15-Dec-2018 |
Brooks Davis <brooks@FreeBSD.org> |
Fix bugs in plugable CC algorithm and siftr sysctls. Use the sysctl_handle_int() handler to write out the old value and read the new value into a temporary variable. Use the temporary variable for any checks of values rather than using the CAST_PTR_INT() macro on req->newptr. The prior usage read directly from userspace memory if the sysctl() was called correctly. This is unsafe and doesn't work at all on some architectures (at least i386.) In some cases, the code could also be tricked into reading from kernel memory and leaking limited information about the contents or crashing the system. This was true for CDG, newreno, and siftr on all platforms and true for i386 in all cases. The impact of this bug is largest in VIMAGE jails which have been configured to allow writing to these sysctls. Per discussion with the security officer, we will not be issuing an advisory for this issue as root access and a non-default config are required to be impacted. Reviewed by: markj, bz Discussed with: gordon (security officer) MFC after: 3 days Security: kernel information leak, local DoS (both require root) Differential Revision: https://reviews.freebsd.org/D18443
|
#
384a5c3c |
|
01-Oct-2018 |
Andrey V. Elsukov <ae@FreeBSD.org> |
Add INP_INFO_WUNLOCK_ASSERT() macro and use it instead of INP_INFO_UNLOCK_ASSERT() in TCP-related code. For encapsulated traffic it is possible, that the code is running in net_epoch_preempt section, and INP_INFO_UNLOCK_ASSERT() is very strict assertion for such case. PR: 231428 Reviewed by: mmacy, tuexen Approved by: re (kib) Differential Revision: https://reviews.freebsd.org/D17335
|
#
2bf95012 |
|
05-Jul-2018 |
Andrew Turner <andrew@FreeBSD.org> |
Create a new macro for static DPCPU data. On arm64 (and possible other architectures) we are unable to use static DPCPU data in kernel modules. This is because the compiler will generate PC-relative accesses, however the runtime-linker expects to be able to relocate these. In preparation to fix this create two macros depending on if the data is global or static. Reviewed by: bz, emaste, markj Sponsored by: ABT Systems Ltd Differential Revision: https://reviews.freebsd.org/D16140
|
#
f6960e20 |
|
18-May-2018 |
Matt Macy <mmacy@FreeBSD.org> |
netinet silence warnings
|
#
fe267a55 |
|
27-Nov-2017 |
Pedro F. Giffuni <pfg@FreeBSD.org> |
sys: general adoption of SPDX licensing ID tags. Mainly focus on files that use BSD 2-Clause license, however the tool I was using misidentified many licenses so this was mostly a manual - error prone - task. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts. No functional change intended.
|
#
47cedcbd |
|
11-Mar-2016 |
John Baldwin <jhb@FreeBSD.org> |
Use SI_SUB_LAST instead of SI_SUB_SMP as the "catch-all" subsystem. Reviewed by: kib Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D5515
|
#
bee68e94 |
|
30-Apr-2015 |
George V. Neville-Neil <gnn@FreeBSD.org> |
Move the SIFTR DTrace probe out of the writing thread context and directly into the place where the data is collected.
|
#
981ad3ec |
|
29-Apr-2015 |
George V. Neville-Neil <gnn@FreeBSD.org> |
Brief demo script showing the various values that can be read via the new SIFTR statically defined tracepoint (SDT). Differential Revision: https://reviews.freebsd.org/D2387 Reviewed by: bz, markj
|
#
efca1668 |
|
24-Mar-2015 |
Lawrence Stewart <lstewart@FreeBSD.org> |
The addition of flowid and flowtype in r280233 and r280237 respectively forgot to extend the IPv6 packet node format string, which causes a build failure when SIFTR is compiled with IPv6 support. Reported by: Lars Eggert
|
#
d0a8b2a5 |
|
18-Mar-2015 |
Hiren Panchasara <hiren@FreeBSD.org> |
Add connection flow type to siftr(4). Suggested by: adrian Sponsored by: Limelight Networks
|
#
a025fd14 |
|
18-Mar-2015 |
Hiren Panchasara <hiren@FreeBSD.org> |
Add connection flowid to siftr(4). Reviewed by: lstewart MFC after: 1 week Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D2089
|
#
cfa6009e |
|
12-Nov-2014 |
Gleb Smirnoff <glebius@FreeBSD.org> |
In preparation of merging projects/sendfile, transform bare access to sb_cc member of struct sockbuf to a couple of inline functions: sbavail() and sbused() Right now they are equal, but once notion of "not ready socket buffer data", will be checked in, they are going to be different. Sponsored by: Netflix Sponsored by: Nginx, Inc.
|
#
0e1152fc |
|
27-Oct-2014 |
Hans Petter Selasky <hselasky@FreeBSD.org> |
The SYSCTL data pointers can come from userspace and must not be directly accessed. Although this will work on some platforms, it can throw an exception if the pointer is invalid and then panic the kernel. Add a missing SYSCTL_IN() of "SCTP_BASE_STATS" structure. MFC after: 3 days Sponsored by: Mellanox Technologies
|
#
c3322cb9 |
|
28-Oct-2013 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Include necessary headers that now are available due to pollution via if_var.h. Sponsored by: Netflix Sponsored by: Nginx, Inc.
|
#
1e0e83d7 |
|
06-Mar-2013 |
Lawrence Stewart <lstewart@FreeBSD.org> |
The hashmask returned by hashinit() is a valid index in the returned hash array. Fix a siftr(4) potential memory leak and INVARIANTS triggered kernel panic in hashdestroy() by ensuring the last array index in the flow counter hash table is flushed of entries. MFC after: 3 days
|
#
8f134647 |
|
22-Oct-2012 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Switch the entire IPv4 stack to keep the IP packet header in network byte order. Any host byte order processing is done in local variables and host byte order values are never[1] written to a packet. After this change a packet processed by the stack isn't modified at all[2] except for TTL. After this change a network stack hacker doesn't need to scratch his head trying to figure out what is the byte order at the given place in the stack. [1] One exception still remains. The raw sockets convert host byte order before pass a packet to an application. Probably this would remain for ages for compatibility. [2] The ip_input() still subtructs header len from ip->ip_len, but this is planned to be fixed soon. Reviewed by: luigi, Maxim Dounin <mdounin mdounin.ru> Tested by: ray, Olivier Cochard-Labbe <olivier cochard.me>
|
#
fa046d87 |
|
30-May-2011 |
Robert Watson <rwatson@FreeBSD.org> |
Decompose the current single inpcbinfo lock into two locks: - The existing ipi_lock continues to protect the global inpcb list and inpcb counter. This lock is now relegated to a small number of allocation and free operations, and occasional operations that walk all connections (including, awkwardly, certain UDP multicast receive operations -- something to revisit). - A new ipi_hash_lock protects the two inpcbinfo hash tables for looking up connections and bound sockets, manipulated using new INP_HASH_*() macros. This lock, combined with inpcb locks, protects the 4-tuple address space. Unlike the current ipi_lock, ipi_hash_lock follows the individual inpcb connection locks, so may be acquired while manipulating a connection on which a lock is already held, avoiding the need to acquire the inpcbinfo lock preemptively when a binding change might later be required. As a result, however, lookup operations necessarily go through a reference acquire while holding the lookup lock, later acquiring an inpcb lock -- if required. A new function in_pcblookup() looks up connections, and accepts flags indicating how to return the inpcb. Due to lock order changes, callers no longer need acquire locks before performing a lookup: the lookup routine will acquire the ipi_hash_lock as needed. In the future, it will also be able to use alternative lookup and locking strategies transparently to callers, such as pcbgroup lookup. New lookup flags are, supplementing the existing INPLOOKUP_WILDCARD flag: INPLOOKUP_RLOCKPCB - Acquire a read lock on the returned inpcb INPLOOKUP_WLOCKPCB - Acquire a write lock on the returned inpcb Callers must pass exactly one of these flags (for the time being). Some notes: - All protocols are updated to work within the new regime; especially, TCP, UDPv4, and UDPv6. pcbinfo ipi_lock acquisitions are largely eliminated, and global hash lock hold times are dramatically reduced compared to previous locking. - The TCP syncache still relies on the pcbinfo lock, something that we may want to revisit. - Support for reverting to the FreeBSD 7.x locking strategy in TCP input is no longer available -- hash lookup locks are now held only very briefly during inpcb lookup, rather than for potentially extended periods. However, the pcbinfo ipi_lock will still be acquired if a connection state might change such that a connection is added or removed. - Raw IP sockets continue to use the pcbinfo ipi_lock for protection, due to maintaining their own hash tables. - The interface in6_pcblookup_hash_locked() is maintained, which allows callers to acquire hash locks and perform one or more lookups atomically with 4-tuple allocation: this is required only for TCPv6, as there is no in6_pcbconnect_setup(), which there should be. - UDPv6 locking remains significantly more conservative than UDPv4 locking, which relates to source address selection. This needs attention, as it likely significantly reduces parallelism in this code for multithreaded socket use (such as in BIND). - In the UDPv4 and UDPv6 multicast cases, we need to revisit locking somewhat, as they relied on ipi_lock to stablise 4-tuple matches, which is no longer sufficient. A second check once the inpcb lock is held should do the trick, keeping the general case from requiring the inpcb lock for every inpcb visited. - This work reminds us that we need to revisit locking of the v4/v6 flags, which may be accessed lock-free both before and after this change. - Right now, a single lock name is used for the pcbhash lock -- this is undesirable, and probably another argument is required to take care of this (or a char array name field in the pcbinfo?). This is not an MFC candidate for 8.x due to its impact on lookup and locking semantics. It's possible some of these issues could be worked around with compatibility wrappers, if necessary. Reviewed by: bz Sponsored by: Juniper Networks, Inc.
|
#
6bed196c |
|
13-Apr-2011 |
Sergey Kandaurov <pluknet@FreeBSD.org> |
Staticize malloc types. Approved by: lstewart MFC after: 1 week
|
#
891b8ed4 |
|
12-Apr-2011 |
Lawrence Stewart <lstewart@FreeBSD.org> |
Use the full and proper company name for Swinburne University of Technology throughout the source tree. Requested by: Grenville Armitage, Director of CAIA at Swinburne University of Technology MFC after: 3 days
|
#
3e288e62 |
|
22-Nov-2010 |
Dimitry Andric <dim@FreeBSD.org> |
After some off-list discussion, revert a number of changes to the DPCPU_DEFINE and VNET_DEFINE macros, as these cause problems for various people working on the affected files. A better long-term solution is still being considered. This reversal may give some modules empty set_pcpu or set_vnet sections, but these are harmless. Changes reverted: ------------------------------------------------------------------------ r215318 | dim | 2010-11-14 21:40:55 +0100 (Sun, 14 Nov 2010) | 4 lines Instead of unconditionally emitting .globl's for the __start_set_xxx and __stop_set_xxx symbols, only emit them when the set_vnet or set_pcpu sections are actually defined. ------------------------------------------------------------------------ r215317 | dim | 2010-11-14 21:38:11 +0100 (Sun, 14 Nov 2010) | 3 lines Apply the STATIC_VNET_DEFINE and STATIC_DPCPU_DEFINE macros throughout the tree. ------------------------------------------------------------------------ r215316 | dim | 2010-11-14 21:23:02 +0100 (Sun, 14 Nov 2010) | 2 lines Add macros to define static instances of VNET_DEFINE and DPCPU_DEFINE.
|
#
92ea5581 |
|
20-Nov-2010 |
Lawrence Stewart <lstewart@FreeBSD.org> |
Fix a minor code redundancy nit. MFC after: 3 days
|
#
052aec12 |
|
20-Nov-2010 |
Lawrence Stewart <lstewart@FreeBSD.org> |
When enabling or disabling SIFTR with a VIMAGE kernel, ensure we add or remove the SIFTR pfil(9) hook functions to or from all network stacks. This patch allows packets inbound or outbound from a vnet to be "seen" by SIFTR. Additional work is required to allow SIFTR to actually generate log messages for all vnet related packets because the siftr_findinpcb() function does not yet search for inpcbs across all vnets. This issue will be fixed separately. Reported and tested by: David Hayes <dahayes at swin edu au> MFC after: 3 days
|
#
31c6a003 |
|
14-Nov-2010 |
Dimitry Andric <dim@FreeBSD.org> |
Apply the STATIC_VNET_DEFINE and STATIC_DPCPU_DEFINE macros throughout the tree.
|
#
619ad9eb |
|
11-Nov-2010 |
Lawrence Stewart <lstewart@FreeBSD.org> |
Standardise all Swinburne related copyright/licence statements throughout the tree in preparation for another large code import. Swinburne University is the legal entity that owns copyright and the 2-clause BSD licence is acceptable.
|
#
67f285a2 |
|
11-Nov-2010 |
Lawrence Stewart <lstewart@FreeBSD.org> |
The university does not require that its CRICOS number be included in source code. Remove all references from the tree. MFC after: 3 days
|
#
a7d5f7eb |
|
19-Oct-2010 |
Jamie Gritton <jamie@FreeBSD.org> |
A new jail(8) with a configuration file, to replace the work currently done by /etc/rc.d/jail.
|
#
d4d3e218 |
|
25-Sep-2010 |
Lawrence Stewart <lstewart@FreeBSD.org> |
Log the number of segments currently in the reassembly queue. Sponsored by: FreeBSD Foundation
|
#
1c18314d |
|
16-Sep-2010 |
Andre Oppermann <andre@FreeBSD.org> |
Remove the TCP inflight bandwidth limiter as announced in r211315 to give way for the pluggable congestion control framework. It is the task of the congestion control algorithm to set the congestion window and amount of inflight data without external interference. In 'struct tcpcb' the variables previously used by the inflight limiter are renamed to spares to keep the ABI intact and to have some more space for future extensions. In 'struct tcp_info' the variable 'tcpi_snd_bwnd' is not removed to preserve the ABI. It is always set to 0. In siftr.c in 'struct pkt_node' the variable 'snd_bwnd' is not removed to preserve the ABI. It is always set to 0. These unused variable in the various structures may be reused in the future or garbage collected before the next release or at some other point when an ABI change happens anyway for other reasons. No MFC is planned. The inflight bandwidth limiter stays disabled by default in the other branches but remains available.
|
#
79848522 |
|
17-Jul-2010 |
Lawrence Stewart <lstewart@FreeBSD.org> |
- Move common code from the hook functions that fills in a packet node struct to a separate inline function. This further reduces duplicate code that didn't have a good reason to stay as it was. - Reorder the malloc of a pkt_node struct in the hook functions such that it only occurs if we managed to find a usable tcpcb associated with the packet. - Make the inp_locally_locked variable's type consistent with the prototype of siftr_siftdata(). Sponsored by: FreeBSD Foundation
|
#
adc5f010 |
|
13-Jul-2010 |
Lawrence Stewart <lstewart@FreeBSD.org> |
The SIFTR DPCPU statistics struct was not being zeroed between enable/disable cycles so the values would accumulate rather than reset for each cycle. Sponsored by: FreeBSD Foundation
|
#
985147de |
|
13-Jul-2010 |
Lawrence Stewart <lstewart@FreeBSD.org> |
Catch up with the rename of DPCPU_SUM to DPCPU_VARSUM in r209978. Sponsored by: FreeBSD Foundation
|
#
a5548bf6 |
|
03-Jul-2010 |
Lawrence Stewart <lstewart@FreeBSD.org> |
Import the Statistical Information For TCP Research (SIFTR) kernel module into FreeBSD. SIFTR logs a range of statistics on active TCP connections to a log file, providing the ability to make highly granular measurements of TCP connection state. The tool is aimed at system administrators, developers and researchers alike. Please take it for a spin and test it out - the man page should have all the information required to get you going. Many thanks go to the Cisco University Research Program Fund at Community Foundation Silicon Valley and the FreeBSD Foundation. Their support of our work at the Centre for Advanced Internet Architectures, Swinburne University of Technology is greatly appreciated. Sponsored by: Cisco URP, FreeBSD Foundation Reviewed by: dwmalone, gnn, rpaulo Tested by: Many on freebsd-current@ and elsewhere over the years MFC after: 1 month
|