History log of /linux-master/net/sctp/ulpqueue.c
Revision Date Author Comments
# b66aeddb 19-Oct-2022 Alexey Kodanev <aleksei.kodanev@bell-sw.com>

sctp: remove unnecessary NULL check in sctp_ulpq_tail_event()

After commit 013b96ec6461 ("sctp: Pass sk_buff_head explicitly to
sctp_ulpq_tail_event().") there is one more unneeded check of
skb_list for NULL.

Detected using the static analysis tool - Svace.
Signed-off-by: Alexey Kodanev <aleksei.kodanev@bell-sw.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Link: https://lore.kernel.org/r/20221019180735.161388-2-aleksei.kodanev@bell-sw.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>


# 6fdfdef7 19-Oct-2022 Alexey Kodanev <aleksei.kodanev@bell-sw.com>

sctp: remove unnecessary NULL check in sctp_association_init()

'&asoc->ulpq' passed to sctp_ulpq_init() as the first argument,
then sctp_qlpq_init() initializes it and eventually returns the
address of the struct member back. Therefore, in this case, the
return pointer cannot be NULL.

Moreover, it seems sctp_ulpq_init() has always been used only in
sctp_association_init(), so there's really no need to return ulpq
anymore.

Detected using the static analysis tool - Svace.
Signed-off-by: Alexey Kodanev <aleksei.kodanev@bell-sw.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Link: https://lore.kernel.org/r/20221019180735.161388-1-aleksei.kodanev@bell-sw.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>


# 4890b686 09-Jun-2022 Eric Dumazet <edumazet@google.com>

net: keep sk->sk_forward_alloc as small as possible

Currently, tcp_memory_allocated can hit tcp_mem[] limits quite fast.

Each TCP socket can forward allocate up to 2 MB of memory, even after
flow became less active.

10,000 sockets can have reserved 20 GB of memory,
and we have no shrinker in place to reclaim that.

Instead of trying to reclaim the extra allocations in some places,
just keep sk->sk_forward_alloc values as small as possible.

This should not impact performance too much now we have per-cpu
reserves: Changes to tcp_memory_allocated should not be too frequent.

For sockets not using SO_RESERVE_MEM:
- idle sockets (no packets in tx/rx queues) have zero forward alloc.
- non idle sockets have a forward alloc smaller than one page.

Note:

- Removal of SK_RECLAIM_CHUNK and SK_RECLAIM_THRESHOLD
is left to MPTCP maintainers as a follow up.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>


# b8d7a7c6 22-Aug-2020 Randy Dunlap <rdunlap@infradead.org>

net: sctp: ulpqueue.c: delete duplicated word

Drop the repeated word "an".

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Cc: linux-sctp@vger.kernel.org
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 4e7696d9 08-Dec-2019 Xin Long <lucien.xin@gmail.com>

sctp: get netns from asoc and ep base

Commit 312434617cb1 ("sctp: cache netns in sctp_ep_common") set netns
in asoc and ep base since they're created, and it will never change.
It's a better way to get netns from asoc and ep base, comparing to
calling sock_net().

This patch is to replace them.

v1->v2:
- no change.

Suggested-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 47505b8b 23-May-2019 Thomas Gleixner <tglx@linutronix.de>

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 104

Based on 1 normalized pattern(s):

this sctp implementation is free software you can redistribute it
and or modify it under the terms of the gnu general public license
as published by the free software foundation either version 2 or at
your option any later version this sctp implementation is
distributed in the hope that it will be useful but without any
warranty without even the implied warranty of merchantability or
fitness for a particular purpose see the gnu general public license
for more details you should have received a copy of the gnu general
public license along with gnu cc see the file copying if not see
http www gnu org licenses

extracted by the scancode license scanner the SPDX license identifier

GPL-2.0-or-later

has been chosen to replace the boilerplate/reference in 42 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Richard Fontana <rfontana@redhat.com>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190523091649.683323110@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 9dde27de 15-Apr-2019 Xin Long <lucien.xin@gmail.com>

sctp: implement memory accounting on rx path

sk_forward_alloc's updating is also done on rx path, but to be consistent
we change to use sk_mem_charge() in sctp_skb_set_owner_r().

In sctp_eat_data(), it's not enough to check sctp_memory_pressure only,
which doesn't work for mem_cgroup_sockets_enabled, so we change to use
sk_under_memory_pressure().

When it's under memory pressure, sk_mem_reclaim() and sk_rmem_schedule()
should be called on both RENEGE or CHUNK DELIVERY path exit the memory
pressure status as soon as possible.

Note that sk_rmem_schedule() is using datalen to make things easy there.

Reported-by: Matteo Croce <mcroce@redhat.com>
Tested-by: Matteo Croce <mcroce@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 013b96ec 11-Apr-2019 David Miller <davem@davemloft.net>

sctp: Pass sk_buff_head explicitly to sctp_ulpq_tail_event().

Now the SKB list implementation assumption can be removed.

And now that we know that the list head is always non-NULL
we can remove the code blocks dealing with that as well.

Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 178ca044 11-Apr-2019 David Miller <davem@davemloft.net>

sctp: Make sctp_enqueue_event tak an skb list.

Pass this, instead of an event. Then everything trickles down and we
always have events a non-empty list.

Then we needs a list creating stub to place into .enqueue_event for sctp_stream_interleave_1.

Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 925b9374 11-Apr-2019 David Miller <davem@davemloft.net>

sctp: Always pass skbs on a list to sctp_ulpq_tail_event().

This way we can simplify the logic and remove assumptions
about the implementation of skb lists.

Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 0eff1052 11-Apr-2019 David Miller <davem@davemloft.net>

sctp: Remove superfluous test in sctp_ulpq_reasm_drain().

Inside the loop, we always start with event non-NULL.

Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# a1e3a059 18-Nov-2018 Xin Long <lucien.xin@gmail.com>

sctp: add subscribe per asoc

The member subscribe should be per asoc, so that sockopt SCTP_EVENT
in the next patch can subscribe a event from one asoc only.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 2cc0eeb6 18-Nov-2018 Xin Long <lucien.xin@gmail.com>

sctp: define subscribe in sctp_sock as __u16

The member subscribe in sctp_sock is used to indicate to which of
the events it is subscribed, more like a group of flags. So it's
better to be defined as __u16 (2 bytpes), instead of struct
sctp_event_subscribe (13 bytes).

Note that sctp_event_subscribe is an UAPI struct, used on sockopt
calls, and thus it will not be removed. This patch only changes
the internal storage of the flags.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 1181d629 07-Aug-2018 David S. Miller <davem@davemloft.net>

sctp: Use skb_queue_is_first().

Instead of direct skb_queue_head pointer accesses.

Signed-off-by: David S. Miller <davem@davemloft.net>


# 5c468674 17-Dec-2017 Xin Long <lucien.xin@gmail.com>

sctp: fix the issue that a __u16 variable may overflow in sctp_ulpq_renege

Now when reneging events in sctp_ulpq_renege(), the variable freed
could be increased by a __u16 value twice while freed is of __u16
type. It means freed may overflow at the second addition.

This patch is to fix it by using __u32 type for 'freed', while at
it, also to remove 'if (chunk)' check, as all renege commands are
generated in sctp_eat_data and it can't be NULL.

Reported-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 13228238 08-Dec-2017 Xin Long <lucien.xin@gmail.com>

sctp: add support for the process of unordered idata

Unordered idata process is more complicated than unordered data:

- It has to add mid into sctp_stream_out to save the next mid value,
which is separated from ordered idata's.

- To support pd for unordered idata, another mid and pd_mode need to
be added to save the message id and pd state in sctp_stream_in.

- To make unordered idata reasm easier, it adds a new event queue
to save frags for idata.

The patch mostly adds the samilar reasm functions for unordered idata
as ordered idata's, and also adjusts some other codes on assign_mid,
abort_pd and ulpevent_data for idata.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 65f5e357 08-Dec-2017 Xin Long <lucien.xin@gmail.com>

sctp: implement abort_pd for sctp_stream_interleave

abort_pd is added as a member of sctp_stream_interleave, used to abort
partial delivery for data or idata, called in sctp_cmd_assoc_failed.

Since stream interleave allows to do partial delivery for each stream
at the same time, sctp_intl_abort_pd for idata would be very different
from the old function sctp_ulpq_abort_pd for data.

Note that sctp_ulpevent_make_pdapi will support per stream in this
patch by adding pdapi_stream and pdapi_seq in sctp_pdapi_event, as
described in section 6.1.7 of RFC6458.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 94014e8d 08-Dec-2017 Xin Long <lucien.xin@gmail.com>

sctp: implement renege_events for sctp_stream_interleave

renege_events is added as a member of sctp_stream_interleave, used to
renege some old data or idata in reasm or lobby queue properly to free
some memory for the new data when there's memory stress.

It defines sctp_renege_events for idata, and leaves sctp_ulpq_renege
as it is for data.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# bd4d627d 08-Dec-2017 Xin Long <lucien.xin@gmail.com>

sctp: implement ulpevent_data for sctp_stream_interleave

ulpevent_data is added as a member of sctp_stream_interleave, used to
do the most process in ulpq, including to convert data or idata chunk
to event, reasm them in reasm queue and put them in lobby queue in
right order, and deliver them up to user sk rx queue.

This procedure is described in section 2.2.3 of RFC8260.

It adds most functions for idata here to do the similar process as
the old functions for data. But since the details are very different
between them, the old functions can not be reused for idata.

event->ssn and event->ppid settings are moved to ulpevent_data from
sctp_ulpevent_make_rcvmsg, so that sctp_ulpevent_make_rcvmsg could
work for both data and idata.

Note that mid is added in sctp_ulpevent for idata, __packed has to
be used for defining sctp_ulpevent, or it would exceeds the skb cb
that saves a sctp_ulpevent variable for ulp layer process.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 7906b00f 08-Sep-2017 Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>

sctp: fix missing wake ups in some situations

Commit fb586f25300f ("sctp: delay calls to sk_data_ready() as much as
possible") minimized the number of wake ups that are triggered in case
the association receives a packet with multiple data chunks on it and/or
when io_events are enabled and then commit 0970f5b36659 ("sctp: signal
sk_data_ready earlier on data chunks reception") moved the wake up to as
soon as possible. It thus relies on the state machine running later to
clean the flag that the event was already generated.

The issue is that there are 2 call paths that calls
sctp_ulpq_tail_event() outside of the state machine, causing the flag to
linger and possibly omitting a needed wake up in the sequence.

One of the call paths is when enabling SCTP_SENDER_DRY_EVENTS via
setsockopt(SCTP_EVENTS), as noticed by Harald Welte. The other is when
partial reliability triggers removal of chunks from the send queue when
the application calls sendmsg().

This commit fixes it by not setting the flag in case the socket is not
owned by the user, as it won't be cleaned later. This works for
user-initiated calls and also for rx path processing.

Fixes: fb586f25300f ("sctp: delay calls to sk_data_ready() as much as possible")
Reported-by: Harald Welte <laforge@gnumonks.org>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 9f8d3147 29-Jun-2017 Xin Long <lucien.xin@gmail.com>

sctp: remove the typedef sctp_data_chunk_t

This patch is to remove the typedef sctp_data_chunk_t, and replace
with struct sctp_data_chunk in the places where it's using this
typedef.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# cee360ab 31-May-2017 Xin Long <lucien.xin@gmail.com>

sctp: define the member stream as an object instead of pointer in asoc

As Marcelo's suggestion, stream is a fixed size member of asoc and would
not grow with more streams. To avoid an allocation for it, this patch is
to define it as an object instead of pointer and update the places using
it, also create sctp_stream_update() called in sctp_assoc_update() to
migrate the stream info from one stream to another.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# a8386317 06-Jan-2017 Xin Long <lucien.xin@gmail.com>

sctp: prepare asoc stream for stream reconf

sctp stream reconf, described in RFC 6525, needs a structure to
save per stream information in assoc, like stream state.

In the future, sctp stream scheduler also needs it to save some
stream scheduler params and queues.

This patchset is to prepare the stream array in assoc for stream
reconf. It defines sctp_stream that includes stream arrays inside
to replace ssnmap.

Note that we use different structures for IN and OUT streams, as
the members in per OUT stream will get more and more different
from per IN stream.

v1->v2:
- put these patches into a smaller group.
v2->v3:
- define sctp_stream to contain stream arrays, and create stream.c
to put stream-related functions.
- merge 3 patches into 1, as new sctp_stream has the same name
with before.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# e8bc8f9a 16-Sep-2016 Christophe Jaillet <christophe.jaillet@wanadoo.fr>

sctp: Remove some redundant code

In commit 311b21774f13 ("sctp: simplify sk_receive_queue locking"), a call
to 'skb_queue_splice_tail_init()' has been made explicit. Previously it was
hidden in 'sctp_skb_list_tail()'

Now, the code around it looks redundant. The '_init()' part of
'skb_queue_splice_tail_init()' should already do the same.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# a0fc6843 30-Jul-2016 Xin Long <lucien.xin@gmail.com>

sctp: allow delivering notifications after receiving SHUTDOWN

Prior to this patch, once sctp received SHUTDOWN or shutdown with RD,
sk->sk_shutdown would be set with RCV_SHUTDOWN, and all events would
be dropped in sctp_ulpq_tail_event(). It would cause:

1. some notifications couldn't be received by users. like
SCTP_SHUTDOWN_COMP generated by sctp_sf_do_4_C().

2. sctp would also never trigger sk_data_ready when the association
was closed, making it harder to identify the end of the association
by calling recvmsg() and getting an EOF. It was not convenient for
kernel users.

The check here should be stopping delivering DATA chunks after receiving
SHUTDOWN, and stopping delivering ANY chunks after sctp_close().

So this patch is to allow notifications to enqueue into receive queue
even if sk->sk_shutdown is set to RCV_SHUTDOWN in sctp_ulpq_tail_event,
but if sk->sk_shutdown == RCV_SHUTDOWN | SEND_SHUTDOWN, it drops all
events.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 0970f5b3 29-Apr-2016 Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>

sctp: signal sk_data_ready earlier on data chunks reception

Dave Miller pointed out that fb586f25300f ("sctp: delay calls to
sk_data_ready() as much as possible") may insert latency specially if
the receiving application is running on another CPU and that it would be
better if we signalled as early as possible.

This patch thus basically inverts the logic on fb586f25300f and signals
it as early as possible, similar to what we had before.

Fixes: fb586f25300f ("sctp: delay calls to sk_data_ready() as much as possible")
Reported-by: Dave Miller <davem@davemloft.net>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 311b2177 13-Apr-2016 Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>

sctp: simplify sk_receive_queue locking

SCTP already serializes access to rcvbuf through its sock lock:
sctp_recvmsg takes it right in the start and release at the end, while
rx path will also take the lock before doing any socket processing. On
sctp_rcv() it will check if there is an user using the socket and, if
there is, it will queue incoming packets to the backlog. The backlog
processing will do the same. Even timers will do such check and
re-schedule if an user is using the socket.

Simplifying this will allow us to remove sctp_skb_list_tail and get ride
of some expensive lockings. The lists that it is used on are also
mangled with functions like __skb_queue_tail and __skb_unlink in the
same context, like on sctp_ulpq_tail_event() and sctp_clear_pd().
sctp_close() will also purge those while using only the sock lock.

Therefore the lockings performed by sctp_skb_list_tail() are not
necessary. This patch removes this function and replaces its calls with
just skb_queue_splice_tail_init() instead.

The biggest gain is at sctp_ulpq_tail_event(), because the events always
contain a list, even if it's queueing a single skb and this was
triggering expensive calls to spin_lock_irqsave/_irqrestore for every
data chunk received.

As SCTP will deliver each data chunk on a corresponding recvmsg, the
more effective the change will be.
Before this patch, with chunks with 30 bytes:
netperf -t SCTP_STREAM -H 192.168.1.2 -cC -l 60 -- -m 30 -S 400000
400000 -s 400000 400000
on a 10Gbit link with 1500 MTU:

SCTP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.1.1 () port 0 AF_INET
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB

425984 425984 30 60.00 137.45 7.34 7.36 52.504 52.608

With it:

SCTP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.1.1 () port 0 AF_INET
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB

425984 425984 30 60.00 179.10 7.97 6.70 43.740 36.788

Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# fb586f25 08-Apr-2016 Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>

sctp: delay calls to sk_data_ready() as much as possible

Currently processing of multiple chunks in a single SCTP packet leads to
multiple calls to sk_data_ready, causing multiple wake up signals which
are costy and doesn't make it wake up any faster.

With this patch it will note that the wake up is pending and will do it
before leaving the state machine interpreter, latest place possible to
do it realiably and cleanly.

Note that sk_data_ready events are not dependent on asocs, unlike waking
up writers.

v2: series re-checked
v3: use local vars to cleanup the code, suggested by Jakub Sitnicki
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 2c8c56e1 11-Nov-2014 Eric Dumazet <edumazet@google.com>

net: introduce SO_INCOMING_CPU

Alternative to RPS/RFS is to use hardware support for multiple
queues.

Then split a set of million of sockets into worker threads, each
one using epoll() to manage events on its own socket pool.

Ideally, we want one thread per RX/TX queue/cpu, but we have no way to
know after accept() or connect() on which queue/cpu a socket is managed.

We normally use one cpu per RX queue (IRQ smp_affinity being properly
set), so remembering on socket structure which cpu delivered last packet
is enough to solve the problem.

After accept(), connect(), or even file descriptor passing around
processes, applications can use :

int cpu;
socklen_t len = sizeof(cpu);

getsockopt(fd, SOL_SOCKET, SO_INCOMING_CPU, &cpu, &len);

And use this information to put the socket into the right silo
for optimal performance, as all networking stack should run
on the appropriate cpu, without need to send IPI (RPS/RFS).

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 8465a5fc 17-Apr-2014 Neil Horman <nhorman@tuxdriver.com>

sctp: add support for busy polling to sctp protocol

The busy polling socket option adds support for sockets to busy wait on data
arriving on the napi queue from which they have most recently received a frame.
Currently only tcp and udp support this feature, but theres no reason sctp can't
do so as well. Add it in so appliations can take advantage of it

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Vlad Yasevich <vyasevich@gmail.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: Daniel Borkmann <dborkman@redhat.com>
CC: netdev@vger.kernel.org
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 676d2369 11-Apr-2014 David S. Miller <davem@davemloft.net>

net: Fix use after free by removing length arg from sk_data_ready callbacks.

Several spots in the kernel perform a sequence like:

skb_queue_tail(&sk->s_receive_queue, skb);
sk->sk_data_ready(sk, skb->len);

But at the moment we place the SKB onto the socket receive queue it
can be consumed and freed up. So this skb->len access is potentially
to freed up memory.

Furthermore, the skb->len can be modified by the consumer so it is
possible that the value isn't accurate.

And finally, no actual implementation of this callback actually uses
the length argument. And since nobody actually cared about it's
value, lots of call sites pass arbitrary values in such as '0' and
even '1'.

So just remove the length argument from the callback, that way there
is no confusion whatsoever and all of these use-after-free cases get
fixed as a side effect.

Based upon a patch by Eric Dumazet and his suggestion to audit this
issue tree-wide.

Signed-off-by: David S. Miller <davem@davemloft.net>


# 8d72651d 22-Dec-2013 wangweidong <wangweidong1@huawei.com>

sctp: fix checkpatch errors with open brace '{' and trailing statements

fix checkpatch errors below:
ERROR: that open brace { should be on the previous line
ERROR: open brace '{' following function declarations go on the next line
ERROR: trailing statements should be on next line

Signed-off-by: Wang Weidong <wangweidong1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 26ac8e5f 22-Dec-2013 wangweidong <wangweidong1@huawei.com>

sctp: fix checkpatch errors with (foo*)|foo * bar|foo* bar

fix checkpatch errors below:
ERROR: "(foo*)" should be "(foo *)"
ERROR: "foo * bar" should be "foo *bar"
ERROR: "foo* bar" should be "foo *bar"

Signed-off-by: Wang Weidong <wangweidong1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# cb3f837b 22-Dec-2013 wangweidong <wangweidong1@huawei.com>

sctp: fix checkpatch errors with space required or prohibited

fix checkpatch errors while the space is required or prohibited
to the "=,()++..."

Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Wang Weidong <wangweidong1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 4b2f13a2 06-Dec-2013 Jeff Kirsher <jeffrey.t.kirsher@intel.com>

sctp: Fix FSF address in file headers

Several files refer to an old address for the Free Software Foundation
in the file header comment. Resolve by replacing the address with
the URL <http://www.gnu.org/licenses/> so that we do not have to keep
updating the header comments anytime the address changes.

CC: Vlad Yasevich <vyasevich@gmail.com>
CC: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 477143e3 06-Aug-2013 Daniel Borkmann <daniel@iogearbox.net>

net: sctp: trivial: update bug report in header comment

With the restructuring of the lksctp.org site, we only allow bug
reports through the SCTP mailing list linux-sctp@vger.kernel.org,
not via SF, as SF is only used for web hosting and nothing more.
While at it, also remove the obvious statement that bugs will be
fixed and incooperated into the kernel.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 91705c61 23-Jul-2013 Daniel Borkmann <daniel@iogearbox.net>

net: sctp: trivial: update mailing list address

The SCTP mailing list address to send patches or questions
to is linux-sctp@vger.kernel.org and not
lksctp-developers@lists.sourceforge.net anymore. Therefore,
update all occurences.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# c1db7a26 16-Apr-2013 Daniel Borkmann <daniel@iogearbox.net>

net: sctp: sctp_ulpq: remove 'malloced' struct member

The structure sctp_ulpq is embedded into sctp_association and never
separately allocated, also ulpq->malloced is always 0, so that
kfree() is never called. Therefore, remove this code.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# d003b41b 27-Feb-2013 Lee A. Roberts <lee.roberts@hp.com>

sctp: fix association hangs due to partial delivery errors

In sctp_ulpq_tail_data(), use return values 0,1 to indicate whether
a complete event (with MSG_EOR set) was delivered. A return value
of -ENOMEM continues to indicate an out-of-memory condition was
encountered.

In sctp_ulpq_retrieve_partial() and sctp_ulpq_retrieve_first(),
correct message reassembly logic for SCTP partial delivery.
Change logic to ensure that as much data as possible is sent
with the initial partial delivery and that following partial
deliveries contain all available data.

In sctp_ulpq_partial_delivery(), attempt partial delivery only
if the data on the head of the reassembly queue is at or before
the cumulative TSN ACK point.

In sctp_ulpq_renege(), use the modified return values from
sctp_ulpq_tail_data() to choose whether to attempt partial
delivery or to attempt to drain the reassembly queue as a
means to reduce memory pressure. Remove call to
sctp_tsnmap_mark(), as this is handled correctly in call to
sctp_ulpq_tail_data().

Signed-off-by: Lee A. Roberts <lee.roberts@hp.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>


# 95ac7b85 27-Feb-2013 Lee A. Roberts <lee.roberts@hp.com>

sctp: fix association hangs due to errors when reneging events from the ordering queue

In sctp_ulpq_renege_list(), events being reneged from the
ordering queue may correspond to multiple TSNs. Identify
all affected packets; sum freed space and renege from the
tsnmap.

Signed-off-by: Lee A. Roberts <lee.roberts@hp.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>


# e67f85ec 27-Feb-2013 Lee A. Roberts <lee.roberts@hp.com>

sctp: fix association hangs due to reneging packets below the cumulative TSN ACK point

In sctp_ulpq_renege_list(), do not renege packets below the
cumulative TSN ACK point.

Signed-off-by: Lee A. Roberts <lee.roberts@hp.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>


# b26ddd81 29-Oct-2012 Neil Horman <nhorman@tuxdriver.com>

sctp: Clean up type-punning in sctp_cmd_t union

Lots of points in the sctp_cmd_interpreter function treat the sctp_cmd_t arg as
a void pointer, even though they are written as various other types. Theres no
need for this as doing so just leads to possible type-punning issues that could
cause crashes, and if we remain type-consistent we can actually just remove the
void * member of the union entirely.

Change Notes:

v2)
* Dropped chunk that modified SCTP_NULL to create a marker pattern
should anyone try to use a SCTP_NULL() assigned sctp_arg_t, Assigning
to .zero provides the same effect and should be faster, per Vlad Y.

v3)
* Reverted part of V2, opting to use memset instead of .zero, so that
the entire union is initalized thus avoiding the i164 speculative load
problems previously encountered, per Dave M.. Also rewrote
SCTP_[NO]FORCE so as to use common infrastructure a little more

Signed-off-by: Neil Horman <nhorman@tuxdriver.com
CC: Vlad Yasevich <vyasevich@gmail.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: linux-sctp@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>


# b01a2407 06-Aug-2012 Eric W. Biederman <ebiederm@xmission.com>

sctp: Make the mib per network namespace

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 4244854d 29-Jun-2012 Neil Horman <nhorman@tuxdriver.com>

sctp: be more restrictive in transport selection on bundled sacks

It was noticed recently that when we send data on a transport, its possible that
we might bundle a sack that arrived on a different transport. While this isn't
a major problem, it does go against the SHOULD requirement in section 6.4 of RFC
2960:

An endpoint SHOULD transmit reply chunks (e.g., SACK, HEARTBEAT ACK,
etc.) to the same destination transport address from which it
received the DATA or control chunk to which it is replying. This
rule should also be followed if the endpoint is bundling DATA chunks
together with the reply chunk.

This patch seeks to correct that. It restricts the bundling of sack operations
to only those transports which have moved the ctsn of the association forward
since the last sack. By doing this we guarantee that we only bundle outbound
saks on a transport that has received a chunk since the last sack. This brings
us into stricter compliance with the RFC.

Vlad had initially suggested that we strictly allow only sack bundling on the
transport that last moved the ctsn forward. While this makes sense, I was
concerned that doing so prevented us from bundling in the case where we had
received chunks that moved the ctsn on multiple transports. In those cases, the
RFC allows us to select any of the transports having received chunks to bundle
the sack on. so I've modified the approach to allow for that, by adding a state
variable to each transport that tracks weather it has moved the ctsn since the
last sack. This I think keeps our behavior (and performance), close enough to
our current profile that I think we can do this without a sysctl knob to
enable/disable it.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Vlad Yaseivch <vyasevich@gmail.com>
CC: David S. Miller <davem@davemloft.net>
CC: linux-sctp@vger.kernel.org
Reported-by: Michele Baldessari <michele@redhat.com>
Reported-by: sorin serban <sserban@redhat.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 25985edc 30-Mar-2011 Lucas De Marchi <lucas.demarchi@profusion.mobi>

Fix common misspellings

Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>


# efea2c6b 04-Mar-2011 Hagen Paul Pfeifer <hagen@jauu.net>

sctp: several declared/set but unused fixes

Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 3fa21e07 18-May-2010 Joe Perches <joe@perches.com>

net: Remove unnecessary returns from void function()s

This patch removes from net/ (but not any netfilter files)
all the unnecessary return; statements that precede the
last closing brace of void functions.

It does not remove the returns that are immediately
preceded by a label as gcc doesn't like that.

Done via:
$ grep -rP --include=*.[ch] -l "return;\n}" net/ | \
xargs perl -i -e 'local $/ ; while (<>) { s/\n[ \t\n]+return;\n}/\n}/g; print; }'

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 5a0e3ad6 24-Mar-2010 Tejun Heo <tj@kernel.org>

include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.

2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).

* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>


# 43f59c89 21-Sep-2008 David S. Miller <davem@davemloft.net>

net: Remove __skb_insert() calls outside of skbuff internals.

This minor cleanup simplifies later changes which will convert
struct sk_buff and friends over to using struct list_head.

Signed-off-by: David S. Miller <davem@davemloft.net>


# c068be54 15-Jan-2008 Vlad Yasevich <vladislav.yasevich@hp.com>

[SCTP]: Correctly reap SSNs when processing FORWARD_TSN chunk

When we recieve a FORWARD_TSN chunk, we need to reap
all the queued fast-forwarded chunks from the ordering queue
However, if we don't have them queued, we need to see if
the next expected one is there as well. If it is, start
deliver from that point instead of waiting for the next
chunk to arrive.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>


# 01f2d384 11-Jan-2008 Vlad Yasevich <vladislav.yasevich@hp.com>

[SCTP]: Kill silly inlines in ulpqueue.c

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>


# 60c778b2 11-Jan-2008 Vlad Yasevich <vladislav.yasevich@hp.com>

[SCTP]: Stop claiming that this is a "reference implementation"

I was notified by Randy Stewart that lksctp claims to be
"the reference implementation". First of all, "the
refrence implementation" was the original implementation
of SCTP in usersapce written ty Randy and a few others.
Second, after looking at the definiton of 'reference implementation',
we don't really meet the requirements.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>


# 3ab224be 31-Dec-2007 Hideo Aoki <haoki@redhat.com>

[NET] CORE: Introducing new memory accounting interface.

This patch introduces new memory accounting functions for each network
protocol. Most of them are renamed from memory accounting functions
for stream protocols. At the same time, some stream memory accounting
functions are removed since other functions do same thing.

Renaming:
sk_stream_free_skb() -> sk_wmem_free_skb()
__sk_stream_mem_reclaim() -> __sk_mem_reclaim()
sk_stream_mem_reclaim() -> sk_mem_reclaim()
sk_stream_mem_schedule -> __sk_mem_schedule()
sk_stream_pages() -> sk_mem_pages()
sk_stream_rmem_schedule() -> sk_rmem_schedule()
sk_stream_wmem_schedule() -> sk_wmem_schedule()
sk_charge_skb() -> sk_mem_charge()

Removeing
sk_stream_rfree(): consolidates into sock_rfree()
sk_stream_set_owner_r(): consolidates into skb_set_owner_r()
sk_stream_mem_schedule()

The following functions are added.
sk_has_account(): check if the protocol supports accounting
sk_mem_uncharge(): do the opposite of sk_mem_charge()

In addition, to achieve consolidation, updating sk_wmem_queued is
removed from sk_mem_charge().

Next, to consolidate memory accounting functions, this patch adds
memory accounting calls to network core functions. Moreover, present
memory accounting call is renamed to new accounting call.

Finally we replace present memory accounting calls with new interface
in TCP and SCTP.

Signed-off-by: Takahiro Yasui <tyasui@redhat.com>
Signed-off-by: Hideo Aoki <haoki@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# ef5d4cf2 16-Dec-2007 Vlad Yasevich <vladislav.yasevich@hp.com>

[SCTP]: Flush fragment queue when exiting partial delivery.

At the end of partial delivery, we may have complete messages
sitting on the fragment queue. These messages are stuck there
until a new fragment arrives. This can comletely stall a
given association. When clearing partial delivery state, flush
any complete messages from the fragment queue and send them on
their way up.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# cd3ae8e6 09-Nov-2007 Vlad Yasevich <vladislav.yasevich@hp.com>

SCTP: Fix PR-SCTP to deliver all the accumulated ordered chunks

There is a small bug when we process a FWD-TSN. We'll deliver
anything upto the current next expected SSN. However, if the
next expected is already in the queue, it will take another
chunk to trigger its delivery. The fix is to simply check
the current queued SSN is the next expected one.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>


# 16d14ef9 23-Oct-2007 Pavel Emelyanov <xemul@openvz.org>

[SCTP]: Consolidate sctp_ulpq_renege_xxx functions

Both are equal, except for the list to be traversed.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 4d93df0a 15-Aug-2007 Neil Horman <nhorman@tuxdriver.com>

[SCTP]: Rewrite of sctp buffer management code

This patch introduces autotuning to the sctp buffer management code
similar to the TCP. The buffer space can be grown if the advertised
receive window still has room. This might happen if small message
sizes are used, which is common in telecom environmens.
New tunables are introduced that provide limits to buffer growth
and memory pressure is entered if to much buffer spaces is used.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# ea2dfb37 13-Jul-2007 Vlad Yasevich <vladislav.yasevich@hp.com>

SCTP: properly clean up fragment and ordering queues during FWD-TSN.

When we recieve a FWD-TSN (meaning the peer has abandoned the data),
we need to clean up any partially received messages that may be
hanging out on the re-assembly or re-ordering queues. This is
a MUST requirement that was not properly done before.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com.>


# 3ff50b79 20-Apr-2007 Stephen Hemminger <shemminger@linux-foundation.org>

[NET]: cleanup extra semicolons

Spring cleaning time...

There seems to be a lot of places in the network code that have
extra bogus semicolons after conditionals. Most commonly is a
bogus semicolon after: switch() { }

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>


# d49d91d7 23-Mar-2007 Vlad Yasevich <vladislav.yasevich@hp.com>

[SCTP]: Implement SCTP_PARTIAL_DELIVERY_POINT option.

This option induces partial delivery to run as soon
as the specified amount of data has been accumulated on
the association. However, we give preference to fully
reassembled messages over PD messages. In any case,
window and buffer is freed up.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@.hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# b6e1331f 20-Apr-2007 Vlad Yasevich <vladislav.yasevich@hp.com>

[SCTP]: Implement SCTP_FRAGMENT_INTERLEAVE socket option

This option was introduced in draft-ietf-tsvwg-sctpsocket-13. It
prevents head-of-line blocking in the case of one-to-many endpoint.
Applications enabling this option really must enable SCTP_SNDRCV event
so that they would know where the data belongs. Based on an
earlier patch by Ivan Skytte Jørgensen.

Additionally, this functionality now permits multiple associations
on the same endpoint to enter Partial Delivery. Applications should
be extra careful, when using this functionality, to track EOR indicators.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# d0cf0d99 18-Apr-2007 Vlad Yasevich <vladislav.yasevich@hp.com>

[SCTP]: Do not interleave non-fragments when in partial delivery

The way partial delivery is currently implemnted, it is possible to
intereleave a message (either from another steram, or unordered) that
is not part of partial delivery process. The only way to this is for
a message to not be a fragment and be 'in order' or unorderd for a
given stream. This will result in bypassing the reassembly/ordering
queues where things live duing partial delivery, and the
message will be delivered to the socket in the middle of partial delivery.

This is a two-fold problem, in that:
1. the app now must check the stream-id and flags which it may not
be doing.
2. this clearing partial delivery state from the association and results
in ulp hanging.

This patch is a band-aid over a much bigger problem in that we
don't do stream interleave.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 0b58a811 19-Mar-2007 Vlad Yasevich <vladislav.yasevich@hp.com>

[SCTP]: Clean up stale data during association restart

During association restart we may have stale data sitting
on the ULP queue waiting for ordering or reassembly. This
data may cause severe problems if not cleaned up. In particular
stale data pending ordering may cause problems with receive
window exhaustion if our peer has decided to restart the
association.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# d808ad9a 09-Feb-2007 YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

[NET] SCTP: Fix whitespace errors.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 331c4ee7 09-Oct-2006 Vlad Yasevich <vladislav.yasevich@hp.com>

[SCTP]: Fix receive buffer accounting.

When doing receiver buffer accounting, we always used skb->truesize.
This is problematic when processing bundled DATA chunks because for
every DATA chunk that could be small part of one large skb, we would
charge the size of the entire skb. The new approach is to store the
size of the DATA chunk we are accounting for in the sctp_ulpevent
structure and use that stored value for accounting.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 672e7cca 05-May-2006 Vladislav Yasevich <vladsilav.yasevich@hp.com>

[SCTP]: Prevent possible infinite recursion with multiple bundled DATA.

There is a rare situation that causes lksctp to go into infinite recursion
and crash the system. The trigger is a packet that contains at least the
first two DATA fragments of a message bundled together. The recursion is
triggered when the user data buffer is smaller that the full data message.
The problem is that we clone the skb for every fragment in the message.
When reassembling the full message, we try to link skbs from the "first
fragment" clone using the frag_list. However, since the frag_list is shared
between two clones in this rare situation, we end up setting the frag_list
pointer of the second fragment to point to itself. This causes
sctp_skb_pull() to potentially recurse indefinitely.

Proposed solution is to make a copy of the skb when attempting to link
things using frag_list.

Signed-off-by: Vladislav Yasevich <vladsilav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# dd0fc66f 07-Oct-2005 Al Viro <viro@ftp.linux.org.uk>

[PATCH] gfp flags annotations - part 1

- added typedef unsigned int __nocast gfp_t;

- replaced __nocast uses for gfp flags with gfp_t - it gives exactly
the same warnings as far as sparse is concerned, doesn't change
generated code (from gcc point of view we replaced unsigned int with
typedef) and documents what's going on far better.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# 8728b834 09-Aug-2005 David S. Miller <davem@davemloft.net>

[NET]: Kill skb->list

Remove the "list" member of struct sk_buff, as it is entirely
redundant. All SKB list removal callers know which list the
SKB is on, so storing this in sk_buff does nothing other than
taking up some space.

Two tricky bits were SCTP, which I took care of, and two ATM
drivers which Francois Romieu <romieu@fr.zoreil.com> fixed
up.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>


# 3182cd84 11-Jul-2005 Alexey Dobriyan <adobriyan@gmail.com>

[SCTP]: __nocast annotations

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 1da177e4 16-Apr-2005 Linus Torvalds <torvalds@ppc970.osdl.org>

Linux-2.6.12-rc2

Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!