Cross Reference: /freebsd-9.3-release/sys/i386/include/atomic.h

History log of /freebsd-9.3-release/sys/i386/include/atomic.h
Revision	Date	Author	Comments (<<< Hide modified files) (Show modified files >>>)
# 267654	19-Jun-2014	gjb	Copy stable/9 to releng/9.3 as part of the 9.3-RELEASE cycle. Approved by: re (implicit) Sponsored by: The FreeBSD Foundation /freebsd-9.3-release
# 262807	05-Mar-2014	dumbbell	MFC r254610, r254611, r254617, r254620: Implement atomic_swap() and atomic_testandset(). Implement atomic_cmpset_64() and atomic_swap_64() for i386. The following revisions were merged at the same time, to avoid build breakage: o r256118 o r256119 o r256131
# 254169	09-Aug-2013	marius	MFC: r241374 Add an unified macro to deny ability from the compiler to reorder instruction loads/stores at its will. The macro __compiler_membar() is currently supported for both gcc and clang, but kernel compilation will fail otherwise. Reviewed by: bde, kib Discussed with: dim, theraven
# 237161	16-Jun-2012	kib	MFC r236456: Use plain store for atomic_store_rel on x86, instead of implicitly locked xchg instruction. IA32 memory model guarantees that store has release semantic, since stores cannot pass loads or stores.
# 225736	22-Sep-2011	kensmith	Copy head to stable/9 as part of 9.0-RELEASE release cycle. Approved by: re (implicit)
# 220404	06-Apr-2011	jkim	Implement atomic_load_acq_64(9) and atomic_store_rel_64(9) for i386. These functions are implemented with CMPXCHG8B instruction where it is available, i. e., all Pentium-class and later processors. Note this instruction is also used for atomic_store_rel_64() because a simple XCHG-like instruction for 64-bit memory access does not exist, unfortunately. If the processor lacks the instruction, i. e., 80486-class CPUs, two 32-bit load/store are performed with interrupt temporarily disabled, assuming it does not support SMP. Although this assumption may be little naive, it is true in reality. This implementation is inspired by Linux.
# 216524	18-Dec-2010	kib	Inform a compiler which asm statements in the x86 implementation of atomics change eflags. Reviewed by: jhb MFC after: 2 weeks
# 208332	20-May-2010	phk	Rename an argument from "exp" to "expect" since the former makes FlexeLint uneasy, in case anybody think it might be exp(3) in libm. This also makes it consistent with other archs.
# 197910	09-Oct-2009	attilio	atomic_cmpset_barr_* was added in order to cope with compilers willing to specify their own version of atomic_cmpset_* which could have been different than the membar version. Right now, however, FreeBSD is bound mostly to GCC-like compilers and it is desired to add new support and compat shim mostly when there is a real necessity, in order to avoid too much compatibility bloats. In this optic, bring back atomic_cmpset_{acq, rel}_* to be the same as atomic_cmpset_* and unwind the atomic_cmpset_barr_* introduction. Requested by: jhb Reviewed by: jhb Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>
# 197824	06-Oct-2009	attilio	- All the functions in atomic.h needs to be in "physical" form (like not defined through macros or similar) in order to be later compiled in the kernel and offer this way the support for modules (and compatibility among the UP case and SMP case). Fix this for the newly introduced atomic_cmpset_barr_* cases by defining and specifying a template. Note that the new DEFINE_CMPSET_GEN() template save more typing on amd64 than the current code. [1] - Fix the style for memory barriers on amd64. [1] Reported by: Paul B. Mahol <onemda at gmail dot com>
# 197803	06-Oct-2009	attilio	Per their definition, atomic instructions used in conjuction with memory barriers should also ensure that the compiler doesn't reorder paths where they are used. GCC, however, does that aggressively, even in presence of volatile operands. The most reliable way GCC offers for avoid instructions reordering is clobbering "memory" even if that is theoretically an heavy-weight operation, flushing the content of all the registers and forcing reload of them (We could rely, however, on gcc DTRT by just understanding the purpose as this is a well-known pattern for many modern operating-systems). Not all our memory barriers, right now, clobber memory for GCC-like compilers. The most notable cases are IA32 and amd64 where the memory barrier are treacted the same as normal atomic instructions. Fix this by offering the possibility to implement atomic instructions with memory barriers separately from the normal version and implement the GCC-like specific one using memory clobbering. Thanks to Chris Lattner (@apple) for his discussion on llvm specifics. Reported by: jhb Reviewed by: jhb Tested by: rdivacky, Giovanni Trematerra <giovanni dot trematerra at gmail dot com>
# 185720	06-Dec-2008	kib	Restore memory clobber, to cause mb on the compiler level too. Use more sane formatting of the assembler. Pointed out by: bde
# 185651	05-Dec-2008	kib	Unconditionally use locked addition of zero to tip of the stack for memory barriers on i386. It works as a serialization instruction on all IA32 CPUs. Alternative solution of using {s,l,}fence requires run-time checking of the presense of the corresponding SSE or SSE2 extensions, and possible boot-time patching of the kernel text. Suggested by: many
# 185162	22-Nov-2008	kmacy	- bump __FreeBSD version to reflect added buf_ring, memory barriers, and ifnet functions - add memory barriers to <machine/atomic.h> - update drivers to only conditionally define their own - add lockless producer / consumer ring buffer - remove ring buffer implementation from cxgb and update its callers - add if_transmit(struct ifnet ifp, struct mbuf m) to ifnet to allow drivers to efficiently manage multiple hardware queues (i.e. not serialize all packets through one ifq) - expose if_qflush to allow drivers to flush any driver managed queues This work was supported by Bitgravity Inc. and Chelsio Inc.
# 177276	16-Mar-2008	pjd	Implement atomic_fetchadd_long() for all architectures and document it. Reviewed by: attilio, jhb, jeff, kris (as a part of the uidinfo_waitfree.patch)
# 165636	29-Dec-2006	bde	Fix oops in previous commit.
# 165635	29-Dec-2006	bde	Fixed some style bugs (mainly assorted errors in comments, and inconsistent spelling of `result').
# 165633	29-Dec-2006	bde	Fixed some style bugs (whitespace only).
# 165630	29-Dec-2006	bde	Try harder to garbage-collect the "LOCORE" (really asm) version of MPLOCKED. The cleaning in rev.1.25 was supposed to have been undone by rev.1.26, but 1.26 could never have actually affected asm files since atomic.h is full of C declarations so including it in asm files would just give syntax errors. The asm MPLOCKED is even less needed than when misplaced definitions of it were first removed, and is now unused in any asm file in the src tree except in anachronismns in sys/i386/i386/support.s.
# 165572	27-Dec-2006	bde	Avoid an instruction in atomic_cmpset_{int_long)() in most cases. These functions are used a lot for mutexes, so this reduces the text size of an average kernel by about 0.75%. This wasn't intended to be a significant optimization, but it somehow increased the maximum number of packets per second that can be transmitted by my bge hardware from 320000 to 460000 (this benchmark is CPU-bound and remarkably sensitive to changes in the text section). Details: we would prefer to leave the result of the cmpxchg in %al, but cannot tell gcc that it is there, so we have to convert it to an integer register. We converted to %al, then to %[re]ax, but the latter step is usually wasted since gcc usually only wants the condition code and can recover it from %al just as easily as from %[re]ax. Let gcc promote %al in the few cases where this is needed. Nearby style fixes; - let gcc manage the load of `res', and don't abuse `res' for a copy of `exp' - don't echo `res's name in comments - consistently spell the condition code as 'e' after comparison for equality - don't hard-code %al anywhere except in constraints - for the version that doesn't use cmpxchg, there is no requirement to use %al anywhere, so don't hard-code it in the constraints either. Style non-fix: - for the versions that use cmpxchg, keep using "a" (was %[re]ax, now %al) for the main output operand, although this is not required. The input and output operands that use the "a" constraint are now decoupled, and this makes things clearer except for the reason that the output register is hard-coded. It is now just a hack to tell gcc that the input "a" has been clobbered without increasing the number of operands.
# 157212	28-Mar-2006	des	Use wrapper macros for atomic pointer operations in order to perform the correct casts. This should probably be merged to other architectures.
# 150627	27-Sep-2005	jhb	Add a new atomic_fetchadd() primitive that atomically adds a value to a variable and returns the previous value of the variable. Tested on: i386, alpha, sparc64, arm (cognet) Reviewed by: arch@ Submitted by: cognet (arm) MFC after: 1 week
# 150182	15-Sep-2005	jhb	Stop using the '+' constraint modifier with inline assembly. The '+' constraint is actually only allowed for register operands. Instead, use separate input and output memory constraints. Education from: alc Reviewed by: alc Tested on: i386, alpha MFC after: 1 week
# 148067	15-Jul-2005	jhb	Convert the atomic_ptr() operations over to operating on uintptr_t variables rather than void * variables. This makes it easier and simpler to get asm constraints and volatile keywords correct. MFC after: 3 days Tested on: i386, alpha, sparc64 Compiled on: ia64, powerpc, amd64 Kernel toolchain busted on: arm
# 147855	09-Jul-2005	jhb	Some cleanups and tweaks to some of the atomic.h files in preparation for further changes and fixes in the future: - Use aliases via macros rather than duplicated inlines wherever possible. - Move all the aliases to the bottom of these files and the inline functions to the top. - Add various comments. - On alpha, drop atomic_{load_acq,store_rel}_{8,char,16,short}(). - On i386 and amd64, don't duplicate the extern declarations for functions in the two non-inline cases (KLD_MODULE and compiler doesn't do inlines), instead, consolidate those two cases. - Some whitespace fixes. Approved by: re (scottl)
# 143063	02-Mar-2005	joerg	netchild's mega-patch to isolate compiler dependencies into a central place. This moves the dependency on GCC's and other compiler's features into the central sys/cdefs.h file, while the individual source files can then refer to #ifdef __COMPILER_FEATURE_FOO where they by now used to refer to #if __GNUC__ > 3.1415 && __BARC__ <= 42. By now, GCC and ICC (the Intel compiler) have been actively tested on IA32 platforms by netchild. Extension to other compilers is supposed to be possible, of course. Submitted by: netchild Reviewed by: various developers on arch@, some time ago
# 137784	16-Nov-2004	jhb	Initiate deorbit burn sequence for 80386 support in FreeBSD: Remove 80386 (I386_CPU) support from the kernel.
# 137623	12-Nov-2004	jhb	Spell _KERNEL correctly so that UP kernels are actually optimized again. Submitted by: pjd
# 137622	12-Nov-2004	jhb	- Use the SMP style ops for atomic_load/store() in userland so that libraries and binaries will work on both UP and SMP machines. - Remove unnecessary gcc memory barrier from the UP atomic_store() op. Submitted by: bde
# 137591	11-Nov-2004	jhb	- Place the gcc memory barrier hint in the right place in the 80386 version of atomic_store_rel(). - Use the 80386 versions of atomic_load_acq() and atomic_store_rel() that do not use serializing instructions on all UP kernels since a UP machine does need to synchronize with other CPUs. This trims lots of cycles from spin locks on UP kernels among other things. Benchmarked by: rwatson
# 126891	12-Mar-2004	trhodes	These are changes to allow to use the Intel C/C++ compiler (lang/icc) to build the kernel. It doesn't affect the operation if gcc. Most of the changes are just adding __INTEL_COMPILER to #ifdef's, as icc v8 may define __GNUC__ some parts may look strange but are necessary. Additional changes: - in_cksum.[ch]: * use a generic C version instead of the assembly version in the !gcc case (ASM code breaks with the optimizations icc does) -> no bad checksums with an icc compiled kernel Help from: andre, grehan, das Stolen from: alpha version via ppc version The entire checksum code should IMHO be replaced with the DragonFly version (because it isn't guaranteed future revisions of gcc will include similar optimizations) as in: ---snip--- Revision Changes Path 1.12 +1 -0 src/sys/conf/files.i386 1.4 +142 -558 src/sys/i386/i386/in_cksum.c 1.5 +33 -69 src/sys/i386/include/in_cksum.h 1.5 +2 -0 src/sys/netinet/igmp.c 1.6 +0 -1 src/sys/netinet/in.h 1.6 +2 -0 src/sys/netinet/ip_icmp.c 1.4 +3 -4 src/contrib/ipfilter/ip_compat.h 1.3 +1 -2 src/sbin/natd/icmp.c 1.4 +0 -1 src/sbin/natd/natd.c 1.48 +1 -0 src/sys/conf/files 1.2 +0 -1 src/sys/conf/files.amd64 1.13 +0 -1 src/sys/conf/files.i386 1.5 +0 -1 src/sys/conf/files.pc98 1.7 +1 -1 src/sys/contrib/ipfilter/netinet/fil.c 1.10 +2 -3 src/sys/contrib/ipfilter/netinet/ip_compat.h 1.10 +1 -1 src/sys/contrib/ipfilter/netinet/ip_fil.c 1.7 +1 -1 src/sys/dev/netif/txp/if_txp.c 1.7 +1 -1 src/sys/net/ip_mroute/ip_mroute.c 1.7 +1 -2 src/sys/net/ipfw/ip_fw2.c 1.6 +1 -2 src/sys/netinet/igmp.c 1.4 +158 -116 src/sys/netinet/in_cksum.c 1.6 +1 -1 src/sys/netinet/ip_gre.c 1.7 +1 -2 src/sys/netinet/ip_icmp.c 1.10 +1 -1 src/sys/netinet/ip_input.c 1.10 +1 -2 src/sys/netinet/ip_output.c 1.13 +1 -2 src/sys/netinet/tcp_input.c 1.9 +1 -2 src/sys/netinet/tcp_output.c 1.10 +1 -1 src/sys/netinet/tcp_subr.c 1.10 +1 -1 src/sys/netinet/tcp_syncache.c 1.9 +1 -2 src/sys/netinet/udp_usrreq.c 1.5 +1 -2 src/sys/netinet6/ipsec.c 1.5 +1 -2 src/sys/netproto/ipsec/ipsec.c 1.5 +1 -1 src/sys/netproto/ipsec/ipsec_input.c 1.4 +1 -2 src/sys/netproto/ipsec/ipsec_output.c and finally remove sys/i386/i386 in_cksum.c sys/i386/include in_cksum.h ---snip--- - endian.h: * DTRT in C++ mode - quad.h: * we don't use gcc v1 anymore, remove support for it Suggested by: bde (long ago) - assym.h: * avoid zero-length arrays (remove dependency on a gcc specific feature) This change changes the contents of the object file, but as it's only used to generate some values for a header, and the generator knows how to handle this, there's no impact in the gcc case. Explained by: bde Submitted by: Marius Strobl <marius@alchemy.franken.de> - aicasm.c: * minor change to teach it about the way icc spells "-nostdinc" Not approved by: gibbs (no reply to my mail) - bump __FreeBSD_version (lang/icc needs to know about the changes) Incarnations of this patch survive gcc compiles since a loooong time, I use it on my desktop. An icc compiled kernel works since Nov. 2003 (exceptions: snd_* if used as modules), it survives a build of the entire ports collection with icc. Parts of this commit contains suggestions or submissions from Marius Strobl <marius@alchemy.franken.de>. Reviewed by: -arch Submitted by: netchild
# 122827	17-Nov-2003	bde	Fixed pedantic syntax errors. Many macros didn't permit a semicolon after their invocation in the !KLD_MODULE case, but a semicolon is provided after all invocations and is required in the KLD_MODULE case.
# 122826	17-Nov-2003	bde	Avoid a warning for compiling with `gcc -Wbad-function cast'. (This is the warning that points to the bug in `(char *)malloc(...)' where malloc() is implicitly declared as returning int. We do similar things here, but they work because u_int is the same as uintptr_t on i386's.)
# 105117	14-Oct-2002	pirzyk	Add a knob to turn on and off the CMPXCHG instruction on > i386 IA32 systems. This is most beneficial for vmware client os installs. Reviewed by: jmallet, iedowse, tlambert2@mindspring.com MFC After: never, -STABLE does not currently use this instruction
# 100327	18-Jul-2002	markm	Beautify. This has the side effect of improving portability and making lint work cleaner. Inspired to do this by: jhb
# 100251	17-Jul-2002	markm	Clean up the syntax WRT semicolons at the end of function-like-macros, and protect GCCisms from non-GNU compilers and lint.
# 91469	28-Feb-2002	bmilekic	Make MPLOCKED work again in asm files and stringify it explicitly where necessary. Reviewed by: jake
# 90515	11-Feb-2002	bde	Garbage-collect the "LOCORE" version of MPLOCKED.
# 88117	18-Dec-2001	jhb	Allow the ATOMIC_ASM() macro to pass in the constraints on the V parameter since the char versions need to use either ax, bx, cx, or dx. Submitted by: Peter Jeremy (mostly) Recommended by: bde
# 86303	12-Nov-2001	jhb	Use newer constraints for atomic_cmpset(). Requested by: bde
# 86301	12-Nov-2001	jhb	Use newer constraints for inline assembly for an operand that is both an input and an output by using the '+' modifier rather than listing the operand in both the input and output sections. Reviwed by: bde
# 84679	08-Oct-2001	jhb	Allow atomic ops to be somewhat safely used in userland. We always use lock prefixes in the userland case so that the binaries will work on both SMP and UP systems.
# 72358	11-Feb-2001	markm	RIP <machine/lock.h>. Some things needed bits of <i386/include/lock.h> - cy.c now has its own (only) copy of the COM_(UN)LOCK() macros, and IMASK_(UN)LOCK() has been moved to <i386/include/apic.h> (AKA <machine/apic.h>). Reviewed by: jhb
# 71141	17-Jan-2001	jhb	- Sort of lie and say that %eax is an output only and not an input for the non-386 atomic_load_acq(). %eax is an input since its value is used in the cmpxchg instruction, but we don't care what value it is, so setting it to a specific value is just wasteful. Thus, it is being used without being initialized as the warning stated, but it is ok for it to be used because its value isn't important. Thus, we are only sort of lying when we say it is an output only operand. - Add "cc" to the clobber list for atomic_load_acq() since the cmpxchgl changes ZF.
# 71085	15-Jan-2001	jhb	- Fix atomic_load_* and atomic_store_* to generate functions for atomic.c that modules can call. - Remove the old gcc <= 2.8 versions of the atomic ops. - Resort the order of some things in the file so that there is only one #ifdef for KLD_MODULE, and so that all WANT_FUNCTIONS stuff is moved to the bottom of the file. - Remove ATOMIC_ACQ_REL() and just use explicit macros instead.
# 71023	14-Jan-2001	jhb	Fix the atomic_load_acq() and atomic_store_rel() functions to properly implement memory fences for the 486+. The 386 still uses versions w/o memory fences as all operations on the 386 are not program ordered. The 386 versions are not MP safe.
# 67740	27-Oct-2000	jhb	The x86 atomic operations are already locked, so they do not need an additional locked instruction to guarantee a write barrier for the acquire variants. Approved by: dfr Pointy hat to: jhb
# 67587	25-Oct-2000	jhb	- Add atomic_cmpset_{acq_,rel_,}_long - Add in atomic operations for 8-bit, 16-bit, and 32-bit integers
# 67351	20-Oct-2000	jhb	- Expand the set of atomic operations to optionally include memory barriers in most of the atomic operations. Now for these operations, you can use the normal atomic operation, you can use the operation with a read barrier, or you can use the operation with a write barrier. The function names follow the same semantics used in the ia64 instruction set. An atomic operation with a read barrier has the extra suffix 'acq', due to it having "acquire" semantics. An atomic operation with a write barrier has the extra suffix 'rel'. These suffixes are inserted between the name of the operation to perform and the typename. For example, the atomic_add_int() function now has 3 variants: - atomic_add_int() - this is the same as the previous function - atomic_add_acq_int() - this function combines the add operation with a read memory barrier - atomic_add_rel_int() - this function combines the add operation with a write memory barrier - Add 'ptr' to the list of types that we can perform atomic operations on. This allows one to do atomic operations on uintptr_t's. This is useful in the mutex code, for example, because the actual mutex lock is a pointer. - Add two new operations for doing loads and stores with memory barriers. The new load operations use a read barrier before the load, and the new store operations use a write barrier after the load. For example, atomic_load_acq_int() will atomically load an integer as well as enforcing a read barrier.
# 66695	05-Oct-2000	jhb	Add atomic_readandclear_int and atomic_readandclear_long.
# 65514	06-Sep-2000	phk	Introduce atomic_cmpset_int() and atomic_cmpset_long() from SMPng a few hours earlier than the rest. The next DEVFS commit needs these functions. Alpha versions by: dfr i386 versions by: jakeb Approved by: SMPng
# 60301	09-May-2000	obrien	Note the syntax for specifying an operand's size is documented in src/contrib/gcc/config/i386/i386.md.
# 60300	09-May-2000	obrien	When using _asm{} in GCC, one must specify the operand's size if one specifies the instruction's operation size. GCC will default to 32-bit operands reguardless of the prototype (ie, formal parameters' type) of an inline function.
# 51938	04-Oct-1999	peter	Use the rev 1.1.2.1 code from RELENG_3 for atomic operations rather than the non-atomic C macros.
# 51937	04-Oct-1999	peter	Typo: s/__GNUC_MINOR_/__GNUC_MINOR__/ (__GNUC_MINOR__ on egcs in -current is "91" and is going to be "95" soon)
# 51917	03-Oct-1999	eivind	Allow compilation with older versions of GCC, in order to make it possible to bootstrap and work with -current from older versions of FreeBSD.
# 50477	27-Aug-1999	peter	$Id$ -> $FreeBSD$
# 49999	18-Aug-1999	alc	Create callable (non-inline) versions of the atomic_OP_TYPE functions that are linked into the kernel. The KLD compilation options are changed to call these functions, rather than in-lining the atomic operations. This approach makes atomic operations from KLDs significantly faster on UP systems (though somewhat slower on SMP systems). PR: i386/13111 Submitted by: peter.jeremy@alcatel.com.au
# 49043	23-Jul-1999	alc	atomic.h: Change "void " to "volatile TYPE ", improving type safety and eliminating some warnings (e.g., mp_machdep.c rev 1.106). cpufunc.h: Eliminate setbits. As defined, it's not precisely correct; and it's redundant. (Use atomic_set_int instead.) ipl_funcs.c: Use atomic_set_int instead of setbits. systm.h: Include atomic.h. Reviewed by: bde
# 48797	13-Jul-1999	alc	Commit the correct patch, i.e., the one that actually corresponds to the rev 1.2 log entry.
# 48796	13-Jul-1999	alc	Changed the implementation of the primitives to guarantee atomicity with respect to interrupts on UP systems. (The upgrade from gcc 2.7.x to egcs 1.1.2 produced at least one non-atomic code sequence in swap_pager_getpages.) In addition, the primitives are now SMP-safe, but only on SMPs. (For portability between SMPs and UPs, modules are compiled with the SMP-safe versions.) Submitted by: dillon and myself Reviewed by: bde
# 38517	24-Aug-1998	dfr	Change various syscalls to use size_t arguments instead of u_int. Add some overflow checks to read/write (from bde). Change all modifications to vm_page::flags, vm_page::busy, vm_object::flags and vm_object::paging_in_progress to use operations which are not interruptable. Reviewed by: Bruce Evans <bde@zeta.org.au>