Lines Matching refs:and

5  * Common Development and Distribution License (the "License").
11 * and limitations under the License.
14 * file and include the License file at usr/src/OPENSOLARIS.LICENSE.
48 * ! of copy and flags. Set up error handling accordingly.
49 * ! The transition point depends on whether the src and
54 * ! For FP version, %l6 holds previous error handling and
57 * ! So either %l6 or %o4 is reserved and not available for
106 * restore error handler and exit.
115 * restore error handler and exit.
119 * ! method: line up src and dst as best possible, then
206 * We've tried to restore fp state from the stack and failed. To
216 * saving a register save and restore. Also, less elaborate setup
218 * For longer copies, especially unaligned ones (where the src and
225 * moved whether the FP registers need to be saved, and some other
227 * 400 clocks. Since each non-repeated/predicted tst and branch costs
229 * longer copies and only benefit a small portion of medium sized
244 * is more data and that data is not in cache, failing to prefetch
247 * The exact tradeoff is strongly load and application dependent, with
258 * hw_copy_limit_1 = src and dst are byte aligned but not halfword aligned
259 * hw_copy_limit_2 = src and dst are halfword aligned but not word aligned
260 * hw_copy_limit_4 = src and dst are word aligned but not longword aligned
261 * hw_copy_limit_8 = src and dst are longword aligned
263 * To say that src and dst are word aligned means that after
265 * both the src and dst will be on word boundaries so that
266 * word loads and stores may be used.
277 * If hw_copy_limit_? is set to a value between 1 and VIS_COPY_THRESHOLD (256)
280 * It is provided to allow for disabling FPBLK copies and to allow
287 * saves an alignment test, memory reference, and enabling test
291 * non-predicted tst and branch costs around 10 clocks.
292 * If src and dst are randomly selected addresses,
297 * But, tests on running kernels show that src and dst to copy code
298 * are typically not on random alignments. Structure copies and
309 * We subdivide the non-FPBLK case further into CHKSIZE bytes and less
311 * align src and dst. We try to minimize special case tests in
316 * src and dst alignment and provide special cases for each of
318 * to decide between short and medium size was chosen to be 39
320 * shift and 4 times 8 bytes for the first long word unrolling.
331 * and nops which are not executed in the code. This
336 * instruction and the unrolled loops, then the alignment needs
341 * a non-predicted tst and branch takes 10 clocks, this savings
353 * three iterations later and shows a measured improvement
362 * Notes on preserving existing fp state and on membars.
366 * preserve - the rest of the kernel does not use fp and, anyway, fp
369 * - userland has fp state and is interrupted (device interrupt
370 * or trap) and within the interrupt/trap handling we use
375 * userland or in kernel copy) and the tl0 component of the handling
377 * - a user process with fp state incurs a copy-on-write fault and
381 * using our stack is ideal (and since fp copy cannot be leaf optimized
394 * ourselves and it is our cpu which will take any trap.
404 * and reboot the system (or restart the service with Greenline/Contracts).
408 * the event and the trap PC may not be the PC of the faulting access.
414 * is no need to repeat this), and we must force delivery of deferred
420 * Since the copy operations may preserve and later restore floating
425 * To make sure that floating point state is always saved and restored
431 * use. Bit 2 (TRAMP_FLAG) indicates that the call was to bcopy, and a
471 * Entry points bcopy, copyin_noerr, and copyout_noerr use this flag.
472 * kcopy, copyout, xcopyout, copyin, and xcopyin do not set this flag.
490 * floating-point register save area and 2 64-bit temp locations.
524 * Copy functions use either quadrants 1 and 3 or 2 and 4.
526 * FZEROQ1Q3: Zero quadrants 1 and 3, ie %f0 - %f15 and %f32 - %f47
527 * FZEROQ2Q4: Zero quadrants 2 and 4, ie %f16 - %f31 and %f48 - %f63
569 * Macros to save and restore quadrants 1 and 3 or 2 and 4 to/from the stack.
570 * Used to save and restore in-use fp registers when we want to use FP
571 * and find fp already in use and copy size still large enough to justify
572 * the additional overhead of this save and restore.
583 * original data, and a membar #Sync after restore lets the block loads
587 * and before using the BLD_*_FROMSTACK macro.
593 and tmp1, -VIS_BLOCKSIZE, tmp1 /* block align */ ;\
602 and tmp1, -VIS_BLOCKSIZE, tmp1 /* block align */ ;\
611 and tmp1, -VIS_BLOCKSIZE, tmp1 /* block align */ ;\
620 and tmp1, -VIS_BLOCKSIZE, tmp1 /* block align */ ;\
628 * FP_NOMIGRATE and FP_ALLOWMIGRATE. Prevent migration (or, stronger,
630 * switch) before commencing a FP copy, and reallow it on completion or
782 * Errno value is in %g1. bcopy_more uses fp quadrants 1 and 3.
790 and %l6, TRAMP_FLAG, %l0 ! copy trampoline flag to %l0
811 ! and bcopy. kcopy will *always* set a t_lofault handler
813 ! and *not* to invoke any existing error handler. As far as
880 * Assumes double word alignment and a count >= 256.
1089 ! Now long word aligned and have at least 32 bytes to move
1139 ! Now word aligned and have at least 36 bytes to move
1186 ! Now half word aligned and have at least 38 bytes to move
1216 * profiling and dtrace of the portions of the copy code that uses
1237 ! kcopy and bcopy use the same code path. If TRAMP_FLAG is set
1238 ! and the saved lofault was zero, we won't reset lofault on
1465 subcc %o0, %o1, %o3 ! difference of from and to address
1472 2: cmp %o2, %o3 ! cmp size and abs(from - to)
1475 cmp %o0, %o1 ! compare from and to addresses
1512 * has already disabled kernel preemption and has checked
1642 * Transfer data to and from user space -
1647 * Note that copyin(9F) and copyout(9F) are part of the
1652 * So there's two extremely similar routines - xcopyin() and xcopyout()
1658 * There are also stub routines for xcopyout_little and xcopyin_little,
1669 * The only difference between copy{in,out} and
1689 * data copying algorithm and the default limits.
1987 ! Now long word aligned and have at least 32 bytes to move
2045 ! Now word aligned and have at least 36 bytes to move
2098 ! Now half word aligned and have at least 38 bytes to move
2152 * profiling and dtrace of the portions of the copy code that uses
2773 ! Now long word aligned and have at least 32 bytes to move
2828 ! Now word aligned and have at least 36 bytes to move
2880 ! Now half word aligned and have at least 38 bytes to move
2931 * profiling and dtrace of the portions of the copy code that uses
3527 * and returns 1. Otherwise 0 is returned indicating success.
3528 * Caller is responsible for ensuring use_hw_bzero is true and that
3554 ! ... and must be 256 bytes or more
3559 ! ... and length must be a multiple of VIS_BLOCKSIZE
3579 and %l1, -VIS_BLOCKSIZE, %l1