Search

Home

Sort by

Full Search
Definition
Symbol
File Path
History
Type

In Project(s)

Searched hist:176082 (Results 1 - 2 of 2) sorted by relevance

/freebsd-10.2-release/lib/msun/src/
H A D	s_expm1.c	diff 176082 Thu Feb 07 09:42:19 MST 2008 bde Use a better method of scaling by 2k. Instead of adding to the exponent bits of the reduced result, construct 2k (hopefully in parallel with the construction of the reduced result) and multiply by it. This tends to be much faster if the construction of 2*k is actually in parallel, and might be faster even with no parallelism since adjustment of the exponent requires a read-modify-wrtite at an unfortunate time for pipelines. In some cases involving exp2 on amd64 (A64), this change saves about 40 cycles or 30%. I think it is inherently only about 12 cycles faster in these cases and the rest of the speedup is from partly-accidentally avoiding compiler pessimizations (the construction of 2*k is now manually scheduled for good results, and -O2 doesn't always mess this up). In most cases on amd64 (A64) and i386 (A64) the speedup is about 20 cycles. The worst case that I found is expf on ia64 where this change is a pessimization of about 10 cycles or 5%. The manual scheduling for plain exp[f] is harder and not as tuned. Details specific to expm1: - the saving is closer to 12 cycles than to 40 for expm1* on i386 (A64). For some reason it is much larger for negative args. - also convert to __FBSDID().
H A D	s_expm1f.c	diff 176082 Thu Feb 07 09:42:19 MST 2008 bde Use a better method of scaling by 2k. Instead of adding to the exponent bits of the reduced result, construct 2k (hopefully in parallel with the construction of the reduced result) and multiply by it. This tends to be much faster if the construction of 2*k is actually in parallel, and might be faster even with no parallelism since adjustment of the exponent requires a read-modify-wrtite at an unfortunate time for pipelines. In some cases involving exp2 on amd64 (A64), this change saves about 40 cycles or 30%. I think it is inherently only about 12 cycles faster in these cases and the rest of the speedup is from partly-accidentally avoiding compiler pessimizations (the construction of 2*k is now manually scheduled for good results, and -O2 doesn't always mess this up). In most cases on amd64 (A64) and i386 (A64) the speedup is about 20 cycles. The worst case that I found is expf on ia64 where this change is a pessimization of about 10 cycles or 5%. The manual scheduling for plain exp[f] is harder and not as tuned. Details specific to expm1: - the saving is closer to 12 cycles than to 40 for expm1* on i386 (A64). For some reason it is much larger for negative args. - also convert to __FBSDID().

Completed in 45 milliseconds

asus-wl-520gu-7.0.1.45
asuswrt-rt-n18u-9.0.0.4.380.2695
barrelfish-2018-10-04
barrelfish-master
broadcom-cfe-1.4.2
darwin-on-arm
freebsd-10-stable
freebsd-10.0-release
freebsd-10.1-release
freebsd-10.2-release
freebsd-10.3-release
freebsd-11-stable
freebsd-11.0-release
freebsd-12-stable
freebsd-13-stable
freebsd-9.3-release
freebsd-current
fuchsia
haiku
haiku-buildtools
haiku-fatelf
haikuporter
haikuports
linux-master
macosx-10.10
macosx-10.10.1
macosx-10.5.8
macosx-10.9.5
netbsd-6-1-5-RELEASE
netbsd-current
netgear-R7000-V1.0.7.12_1.2.5
netgear-R7800-V1.0.2.28
netgear-WNDR4500-V1.0.1.40_1.0.68
netgear-WNDR4500v2-V1.0.0.60_1.0.38
openbsd-current
openjdk10
openjdk9
opensolaris-onvv-gate
openwrt
seL4-camkes-master
seL4-l4v-10.1.1
seL4-l4v-master
seL4-mcs-10.1.1
seL4-refos-master
seL4-test-master
u-boot
xnu-2422.115.4
xnu-2782.1.97