#
8dc67def |
|
04-Aug-2022 |
Augustin Cavalier <waddlesplash@gmail.com> |
kernel/scheduler: Always reschedule after enqueueing if the runqueue was empty. Normally the heap's priority will suffice to check if we need to reschedule. However, in cases where the CPU in question is currently in the middle of rescheduling already, and it is about to change its priority to "idle", we can race with it and not notice that we need to send it an ICI. Previously this meant we would just lose some performance, but after recent fixes to not reschedule only as necessary, this race led to hangs. Now we report whether the runqueue we added the thread to was empty in ThreadData::Enqueue(), and if it was, we then always trigger a scheduler invocation on the target CPU. In my testing, this fixes #17847; and at least in my unscientific benchmarks, improves compile performance by as much as 10% (I saw ~55s -> ~50s in some tests.)
|
#
1bba129c |
|
08-Apr-2014 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Let ThreadData::ShouldRebalance() choose the actual core Currently, ThreadData::ShouldRebalance() (and mode specific functions it calls) only decides whether to migrate thread to another core or not. However, in most cases it actually needs to find the best candidate for new core so it could as well return that information.
|
#
f116370e |
|
30-Jan-2014 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Relax penalty cancellation requirements Priority penalties were made more strict in order to prevent situation when two or more high priority threads uses up all available CPU time in such manner that they do not receive a penalty but starve low priority threads. However, a significant change to thread priorites has been made since and now priority of all non real time threads varies in a range from 1 to static priority minus penalty. This means that the scheduler is able to prevent thread starvation without any complex penalty policies.
|
#
6155ab7b |
|
30-Jan-2014 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Provide more stable core load statistics Originially, core load was a sum of eastimated loads of all currently running or ready threads on a given core. Such value is changing very rapidly preventing the thread migration logic from making any reasonable decisions. This patch changes the way core load is computed to make it more stable thus improving the qualitiy of decisions made by the thread migration logic. Currently core load is a sum of estimated loads of all threads that have been ready during last load measurement interval and haven't been migrated or killed.
|
#
931ce674 |
|
26-Jan-2014 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Fix core unassignment
|
#
7adce94d |
|
26-Jan-2014 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Check team user time timers before entering scheduler User timers may cause another thread to become ready in which case we would like this to happen before scheduler_reschedule() chooses next thread to be executed.
|
#
d01fa1ff |
|
20-Jan-2014 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Always update core load at thread reenqueue
|
#
59b9b52a |
|
19-Jan-2014 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: system_time() may be unreliable On multisocket systems as well as under virtual machines logical CPUs may use separate TSC. We could attempt to synchronize them what probably would solve problems on multisocket systems. Unfortunately, when running under hypervisor there is still a chance that TSC will get out of sync again (e.g. cpufreq enabled on host when there is no invariant TSC). As long as we use RDTSC as our main time source the scheduler must accept the fact that time may go backwards (what isn't really a serious problem).
|
#
955e4cff |
|
17-Jan-2014 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Weaken time slept assertion
|
#
3dce49af |
|
16-Jan-2014 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Cache quantum length
|
#
f978518a |
|
16-Jan-2014 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheudler: Cache ThreadData::IsCPUBound() result
|
#
b7d404c2 |
|
16-Jan-2014 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Add ThreadData::{GetPriority, IsIdle, IsRealTime}()
|
#
093c2202 |
|
16-Jan-2014 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Improve latencies
|
#
7f212f45 |
|
14-Jan-2014 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Update used time when thread yields or sleeps
|
#
082d3c10 |
|
08-Jan-2014 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Increase thread penalty at fork
|
#
a2634874 |
|
08-Jan-2014 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Estimate the load thread is able to produce Previous implementation based on the actual load of each core and share each thread has in that load turned up to be very problematic when balancing load on very heavily loaded systems (i.e. more threads consuming all available CPU time than there is logical CPUs). The new approach is to estimate how much load would a thread produce if it had all CPU time only for itself. Summing such load estimations of each thread assigned to a given core we get a rank that contains much more information than just simple actual core load.
|
#
772331c7 |
|
07-Jan-2014 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Introduce strong and weak priority penalties
|
#
d36098e0 |
|
07-Jan-2014 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Keep track of the number of the ready threads
|
#
9c465cc8 |
|
07-Jan-2014 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Improve recognition of CPU bound threads
|
#
c2a02dee |
|
06-Jan-2014 |
Pawel Dziepak <pdziepak@quarnos.org> |
kernel: Relax cpu_ent::interrupt_time locking The value isn't accessed by the other CPUs and all writes and reads are done with interrupts disabled.
|
#
8235bbc9 |
|
05-Jan-2014 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Improve thread creation performance
|
#
cb66faef |
|
04-Jan-2014 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Work around GCC2 limitations in function inlining GCC2 won't inline a function if it is used before its definition.
|
#
e4ea6372 |
|
03-Jan-2014 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Disable load tracking when not needed
|
#
2d52abbd |
|
30-Dec-2013 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Inherit penalty and core from creator thread
|
#
26592750 |
|
30-Dec-2013 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Protect per CPU run queue with its own lock
|
#
4c25fcab |
|
28-Dec-2013 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Fix double release of run queue lock
|
#
335c6055 |
|
26-Dec-2013 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Remove CPUEntry::IncreaseActiveTime()
|
#
96dcc73b |
|
26-Dec-2013 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Add scheduler profiler A bit hackish implementation of a profiler for the scheduler. SCHEDULER_ENTER_FUNCTION at the begining of each function aren't nice and usage of __PRETTY_FUNCTION__ isn't any better (both gcc and clang support it though), but it was quick to implement and doesn't lose information on inlined functions. It's just a tool, not an integral part of the kernal anyway.
|
#
ebe5420f |
|
26-Dec-2013 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: No need for gQuantumLengths to be global
|
#
cf4984f6 |
|
23-Dec-2013 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Use precomputed time slice lengths
|
#
ede552ab |
|
23-Dec-2013 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Keep thread effective priority cached
|
#
b24ea642 |
|
23-Dec-2013 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Encapsulate ThreadData fields
|
#
a08b40d4 |
|
23-Dec-2013 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Encapsulate CPUEntry fields
|
#
e1e7235c |
|
23-Dec-2013 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Encapsulate CoreEntry fields
|
#
60e198f2 |
|
22-Dec-2013 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Encapsulate PackageEntry fields Apart from the refactoring this commit takes the opportunity and removes unnecessary read locks when choosing a package and a core from idle lists. The data structures are accessed in a thread safe way and it does not really matter whether the obtained data becomes outdated just when we release the lock or during our search for the appropriate package/core.
|
#
c08ed2db |
|
19-Dec-2013 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Try to keep thread on the same logical CPU Some SMT implementations (e.g. recent AMD microarchitectures) have separate L1d cache for each SMT thread (which AMD decides to call "cores"). This means that we shouldn't move threads to another logical processor too often even if it belongs to the same core. We aren't very strict about this as it would complicate load balancing, but we try to reduce unnecessary migrations.
|
#
ad6b9a1d |
|
19-Dec-2013 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Use sequential locks instead of atomic 64 bit access
|
#
d287274d |
|
05-Dec-2013 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Code refactoring
|
#
1bba129c56656a5c140fc8d1202ae1cac761d49b |
|
08-Apr-2014 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Let ThreadData::ShouldRebalance() choose the actual core Currently, ThreadData::ShouldRebalance() (and mode specific functions it calls) only decides whether to migrate thread to another core or not. However, in most cases it actually needs to find the best candidate for new core so it could as well return that information.
|
#
f116370edda18472a248387a7256e2b4e528c666 |
|
30-Jan-2014 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Relax penalty cancellation requirements Priority penalties were made more strict in order to prevent situation when two or more high priority threads uses up all available CPU time in such manner that they do not receive a penalty but starve low priority threads. However, a significant change to thread priorites has been made since and now priority of all non real time threads varies in a range from 1 to static priority minus penalty. This means that the scheduler is able to prevent thread starvation without any complex penalty policies.
|
#
6155ab7b25c5f0b32bb014bab5916f4594dbb685 |
|
30-Jan-2014 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Provide more stable core load statistics Originially, core load was a sum of eastimated loads of all currently running or ready threads on a given core. Such value is changing very rapidly preventing the thread migration logic from making any reasonable decisions. This patch changes the way core load is computed to make it more stable thus improving the qualitiy of decisions made by the thread migration logic. Currently core load is a sum of estimated loads of all threads that have been ready during last load measurement interval and haven't been migrated or killed.
|
#
931ce674a9dfdc8f4a05bdb54fc21b207018f7a2 |
|
26-Jan-2014 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Fix core unassignment
|
#
7adce94d45a69dd1a8dece6d3324b9a0859e90b4 |
|
26-Jan-2014 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Check team user time timers before entering scheduler User timers may cause another thread to become ready in which case we would like this to happen before scheduler_reschedule() chooses next thread to be executed.
|
#
d01fa1ffe3ba17bf507570cb6c380c16542ea481 |
|
20-Jan-2014 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Always update core load at thread reenqueue
|
#
59b9b52aafcba9847df3791c525fac03c3caec3e |
|
19-Jan-2014 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: system_time() may be unreliable On multisocket systems as well as under virtual machines logical CPUs may use separate TSC. We could attempt to synchronize them what probably would solve problems on multisocket systems. Unfortunately, when running under hypervisor there is still a chance that TSC will get out of sync again (e.g. cpufreq enabled on host when there is no invariant TSC). As long as we use RDTSC as our main time source the scheduler must accept the fact that time may go backwards (what isn't really a serious problem).
|
#
955e4cff9455aec37f8f9cf2761e79b3cbebff05 |
|
17-Jan-2014 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Weaken time slept assertion
|
#
3dce49af0ecc67f243fdd37fbcfa3321f7d2047b |
|
16-Jan-2014 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Cache quantum length
|
#
f978518a52465f8b3cbf8ea1fd27541a5d54af11 |
|
16-Jan-2014 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheudler: Cache ThreadData::IsCPUBound() result
|
#
b7d404c2df546acbe91df76939005bab666dd71b |
|
16-Jan-2014 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Add ThreadData::{GetPriority, IsIdle, IsRealTime}()
|
#
093c2202675b2ef2c9a76dec558fe6ed4a5e6f17 |
|
16-Jan-2014 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Improve latencies
|
#
7f212f45c3d669962ca86d14a3b6d80246b1a486 |
|
14-Jan-2014 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Update used time when thread yields or sleeps
|
#
082d3c1015c3610e190fda886ea207c122181dc3 |
|
08-Jan-2014 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Increase thread penalty at fork
|
#
a2634874ed5e33a36fe83c272614e2042fafde1d |
|
08-Jan-2014 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Estimate the load thread is able to produce Previous implementation based on the actual load of each core and share each thread has in that load turned up to be very problematic when balancing load on very heavily loaded systems (i.e. more threads consuming all available CPU time than there is logical CPUs). The new approach is to estimate how much load would a thread produce if it had all CPU time only for itself. Summing such load estimations of each thread assigned to a given core we get a rank that contains much more information than just simple actual core load.
|
#
772331c7cdd486b7283ea138621693df88a9327b |
|
07-Jan-2014 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Introduce strong and weak priority penalties
|
#
d36098e0430bdec4c5202673c3a8bff776dd03db |
|
07-Jan-2014 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Keep track of the number of the ready threads
|
#
9c465cc83bbd40732475db43bd870221b99bdbb7 |
|
07-Jan-2014 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Improve recognition of CPU bound threads
|
#
c2a02dee65184026ea953726a9ab1bac1c0a4617 |
|
06-Jan-2014 |
Pawel Dziepak <pdziepak@quarnos.org> |
kernel: Relax cpu_ent::interrupt_time locking The value isn't accessed by the other CPUs and all writes and reads are done with interrupts disabled.
|
#
8235bbc9965b083b294b366ea5438d2ff274dbf7 |
|
05-Jan-2014 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Improve thread creation performance
|
#
cb66faef24f64af40a51f23300ff546d975535b3 |
|
04-Jan-2014 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Work around GCC2 limitations in function inlining GCC2 won't inline a function if it is used before its definition.
|
#
e4ea637227d7cf9a53bc89317990b8a22a76780a |
|
03-Jan-2014 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Disable load tracking when not needed
|
#
2d52abbd5d279c622982b8cae1f38d51bf73c2d2 |
|
30-Dec-2013 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Inherit penalty and core from creator thread
|
#
265927509dc56e82b12cd68750ae1e96601fd558 |
|
30-Dec-2013 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Protect per CPU run queue with its own lock
|
#
4c25fcab386a4a5e22c869865a9b5f25350c24dd |
|
28-Dec-2013 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Fix double release of run queue lock
|
#
335c60552c275dc13e1ca4fac0af2cd11be7f4aa |
|
26-Dec-2013 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Remove CPUEntry::IncreaseActiveTime()
|
#
96dcc73b39cc68a59c276a35690f8af1886214ef |
|
26-Dec-2013 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Add scheduler profiler A bit hackish implementation of a profiler for the scheduler. SCHEDULER_ENTER_FUNCTION at the begining of each function aren't nice and usage of __PRETTY_FUNCTION__ isn't any better (both gcc and clang support it though), but it was quick to implement and doesn't lose information on inlined functions. It's just a tool, not an integral part of the kernal anyway.
|
#
ebe5420f845cae655b92303a8b664320307248fa |
|
26-Dec-2013 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: No need for gQuantumLengths to be global
|
#
cf4984f64588ef80b96d573c0931c3517585c162 |
|
23-Dec-2013 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Use precomputed time slice lengths
|
#
ede552ab25e23e2aa64b6953c4ef848699266881 |
|
23-Dec-2013 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Keep thread effective priority cached
|
#
b24ea642d759ad6e6b30007cb112b3cdfad35204 |
|
23-Dec-2013 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Encapsulate ThreadData fields
|
#
a08b40d4087b35c586959dc7da44035171d4cf15 |
|
23-Dec-2013 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Encapsulate CPUEntry fields
|
#
e1e7235c60d942d4fd58ac7caedf4a9715efcc7a |
|
23-Dec-2013 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Encapsulate CoreEntry fields
|
#
60e198f2cbf2e26b584370c0d32c37cb3dce556c |
|
22-Dec-2013 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Encapsulate PackageEntry fields Apart from the refactoring this commit takes the opportunity and removes unnecessary read locks when choosing a package and a core from idle lists. The data structures are accessed in a thread safe way and it does not really matter whether the obtained data becomes outdated just when we release the lock or during our search for the appropriate package/core.
|
#
c08ed2db65267bea18a3ba424f98fffde9da6c25 |
|
19-Dec-2013 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Try to keep thread on the same logical CPU Some SMT implementations (e.g. recent AMD microarchitectures) have separate L1d cache for each SMT thread (which AMD decides to call "cores"). This means that we shouldn't move threads to another logical processor too often even if it belongs to the same core. We aren't very strict about this as it would complicate load balancing, but we try to reduce unnecessary migrations.
|
#
ad6b9a1df8ccdb1093c4b122764f8692d6f7ca2c |
|
19-Dec-2013 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Use sequential locks instead of atomic 64 bit access
|
#
d287274dcec634da4973a1b92c97dd14d7c5ecd0 |
|
05-Dec-2013 |
Pawel Dziepak <pdziepak@quarnos.org> |
scheduler: Code refactoring
|