#
fad87dbd |
|
22-Feb-2024 |
Nathan Lynch <nathanl@linux.ibm.com> |
powerpc/rtas: use correct function name for resetting TCE tables The PAPR spec spells the function name as "ibm,reset-pe-dma-windows" but in practice firmware uses the singular form: "ibm,reset-pe-dma-window" in the device tree. Since we have the wrong spelling in the RTAS function table, reverse lookups (token -> name) fail and warn: unexpected failed lookup for token 86 WARNING: CPU: 1 PID: 545 at arch/powerpc/kernel/rtas.c:659 __do_enter_rtas_trace+0x2a4/0x2b4 CPU: 1 PID: 545 Comm: systemd-udevd Not tainted 6.8.0-rc4 #30 Hardware name: IBM,9105-22A POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1060.00 (NL1060_028) hv:phyp pSeries NIP [c0000000000417f0] __do_enter_rtas_trace+0x2a4/0x2b4 LR [c0000000000417ec] __do_enter_rtas_trace+0x2a0/0x2b4 Call Trace: __do_enter_rtas_trace+0x2a0/0x2b4 (unreliable) rtas_call+0x1f8/0x3e0 enable_ddw.constprop.0+0x4d0/0xc84 dma_iommu_dma_supported+0xe8/0x24c dma_set_mask+0x5c/0xd8 mlx5_pci_init.constprop.0+0xf0/0x46c [mlx5_core] probe_one+0xfc/0x32c [mlx5_core] local_pci_probe+0x68/0x12c pci_call_probe+0x68/0x1ec pci_device_probe+0xbc/0x1a8 really_probe+0x104/0x570 __driver_probe_device+0xb8/0x224 driver_probe_device+0x54/0x130 __driver_attach+0x158/0x2b0 bus_for_each_dev+0xa8/0x120 driver_attach+0x34/0x48 bus_add_driver+0x174/0x304 driver_register+0x8c/0x1c4 __pci_register_driver+0x68/0x7c mlx5_init+0xb8/0x118 [mlx5_core] do_one_initcall+0x60/0x388 do_init_module+0x7c/0x2a4 init_module_from_file+0xb4/0x108 idempotent_init_module+0x184/0x34c sys_finit_module+0x90/0x114 And oopses are possible when lockdep is enabled or the RTAS tracepoints are active, since those paths dereference the result of the lookup. Use the correct spelling to match firmware's behavior, adjusting the related constants to match. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Fixes: 8252b88294d2 ("powerpc/rtas: improve function information lookups") Reported-by: Gaurav Batra <gbatra@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240222-rtas-fix-ibm-reset-pe-dma-window-v1-1-7aaf235ac63c@linux.ibm.com
|
#
e3681107 |
|
12-Dec-2023 |
Nathan Lynch <nathanl@linux.ibm.com> |
powerpc/rtas: Warn if per-function lock isn't held If the function descriptor has a populated lock member, then callers are required to hold it across calls. Now that the firmware activation sequence is appropriately guarded, we can warn when the requirement isn't satisfied. __do_enter_rtas_trace() gets reorganized a bit as a result of performing the function descriptor lookup unconditionally now. Reviewed-by: "Aneesh Kumar K.V (IBM)" <aneesh.kumar@kernel.org> Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20231212-papr-sys_rtas-vs-lockdown-v6-8-e9eafd0c8c6c@linux.ibm.com
|
#
dc7637c4 |
|
12-Dec-2023 |
Nathan Lynch <nathanl@linux.ibm.com> |
powerpc/rtas: Serialize firmware activation sequences Use rtas_ibm_activate_firmware_lock to prevent interleaving call sequences of the ibm,activate-firmware RTAS function, which typically requires multiple calls to complete the update. While the spec does not specifically prohibit interleaved sequences, there's almost certainly no advantage to allowing them. Reviewed-by: "Aneesh Kumar K.V (IBM)" <aneesh.kumar@kernel.org> Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20231212-papr-sys_rtas-vs-lockdown-v6-7-e9eafd0c8c6c@linux.ibm.com
|
#
adf7a019 |
|
12-Dec-2023 |
Nathan Lynch <nathanl@linux.ibm.com> |
powerpc/rtas: Facilitate high-level call sequences On RTAS platforms there is a general restriction that the OS must not enter RTAS on more than one CPU at a time. This low-level serialization requirement is satisfied by holding a spin lock (rtas_lock) across most RTAS function invocations. However, some pseries RTAS functions require multiple successive calls to complete a logical operation. Beginning a new call sequence for such a function may disrupt any other sequences of that function already in progress. Safe and reliable use of these functions effectively requires higher-level serialization beyond what is already done at the level of RTAS entry and exit. Where a sequence-based RTAS function is invoked only through sys_rtas(), with no in-kernel users, there is no issue as far as the kernel is concerned. User space is responsible for appropriately serializing its call sequences. (Whether user space code actually takes measures to prevent sequence interleaving is another matter.) Examples of such functions currently include ibm,platform-dump and ibm,get-vpd. But where a sequence-based RTAS function has both user space and in-kernel uesrs, there is a hazard. Even if the in-kernel call sites of such a function serialize their sequences correctly, a user of sys_rtas() can invoke the same function at any time, potentially disrupting a sequence in progress. So in order to prevent disruption of kernel-based RTAS call sequences, they must serialize not only with themselves but also with sys_rtas() users, somehow. Preferably without adding more function-specific hacks to sys_rtas(). This is a prerequisite for adding an in-kernel call sequence of ibm,get-vpd, which is in a change to follow. Note that it has never been feasible for the kernel to prevent sys_rtas()-based sequences from being disrupted because control returns to user space on every call. sys_rtas()-based users of these functions have always been, and continue to be, responsible for coordinating their call sequences with other users, even those which may invoke the RTAS functions through less direct means than sys_rtas(). This is an unavoidable consequence of exposing sequence-based RTAS functions through sys_rtas(). * Add an optional mutex member to struct rtas_function. * Statically define a mutex for each RTAS function with known call sequence serialization requirements, and assign its address to the .lock member of the corresponding function table entry, along with justifying commentary. * In sys_rtas(), if the table entry for the RTAS function being called has a populated lock member, acquire it before taking rtas_lock and entering RTAS. * Kernel-based RTAS call sequences are expected to access the appropriate mutex explicitly by name. For example, a user of the ibm,activate-firmware RTAS function would do: int token = rtas_function_token(RTAS_FN_IBM_ACTIVATE_FIRMWARE); int fwrc; mutex_lock(&rtas_ibm_activate_firmware_lock); do { fwrc = rtas_call(token, 0, 1, NULL); } while (rtas_busy_delay(fwrc)); mutex_unlock(&rtas_ibm_activate_firmware_lock); There should be no perceivable change introduced here except that concurrent callers of the same RTAS function via sys_rtas() may block on a mutex instead of spinning on rtas_lock. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20231212-papr-sys_rtas-vs-lockdown-v6-6-e9eafd0c8c6c@linux.ibm.com
|
#
e7582edb |
|
12-Dec-2023 |
Nathan Lynch <nathanl@linux.ibm.com> |
powerpc/rtas: Move token validation from block_rtas_call() to sys_rtas() The rtas system call handler sys_rtas() delegates certain input validation steps to a helper function: block_rtas_call(). One of these steps ensures that the user-supplied token value maps to a known RTAS function. This is done by performing a "reverse" token-to-function lookup via rtas_token_to_function_untrusted() to obtain an rtas_function object. In changes to come, sys_rtas() itself will need the function descriptor for the token. To prepare: * Move the lookup and validation up into sys_rtas() and pass the resulting rtas_function pointer to block_rtas_call(), which is otherwise unconcerned with the token value. * Change block_rtas_call() to report the RTAS function name instead of the token value on validation failures, since it can now rely on having a valid function descriptor. One behavior change is that sys_rtas() now silently errors out when passed a bad token, before calling block_rtas_call(). So we will no longer log "RTAS call blocked - exploit attempt?" on invalid tokens. This is consistent with how sys_rtas() currently handles other "metadata" (nargs and nret), while block_rtas_call() is primarily concerned with validating the arguments to be passed to specific RTAS functions. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20231212-papr-sys_rtas-vs-lockdown-v6-5-e9eafd0c8c6c@linux.ibm.com
|
#
669acc7e |
|
12-Dec-2023 |
Nathan Lynch <nathanl@linux.ibm.com> |
powerpc/rtas: Fall back to linear search on failed token->function lookup Enabling any of the powerpc:rtas_* tracepoints at boot is likely to result in an oops on RTAS platforms. For example, booting a QEMU pseries model with 'trace_event=powerpc:rtas_input' in the command line leads to: BUG: Kernel NULL pointer dereference on read at 0x00000008 Oops: Kernel access of bad area, sig: 7 [#1] NIP [c00000000004231c] do_enter_rtas+0x1bc/0x460 LR [c00000000004231c] do_enter_rtas+0x1bc/0x460 Call Trace: do_enter_rtas+0x1bc/0x460 (unreliable) rtas_call+0x22c/0x4a0 rtas_get_boot_time+0x80/0x14c read_persistent_clock64+0x124/0x150 read_persistent_wall_and_boot_offset+0x28/0x58 timekeeping_init+0x70/0x348 start_kernel+0xa0c/0xc1c start_here_common+0x1c/0x20 (This is preceded by a warning for the failed lookup in rtas_token_to_function().) This happens when __do_enter_rtas_trace() attempts a token to function descriptor lookup before the xarray containing the mappings has been set up. Fall back to linear scan of the table if rtas_token_to_function_xarray is empty. Fixes: 24098f580e2b ("powerpc/rtas: add tracepoints around RTAS entry") Reviewed-by: "Aneesh Kumar K.V (IBM)" <aneesh.kumar@kernel.org> Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20231212-papr-sys_rtas-vs-lockdown-v6-3-e9eafd0c8c6c@linux.ibm.com
|
#
c500c6e7 |
|
12-Dec-2023 |
Nathan Lynch <nathanl@linux.ibm.com> |
powerpc/rtas: Add for_each_rtas_function() iterator Add a convenience macro for iterating over every element of the internal function table and convert the one site that can use it. An additional user of the macro is anticipated in changes to follow. Reviewed-by: "Aneesh Kumar K.V (IBM)" <aneesh.kumar@kernel.org> Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20231212-papr-sys_rtas-vs-lockdown-v6-2-e9eafd0c8c6c@linux.ibm.com
|
#
01e346ff |
|
12-Dec-2023 |
Nathan Lynch <nathanl@linux.ibm.com> |
powerpc/rtas: Avoid warning on invalid token argument to sys_rtas() rtas_token_to_function() WARNs when passed an invalid token; it's meant to catch bugs in kernel-based users of RTAS functions. However, user space controls the token value passed to rtas_token_to_function() by block_rtas_call(), so user space with sufficient privilege to use sys_rtas() can trigger the warnings at will: unexpected failed lookup for token 2048 WARNING: CPU: 20 PID: 2247 at arch/powerpc/kernel/rtas.c:556 rtas_token_to_function+0xfc/0x110 ... NIP rtas_token_to_function+0xfc/0x110 LR rtas_token_to_function+0xf8/0x110 Call Trace: rtas_token_to_function+0xf8/0x110 (unreliable) sys_rtas+0x188/0x880 system_call_exception+0x268/0x530 system_call_common+0x160/0x2c4 It's desirable to continue warning on bogus tokens in rtas_token_to_function(). Currently it is used to look up RTAS function descriptors when tracing, where we know there has to have been a successful descriptor lookup by different means already, and it would be a serious inconsistency for the reverse lookup to fail. So instead of weakening rtas_token_to_function()'s contract by removing the warnings, introduce rtas_token_to_function_untrusted(), which has no opinion on failed lookups. Convert block_rtas_call() and rtas_token_to_function() to use it. Fixes: 8252b88294d2 ("powerpc/rtas: improve function information lookups") Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20231212-papr-sys_rtas-vs-lockdown-v6-1-e9eafd0c8c6c@linux.ibm.com
|
#
19773eda |
|
06-Nov-2023 |
Nathan Lynch <nathanl@linux.ibm.com> |
powerpc/rtas: Remove trailing space Use scripts/cleanfile to remove instances of trailing space in the core RTAS code and header. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20231106-rtas-trivial-v1-6-61847655c51f@linux.ibm.com
|
#
1d8faf1f |
|
06-Nov-2023 |
Nathan Lynch <nathanl@linux.ibm.com> |
powerpc/rtas: Remove unused rtas_service_present() rtas_service_present() has no more users. rtas_function_implemented() is now the appropriate API for determining whether a given RTAS function is available to call. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20231106-rtas-trivial-v1-4-61847655c51f@linux.ibm.com
|
#
e160bf64 |
|
18-Aug-2023 |
Mahesh Salgaonkar <mahesh@linux.ibm.com> |
powerpc/rtas: export rtas_error_rc() for reuse. Also, #define descriptive names for common rtas return codes and use it instead of numeric values. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/169235811556.193557.1023625262204809514.stgit@jupiter
|
#
b949ee68 |
|
08-Jun-2023 |
Hari Bathini <hbathini@linux.ibm.com> |
powerpc/fadump: invoke ibm,os-term with rtas_call_unlocked() Invoke ibm,os-term call with rtas_call_unlocked(), without using the RTAS spinlock, to avoid deadlock in the unlikely event of a machine crash while making an RTAS call. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Reviewed-by: Mahesh Salgaonkar <mahesh@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230609071404.425529-1-hbathini@linux.ibm.com
|
#
af8bc682 |
|
06-Mar-2023 |
Nathan Lynch <nathanl@linux.ibm.com> |
powerpc/rtas: lockdep annotations Add lockdep annotations for the following properties that must hold: * Any error log retrieval must be atomically coupled with the prior RTAS call, without a window for another RTAS call to occur before the error log can be retrieved. * All users of the core rtas_args parameter block must hold rtas_lock. Move the definitions of rtas_lock and rtas_args up in the file so that __do_enter_rtas_trace() can refer to them. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230220-rtas-queue-for-6-4-v1-6-010e4416f13f@linux.ibm.com
|
#
32740fce |
|
06-Mar-2023 |
Nathan Lynch <nathanl@linux.ibm.com> |
powerpc/rtas: fix miswording in rtas_function kerneldoc The 'filter' member is a pointer, not a bool; fix the wording accordingly. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230220-rtas-queue-for-6-4-v1-4-010e4416f13f@linux.ibm.com
|
#
1792e46e |
|
06-Mar-2023 |
Nathan Lynch <nathanl@linux.ibm.com> |
powerpc/rtas: rtas_call_unlocked() kerneldoc Add documentation for rtas_call_unlocked(), including details on how it differs from rtas_call(). Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230220-rtas-queue-for-6-4-v1-3-010e4416f13f@linux.ibm.com
|
#
271208ee |
|
06-Mar-2023 |
Nathan Lynch <nathanl@linux.ibm.com> |
powerpc/rtas: use memmove for potentially overlapping buffer copy Using memcpy() isn't safe when buf is identical to rtas_err_buf, which can happen during boot before slab is up. Full context which may not be obvious from the diff: if (altbuf) { buf = altbuf; } else { buf = rtas_err_buf; if (slab_is_available()) buf = kmalloc(RTAS_ERROR_LOG_MAX, GFP_ATOMIC); } if (buf) memcpy(buf, rtas_err_buf, RTAS_ERROR_LOG_MAX); This was found by inspection and I'm not aware of it causing problems in practice. It appears to have been introduced by commit 033ef338b6e0 ("powerpc: Merge rtas.c into arch/powerpc/kernel"); the old ppc64 version of this code did not have this problem. Use memmove() instead. Fixes: 033ef338b6e0 ("powerpc: Merge rtas.c into arch/powerpc/kernel") Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230220-rtas-queue-for-6-4-v1-2-010e4416f13f@linux.ibm.com
|
#
08273c9f |
|
09-Feb-2023 |
Nathan Lynch <nathanl@linux.ibm.com> |
powerpc/rtas: arch-wide function token lookup conversions With the tokens for all implemented RTAS functions now available via rtas_function_token(), which is optimal and safe for arbitrary contexts, there is no need to use rtas_token() or cache its result. Most conversions are trivial, but a few are worth describing in more detail: * Error injection token comparisons for lockdown purposes are consolidated into a simple predicate: token_is_restricted_errinjct(). * A couple of special cases in block_rtas_call() do not use rtas_token() but perform string comparisons against names in the function table. These are converted to compare against token values instead, which is logically equivalent but less expensive. * The lookup for the ibm,os-term token can be deferred until needed, instead of caching it at boot to avoid device tree traversal during panic. * Since rtas_function_token() accesses a read-only data structure without taking any locks, xmon's lookup of set-indicator can be performed as needed instead of cached at startup. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-20-26929c8cce78@linux.ibm.com
|
#
716bfc97 |
|
09-Feb-2023 |
Nathan Lynch <nathanl@linux.ibm.com> |
powerpc/rtas: introduce rtas_function_token() API Users of rtas_token() supply a string argument that can't be validated at build time. A typo or misspelling has to be caught by inspection or by observing wrong behavior at runtime. Since the core RTAS code now has consolidated the names of all possible RTAS functions and mapped them to their tokens, token lookup can be implemented using symbolic constants to index a static array. So introduce rtas_function_token(), a replacement API which does that, along with a rtas_service_present()-equivalent helper, rtas_function_implemented(). Callers supply an opaque predefined function handle which is used internally to index the function table. Typos or other inappropriate arguments yield build errors, and the function handle is a type that can't be easily confused with RTAS tokens or other integer types. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-19-26929c8cce78@linux.ibm.com
|
#
43033bc6 |
|
09-Feb-2023 |
Nathan Lynch <nathanl@linux.ibm.com> |
powerpc/pseries: add RTAS work area allocator Various pseries-specific RTAS functions take a temporary "work area" parameter - a buffer in memory accessible to RTAS. Typically such functions are passed the statically allocated rtas_data_buf buffer as the argument. This buffer is protected by a global spinlock. So users of rtas_data_buf cannot perform sleeping operations while accessing the buffer. Most RTAS functions that have a work area parameter can return a status (-2/990x) that indicates that the caller should retry. Before retrying, the caller may need to reschedule or sleep (see rtas_busy_delay() for details). This combination of factors leads to uncomfortable constructions like this: do { spin_lock(&rtas_data_buf_lock); rc = rtas_call(token, __pa(rtas_data_buf, ...); if (rc == 0) { /* parse or copy out rtas_data_buf contents */ } spin_unlock(&rtas_data_buf_lock); } while (rtas_busy_delay(rc)); Another unfortunately common way of handling this is for callers to blithely ignore the possibility of a -2/990x status and hope for the best. If users were allowed to perform blocking operations while owning a work area, the programming model would become less tedious and error-prone. Users could schedule away, sleep, or perform other blocking operations without having to release and re-acquire resources. We could continue to use a single work area buffer, and convert rtas_data_buf_lock to a mutex. But that would impose an unnecessarily coarse serialization on all users. As awkward as the current design is, it prevents longer running operations that need to repeatedly use rtas_data_buf from blocking the progress of others. There are more considerations. One is that while 4KB is fine for all current in-kernel uses, some RTAS calls can take much smaller buffers, and some (VPD, platform dumps) would likely benefit from larger ones. Another is that at least one RTAS function (ibm,get-vpd) has *two* work area parameters. And finally, we should expect the number of work area users in the kernel to increase over time as we introduce lockdown-compatible ABIs to replace less safe use cases based on sys_rtas/librtas. So a special-purpose allocator for RTAS work area buffers seems worth trying. Properties: * The backing memory for the allocator is reserved early in boot in order to satisfy RTAS addressing requirements, and then managed with genalloc. * Allocations can block, but they never fail (mempool-like). * Prioritizes first-come, first-serve fairness over throughput. * Early boot allocations before the allocator has been initialized are served via an internal static buffer. Intended to replace rtas_data_buf. New code that needs RTAS work area buffers should prefer this API. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-12-26929c8cce78@linux.ibm.com
|
#
24098f58 |
|
09-Feb-2023 |
Nathan Lynch <nathanl@linux.ibm.com> |
powerpc/rtas: add tracepoints around RTAS entry Decompose the RTAS entry C code into tracing and non-tracing variants, calling the just-added tracepoints in the tracing-enabled path. Skip tracing in contexts known to be unsafe (real mode, CPU offline). Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-11-26929c8cce78@linux.ibm.com
|
#
77f85f69 |
|
09-Feb-2023 |
Nathan Lynch <nathanl@linux.ibm.com> |
powerpc/rtas: strengthen do_enter_rtas() type safety, drop inline Make do_enter_rtas() take a pointer to struct rtas_args and do the __pa() conversion in one place instead of leaving it to callers. This also makes it possible to introduce enter/exit tracepoints that access the rtas_args struct fields. There's no apparent reason to force inlining of do_enter_rtas() either, and it seems to bloat the code a bit. Let the compiler decide. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-9-26929c8cce78@linux.ibm.com
|
#
8252b882 |
|
09-Feb-2023 |
Nathan Lynch <nathanl@linux.ibm.com> |
powerpc/rtas: improve function information lookups The core RTAS support code and its clients perform two types of lookup for RTAS firmware function information. First, mapping a known function name to a token. The typical use case invokes rtas_token() to retrieve the token value to pass to rtas_call(). rtas_token() relies on of_get_property(), which performs a linear search of the /rtas node's property list under a lock with IRQs disabled. Second, and less common: given a token value, looking up some information about the function. The primary example is the sys_rtas filter path, which linearly scans a small table to match the token to a rtas_filter struct. Another use case to come is RTAS entry/exit tracepoints, which will require efficient lookup of function names from token values. Currently there is no general API for this. We need something much like the existing rtas_filters table, but more general and organized to facilitate efficient lookups. Introduce: * A new rtas_function type, aggregating function name, token, and filter. Other function characteristics could be added in the future. * An array of rtas_function, where each element corresponds to a known RTAS function. All information in the table is static save the token values, which are derived from the device tree at boot. The array is sorted by function name to allow binary search. * A named constant for each known RTAS function, used to index the function array. These also will be used in a client-facing API to be added later. * An xarray that maps valid tokens to rtas_function objects. Fold the existing rtas_filter table into the new rtas_function array, with the appropriate adjustments to block_rtas_call(). Remove now-redundant fields from struct rtas_filter. Preserve the function of the CONFIG_CPU_BIG_ENDIAN guard in the current filter table by introducing a per-function flag that is set for the function entries related to pseries LPAR migration. These have never had working users via sys_rtas on ppc64le; see commit de0f7349a0dd ("powerpc/rtas: prevent suspend-related sys_rtas use on LE"). Convert rtas_token() to use a lockless binary search on the function table. Fall back to the old behavior for lookups against names that are not known to be RTAS functions, but issue a warning. rtas_token() is for function names; it is not a general facility for accessing arbitrary properties of the /rtas node. All known misuses of rtas_token() have been converted to more appropriate of_ APIs in preceding changes. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-8-26929c8cce78@linux.ibm.com
|
#
836b5b9f |
|
09-Feb-2023 |
Nathan Lynch <nathanl@linux.ibm.com> |
powerpc/rtas: ensure 4KB alignment for rtas_data_buf Some RTAS functions that have work area parameters impose alignment requirements on the work area passed to them by the OS. Examples include: - ibm,configure-connector - ibm,update-nodes - ibm,update-properties 4KB is the greatest alignment required by PAPR for such buffers. rtas_data_buf used to have a __page_aligned attribute in the arch/ppc64 days, but that was changed to __cacheline_aligned for unknown reasons by commit 033ef338b6e0 ("powerpc: Merge rtas.c into arch/powerpc/kernel"). That works out to 128-byte alignment on ppc64, which isn't right. This was found by inspection and I'm not aware of any real problems caused by this. Either current RTAS implementations don't enforce the alignment constraints, or rtas_data_buf is always being placed at a 4KB boundary by accident (or both, perhaps). Use __aligned(SZ_4K) to ensure the rtas_data_buf has alignment appropriate for all users. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Fixes: 033ef338b6e0 ("powerpc: Merge rtas.c into arch/powerpc/kernel") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-6-26929c8cce78@linux.ibm.com
|
#
09d1ea72 |
|
09-Feb-2023 |
Nathan Lynch <nathanl@linux.ibm.com> |
powerpc/rtas: handle extended delays safely in early boot Some code that runs early in boot calls RTAS functions that can return -2 or 990x statuses, which mean the caller should retry. An example is pSeries_cmo_feature_init(), which invokes ibm,get-system-parameter but treats these benign statuses as errors instead of retrying. pSeries_cmo_feature_init() and similar code should be made to retry until they succeed or receive a real error, using the usual pattern: do { rc = rtas_call(token, etc...); } while (rtas_busy_delay(rc)); But rtas_busy_delay() will perform a timed sleep on any 990x status. This isn't safe so early in boot, before the CPU scheduler and timer subsystem have initialized. The -2 RTAS status is much more likely to occur during single-threaded boot than 990x in practice, at least on PowerVM. This is because -2 usually means that RTAS made progress but exhausted its self-imposed timeslice, while 990x is associated with concurrent requests from the OS causing internal contention. Regardless, according to the language in PAPR, the OS should be prepared to handle either type of status at any time. Add a fallback path to rtas_busy_delay() to handle this as safely as possible, performing a small delay on 990x. Include a counter to detect retry loops that aren't making progress and bail out. Add __ref to rtas_busy_delay() since it now conditionally calls an __init function. This was found by inspection and I'm not aware of any real failures. However, the implementation of rtas_busy_delay() before commit 38f7b7067dae ("powerpc/rtas: rtas_busy_delay() improvements") was not susceptible to this problem, so let's treat this as a regression. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Fixes: 38f7b7067dae ("powerpc/rtas: rtas_busy_delay() improvements") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-1-26929c8cce78@linux.ibm.com
|
#
12fd6665 |
|
24-Jan-2023 |
Nathan Lynch <nathanl@linux.ibm.com> |
powerpc/rtas: upgrade internal arch spinlocks At the time commit f97bb36f705d ("powerpc/rtas: Turn rtas lock into a raw spinlock") was written, the spinlock lockup detection code called __delay(), which will not make progress if the timebase is not advancing. Since the interprocessor timebase synchronization sequence for chrp, cell, and some now-unsupported Power models can temporarily freeze the timebase through an RTAS function (freeze-time-base), the lock that serializes most RTAS calls was converted to arch_spinlock_t to prevent kernel hangs in the lockup detection code. However, commit bc88c10d7e69 ("locking/spinlock/debug: Remove spinlock lockup detection code") removed that inconvenient property from the lock debug code several years ago. So now it should be safe to reintroduce generic locks into the RTAS support code, primarily to increase lockdep coverage. Making rtas_lock a spinlock_t would violate lock type nesting rules because it can be acquired while holding raw locks, e.g. pci_lock and irq_desc->lock. So convert it to raw_spinlock_t. There's no apparent reason not to upgrade timebase_lock as well. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230124140448.45938-5-nathanl@linux.ibm.com
|
#
599af491 |
|
24-Jan-2023 |
Nathan Lynch <nathanl@linux.ibm.com> |
powerpc/rtas: remove lock and args fields from global rtas struct Only code internal to the RTAS subsystem needs access to the central lock and parameter block. Remove these from the globally visible 'rtas' struct and make them file-static in rtas.c. Some changed lines in rtas_call() lack appropriate spacing around operators and cause checkpatch errors; fix these as well. Suggested-by: Laurent Dufour <ldufour@linux.ibm.com> Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Reviewed-by: Laurent Dufour <laurent.dufour@fr.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230124140448.45938-4-nathanl@linux.ibm.com
|
#
9bce6243 |
|
24-Jan-2023 |
Nathan Lynch <nathanl@linux.ibm.com> |
powerpc/rtas: make all exports GPL The first symbol exports of RTAS functions and data came with the (now removed) scanlog driver in 2003: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git/commit/?id=f92e361842d5251e50562b09664082dcbd0548bb At the time this was applied, EXPORT_SYMBOL_GPL() was very new, and the exports of rtas_call() etc have remained non-GPL. As new APIs have been added to the RTAS subsystem, their symbol exports have followed the convention set by existing code. However, the historical evidence is that RTAS function exports have been added over time only to satisfy the needs of in-kernel users, and these clients must have fairly intimate knowledge of how the APIs work to use them safely. No out of tree users are known, and future ones seem unlikely. Arguably the default for RTAS symbols should have become EXPORT_SYMBOL_GPL once it was available. Let's make it so now, and exceptions can be evaluated as needed. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Reviewed-by: Laurent Dufour <laurent.dufour@fr.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230124140448.45938-3-nathanl@linux.ibm.com
|
#
0d7e812f |
|
26-Jan-2023 |
Michael Ellerman <mpe@ellerman.id.au> |
powerpc/rtas: Drop unused export symbols Some RTAS symbols are never used by modular code, drop their exports. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Nathan Lynch <nathanl@linux.ibm.com> Link: https://lore.kernel.org/r/20230127111231.84294-1-mpe@ellerman.id.au
|
#
5ff92e2f |
|
24-Jan-2023 |
Nathan Lynch <nathanl@linux.ibm.com> |
powerpc/rtas: unexport 'rtas' symbol No modular code needs access to the 'rtas' struct, so remove the symbol export. Suggested-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230124140448.45938-2-nathanl@linux.ibm.com
|
#
98c738c8 |
|
18-Nov-2022 |
Nathan Lynch <nathanl@linux.ibm.com> |
powerpc/rtas: mandate RTAS syscall filtering CONFIG_PPC_RTAS_FILTER has been optional but default-enabled since its introduction. It's been enabled in enterprise distro kernels for a while without causing ABI breakage that wasn't easily fixed, and it prevents harmful abuses of the rtas syscall. Let's make it unconditional. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20221118150751.469393-10-nathanl@linux.ibm.com
|
#
f975b655 |
|
18-Nov-2022 |
Nathan Lynch <nathanl@linux.ibm.com> |
powerpc/rtas: define pr_fmt and convert printk call sites Set pr_fmt to "rtas: " and convert the handful of printk() uses in rtas.c, adjusting the messages to remove now-redundant "RTAS" strings. Note that rtas_restart(), rtas_power_off(), and rtas_halt() all currently use printk() without specifying a log level. These have been changed to use pr_emerg(), which matches the behavior of rtas_os_term(). Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20221118150751.469393-9-nathanl@linux.ibm.com
|
#
9581f8a0 |
|
18-Nov-2022 |
Nathan Lynch <nathanl@linux.ibm.com> |
powerpc/rtas: clean up includes rtas.c used to host complex code related to pseries-specific guest migration and suspend, which used atomics, completions, hcalls, and CPU hotplug APIs. That's all been deleted or moved, so remove the include directives that have been rendered unnecessary. Sort the remainder (with linux/ before asm/) to impose some order on where future additions go. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20221118150751.469393-8-nathanl@linux.ibm.com
|
#
c67a0e41 |
|
18-Nov-2022 |
Nathan Lynch <nathanl@linux.ibm.com> |
powerpc/rtas: clean up rtas_error_log_max initialization The code in rtas_get_error_log_max() doesn't cause problems in practice, but there are no measures to ensure that the lazy initialization of the static rtas_error_log_max variable is atomic, and it's not worth adding them. Initialize the static rtas_error_log_max variable at boot when we're single-threaded instead of lazily on first use. Use the more appropriate of_property_read_u32() API instead of rtas_token() to consult the "rtas-error-log-max" property, which is not the name of an RTAS function. Convert use of printk() to pr_warn() and distinguish the possible error cases. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20221118150751.469393-7-nathanl@linux.ibm.com
|
#
6c606e57 |
|
18-Nov-2022 |
Nathan Lynch <nathanl@linux.ibm.com> |
powerpc/rtas: avoid scheduling in rtas_os_term() It's unsafe to use rtas_busy_delay() to handle a busy status from the ibm,os-term RTAS function in rtas_os_term(): Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b BUG: sleeping function called from invalid context at arch/powerpc/kernel/rtas.c:618 in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 1, name: swapper/0 preempt_count: 2, expected: 0 CPU: 7 PID: 1 Comm: swapper/0 Tainted: G D 6.0.0-rc5-02182-gf8553a572277-dirty #9 Call Trace: [c000000007b8f000] [c000000001337110] dump_stack_lvl+0xb4/0x110 (unreliable) [c000000007b8f040] [c0000000002440e4] __might_resched+0x394/0x3c0 [c000000007b8f0e0] [c00000000004f680] rtas_busy_delay+0x120/0x1b0 [c000000007b8f100] [c000000000052d04] rtas_os_term+0xb8/0xf4 [c000000007b8f180] [c0000000001150fc] pseries_panic+0x50/0x68 [c000000007b8f1f0] [c000000000036354] ppc_panic_platform_handler+0x34/0x50 [c000000007b8f210] [c0000000002303c4] notifier_call_chain+0xd4/0x1c0 [c000000007b8f2b0] [c0000000002306cc] atomic_notifier_call_chain+0xac/0x1c0 [c000000007b8f2f0] [c0000000001d62b8] panic+0x228/0x4d0 [c000000007b8f390] [c0000000001e573c] do_exit+0x140c/0x1420 [c000000007b8f480] [c0000000001e586c] make_task_dead+0xdc/0x200 Use rtas_busy_delay_time() instead, which signals without side effects whether to attempt the ibm,os-term RTAS call again. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20221118150751.469393-5-nathanl@linux.ibm.com
|
#
ed2213bf |
|
18-Nov-2022 |
Nathan Lynch <nathanl@linux.ibm.com> |
powerpc/rtas: avoid device tree lookups in rtas_os_term() rtas_os_term() is called during panic. Its behavior depends on a couple of conditions in the /rtas node of the device tree, the traversal of which entails locking and local IRQ state changes. If the kernel panics while devtree_lock is held, rtas_os_term() as currently written could hang. Instead of discovering the relevant characteristics at panic time, cache them in file-static variables at boot. Note the lookup for "ibm,extended-os-term" is converted to of_property_read_bool() since it is a boolean property, not an RTAS function token. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> [mpe: Incorporate suggested change from Nick] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20221118150751.469393-4-nathanl@linux.ibm.com
|
#
336e2554 |
|
18-Nov-2022 |
Nathan Lynch <nathanl@linux.ibm.com> |
powerpc/rtas: document rtas_call() rtas_call() has a complex calling convention, non-standard return values, and many users. Add kernel-doc for it and remove the less structured commentary from rtas.h. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20221118150751.469393-2-nathanl@linux.ibm.com
|
#
b8f3e488 |
|
26-Sep-2022 |
Nathan Lynch <nathanl@linux.ibm.com> |
powerpc/rtas: block error injection when locked down The error injection facility on pseries VMs allows corruption of arbitrary guest memory, potentially enabling a sufficiently privileged user to disable lockdown or perform other modifications of the running kernel via the rtas syscall. Block the PAPR error injection facility from being opened or called when locked down. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Acked-by: Paul Moore <paul@paul-moore.com> (LSM) Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220926131643.146502-3-nathanl@linux.ibm.com
|
#
f88aabad |
|
07-Sep-2022 |
Nathan Lynch <nathanl@linux.ibm.com> |
Revert "powerpc/rtas: Implement reentrant rtas call" At the time this was submitted by Leonardo, I confirmed -- or thought I had confirmed -- with PowerVM partition firmware development that the following RTAS functions: - ibm,get-xive - ibm,int-off - ibm,int-on - ibm,set-xive were safe to call on multiple CPUs simultaneously, not only with respect to themselves as indicated by PAPR, but with arbitrary other RTAS calls: https://lore.kernel.org/linuxppc-dev/875zcy2v8o.fsf@linux.ibm.com/ Recent discussion with firmware development makes it clear that this is not true, and that the code in commit b664db8e3f97 ("powerpc/rtas: Implement reentrant rtas call") is unsafe, likely explaining several strange bugs we've seen in internal testing involving DLPAR and LPM. These scenarios use ibm,configure-connector, whose internal state can be corrupted by the concurrent use of the "reentrant" functions, leading to symptoms like endless busy statuses from RTAS. Fixes: b664db8e3f97 ("powerpc/rtas: Implement reentrant rtas call") Cc: stable@vger.kernel.org # v5.8+ Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Reviewed-by: Laurent Dufour <laurent.dufour@fr.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220907220111.223267-1-nathanl@linux.ibm.com
|
#
7bc08056 |
|
14-Jun-2022 |
Andrew Donnellan <ajd@linux.ibm.com> |
powerpc/rtas: Allow ibm,platform-dump RTAS call with null buffer address Add a special case to block_rtas_call() to allow the ibm,platform-dump RTAS call through the RTAS filter if the buffer address is 0. According to PAPR, ibm,platform-dump is called with a null buffer address to notify the platform firmware that processing of a particular dump is finished. Without this, on a pseries machine with CONFIG_PPC_RTAS_FILTER enabled, an application such as rtas_errd that is attempting to retrieve a dump will encounter an error at the end of the retrieval process. Fixes: bd59380c5ba4 ("powerpc/rtas: Restrict RTAS requests from userspace") Cc: stable@vger.kernel.org Reported-by: Sathvika Vasireddy <sathvika@linux.ibm.com> Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Reviewed-by: Tyrel Datwyler <tyreld@linux.ibm.com> Reviewed-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220614134952.156010-1-ajd@linux.ibm.com
|
#
743cdb7b |
|
19-May-2022 |
Paul Mackerras <paulus@ozlabs.org> |
powerpc/kasan: Mark more real-mode code as not to be instrumented This marks more files and functions that can possibly be called in real mode as not to be instrumented by KASAN. Most were found by inspection, except for get_pseries_errorlog() which was reported as causing a crash in testing. Reported-by: Nageswara R Sastry <rnsastry@linux.ibm.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/YoX1kZPnmUX4RZEK@cleo
|
#
804c0a16 |
|
08-Mar-2022 |
Nicholas Piggin <npiggin@gmail.com> |
powerpc/rtas: enture rtas_call is called with MMU enabled rtas_call must not be called with the MMU disabled because in case of rtas error, log_error is called which requires MMU enabled. Add a test and warning for this. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Laurent Dufour <ldufour@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220308135047.478297-14-npiggin@gmail.com
|
#
c5a65e0a |
|
08-Mar-2022 |
Nicholas Piggin <npiggin@gmail.com> |
powerpc/rtas: Call enter_rtas with MSR[EE] disabled Disable MSR[EE] in C code rather than asm. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Laurent Dufour <ldufour@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220308135047.478297-5-npiggin@gmail.com
|
#
b6b1c3ce |
|
03-May-2022 |
Laurent Dufour <ldufour@linux.ibm.com> |
powerpc/rtas: Keep MSR[RI] set when calling RTAS RTAS runs in real mode (MSR[DR] and MSR[IR] unset) and in 32-bit big endian mode (MSR[SF,LE] unset). The change in MSR is done in enter_rtas() in a relatively complex way, since the MSR value could be hardcoded. Furthermore, a panic has been reported when hitting the watchdog interrupt while running in RTAS, this leads to the following stack trace: watchdog: CPU 24 Hard LOCKUP watchdog: CPU 24 TB:997512652051031, last heartbeat TB:997504470175378 (15980ms ago) ... Supported: No, Unreleased kernel CPU: 24 PID: 87504 Comm: drmgr Kdump: loaded Tainted: G E X 5.14.21-150400.71.1.bz196362_2-default #1 SLE15-SP4 (unreleased) 0d821077ef4faa8dfaf370efb5fdca1fa35f4e2c NIP: 000000001fb41050 LR: 000000001fb4104c CTR: 0000000000000000 REGS: c00000000fc33d60 TRAP: 0100 Tainted: G E X (5.14.21-150400.71.1.bz196362_2-default) MSR: 8000000002981000 <SF,VEC,VSX,ME> CR: 48800002 XER: 20040020 CFAR: 000000000000011c IRQMASK: 1 GPR00: 0000000000000003 ffffffffffffffff 0000000000000001 00000000000050dc GPR04: 000000001ffb6100 0000000000000020 0000000000000001 000000001fb09010 GPR08: 0000000020000000 0000000000000000 0000000000000000 0000000000000000 GPR12: 80040000072a40a8 c00000000ff8b680 0000000000000007 0000000000000034 GPR16: 000000001fbf6e94 000000001fbf6d84 000000001fbd1db0 000000001fb3f008 GPR20: 000000001fb41018 ffffffffffffffff 000000000000017f fffffffffffff68f GPR24: 000000001fb18fe8 000000001fb3e000 000000001fb1adc0 000000001fb1cf40 GPR28: 000000001fb26000 000000001fb460f0 000000001fb17f18 000000001fb17000 NIP [000000001fb41050] 0x1fb41050 LR [000000001fb4104c] 0x1fb4104c Call Trace: Instruction dump: XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX Oops: Unrecoverable System Reset, sig: 6 [#1] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries ... Supported: No, Unreleased kernel CPU: 24 PID: 87504 Comm: drmgr Kdump: loaded Tainted: G E X 5.14.21-150400.71.1.bz196362_2-default #1 SLE15-SP4 (unreleased) 0d821077ef4faa8dfaf370efb5fdca1fa35f4e2c NIP: 000000001fb41050 LR: 000000001fb4104c CTR: 0000000000000000 REGS: c00000000fc33d60 TRAP: 0100 Tainted: G E X (5.14.21-150400.71.1.bz196362_2-default) MSR: 8000000002981000 <SF,VEC,VSX,ME> CR: 48800002 XER: 20040020 CFAR: 000000000000011c IRQMASK: 1 GPR00: 0000000000000003 ffffffffffffffff 0000000000000001 00000000000050dc GPR04: 000000001ffb6100 0000000000000020 0000000000000001 000000001fb09010 GPR08: 0000000020000000 0000000000000000 0000000000000000 0000000000000000 GPR12: 80040000072a40a8 c00000000ff8b680 0000000000000007 0000000000000034 GPR16: 000000001fbf6e94 000000001fbf6d84 000000001fbd1db0 000000001fb3f008 GPR20: 000000001fb41018 ffffffffffffffff 000000000000017f fffffffffffff68f GPR24: 000000001fb18fe8 000000001fb3e000 000000001fb1adc0 000000001fb1cf40 GPR28: 000000001fb26000 000000001fb460f0 000000001fb17f18 000000001fb17000 NIP [000000001fb41050] 0x1fb41050 LR [000000001fb4104c] 0x1fb4104c Call Trace: Instruction dump: XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX ---[ end trace 3ddec07f638c34a2 ]--- This happens because MSR[RI] is unset when entering RTAS but there is no valid reason to not set it here. RTAS is expected to be called with MSR[RI] as specified in PAPR+ section "7.2.1 Machine State": R1–7.2.1–9. If called with MSR[RI] equal to 1, then RTAS must protect its own critical regions from recursion by setting the MSR[RI] bit to 0 when in the critical regions. Fixing this by reviewing the way MSR is compute before calling RTAS. Now a hardcoded value meaning real mode, 32 bits big endian mode and Recoverable Interrupt is loaded. In the case MSR[S] is set, it will remain set while entering RTAS as only urfid can unset it (thanks Fabiano). In addition a check is added in do_enter_rtas() to detect calls made with MSR[RI] unset, as we are forcing it on later. This patch has been tested on the following machines: Power KVM Guest P8 S822L (host Ubuntu kernel 5.11.0-49-generic) PowerVM LPAR P8 9119-MME (FW860.A1) p9 9008-22L (FW950.00) P10 9080-HEX (FW1010.00) Suggested-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220504101244.12107-1-ldufour@linux.ibm.com
|
#
e6f6390a |
|
08-Mar-2022 |
Christophe Leroy <christophe.leroy@csgroup.eu> |
powerpc: Add missing headers Don't inherit headers "by chances" from asm/prom.h, asm/mpc52xx.h, asm/pci.h etc... Include the needed headers, and remove asm/prom.h when it was needed exclusively for pulling necessary headers. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/be8bdc934d152a7d8ee8d1a840d5596e2f7d85e0.1646767214.git.christophe.leroy@csgroup.eu
|
#
7c5ed82b |
|
04-Feb-2022 |
Sourabh Jain <sourabhjain@linux.ibm.com> |
powerpc: Set crashkernel offset to mid of RMA region On large config LPARs (having 192 and more cores), Linux fails to boot due to insufficient memory in the first memblock. It is due to the memory reservation for the crash kernel which starts at 128MB offset of the first memblock. This memory reservation for the crash kernel doesn't leave enough space in the first memblock to accommodate other essential system resources. The crash kernel start address was set to 128MB offset by default to ensure that the crash kernel get some memory below the RMA region which is used to be of size 256MB. But given that the RMA region size can be 512MB or more, setting the crash kernel offset to mid of RMA size will leave enough space for the kernel to allocate memory for other system resources. Since the above crash kernel offset change is only applicable to the LPAR platform, the LPAR feature detection is pushed before the crash kernel reservation. The rest of LPAR specific initialization will still be done during pseries_probe_fw_features as usual. This patch is dependent on changes to paca allocation for boot CPU. It expect boot CPU to discover 1T segment support which is introduced by the patch posted here: https://lists.ozlabs.org/pipermail/linuxppc-dev/2022-January/239175.html Reported-by: Abdul haleem <abdhalee@linux.vnet.ibm.com> Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220204085601.107257-1-sourabhjain@linux.ibm.com
|
#
dd5cde45 |
|
16-Nov-2021 |
Nathan Lynch <nathanl@linux.ibm.com> |
powerpc/rtas: rtas_busy_delay_time() kernel-doc Provide API documentation for rtas_busy_delay_time(), explaining why we return the same value for 9900 and -2. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211117060259.957178-3-nathanl@linux.ibm.com
|
#
38f7b706 |
|
16-Nov-2021 |
Nathan Lynch <nathanl@linux.ibm.com> |
powerpc/rtas: rtas_busy_delay() improvements Generally RTAS cannot block, and in PAPR it is required to return control to the OS within a few tens of microseconds. In order to support operations which may take longer to complete, many RTAS primitives can return intermediate -2 ("busy") or 990x ("extended delay") values, which indicate that the OS should reattempt the same call with the same arguments at some point in the future. Current versions of PAPR are less than clear about this, but the intended meanings of these values in more detail are: RTAS_BUSY (-2): RTAS has suspended a potentially long-running operation in order to meet its latency obligation and give the OS the opportunity to perform other work. RTAS can resume making progress as soon as the OS reattempts the call. RTAS_EXTENDED_DELAY_{MIN...MAX} (9900-9905): RTAS must wait for an external event to occur or for internal contention to resolve before it can complete the requested operation. The value encodes a non-binding hint as to roughly how long the OS should wait before calling again, but the OS is allowed to reattempt the call sooner or even immediately. Linux of course must take its own CPU scheduling obligations into account when handling these statuses; e.g. a task which receives an RTAS_BUSY status should check whether to reschedule before it attempts the RTAS call again to avoid starving other tasks. rtas_busy_delay() is a helper function that "consumes" a busy or extended delay status. Common usage: int rc; do { rc = rtas_call(rtas_token("some-function"), ...); } while (rtas_busy_delay(rc)); /* convert rc to Linux error value, etc */ If rc is a busy or extended delay status, the caller can rely on rtas_busy_delay() to perform an appropriate sleep or reschedule and return nonzero. Other statuses are handled normally by the caller. The current implementation of rtas_busy_delay() both oversleeps and overuses the CPU: * It performs msleep() for all 990x and even when no delay is suggested (-2), but this is understood to actually sleep for two jiffies minimum in practice (20ms with HZ=100). 9900 (1ms) and 9901 (10ms) appear to be the most common extended delay statuses, and the oversleeping measurably lengthens DLPAR operations, which perform many RTAS calls. * It does not sleep on 990x unless need_resched() is true, causing code like the loop above to needlessly retry, wasting CPU time. Alter the logic to align better with the intended meanings: * When passed RTAS_BUSY, perform cond_resched() and return without sleeping. The caller should reattempt immediately * Always sleep when passed an extended delay status, using usleep_range() for precise shorter sleeps. Limit the sleep time to one second even though there are higher architected values. Change rtas_busy_delay()'s return type to bool to better reflect its usage, and add kernel-doc. rtas_busy_delay_time() is unchanged, even though it "incorrectly" returns 1 for RTAS_BUSY. There are users of that API with open-coded delay loops in sensitive contexts that will have to be taken on an individual basis. Brief results for addition and removal of 5GB memory on a small P9 PowerVM partition follow. Load was generated with stress-ng --cpu N. For add, elapsed time is greatly reduced without significant change in the number of RTAS calls or time spent on CPU. For remove, elapsed time is modestly reduced, with significant reductions in RTAS calls and time spent on CPU. With no competing workload (- before, + after): Performance counter stats for 'bash -c echo "memory add count 20" > /sys/kernel/dlpar' (10 runs): - 1,935 probe:rtas_call # 0.003 M/sec ( +- 0.22% ) - 609.99 msec task-clock # 0.183 CPUs utilized ( +- 0.19% ) + 1,956 probe:rtas_call # 0.003 M/sec ( +- 0.17% ) + 618.56 msec task-clock # 0.278 CPUs utilized ( +- 0.11% ) - 3.3322 +- 0.0670 seconds time elapsed ( +- 2.01% ) + 2.2222 +- 0.0416 seconds time elapsed ( +- 1.87% ) Performance counter stats for 'bash -c echo "memory remove count 20" > /sys/kernel/dlpar' (10 runs): - 6,224 probe:rtas_call # 0.008 M/sec ( +- 2.57% ) - 750.36 msec task-clock # 0.190 CPUs utilized ( +- 2.01% ) + 843 probe:rtas_call # 0.003 M/sec ( +- 0.12% ) + 250.66 msec task-clock # 0.068 CPUs utilized ( +- 0.17% ) - 3.9394 +- 0.0890 seconds time elapsed ( +- 2.26% ) + 3.678 +- 0.113 seconds time elapsed ( +- 3.07% ) With all CPUs 100% busy (- before, + after): Performance counter stats for 'bash -c echo "memory add count 20" > /sys/kernel/dlpar' (10 runs): - 2,979 probe:rtas_call # 0.003 M/sec ( +- 0.12% ) - 1,096.62 msec task-clock # 0.105 CPUs utilized ( +- 0.10% ) + 2,981 probe:rtas_call # 0.003 M/sec ( +- 0.22% ) + 1,095.26 msec task-clock # 0.154 CPUs utilized ( +- 0.21% ) - 10.476 +- 0.104 seconds time elapsed ( +- 1.00% ) + 7.1124 +- 0.0865 seconds time elapsed ( +- 1.22% ) Performance counter stats for 'bash -c echo "memory remove count 20" > /sys/kernel/dlpar' (10 runs): - 2,702 probe:rtas_call # 0.004 M/sec ( +- 4.00% ) - 722.71 msec task-clock # 0.067 CPUs utilized ( +- 2.41% ) + 1,246 probe:rtas_call # 0.003 M/sec ( +- 0.25% ) + 487.73 msec task-clock # 0.049 CPUs utilized ( +- 0.20% ) - 10.829 +- 0.163 seconds time elapsed ( +- 1.51% ) + 9.9887 +- 0.0866 seconds time elapsed ( +- 0.87% ) Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211117060259.957178-2-nathanl@linux.ibm.com
|
#
53cadf7d |
|
16-Nov-2021 |
Nathan Lynch <nathanl@linux.ibm.com> |
powerpc/rtas: kernel-doc fixes Fix the following issues reported by kernel-doc: $ scripts/kernel-doc -v -none arch/powerpc/kernel/rtas.c arch/powerpc/kernel/rtas.c:810: info: Scanning doc for function rtas_activate_firmware arch/powerpc/kernel/rtas.c:818: warning: contents before sections arch/powerpc/kernel/rtas.c:841: info: Scanning doc for function rtas_call_reentrant arch/powerpc/kernel/rtas.c:893: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst * Find a specific pseries error log in an RTAS extended event log. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211116215806.928235-1-nathanl@linux.ibm.com
|
#
c0891ac1 |
|
02-Aug-2021 |
Alexey Dobriyan <adobriyan@gmail.com> |
isystem: ship and use stdarg.h Ship minimal stdarg.h (1 type, 4 macros) as <linux/stdarg.h>. stdarg.h is the only userspace header commonly used in the kernel. GPL 2 version of <stdarg.h> can be extracted from http://archive.debian.org/debian/pool/main/g/gcc-4.2/gcc-4.2_4.2.4.orig.tar.gz Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
#
59dc5bfc |
|
17-Jun-2021 |
Nicholas Piggin <npiggin@gmail.com> |
powerpc/64s: avoid reloading (H)SRR registers if they are still valid When an interrupt is taken, the SRR registers are set to return to where it left off. Unless they are modified in the meantime, or the return address or MSR are modified, there is no need to reload these registers when returning from interrupt. Introduce per-CPU flags that track the validity of SRR and HSRR registers. These are cleared when returning from interrupt, when using the registers for something else (e.g., OPAL calls), when adjusting the return address or MSR of a context, and when context switching (which changes the return address and MSR). This improves the performance of interrupt returns. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Fold in fixup patch from Nick] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210617155116.2167984-5-npiggin@gmail.com
|
#
e5d56763 |
|
08-Apr-2021 |
Nathan Lynch <nathanl@linux.ibm.com> |
powerpc/rtas: rename RTAS_RMOBUF_MAX to RTAS_USER_REGION_SIZE RTAS_RMOBUF_MAX doesn't actually describe a "maximum" value in any sense. It represents the size of an area of memory set aside for user space to use as work areas for certain RTAS calls. Rename it to RTAS_USER_REGION_SIZE. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210408140630.205502-6-nathanl@linux.ibm.com
|
#
0649cdc8 |
|
08-Apr-2021 |
Nathan Lynch <nathanl@linux.ibm.com> |
powerpc/rtas: move syscall filter setup into separate function Reduce conditionally compiled sections within rtas_initialize() by moving the filter table initialization into its own function already guarded by CONFIG_PPC_RTAS_FILTER. No behavior change intended. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Acked-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210408140630.205502-5-nathanl@linux.ibm.com
|
#
0ab1c929 |
|
08-Apr-2021 |
Nathan Lynch <nathanl@linux.ibm.com> |
powerpc/rtas: remove ibm_suspend_me_token There's not a compelling reason to cache the value of the token for the ibm,suspend-me function. Just look it up when needed in the RTAS syscall's special case for it. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210408140630.205502-4-nathanl@linux.ibm.com
|
#
f10881a4 |
|
08-Dec-2020 |
Tyrel Datwyler <tyreld@linux.ibm.com> |
powerpc/rtas: Fix typo of ibm,open-errinjct in RTAS filter Commit bd59380c5ba4 ("powerpc/rtas: Restrict RTAS requests from userspace") introduced the following error when invoking the errinjct userspace tool: [root@ltcalpine2-lp5 librtas]# errinjct open [327884.071171] sys_rtas: RTAS call blocked - exploit attempt? [327884.071186] sys_rtas: token=0x26, nargs=0 (called by errinjct) errinjct: Could not open RTAS error injection facility errinjct: librtas: open: Unexpected I/O error The entry for ibm,open-errinjct in rtas_filter array has a typo where the "j" is omitted in the rtas call name. After fixing this typo the errinjct tool functions again as expected. [root@ltcalpine2-lp5 linux]# errinjct open RTAS error injection facility open, token = 1 Fixes: bd59380c5ba4 ("powerpc/rtas: Restrict RTAS requests from userspace") Cc: stable@vger.kernel.org Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201208195434.8289-1-tyreld@linux.ibm.com
|
#
1b248817 |
|
07-Dec-2020 |
Nathan Lynch <nathanl@linux.ibm.com> |
powerpc/rtas: remove unused rtas_suspend_last_cpu() rtas_suspend_last_cpu() is now unused, remove it and __rtas_suspend_last_cpu() which also becomes unused. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201207215200.1785968-24-nathanl@linux.ibm.com
|
#
395b2c09 |
|
07-Dec-2020 |
Nathan Lynch <nathanl@linux.ibm.com> |
powerpc/rtas: remove rtas_suspend_cpu() rtas_suspend_cpu() no longer has users; remove it and __rtas_suspend_cpu() which now becomes unused as well. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201207215200.1785968-22-nathanl@linux.ibm.com
|
#
5f6665e4 |
|
07-Dec-2020 |
Nathan Lynch <nathanl@linux.ibm.com> |
powerpc/rtas: remove rtas_ibm_suspend_me_unsafe() rtas_ibm_suspend_me_unsafe() is now unused; remove it and rtas_percpu_suspend_me() which becomes unused as a result. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201207215200.1785968-17-nathanl@linux.ibm.com
|
#
4d756894 |
|
07-Dec-2020 |
Nathan Lynch <nathanl@linux.ibm.com> |
powerpc/rtas: dispatch partition migration requests to pseries sys_rtas() cannot call ibm,suspend-me directly in the same way it handles other inputs. Instead it must dispatch the request to code that can first perform the H_JOIN sequence before any call to ibm,suspend-me can succeed. Over time kernel/rtas.c has accreted a fair amount of platform-specific code to implement this. Since a different, more robust implementation of the suspend sequence is now in the pseries platform code, we want to dispatch the request there. Note that invoking ibm,suspend-me via the RTAS syscall is all but deprecated; this change preserves ABI compatibility for old programs while providing to them the benefit of the new partition suspend implementation. This is a behavior change in that the kernel performs the device tree update and firmware activation before returning, but experimentation indicates this is tolerated fine by legacy user space. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201207215200.1785968-16-nathanl@linux.ibm.com
|
#
5f485a66 |
|
07-Dec-2020 |
Nathan Lynch <nathanl@linux.ibm.com> |
powerpc/rtas: add rtas_activate_firmware() Provide a documented wrapper function for the ibm,activate-firmware service, which must be called after a partition migration or hibernation. If the function is absent or the call fails, the OS will continue to run normally with the current firmware, so there is no need to perform any recovery. Just log it and continue. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201207215200.1785968-6-nathanl@linux.ibm.com
|
#
701ba683 |
|
07-Dec-2020 |
Nathan Lynch <nathanl@linux.ibm.com> |
powerpc/rtas: add rtas_ibm_suspend_me() Now that the name is available, provide a simple wrapper for ibm,suspend-me which returns both a Linux errno and optionally the actual RTAS status to the caller. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201207215200.1785968-5-nathanl@linux.ibm.com
|
#
7049b288 |
|
07-Dec-2020 |
Nathan Lynch <nathanl@linux.ibm.com> |
powerpc/rtas: rtas_ibm_suspend_me -> rtas_ibm_suspend_me_unsafe The pseries partition suspend sequence requires that all active CPUs call H_JOIN, which suspends all but one of them with interrupts disabled. The "chosen" CPU is then to call ibm,suspend-me to complete the suspend. Upon returning from ibm,suspend-me, the chosen CPU is to use H_PROD to wake the joined CPUs. Using on_each_cpu() for this, as rtas_ibm_suspend_me() does to implement partition migration, is susceptible to deadlock with other users of on_each_cpu() and with users of stop_machine APIs. The callback passed to on_each_cpu() is not allowed to synchronize with other CPUs in the way it is used here. Complicating the fix is the fact that rtas_ibm_suspend_me() also occupies the function name that should be used to provide a more conventional wrapper for ibm,suspend-me. Rename rtas_ibm_suspend_me() to rtas_ibm_suspend_me_unsafe() to free up the name and indicate that it should not gain users. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201207215200.1785968-4-nathanl@linux.ibm.com
|
#
de0f7349 |
|
07-Dec-2020 |
Nathan Lynch <nathanl@linux.ibm.com> |
powerpc/rtas: prevent suspend-related sys_rtas use on LE While drmgr has had work in some areas to make its RTAS syscall interactions endian-neutral, its code for performing partition migration via the syscall has never worked on LE. While it is able to complete ibm,suspend-me successfully, it crashes when attempting the subsequent ibm,update-nodes call. drmgr is the only known (or plausible) user of ibm,suspend-me, ibm,update-nodes, and ibm,update-properties, so allow them only in big-endian configurations. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201207215200.1785968-2-nathanl@linux.ibm.com
|
#
bd59380c |
|
19-Aug-2020 |
Andrew Donnellan <ajd@linux.ibm.com> |
powerpc/rtas: Restrict RTAS requests from userspace A number of userspace utilities depend on making calls to RTAS to retrieve information and update various things. The existing API through which we expose RTAS to userspace exposes more RTAS functionality than we actually need, through the sys_rtas syscall, which allows root (or anyone with CAP_SYS_ADMIN) to make any RTAS call they want with arbitrary arguments. Many RTAS calls take the address of a buffer as an argument, and it's up to the caller to specify the physical address of the buffer as an argument. We allocate a buffer (the "RMO buffer") in the Real Memory Area that RTAS can access, and then expose the physical address and size of this buffer in /proc/powerpc/rtas/rmo_buffer. Userspace is expected to read this address, poke at the buffer using /dev/mem, and pass an address in the RMO buffer to the RTAS call. However, there's nothing stopping the caller from specifying whatever address they want in the RTAS call, and it's easy to construct a series of RTAS calls that can overwrite arbitrary bytes (even without /dev/mem access). Additionally, there are some RTAS calls that do potentially dangerous things and for which there are no legitimate userspace use cases. In the past, this would not have been a particularly big deal as it was assumed that root could modify all system state freely, but with Secure Boot and lockdown we need to care about this. We can't fundamentally change the ABI at this point, however we can address this by implementing a filter that checks RTAS calls against a list of permitted calls and forces the caller to use addresses within the RMO buffer. The list is based off the list of calls that are used by the librtas userspace library, and has been tested with a number of existing userspace RTAS utilities. For compatibility with any applications we are not aware of that require other calls, the filter can be turned off at build time. Cc: stable@vger.kernel.org Reported-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200820044512.7543-1-ajd@linux.ibm.com
|
#
ec2fc2a9 |
|
11-Jun-2020 |
Nathan Lynch <nathanl@linux.ibm.com> |
powerpc/rtas: don't online CPUs for partition suspend Partition suspension, used for hibernation and migration, requires that the OS place all but one of the LPAR's processor threads into one of two states prior to calling the ibm,suspend-me RTAS function: * the architected offline state (via RTAS stop-self); or * the H_JOIN hcall, which does not return until the partition resumes execution Using H_CEDE as the offline mode, introduced by commit 3aa565f53c39 ("powerpc/pseries: Add hooks to put the CPU into an appropriate offline state"), means that any threads which are offline from Linux's point of view must be moved to one of those two states before a partition suspension can proceed. This was eventually addressed in commit 120496ac2d2d ("powerpc: Bring all threads online prior to migration/hibernation"), which added code to temporarily bring up any offline processor threads so they can call H_JOIN. Conceptually this is fine, but the implementation has had multiple races with cpu hotplug operations initiated from user space[1][2][3], the error handling is fragile, and it generates user-visible cpu hotplug events which is a lot of noise for a platform feature that's supposed to minimize disruption to workloads. With commit 3aa565f53c39 ("powerpc/pseries: Add hooks to put the CPU into an appropriate offline state") reverted, this code becomes unnecessary, so remove it. Since any offline CPUs now are truly offline from the platform's point of view, it is no longer necessary to bring up CPUs only to have them call H_JOIN and then go offline again upon resuming. Only active threads are required to call H_JOIN; stopped threads can be left alone. [1] commit a6717c01ddc2 ("powerpc/rtas: use device model APIs and serialization during LPM") [2] commit 9fb603050ffd ("powerpc/rtas: retry when cpu offline races with suspend/migration") [3] commit dfd718a2ed1f ("powerpc/rtas: Fix a potential race between CPU-Offline & Migration") Fixes: 120496ac2d2d ("powerpc: Bring all threads online prior to migration/hibernation") Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200612051238.1007764-3-nathanl@linux.ibm.com
|
#
b664db8e |
|
18-May-2020 |
Leonardo Bras <leobras.c@gmail.com> |
powerpc/rtas: Implement reentrant rtas call Implement rtas_call_reentrant() for reentrant rtas-calls: "ibm,int-on", "ibm,int-off",ibm,get-xive" and "ibm,set-xive". On LoPAPR Version 1.1 (March 24, 2016), from 7.3.10.1 to 7.3.10.4, items 2 and 3 say: 2 - For the PowerPC External Interrupt option: The * call must be reentrant to the number of processors on the platform. 3 - For the PowerPC External Interrupt option: The * argument call buffer for each simultaneous call must be physically unique. So, these rtas-calls can be called in a lockless way, if using a different buffer for each cpu doing such rtas call. For this, it was suggested to add the buffer (struct rtas_args) in the PACA struct, so each cpu can have it's own buffer. The PACA struct received a pointer to rtas buffer, which is allocated in the memory range available to rtas 32-bit. Reentrant rtas calls are useful to avoid deadlocks in crashing, where rtas-calls are needed, but some other thread crashed holding the rtas.lock. This is a backtrace of a deadlock from a kdump testing environment: #0 arch_spin_lock #1 lock_rtas () #2 rtas_call (token=8204, nargs=1, nret=1, outputs=0x0) #3 ics_rtas_mask_real_irq (hw_irq=4100) #4 machine_kexec_mask_interrupts #5 default_machine_crash_shutdown #6 machine_crash_shutdown #7 __crash_kexec #8 crash_kexec #9 oops_end Signed-off-by: Leonardo Bras <leobras.c@gmail.com> [mpe: Move under #ifdef PSERIES to avoid build breakage] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200518234245.200672-3-leobras.c@gmail.com
|
#
10e4850d |
|
02-Aug-2019 |
Nathan Lynch <nathanl@linux.ibm.com> |
powerpc/rtas: allow rescheduling while changing cpu states rtas_cpu_state_change_mask() potentially operates on scores of cpus, so explicitly allow rescheduling in the loop body. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190802192926.19277-3-nathanl@linux.ibm.com
|
#
a6717c01 |
|
02-Aug-2019 |
Nathan Lynch <nathanl@linux.ibm.com> |
powerpc/rtas: use device model APIs and serialization during LPM The LPAR migration implementation and userspace-initiated cpu hotplug can interleave their executions like so: 1. Set cpu 7 offline via sysfs. 2. Begin a partition migration, whose implementation requires the OS to ensure all present cpus are online; cpu 7 is onlined: rtas_ibm_suspend_me -> rtas_online_cpus_mask -> cpu_up This sets cpu 7 online in all respects except for the cpu's corresponding struct device; dev->offline remains true. 3. Set cpu 7 online via sysfs. _cpu_up() determines that cpu 7 is already online and returns success. The driver core (device_online) sets dev->offline = false. 4. The migration completes and restores cpu 7 to offline state: rtas_ibm_suspend_me -> rtas_offline_cpus_mask -> cpu_down This leaves cpu7 in a state where the driver core considers the cpu device online, but in all other respects it is offline and unused. Attempts to online the cpu via sysfs appear to succeed but the driver core actually does not pass the request to the lower-level cpuhp support code. This makes the cpu unusable until the cpu device is manually set offline and then online again via sysfs. Instead of directly calling cpu_up/cpu_down, the migration code should use the higher-level device core APIs to maintain consistent state and serialize operations. Fixes: 120496ac2d2d ("powerpc: Bring all threads online prior to migration/hibernation") Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190802192926.19277-2-nathanl@linux.ibm.com
|
#
ae2e953f |
|
18-Jul-2019 |
Nathan Lynch <nathanl@linux.ibm.com> |
powerpc/rtas: Unexport rtas_online_cpus_mask, rtas_offline_cpus_mask These aren't used by modular code, nor should they be. Fixes: 120496ac2d2d ("powerpc: Bring all threads online prior to migration/hibernation") Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190718162214.5694-1-nathanl@linux.ibm.com
|
#
9fb60305 |
|
21-Jun-2019 |
Nathan Lynch <nathanl@linux.ibm.com> |
powerpc/rtas: retry when cpu offline races with suspend/migration The protocol for suspending or migrating an LPAR requires all present processor threads to enter H_JOIN. So if we have threads offline, we have to temporarily bring them up. This can race with administrator actions such as SMT state changes. As of dfd718a2ed1f ("powerpc/rtas: Fix a potential race between CPU-Offline & Migration"), rtas_ibm_suspend_me() accounts for this, but errors out with -EBUSY for what almost certainly is a transient condition in any reasonable scenario. Callers of rtas_ibm_suspend_me() already retry when -EAGAIN is returned, and it is typical during a migration for that to happen repeatedly for several minutes polling the H_VASI_STATE hcall result before proceeding to the next stage. So return -EAGAIN instead of -EBUSY when this race is encountered. Additionally: logging this event is still appropriate but use pr_info instead of pr_err; and remove use of unlikely() while here as this is not a hot path at all. Fixes: dfd718a2ed1f ("powerpc/rtas: Fix a potential race between CPU-Offline & Migration") Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
caa75932 |
|
13-Jun-2019 |
Nadav Amit <namit@vmware.com> |
smp: Remove smp_call_function() and on_each_cpu() return values The return value is fixed. Remove it and amend the callers. [ tglx: Fixup arm/bL_switcher and powerpc/rtas ] Signed-off-by: Nadav Amit <namit@vmware.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Link: https://lkml.kernel.org/r/20190613064813.8102-2-namit@vmware.com
|
#
2874c5fd |
|
27-May-2019 |
Thomas Gleixner <tglx@linutronix.de> |
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 3029 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
0ba9e6ed |
|
12-Mar-2019 |
Mike Rapoport <rppt@kernel.org> |
memblock: drop memblock_alloc_base() The memblock_alloc_base() function tries to allocate a memory up to the limit specified by its max_addr parameter and panics if the allocation fails. Replace its usage with memblock_phys_alloc_range() and make the callers check the return value and panic in case of error. Link: http://lkml.kernel.org/r/1548057848-15136-10-git-send-email-rppt@linux.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> [powerpc] Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christophe Leroy <christophe.leroy@c-s.fr> Cc: Christoph Hellwig <hch@lst.de> Cc: "David S. Miller" <davem@davemloft.net> Cc: Dennis Zhou <dennis@kernel.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Guo Ren <guoren@kernel.org> Cc: Guo Ren <ren_guo@c-sky.com> [c-sky] Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Juergen Gross <jgross@suse.com> [Xen] Cc: Mark Salter <msalter@redhat.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Paul Burton <paul.burton@mips.com> Cc: Petr Mladek <pmladek@suse.com> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Rob Herring <robh+dt@kernel.org> Cc: Rob Herring <robh@kernel.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
dfd718a2 |
|
01-Oct-2018 |
Gautham R. Shenoy <ego@linux.vnet.ibm.com> |
powerpc/rtas: Fix a potential race between CPU-Offline & Migration Live Partition Migrations require all the present CPUs to execute the H_JOIN call, and hence rtas_ibm_suspend_me() onlines any offline CPUs before initiating the migration for this purpose. The commit 85a88cabad57 ("powerpc/pseries: Disable CPU hotplug across migrations") disables any CPU-hotplug operations once all the offline CPUs are brought online to prevent any further state change. Once the CPU-Hotplug operation is disabled, the code assumes that all the CPUs are online. However, there is a minor window in rtas_ibm_suspend_me() between onlining the offline CPUs and disabling CPU-Hotplug when a concurrent CPU-offline operations initiated by the userspace can succeed thereby nullifying the the aformentioned assumption. In this unlikely case these offlined CPUs will not call H_JOIN, resulting in a system hang. Fix this by verifying that all the present CPUs are actually online after CPU-Hotplug has been disabled, failing which we restore the state of the offline CPUs in rtas_ibm_suspend_me() and return an -EBUSY. Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com> Cc: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> Suggested-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Reviewed-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
65b9fdad |
|
09-Oct-2018 |
Michael Bringmann <mwb@linux.vnet.ibm.com> |
powerpc/pseries/mobility: Extend start/stop topology update scope The powerpc mobility code may receive RTAS requests to perform PRRN (Platform Resource Reassignment Notification) topology changes at any time, including during LPAR migration operations. In some configurations where the affinity of CPUs or memory is being changed on that platform, the PRRN requests may apply or refer to outdated information prior to the complete update of the device-tree. This patch changes the duration for which topology updates are suppressed during LPAR migrations from just the rtas_ibm_suspend_me() / 'ibm,suspend-me' call(s) to cover the entire migration_store() operation to allow all changes to the device-tree to be applied prior to accepting and applying any PRRN requests. For tracking purposes, pr_info notices are added to the functions start_topology_update() and stop_topology_update() of 'numa.c'. Signed-off-by: Michael Bringmann <mwb@linux.vnet.ibm.com> Reviewed-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
85a88cab |
|
17-Sep-2018 |
Nathan Fontenot <nfont@linux.vnet.ibm.com> |
powerpc/pseries: Disable CPU hotplug across migrations When performing partition migrations all present CPUs must be online as all present CPUs must make the H_JOIN call as part of the migration process. Once all present CPUs make the H_JOIN call, one CPU is returned to make the rtas call to perform the migration to the destination system. During testing of migration and changing the SMT state we have found instances where CPUs are offlined, as part of the SMT state change, before they make the H_JOIN call. This results in a hung system where every CPU is either in H_JOIN or offline. To prevent this this patch disables CPU hotplug during the migration process. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Reviewed-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
ac851744 |
|
19-Jun-2018 |
Paul Burton <paulburton@kernel.org> |
powerpc: Remove -Wattribute-alias pragmas With SYSCALL_DEFINEx() disabling -Wattribute-alias generically, there's no need to duplicate that for PowerPC syscalls. This reverts commit 415520373975 ("powerpc: fix build failure by disabling attribute-alias warning in pci_32") and commit 2479bfc9bc60 ("powerpc: Fix build by disabling attribute-alias warning for SYSCALL_DEFINEx"). Signed-off-by: Paul Burton <paul.burton@mips.com> Acked-by: Christophe Leroy <christophe.leroy@c-s.fr> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
|
#
2479bfc9 |
|
29-May-2018 |
Christophe Leroy <christophe.leroy@c-s.fr> |
powerpc: Fix build by disabling attribute-alias warning for SYSCALL_DEFINEx GCC 8.1 emits warnings such as the following. As arch/powerpc code is built with -Werror, this breaks the build with GCC 8.1. In file included from arch/powerpc/kernel/pci_64.c:23: ./include/linux/syscalls.h:233:18: error: 'sys_pciconfig_iobase' alias between functions of incompatible types 'long int(long int, long unsigned int, long unsigned int)' and 'long int(long int, long int, long int)' [-Werror=attribute-alias] asmlinkage long sys##name(__MAP(x,__SC_DECL,__VA_ARGS__)) \ ^~~ ./include/linux/syscalls.h:222:2: note: in expansion of macro '__SYSCALL_DEFINEx' __SYSCALL_DEFINEx(x, sname, __VA_ARGS__) This patch inhibits those warnings. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> [mpe: Trim change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
4c392e65 |
|
02-May-2018 |
Al Viro <viro@zeniv.linux.org.uk> |
powerpc/syscalls: switch rtas(2) to SYSCALL_DEFINE Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> [mpe: Update sys_ni.c for s/ppc_rtas/sys_rtas/] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
58788a9b |
|
17-Oct-2017 |
Will Deacon <will@kernel.org> |
locking/arch, powerpc/rtas: Use arch_spin_lock() instead of arch_spin_lock_flags() arch_spin_lock_flags() is an internal part of the spinlock implementation and is no longer available when SMP=n and DEBUG_SPINLOCK=y, so the PPC RTAS code fails to compile in this configuration: arch/powerpc/kernel/rtas.c: In function 'lock_rtas': >> arch/powerpc/kernel/rtas.c:81:2: error: implicit declaration of function 'arch_spin_lock_flags' [-Werror=implicit-function-declaration] arch_spin_lock_flags(&rtas.lock, flags); ^~~~~~~~~~~~~~~~~~~~ Since there's no good reason to use arch_spin_lock_flags() here (the code in question already calls local_irq_save(flags)), switch it over to arch_spin_lock and get things building again. Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1508327469-20231-1-git-send-email-will.deacon@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
0ee931c4 |
|
13-Sep-2017 |
Michal Hocko <mhocko@suse.com> |
mm: treewide: remove GFP_TEMPORARY allocation flag GFP_TEMPORARY was introduced by commit e12ba74d8ff3 ("Group short-lived and reclaimable kernel allocations") along with __GFP_RECLAIMABLE. It's primary motivation was to allow users to tell that an allocation is short lived and so the allocator can try to place such allocations close together and prevent long term fragmentation. As much as this sounds like a reasonable semantic it becomes much less clear when to use the highlevel GFP_TEMPORARY allocation flag. How long is temporary? Can the context holding that memory sleep? Can it take locks? It seems there is no good answer for those questions. The current implementation of GFP_TEMPORARY is basically GFP_KERNEL | __GFP_RECLAIMABLE which in itself is tricky because basically none of the existing caller provide a way to reclaim the allocated memory. So this is rather misleading and hard to evaluate for any benefits. I have checked some random users and none of them has added the flag with a specific justification. I suspect most of them just copied from other existing users and others just thought it might be a good idea to use without any measuring. This suggests that GFP_TEMPORARY just motivates for cargo cult usage without any reasoning. I believe that our gfp flags are quite complex already and especially those with highlevel semantic should be clearly defined to prevent from confusion and abuse. Therefore I propose dropping GFP_TEMPORARY and replace all existing users to simply use GFP_KERNEL. Please note that SLAB users with shrinkers will still get __GFP_RECLAIMABLE heuristic and so they will be placed properly for memory fragmentation prevention. I can see reasons we might want some gfp flag to reflect shorterm allocations but I propose starting from a clear semantic definition and only then add users with proper justification. This was been brought up before LSF this year by Matthew [1] and it turned out that GFP_TEMPORARY really doesn't have a clear semantic. It seems to be a heuristic without any measured advantage for most (if not all) its current users. The follow up discussion has revealed that opinions on what might be temporary allocation differ a lot between developers. So rather than trying to tweak existing users into a semantic which they haven't expected I propose to simply remove the flag and start from scratch if we really need a semantic for short term allocations. [1] http://lkml.kernel.org/r/20170118054945.GD18349@bombadil.infradead.org [akpm@linux-foundation.org: fix typo] [akpm@linux-foundation.org: coding-style fixes] [sfr@canb.auug.org.au: drm/i915: fix up] Link: http://lkml.kernel.org/r/20170816144703.378d4f4d@canb.auug.org.au Link: http://lkml.kernel.org/r/20170728091904.14627-1-mhocko@kernel.org Signed-off-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Mel Gorman <mgorman@suse.de> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Matthew Wilcox <willy@infradead.org> Cc: Neil Brown <neilb@suse.de> Cc: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
8b257783 |
|
23-Jan-2017 |
Gavin Shan <gwshan@linux.vnet.ibm.com> |
powerpc/kernel: Fix unbalanced refcount on RTAS device node The RTAS device-tree node's refcount has been increased by one in the function call of_find_node_by_name(), but it's missed to be decreased by one in the error path. It leads to unbalanced refcount on RTAS device-tree node. This fixes above issue by decreasing RTAS device-tree node's refcount in error path. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
de6d2d1b |
|
23-Jan-2017 |
Gavin Shan <gwshan@linux.vnet.ibm.com> |
powerpc/kernel: Use of_property_read_u32() in rtas_initialize() This uses of_property_read_u32() in rtas_initialize() so that we needn't explicitly care the CPU's endian. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
dbecd509 |
|
23-Jan-2017 |
Gavin Shan <gwshan@linux.vnet.ibm.com> |
powerpc/kernel: Remove nested if statements in rtas_initialize() This removes the unnecessary nested if statements in function rtas_initialize(), to simplify the code. No functional changes introduced. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
7c0f6ba6 |
|
24-Dec-2016 |
Linus Torvalds <torvalds@linux-foundation.org> |
Replace <asm/uaccess.h> with <linux/uaccess.h> globally This was entirely automated, using the script by Al: PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>' sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \ $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h) to do the replacement at the end of the merge window. Requested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
95ec77c0 |
|
11-Jul-2016 |
Daniel Axtens <dja@axtens.net> |
powerpc: Make ppc_md.{halt, restart} __noreturn powernv marks it's halt and restart calls as __noreturn. However, ppc_md does not have this annotation. Add the annotation to ppc_md, and then to every halt/restart function that is missing it. Additionally, I have verified that all of these functions do not return. Occasionally I have added a spin loop to be sure. Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
484cc1ed |
|
04-Jul-2016 |
Benjamin Herrenschmidt <benh@kernel.crashing.org> |
powerpc/rtas: Don't test for machine type in rtas_initialize() The test is unnecessary, the FW_FEATURE_LPAR is sufficient as there exist no other LPAR type that has RTAS. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
a9862c74 |
|
18-Mar-2016 |
Andrew Donnellan <andrew.donnellan@au1.ibm.com> |
powerpc/rtas: Fix array overrun in ppc_rtas() syscall If ppc_rtas() is called with args.nargs == 16 and args.nret == 0, args.rets is set to point to &args.args[16], which is beyond the end of the args.args array. This results in a minor read overrun of the array when we check the first return code (which, per PAPR, is a required output of all RTAS calls) to see if there's been a hardware error. Change the nargs/nret check to ensure nargs is <= 15, allowing room for the status code. Users shouldn't be calling with nret == 0, but there's no real harm if they do, so we don't stop them. Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
cd5cdeb6 |
|
24-Nov-2015 |
Michael Ellerman <mpe@ellerman.id.au> |
powerpc/rtas: Make enter_rtas() private There are no longer any users of enter_rtas() outside of rtas.c, so make it "private", by moving the declaration inside rtas.c. Hopefully this will encourage people to use one of the wrappers which takes the sharp edges off the RTAS calling sequence. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
4456f452 |
|
24-Nov-2015 |
Michael Ellerman <mpe@ellerman.id.au> |
powerpc/rtas: Use rtas_call_unlocked() in call_rtas_display_status() Although call_rtas_display_status() does actually want to use the regular RTAS locking, it doesn't want the extra logic that is in rtas_call(), so currently it open codes the logic. Instead we can use rtas_call_unlocked(), after taking the RTAS lock. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
209eb4e5 |
|
16-Dec-2015 |
Michael Ellerman <mpe@ellerman.id.au> |
powerpc/rtas: Add rtas_call_unlocked() Most users of RTAS (Run-Time Abstraction Services) use rtas_call(), which deals with locking as well as endian handling. However we have two users outside of rtas.c that can't use rtas_call() because they have different locking requirements. The hotplug CPU code can't take the RTAS lock because the CPU would go offline with the lock held and no other CPUs would be able to call RTAS until the CPU came back online. The xmon code doesn't want to take the lock because it would risk dead locking when we are trying to recover from a crash. Both sites required multiple patches when we added little endian support, proving that programmers can't do endian right. Although that ship has sailed, we can still clean the code up by providing an unlocked version of rtas_call() which avoids the need to open code the logic elsewhere. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
8832317f |
|
16-Oct-2015 |
Vasant Hegde <hegdevasant@linux.vnet.ibm.com> |
powerpc/rtas: Validate rtas.entry before calling enter_rtas() Currently we do not validate rtas.entry before calling enter_rtas(). This leads to a kernel oops when user space calls rtas system call on a powernv platform (see below). This patch adds code to validate rtas.entry before making enter_rtas() call. Oops: Exception in kernel mode, sig: 4 [#1] SMP NR_CPUS=1024 NUMA PowerNV task: c000000004294b80 ti: c0000007e1a78000 task.ti: c0000007e1a78000 NIP: 0000000000000000 LR: 0000000000009c14 CTR: c000000000423140 REGS: c0000007e1a7b920 TRAP: 0e40 Not tainted (3.18.17-340.el7_1.pkvm3_1_0.2400.1.ppc64le) MSR: 1000000000081000 <HV,ME> CR: 00000000 XER: 00000000 CFAR: c000000000009c0c SOFTE: 0 NIP [0000000000000000] (null) LR [0000000000009c14] 0x9c14 Call Trace: [c0000007e1a7bba0] [c00000000041a7f4] avc_has_perm_noaudit+0x54/0x110 (unreliable) [c0000007e1a7bd80] [c00000000002ddc0] ppc_rtas+0x150/0x2d0 [c0000007e1a7be30] [c000000000009358] syscall_exit+0x0/0x98 Cc: stable@vger.kernel.org # v3.2+ Fixes: 55190f88789a ("powerpc: Add skeleton PowerNV platform") Reported-by: NAGESWARA R. SASTRY <nasastry@in.ibm.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> [mpe: Reword change log, trim oops, and add stable + fixes] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
1c2cb594 |
|
16-Jul-2015 |
Thomas Huth <thuth@redhat.com> |
powerpc/rtas: Introduce rtas_get_sensor_fast() for IRQ handlers The EPOW interrupt handler uses rtas_get_sensor(), which in turn uses rtas_busy_delay() to wait for RTAS becoming ready in case it is necessary. But rtas_busy_delay() is annotated with might_sleep() and thus may not be used by interrupts handlers like the EPOW handler! This leads to the following BUG when CONFIG_DEBUG_ATOMIC_SLEEP is enabled: BUG: sleeping function called from invalid context at arch/powerpc/kernel/rtas.c:496 in_atomic(): 1, irqs_disabled(): 1, pid: 0, name: swapper/1 CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.2.0-rc2-thuth #6 Call Trace: [c00000007ffe7b90] [c000000000807670] dump_stack+0xa0/0xdc (unreliable) [c00000007ffe7bc0] [c0000000000e1f14] ___might_sleep+0x134/0x180 [c00000007ffe7c20] [c00000000002aec0] rtas_busy_delay+0x30/0xd0 [c00000007ffe7c50] [c00000000002bde4] rtas_get_sensor+0x74/0xe0 [c00000007ffe7ce0] [c000000000083264] ras_epow_interrupt+0x44/0x450 [c00000007ffe7d90] [c000000000120260] handle_irq_event_percpu+0xa0/0x300 [c00000007ffe7e70] [c000000000120524] handle_irq_event+0x64/0xc0 [c00000007ffe7eb0] [c000000000124dbc] handle_fasteoi_irq+0xec/0x260 [c00000007ffe7ef0] [c00000000011f4f0] generic_handle_irq+0x50/0x80 [c00000007ffe7f20] [c000000000010f3c] __do_irq+0x8c/0x200 [c00000007ffe7f90] [c0000000000236cc] call_do_irq+0x14/0x24 [c00000007e6f39e0] [c000000000011144] do_IRQ+0x94/0x110 [c00000007e6f3a30] [c000000000002594] hardware_interrupt_common+0x114/0x180 Fix this issue by introducing a new rtas_get_sensor_fast() function that does not use rtas_busy_delay() - and thus can only be used for sensors that do not cause a BUSY condition - known as "fast" sensors. The EPOW sensor is defined to be "fast" in sPAPR - mpe. Fixes: 587f83e8dd50 ("powerpc/pseries: Use rtas_get_sensor in RAS code") Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
9ef03193 |
|
22-Jul-2015 |
Thomas Huth <thuth@redhat.com> |
powerpc/rtas: Replace magic values with defines rtas.h already has some nice #defines for RTAS return status codes - let's use them instead of hard-coded "magic" values! Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
f691fa10 |
|
29-Mar-2015 |
Michael Ellerman <mpe@ellerman.id.au> |
powerpc: Replace mem_init_done with slab_is_available() We have a powerpc specific global called mem_init_done which is "set on boot once kmalloc can be called". But that's not *quite* true. We set it at the bottom of mem_init(), and rely on the fact that mm_init() calls kmem_cache_init() immediately after that, and nothing is running in parallel. So replace it with the generic and 100% correct slab_is_available(). Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
c03e7374 |
|
27-Mar-2015 |
Tyrel Datwyler <tyreld@linux.vnet.ibm.com> |
powerpc/pseries: Simplify check for suspendability during suspend/migration During suspend/migration operation we must wait for the VASI state reported by the hypervisor to become Suspending prior to making the ibm,suspend-me RTAS call. Calling routines to rtas_ibm_supend_me() pass a vasi_state variable that exposes the VASI state to the caller. This is unnecessary as the caller only really cares about the following three conditions; if there is an error we should bailout, success indicating we have suspended and woken back up so proceed to device tree update, or we are not suspendable yet so try calling rtas_ibm_suspend_me again shortly. This patch removes the extraneous vasi_state variable and simply uses the return code to communicate how to proceed. We either succeed, fail, or get -EAGAIN in which case we sleep for a second before trying to call rtas_ibm_suspend_me again. The behaviour of ppc_rtas() remains the same, but migrate_store() now returns the propogated error code on failure. Previously -1 was returned from migrate_store() in the failure case which equates to -EPERM and was clearly wrong. Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> Cc: Nathan Fontenont <nfont@linux.vnet.ibm.com> Cc: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
3df76a9d |
|
20-Jan-2015 |
Cyril Bur <cyrilbur@gmail.com> |
powerpc/pseries: Fix endian problems with LE migration RTAS events require arguments be passed in big endian while hypercalls have their arguments passed in registers and the values should therefore be in CPU endian. The "ibm,suspend_me" 'RTAS' call makes a sequence of hypercalls to setup one true RTAS call. This means that "ibm,suspend_me" is handled specially in the ppc_rtas() syscall. The ppc_rtas() syscall has its arguments in big endian and can therefore pass these arguments directly to the RTAS call. "ibm,suspend_me" is handled specially from within ppc_rtas() (by calling rtas_ibm_suspend_me()) which has left an endian bug on little endian systems due to the requirement of hypercalls. The return value from rtas_ibm_suspend_me() gets returned in cpu endian, and is left unconverted, also a bug on little endian systems. rtas_ibm_suspend_me() does not actually make use of the rtas_args that it is passed. This patch removes the convoluted use of the rtas_args struct to pass params to rtas_ibm_suspend_me() in favour of passing what it needs as actual arguments. This patch also ensures the two callers of rtas_ibm_suspend_me() pass function parameters in cpu endian and in the case of ppc_rtas(), converts the return value. migrate_store() (the other caller of rtas_ibm_suspend_me()) is from a sysfs file which deals with everything in cpu endian so this function only underwent cleanup. This patch has been tested with KVM both LE and BE and on PowerVM both LE and BE. Under QEMU/KVM the migration happens without touching these code pathes. For PowerVM there is no obvious regression on BE and the LE code path now provides the correct parameters to the hypervisor. Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
14ed7409 |
|
17-Sep-2014 |
Anton Blanchard <anton@samba.org> |
powerpc: Remove some old bootmem related comments Now bootmem is gone from powerpc we can remove comments mentioning it. Signed-off-by: Anton Blanchard <anton@samba.org> Tested-by: Emil Medve <Emilian.Medve@Freescale.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
#
9d0c4dfe |
|
01-Apr-2014 |
Rob Herring <robh@kernel.org> |
of/fdt: update of_get_flat_dt_prop in prep for libfdt Make of_get_flat_dt_prop arguments compatible with libfdt fdt_getprop call in preparation to convert FDT code to use libfdt. Make the return value const and the property length ptr type an int. Signed-off-by: Rob Herring <robh@kernel.org> Tested-by: Michal Simek <michal.simek@xilinx.com> Tested-by: Grant Likely <grant.likely@linaro.org> Tested-by: Stephen Chivers <schivers@csc.com>
|
#
a08a53ea |
|
04-Apr-2014 |
Greg Kurz <groug@kaod.org> |
powerpc/le: Enable RTAS events support The current kernel code assumes big endian and parses RTAS events all wrong. The most visible effect is that we cannot honor EPOW events, meaning, for example, we cannot shut down a guest properly from the hypervisor. This new patch is largely inspired by Nathan's work: we get rid of all the bit fields in the RTAS event structures (even the unused ones, for consistency). We also introduce endian safe accessors for the fields used by the kernel (trivial rtas_error_type() accessor added for consistency). Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
#
599d2870 |
|
19-Mar-2014 |
Greg Kurz <groug@kaod.org> |
powerpc/le: Big endian arguments for ppc_rtas() The ppc_rtas() syscall allows userspace to interact directly with RTAS. For the moment, it assumes every thing is big endian and returns either EINVAL or EFAULT when called in a little endian environment. As suggested by Benjamin, to avoid bugs when userspace wants to pass a non 32 bit value to RTAS, it is far better to stick with a simple rationale: ppc_rtas() should be called with a big endian rtas_args structure. With this patch, it is now up to userspace to forge big endian arguments, as expected by RTAS. Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
#
27128264 |
|
06-Aug-2013 |
Anton Blanchard <anton@samba.org> |
powerpc: Make RTAS calls endian safe RTAS expects arguments in the call buffer to be big endian so we need to byteswap on little endian builds Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
#
08bc1dc5 |
|
06-Aug-2013 |
Anton Blanchard <anton@samba.org> |
powerpc: Make RTAS device tree accesses endian safe Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
#
061d19f2 |
|
24-Jun-2013 |
Paul Gortmaker <paul.gortmaker@windriver.com> |
powerpc: Delete __cpuinit usage from all users The __cpuinit type of throwaway sections might have made sense some time ago when RAM was more constrained, but now the savings do not offset the cost and complications. For example, the fix in commit 5e427ec2d0 ("x86: Fix bit corruption at CPU resume time") is a good example of the nasty type of bugs that can be created with improper use of the various __init prefixes. After a discussion on LKML[1] it was decided that cpuinit should go the way of devinit and be phased out. Once all the users are gone, we can then finally remove the macros themselves from linux/init.h. This removes all the powerpc uses of the __cpuinit macros. There are no __CPUINIT users in assembly files in powerpc. [1] https://lkml.org/lkml/2013/5/20/589 Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Josh Boyer <jwboyer@gmail.com> Cc: Matt Porter <mporter@kernel.crashing.org> Cc: Kumar Gala <galak@kernel.crashing.org> Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
#
120496ac |
|
06-May-2013 |
Robert Jennings <rcj@linux.vnet.ibm.com> |
powerpc: Bring all threads online prior to migration/hibernation This patch brings online all threads which are present but not online prior to migration/hibernation. After migration/hibernation those threads are taken back offline. During migration/hibernation all online CPUs must call H_JOIN, this is required by the hypervisor. Without this patch, threads that are offline (H_CEDE'd) will not be woken to make the H_JOIN call and the OS will be deadlocked (all threads either JOIN'd or CEDE'd). Cc: <stable@kernel.org> Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
#
f459d63e |
|
02-Oct-2012 |
Nathan Fontenot <nfont@linux.vnet.ibm.com> |
powerpc+of: Remove the pSeries_reconfig.h file Remove the pSeries_reconfig.h header file. At this point there is only one definition in the file, pSeries_coalesce_init(), which can be moved to rtas.h. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Acked-by: Rob Herring <rob.herring@calxeda.com> Acked-by: Grant Likely <grant.likely@secretlab.ca> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
#
ae3a197e |
|
28-Mar-2012 |
David Howells <dhowells@redhat.com> |
Disintegrate asm/system.h for PowerPC Disintegrate asm/system.h for PowerPC. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> cc: linuxppc-dev@lists.ozlabs.org
|
#
6431f208 |
|
21-Mar-2012 |
Anton Blanchard <anton@samba.org> |
powerpc: Make function that parses RTAS error logs global The IO event interrupt code has a function that finds specific sections in an RTAS error log. We want to use it in the EPOW code so make it global. Rename things to make it less cryptic: find_xelog_section() -> get_pseries_errorlog() struct pseries_elog_section -> struct pseries_errorlog Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
#
444080d1 |
|
10-Jan-2012 |
Brian King <brking@linux.vnet.ibm.com> |
powerpc/pseries: Fix partition migration hang in stop_topology_update This fixes a hang that was observed during live partition migration. Since stop_topology_update must not be called from an interrupt context, call it earlier in the migration process. The hang observed can be seen below: WARNING: at kernel/timer.c:1011 Modules linked in: ip6t_LOG xt_tcpudp xt_pkttype ipt_LOG xt_limit ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_raw xt_NOTRACK ipt_REJECT xt_state iptable_raw iptable_filter ip6table_mangle nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 ip_tables ip6table_filter ip6_tables x_tables ipv6 fuse loop ibmveth sg ext3 jbd mbcache raid456 async_raid6_recov async_pq raid6_pq async_xor xor async_memcpy async_tx raid10 raid1 raid0 scsi_dh_alua scsi_dh_rdac scsi_dh_hp_sw scsi_dh_emc dm_round_robin dm_multipath scsi_dh sd_mod crc_t10dif ibmvfc scsi_transport_fc scsi_tgt scsi_mod dm_snapshot dm_mod NIP: c0000000000c52d8 LR: c00000000004be28 CTR: 0000000000000000 REGS: c00000005ffd77d0 TRAP: 0700 Not tainted (3.2.0-git-00001-g07d106d) MSR: 8000000000021032 <ME,CE,IR,DR> CR: 48000084 XER: 00000001 CFAR: c00000000004be20 TASK = c00000005ec78860[0] 'swapper/3' THREAD: c00000005ec98000 CPU: 3 GPR00: 0000000000000001 c00000005ffd7a50 c000000000fbbc98 c000000000ec8340 GPR04: 00000000282a0020 0000000000000000 0000000000004000 0000000000000101 GPR08: 0000000000000012 c00000005ffd4000 0000000000000020 c000000000f3ba88 GPR12: 0000000000000000 c000000007f40900 0000000000000001 0000000000000004 GPR16: 0000000000000001 0000000000000000 0000000000000000 c000000001022310 GPR20: 0000000000000001 0000000000000000 0000000000200200 c000000001029e14 GPR24: 0000000000000000 0000000000000001 0000000000000040 c00000003f74bc80 GPR28: c00000003f74bc84 c000000000f38038 c000000000f16b58 c000000000ec8340 NIP [c0000000000c52d8] .del_timer_sync+0x28/0x60 LR [c00000000004be28] .stop_topology_update+0x20/0x38 Call Trace: [c00000005ffd7a50] [c00000005ec78860] 0xc00000005ec78860 (unreliable) [c00000005ffd7ad0] [c00000000004be28] .stop_topology_update+0x20/0x38 [c00000005ffd7b40] [c000000000028378] .__rtas_suspend_last_cpu+0x58/0x260 [c00000005ffd7bf0] [c0000000000fa230] .generic_smp_call_function_interrupt+0x160/0x358 [c00000005ffd7cf0] [c000000000036ec8] .smp_ipi_demux+0x88/0x100 [c00000005ffd7d80] [c00000000005c154] .icp_hv_ipi_action+0x5c/0x80 [c00000005ffd7e00] [c00000000012a088] .handle_irq_event_percpu+0x100/0x318 [c00000005ffd7f00] [c00000000012e774] .handle_percpu_irq+0x84/0xd0 [c00000005ffd7f90] [c000000000022ba8] .call_handle_irq+0x1c/0x2c [c00000005ec9ba20] [c00000000001157c] .do_IRQ+0x22c/0x2a8 [c00000005ec9bae0] [c0000000000054bc] hardware_interrupt_entry+0x18/0x1c Exception: 501 at .cpu_idle+0x194/0x2f8 LR = .cpu_idle+0x194/0x2f8 [c00000005ec9bdd0] [c000000000017e58] .cpu_idle+0x188/0x2f8 (unreliable) [c00000005ec9be90] [c00000000067ec18] .start_secondary+0x3e4/0x524 [c00000005ec9bf90] [c0000000000093e8] .start_secondary_prolog+0x10/0x14 Instruction dump: ebe1fff8 4e800020 fbe1fff8 7c0802a6 f8010010 7c7f1b78 f821ff81 78290464 80090014 5400019e 7c0000d0 78000fe0 <0b000000> 4800000c 7c210b78 7c421378 Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
#
4b16f8e2 |
|
22-Jul-2011 |
Paul Gortmaker <paul.gortmaker@windriver.com> |
powerpc: various straight conversions from module.h --> export.h All these files were including module.h just for the basic EXPORT_SYMBOL infrastructure. We can shift them off to the export.h header which is a way smaller footprint and thus realize some compile time gains. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
|
#
60063497 |
|
26-Jul-2011 |
Arun Sharma <asharma@fb.com> |
atomic: use <linux/atomic.h> This allows us to move duplicated code in <asm/atomic.h> (atomic_inc_not_zero() for now) to <linux/atomic.h> Signed-off-by: Arun Sharma <asharma@fb.com> Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: David Miller <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
c5f41752 |
|
25-Jul-2011 |
Amerigo Wang <amwang@redhat.com> |
notifiers: sys: move reboot notifiers into reboot.h It is not necessary to share the same notifier.h. This patch already moves register_reboot_notifier() and unregister_reboot_notifier() from kernel/notifier.c to kernel/sys.c. [amwang@redhat.com: make allyesconfig succeed on ppc64] Signed-off-by: WANG Cong <amwang@redhat.com> Cc: David Miller <davem@davemloft.net> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Greg KH <greg@kroah.com> Signed-off-by: WANG Cong <amwang@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
9ee820fa |
|
04-May-2011 |
Brian King <brking@linux.vnet.ibm.com> |
powerpc/pseries: Add page coalescing support Adds support for page coalescing, which is a feature on IBM Power servers which allows for coalescing identical pages between logical partitions. Hint text pages as coalesce candidates, since they are the most likely pages to be able to be coalesced between partitions. This patch also exports some page coalescing statistics available from firmware via lparcfg. [BenH: Moved a couple of things around to fix compile problems] Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
#
eca590f4 |
|
06-Apr-2011 |
Anton Blanchard <anton@samba.org> |
powerpc/rtas: Only sleep in rtas_busy_delay if we have useful work to do RTAS returns extended error codes as a hint of how long the OS might want to wait before retrying a call. If we have nothing else useful to do we may as well call back straight away. This was found when testing the new dynamic dma window feature. Firmware split the zeroing of the TCE table into 32k chunks but returned 9901 (which is a suggested wait of 10ms). All up this took about 10 minutes to complete since msleep is jiffies based and will round 10ms up to 20ms. With the patch below we take 3 seconds to complete the same test. The hint firmware is returning in the RTAS call should definitely be decreased, but even if we slept 1ms each iteration this would take 32s. Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Nishanth Aravamudan <nacc@us.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
#
3b7a27db |
|
30-Nov-2010 |
Jesse Larrew <jlarrew@linux.vnet.ibm.com> |
powerpc: Disable VPHN polling during a suspend operation Tie the polling mechanism into the ibm,suspend-me rtas call to stop/restart polling before/after a suspend, hibernate, migrate, or checkpoint restart operation. This ensures that the system has a chance to disable the polling if the partition is migrated to a system that does not support VPHN (and vice versa). Signed-off-by: Jesse Larrew <jlarrew@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
#
d8862be1 |
|
10-Sep-2010 |
Nathan Fontenot <nfont@austin.ibm.com> |
powerpc/pseries: Export rtas_ibm_suspend_me() Export the rtas_ibm_suspend_me() routine. This is needed to perform partition migration in the kernel. Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
#
cd3db0c4 |
|
06-Jul-2010 |
Benjamin Herrenschmidt <benh@kernel.crashing.org> |
memblock: Remove rmo_size, burry it in arch/powerpc where it belongs The RMA (RMO is a misnomer) is a concept specific to ppc64 (in fact server ppc64 though I hijack it on embedded ppc64 for similar purposes) and represents the area of memory that can be accessed in real mode (aka with MMU off), or on embedded, from the exception vectors (which is bolted in the TLB) which pretty much boils down to the same thing. We take that out of the generic MEMBLOCK data structure and move it into arch/powerpc where it belongs, renaming it to "RMA" while at it. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
#
95f72d1e |
|
11-Jul-2010 |
Yinghai Lu <yinghai@kernel.org> |
lmb: rename to memblock via following scripts FILES=$(find * -type f | grep -vE 'oprofile|[^K]config') sed -i \ -e 's/lmb/memblock/g' \ -e 's/LMB/MEMBLOCK/g' \ $FILES for N in $(find . -name lmb.[ch]); do M=$(echo $N | sed 's/lmb/memblock/g') mv $N $M done and remove some wrong change like lmbench and dlmb etc. also move memblock.c from lib/ to mm/ Suggested-by: Ingo Molnar <mingo@elte.hu> Acked-by: "H. Peter Anvin" <hpa@zytor.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
#
8fe93f8d |
|
06-Jul-2010 |
Brian King <brking@linux.vnet.ibm.com> |
powerpc/pseries: Migration code reorganization / hibernation prep Partition hibernation will use some of the same code as is currently used for Live Partition Migration. This function further abstracts this code such that code outside of rtas.c can utilize it. It also changes the error field in the suspend me data structure to be an atomic type, since it is set and checked on different cpus without any barriers or locking. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
#
e9bbc8cd |
|
17-Feb-2010 |
Anton Blanchard <anton@samba.org> |
powerpc/pseries: Call ibm,os-term if the ibm,extended-os-term is present We have had issues in the past with ibm,os-term initiating shutdown of a partition. This is confusing to the user, especially if panic_timeout is non zero. The temporary fix was to avoid calling ibm,os-term if a panic_timeout was set and since we set it on every boot we basically never call ibm,os-term. An extended version of ibm,os-term has since been implemented which gives us the behaviour we want: "When the platform supports extended ibm,os-term behavior, the return to the RTAS will always occur unless there is a kernel assisted dump active as initiated by an ibm,configure-kernel-dump call." This patch checks for the ibm,extended-os-term property and calls ibm,os-term if it exists. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
#
5a0e3ad6 |
|
24-Mar-2010 |
Tejun Heo <tj@kernel.org> |
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
|
#
0199c4e6 |
|
02-Dec-2009 |
Thomas Gleixner <tglx@linutronix.de> |
locking: Convert __raw_spin* functions to arch_spin* Name space cleanup. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: linux-arch@vger.kernel.org
|
#
edc35bd7 |
|
02-Dec-2009 |
Thomas Gleixner <tglx@linutronix.de> |
locking: Rename __RAW_SPIN_LOCK_UNLOCKED to __ARCH_SPIN_LOCK_UNLOCKED Further name space cleanup. No functional change Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: linux-arch@vger.kernel.org
|
#
445c8951 |
|
02-Dec-2009 |
Thomas Gleixner <tglx@linutronix.de> |
locking: Convert raw_spinlock to arch_spinlock The raw_spin* namespace was taken by lockdep for the architecture specific implementations. raw_spin_* would be the ideal name space for the spinlocks which are not converted to sleeping locks in preempt-rt. Linus suggested to convert the raw_ to arch_ locks and cleanup the name space instead of using an artifical name like core_spin, atomic_spin or whatever No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: linux-arch@vger.kernel.org
|
#
46db2f86 |
|
27-Aug-2009 |
Brian King <brking@linux.vnet.ibm.com> |
powerpc/pseries: Fix to handle slb resize across migration The SLB can change sizes across a live migration, which was not being handled, resulting in possible machine crashes during migration if migrating to a machine which has a smaller max SLB size than the source machine. Fix this by first reducing the SLB size to the minimum possible value, which is 32, prior to migration. Then during the device tree update which occurs after migration, we make the call to ensure the SLB gets updated. Also add the slb_size to the lparcfg output so that the migration tools can check to make sure the kernel has this capability before allowing migration in scenarios where the SLB size will change. BenH: Fixed #include <asm/mmu-hash64.h> -> <asm/mmu.h> to avoid breaking ppc32 build Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
#
c4007a2f |
|
16-Jun-2009 |
Benjamin Herrenschmidt <benh@kernel.crashing.org> |
powerpc: Use one common impl. of RTAS timebase sync and use raw spinlock Several platforms use their own copy of what is essentially the same code, using RTAS to synchronize the timebases when bringing up new CPUs. This moves it all into a single common implementation and additionally turns the spinlock into a raw spinlock since the former can rely on the timebase not being frozen when spinlock debugging is enabled, and finally masks interrupts while the timebase is disabled. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
#
f97bb36f |
|
16-Jun-2009 |
Benjamin Herrenschmidt <benh@kernel.crashing.org> |
powerpc/rtas: Turn rtas lock into a raw spinlock RTAS currently uses a normal spinlock. However it can be called from contexts where this is not necessarily a good idea. For example, it can be called while syncing timebases, with the core timebase being frozen. Unfortunately, that will deadlock in case of lock contention when spinlock debugging is enabled as the spin lock debugging code will try to use __delay() which ... relies on the timebase being enabled. Also RTAS can be used in some low level IRQ handling code path so it may as well be a raw spinlock for -rt sake. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
#
f52862f4 |
|
16-Feb-2009 |
Brian King <brking@linux.vnet.ibm.com> |
powerpc/pseries: Fix partition migration hang under load While testing partition migration with heavy CPU load using shared processors, it was observed that sometimes the migration would never complete and would appear to hang. Currently, the migration code assumes that if H_SUCCESS is returned from the H_JOIN then the migration is complete and the processor is waking up on the target system. If there was an outstanding PROD to the processor when the H_JOIN is called, however, it will return H_SUCCESS on the source system, causing the migration to hang, or in some scenarios cause the kernel to crash on the complete call waking the caller of rtas_percpu_suspend_me. Fix this by calling H_JOIN multiple times if necessary during the migration. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
#
edc72ac4 |
|
11-Dec-2008 |
Nathan Lynch <ntl@pobox.com> |
powerpc/pseries: Check for GIQ indicator before calling set-indicator Since "Factor out cpu joining/unjoining the GIQ" (b4963255ad5a426f04a0bb15c4315fa4bb40cde9) the WARN_ON in xics_set_cpu_giq() is being triggered during boot on JS20 because the GIQ indicator is not available on that platform. While the warning is harmless and the system runs normally, it's nicer to check for the existence of the indicator before trying to manipulate it. Implement rtas_indicator_present(), which searches the /rtas/rtas-indicators property for the given indicator token, and use this function in xics_set_cpu_giq(). Also use a WARN statement in xics_set_cpu_giq to get better information on failure. Signed-off-by: Nathan Lynch <ntl@pobox.com> Acked-by: Milton Miller <miltonm@bga.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
|
#
b79998fc |
|
30-Jul-2008 |
Nathan Fontenot <nfont@austin.ibm.com> |
powerpc: Zero fill the return values of rtas argument buffer The kernel copy of the rtas args struct contains the return value(s) for the specified rtas call. These are copied back to user space with the assumption that every value has been set by the rtas call, which turns out to be not always true. Thus userspace can see random values and think the call failed when in fact it succeeded, but for some reason didn't set one of the return values. This fixes the problem by zeroing out the return value fields of the rtas args struct before processing the rtas call. Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
|
#
15c8b6c1 |
|
09-May-2008 |
Jens Axboe <jens.axboe@oracle.com> |
on_each_cpu(): kill unused 'retry' parameter It's not even passed on to smp_call_function() anymore, since that was removed. So kill it. Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
#
1c21a293 |
|
07-May-2008 |
Michael Ellerman <michael@ellerman.id.au> |
[POWERPC] Fix sparse warnings in arch/powerpc/kernel Make a few things static in lparcfg.c Make init and exit routines static in rtas_flash.c Make things static in rtas_pci.c Make some functions static in rtas.c Make fops static in rtas-proc.c Remove unneeded extern for do_gtod in smp.c Make clocksource_init() static in time.c Make last_tick_len and ticklen_to_xs static in time.c Move the declaration of the pvr per-cpu into smp.h Make kexec_smp_down() and kexec_stack static in machine_kexec_64.c Don't return void in arch_teardown_msi_irqs() in msi.c Move declaration of GregorianDay()into asm/time.h Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
|
#
950e4da3 |
|
26-Feb-2008 |
Matthew Wilcox <willy@infradead.org> |
arch: Remove unnecessary inclusions of asm/semaphore.h None of these files use any of the functionality promised by asm/semaphore.h. It's possible that they rely on it dragging in some unrelated header file, but I can't build all these files, so we'll have fix any build failures as they come up. Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
|
#
e48b1b45 |
|
28-Mar-2008 |
Harvey Harrison <harvey.harrison@gmail.com> |
[POWERPC] Replace remaining __FUNCTION__ occurrences __FUNCTION__ is gcc-specific, use __func__ Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
|
#
d9b2b2a2 |
|
13-Feb-2008 |
David S. Miller <davem@davemloft.net> |
[LIB]: Make PowerPC LMB code generic so sparc64 can use it too. Signed-off-by: David S. Miller <davem@davemloft.net>
|
#
8f515061 |
|
02-Dec-2007 |
Paul Mackerras <paulus@samba.org> |
Revert "[POWERPC] Fix RTAS os-term usage on kernel panic" This reverts commit a2b51812a4dc5db09ab4d4638d4d8ed456e2457e. It turns out that this change caused some machines to fail to come back up when being rebooted, and generated an error in the hypervisor error log on some machines. The platform architecture (PAPR) is a little unclear on exactly when the RTAS ibm,os-term function should be called. Until that is clarified I'm reverting this commit. Signed-off-by: Paul Mackerras <paulus@samba.org>
|
#
a2b51812 |
|
19-Nov-2007 |
Linas Vepstas <linas@austin.ibm.com> |
[POWERPC] Fix RTAS os-term usage on kernel panic The rtas_os_term() routine was being called at the wrong time. The actual rtas call "os-term" will not ever return, and so calling it from the panic notifier is too early. Instead, call it from the machine_reset() call. This splits the rtas_os_term() routine into two: one part to capture the kernel panic message, invoked during the panic notifier, and another part that is invoked during machine_reset(). Prior to this patch, the os-term call was never being made, because panic_timeout was always non-zero. Calling os-term helps keep the hypervisor happy! We have to keep the hypervisor happy to avoid service, dump and error reporting problems. Signed-off-by: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
|
#
8f5c7579 |
|
13-Nov-2007 |
Nathan Lynch <ntl@pobox.com> |
[POWERPC] Fix multiple bugs in rtas_ibm_suspend_me code There are several issues with the rtas_ibm_suspend_me code, which enables platform-assisted suspension of an LPAR as covered in PAPR 2.2. 1.) rtas_ibm_suspend_me uses on_each_cpu() to invoke rtas_percpu_suspend_me on all cpus via IPI: if (on_each_cpu(rtas_percpu_suspend_me, &data, 1, 0)) ... 'data' is on the calling task's stack, but rtas_ibm_suspend_me takes no measures to ensure that all instances of rtas_percpu_suspend_me are finished accessing 'data' before returning. This can result in the IPI'd cpus accessing random stack data and getting stuck in H_JOIN. This is addressed by using an atomic count of workers and a completion on the stack. 2.) rtas_percpu_suspend_me is needlessly calling H_JOIN in a loop. The only event that can cause a cpu to return from H_JOIN is an H_PROD from another cpu or a NMI/system reset. Each cpu need call H_JOIN only once per suspend operation. Remove the loop and the now unnecessary 'waiting' state variable. 3.) H_JOIN must be called with MSR[EE] off, but lazy interrupt disabling may cause the caller of rtas_ibm_suspend_me to call H_JOIN with it on; the local_irq_disable() in on_each_cpu() is not sufficient. Fix this by explicitly saving the MSR and clearing the EE bit before calling H_JOIN. 4.) H_PROD is being called with the Linux logical cpu number as the parameter, not the platform interrupt server value. (It's also being called for all possible cpus, which is harmless, but unnecessary.) This is fixed by calling H_PROD for each online cpu using get_hard_smp_processor_id(cpu) for the argument. Signed-off-by: Nathan Lynch <ntl@pobox.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
|
#
8c8dc322 |
|
23-Apr-2007 |
Stephen Rothwell <sfr@canb.auug.org.au> |
[POWERPC] Remove old interface find_path_device Replaced by of_find_node_by_path. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
|
#
e2eb6392 |
|
03-Apr-2007 |
Stephen Rothwell <sfr@canb.auug.org.au> |
[POWERPC] Rename get_property to of_get_property: arch/powerpc Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
|
#
c5a69d57 |
|
17-Feb-2007 |
Tobias Klauser <tklauser@distanz.ch> |
Storage class should be before const qualifier The C99 specification states in section 6.11.5: The placement of a storage-class specifier other than at the beginning of the declaration specifiers in a declaration is an obsolescent feature. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
#
f2d6d2d8 |
|
06-Dec-2006 |
Nathan Lynch <ntl@pobox.com> |
[POWERPC] Add rtas_service_present() helper To test for the existence of an RTAS function, we typically do: foo_token = rtas_token("foo"); if (foo_token == RTAS_UNKNOWN_SERVICE) return; Add a rtas_service_present method, which provides a more conventional boolean interface for testing the existence of an RTAS method. Signed-off-by: Nathan Lynch <ntl@pobox.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
|
#
0332c2d4 |
|
04-Dec-2006 |
Michael Ellerman <michael@ellerman.id.au> |
[POWERPC] Move rtas_stop_self() into platforms/pseries/hotplug-cpu.c As the first step in consolidating the pseries hotplug cpu code, create platforms/pseries/hotplug-cpu.c and move rtas_stop_self() into it. Do the rtas token initialisation in a new initcall, rather than rtas_initialize(). Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Acked-by: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
|
#
088df4d2 |
|
16-Nov-2006 |
Linas Vepstas <linas@austin.ibm.com> |
[POWERPC] Wrap cpu_die() with CONFIG_HOTPLUG_CPU Per email discussion, it appears that rtas_stop_self() and pSeries_mach_cpu_die() should not be compiled if CONFIG_HOTPLUG_CPU is not defined. This patch adds #ifdefs around these bits of code. Signed-off-by: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
|
#
39ed2fe6 |
|
21-Aug-2006 |
Olaf Hering <olaf@aepfle.de> |
[POWERPC] reboot when panic_timout is set Only call into RTAS when booted with panic=0 because the RTAS call does not return. The system has to be rebooted via the HMC or via the management console right now. This is cumbersome and not what the default panic=180 is supposed to do. Signed-off-by: Olaf Hering <olh@suse.de> Signed-off-by: Paul Mackerras <paulus@samba.org>
|
#
9a2ded55 |
|
16-Aug-2006 |
Michael Neuling <mikey@neuling.org> |
[POWERPC] powerpc: Make RTAS console init generic The rtas console doesn't have to be Cell specific. If we get both RTAS tokens, we should just enabled the console then and there. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
|
#
81b73dd9 |
|
27-Jul-2006 |
Haren Myneni <haren@us.ibm.com> |
[POWERPC] Fix might-sleep warning on removing cpus Noticing the following might_sleep warning (dump_stack()) during kdump testing when CONFIG_DEBUG_SPINLOCK_SLEEP is enabled. All secondary CPUs will be calling rtas_set_indicator with interrupts disabled to remove them from global interrupt queue. BUG: sleeping function called from invalid context at arch/powerpc/kernel/rtas.c:463 in_atomic():1, irqs_disabled():1 Call Trace: [C00000000FFFB970] [C000000000010234] .show_stack+0x68/0x1b0 (unreliable) [C00000000FFFBA10] [C000000000059354] .__might_sleep+0xd8/0xf4 [C00000000FFFBA90] [C00000000001D1BC] .rtas_busy_delay+0x20/0x5c [C00000000FFFBB20] [C00000000001D8A8] .rtas_set_indicator+0x6c/0xcc [C00000000FFFBBC0] [C000000000048BF4] .xics_teardown_cpu+0x118/0x134 [C00000000FFFBC40] [C00000000004539C] .pseries_kexec_cpu_down_xics+0x74/0x8c [C00000000FFFBCC0] [C00000000002DF08] .crash_ipi_callback+0x15c/0x188 [C00000000FFFBD50] [C0000000000296EC] .smp_message_recv+0x84/0xdc [C00000000FFFBDC0] [C000000000048E08] .xics_ipi_dispatch+0xf0/0x130 [C00000000FFFBE50] [C00000000009EF10] .handle_IRQ_event+0x7c/0xf8 [C00000000FFFBF00] [C0000000000A0A14] .handle_percpu_irq+0x90/0x10c [C00000000FFFBF90] [C00000000002659C] .call_handle_irq+0x1c/0x2c [C00000000058B9C0] [C00000000000CA10] .do_IRQ+0xf4/0x1a4 [C00000000058BA50] [C0000000000044EC] hardware_interrupt_entry+0xc/0x10 --- Exception: 501 at .plpar_hcall_norets+0x14/0x1c LR = .pseries_dedicated_idle_sleep+0x190/0x1d4 [C00000000058BD40] [C00000000058BDE0] 0xc00000000058bde0 (unreliable) [C00000000058BDF0] [C00000000001270C] .cpu_idle+0x10c/0x1e0 [C00000000058BE70] [C000000000009274] .rest_init+0x44/0x5c To fix this issue, rtas_set_indicator_fast() is added so that will not wait for RTAS 'busy' delay and this new function is used for kdump (in xics_teardown_cpu()) and for CPU hotplug ( xics_migrate_irqs_away() and xics_setup_cpu()). Note that the platform architecture spec says that set-indicator on the indicator we're using here is not permitted to return the busy or extended busy status codes. Signed-off-by: Haren Myneni <haren@us.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
|
#
b9377ffc |
|
18-Jul-2006 |
Anton Blanchard <anton@samba.org> |
[POWERPC] clean up pseries hcall interfaces Our pseries hcall interfaces are out of control: plpar_hcall_norets plpar_hcall plpar_hcall_8arg_2ret plpar_hcall_4out plpar_hcall_7arg_7ret plpar_hcall_9arg_9ret Create 3 interfaces to cover all cases: plpar_hcall_norets: 7 arguments no returns plpar_hcall: 6 arguments 4 returns plpar_hcall9: 9 arguments 9 returns There are only 2 cases in the kernel that need plpar_hcall9, hopefully we can keep it that way. Pass in a buffer to stash return parameters so we avoid the &dummy1, &dummy2 madness. Signed-off-by: Anton Blanchard <anton@samba.org> -- Signed-off-by: Paul Mackerras <paulus@samba.org>
|
#
a7f67bdf |
|
11-Jul-2006 |
Jeremy Kerr <jk@ozlabs.org> |
[POWERPC] Constify & voidify get_property() Now that get_property() returns a void *, there's no need to cast its return value. Also, treat the return value as const, so we can constify get_property later. powerpc core changes. Signed-off-by: Jeremy Kerr <jk@ozlabs.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
|
#
cc46bb98 |
|
23-Jun-2006 |
Michael Ellerman <michael@ellerman.id.au> |
[POWERPC] Add udbg support for RTAS console Add udbg hooks for the RTAS console, based on the RTAS put-term-char and get-term-char calls. Along with my previous patches, this should enable debugging as soon as early_init_dt_scan_rtas() is called. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
|
#
458148c0 |
|
23-Jun-2006 |
Michael Ellerman <michael@ellerman.id.au> |
[POWERPC] Setup RTAS values earlier, to enable rtas_call() earlier Althought RTAS is instantiated when we enter the kernel, we can't actually call into it until we know its entry point address. Currently we grab that in rtas_initialize(), however that's quite late in the boot sequence. To enable rtas_call() earlier, we can grab the RTAS entry etc. values while we're scanning the flattened device tree. There's existing code to retrieve the values from /chosen, however we don't store them there anymore, so remove that code. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
|
#
ab3ab74d |
|
23-Jun-2006 |
Michael Ellerman <michael@ellerman.id.au> |
[POWERPC] Move RTAS exports next to their declarations Move RTAS exports next to their declarations. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
|
#
24da3dd5 |
|
23-Jun-2006 |
Michael Ellerman <michael@ellerman.id.au> |
[POWERPC] Make rtas_call() safe if RTAS hasn't been initialised Currently it's unsafe to call rtas_call() prior to rtas_initialize(). This is because the rtas.entry value hasn't been setup and so we don't know where to enter, but we just try anyway. We can't do anything intelligent without rtas.entry, so if it's not set, just return. Code that calls rtas_call() early needs to be aware that the call might fail. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
|
#
7932f0b8 |
|
15-Jun-2006 |
John Rose <johnrose@austin.ibm.com> |
[POWERPC] RTAS delay, fix module build breaks Export both news RTAS delay functions, and change the scanlog module to use the new delay functions. Signed-off-by: John Rose <johnrose@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
|
#
368a6ba5 |
|
12-Jun-2006 |
Dave C Boutcher <boutcher@cs.umn.edu> |
[POWERPC] check firmware state before suspending Currently the kernel blindly halts all the processors and calls the ibm,suspend-me rtas call. If the firmware is not in the correct state, we then re-start all the processors and return. It is much smarter to first check the firmware state, and only if it is waiting, call the ibm,suspend-me call. Signed-off-by: Paul Mackerras <paulus@samba.org>
|
#
507279db |
|
05-Jun-2006 |
John Rose <johnrose@austin.ibm.com> |
[PATCH] powerpc: reorg RTAS delay code This patch attempts to handle RTAS "busy" return codes in a more simple and consistent manner. Typical callers of RTAS shouldn't have to manage wait times and delay calls. This patch also changes the kernel to use msleep() rather than udelay() when a runtime delay is necessary. This will avoid CPU soft lockups for extended delay conditions. Signed-off-by: John Rose <johnrose@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
|
#
706c8c93 |
|
30-Mar-2006 |
Segher Boessenkool <segher@kernel.crashing.org> |
[PATCH] powerpc/pseries: Change H_StudlyCaps to H_SHOUTING_CAPS Also cleans up some nearby whitespace problems. Signed-off-by: Segher Boessenkool <segher@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
|
#
0e551954 |
|
28-Mar-2006 |
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> |
[PATCH] for_each_possible_cpu: powerpc for_each_cpu() actually iterates across all possible CPUs. We've had mistakes in the past where people were using for_each_cpu() where they should have been iterating across only online or present CPUs. This is inefficient and possibly buggy. We're renaming for_each_cpu() to for_each_possible_cpu() to avoid this in the future. This patch replaces for_each_cpu with for_each_possible_cpu. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
|
#
e8222502 |
|
28-Mar-2006 |
Benjamin Herrenschmidt <benh@kernel.crashing.org> |
[PATCH] powerpc: Kill _machine and hard-coded platform numbers This removes statically assigned platform numbers and reworks the powerpc platform probe code to use a better mechanism. With this, board support files can simply declare a new machine type with a macro, and implement a probe() function that uses the flattened device-tree to detect if they apply for a given machine. We now have a machine_is() macro that replaces the comparisons of _machine with the various PLATFORM_* constants. This commit also changes various drivers to use the new macro instead of looking at _machine. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
|
#
a7f31841 |
|
22-Mar-2006 |
Arnd Bergmann <abergman@de.ibm.com> |
[PATCH] powerpc: declare arch syscalls in <asm/syscalls.h> powerpc currently declares some of its own system calls in <asm/unistd.h>, but not all of them. That place also contains remainders of the now almost unused kernel syscall hack. - Add a new <asm/syscalls.h> with clean declarations - Include that file from every source that implements one of these - Get rid of old declarations in <asm/unistd.h> This patch is required as a base for implementing system calls from an SPU, but also makes sense as a general cleanup. Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
|
#
b4fd884a |
|
03-Feb-2006 |
Dave C Boutcher <boutcher@cs.umn.edu> |
[PATCH] powerpc: remove useless call to touch_softlockup_watchdog It turns out that we can't stop the watchdog from triggering here. If we touch the timer (which just uses the current jiffie value) before we enable interrupts, it does nothing because jiffies are not mass-updated until after we enable interrupts. If we touch the timer after we enable interrupts, its too late because the softlockup watchdog will already have triggered. The touch_softlockup_watchdog call removed below does nothing. Signed-off-by: Dave Boutcher <sleddog@us.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
|
#
82a4df74 |
|
03-Feb-2006 |
Dave C Boutcher <boutcher@cs.umn.edu> |
[PATCH] powerpc: prod all processors after ibm,suspend-me We need to prod everyone here since this is the only CPU that is guaranteed to be running after the ibm,suspend-me RTAS call returns. Signed-off-by: Dave Boutcher <sleddog@us.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
|
#
c4cb8ecc |
|
03-Feb-2006 |
Dave C Boutcher <boutcher@cs.umn.edu> |
[PATCH] powerpc: return correct rtas status from ibm,suspend-me Correctly return the status from the RTAS call. rtas_call expects to return the status as a return value. Signed-off-by: Dave Boutcher <sleddog@us.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
|
#
31a7f67e |
|
30-Jan-2006 |
Michael Ellerman <michael@ellerman.id.au> |
[PATCH] powerpc: Fix !SMP build of rtas.c arch/powerpc/kernel/rtas.c is getting hvcall.h via spinlock.h, but when we're building for UP we don't include spinlock.h. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
|
#
91dc182c |
|
13-Jan-2006 |
Dave C Boutcher <sleddog@us.ibm.com> |
[PATCH] powerpc: special-case ibm,suspend-me RTAS call Handle the ibm,suspend-me RTAS call specially. It needs to be wrapped in a set of synchronization hypervisor calls (H_Join). When the H_Join calls are made on all CPUs, the intent is that only one will return with H_Continue, meaning that he is the "last man standing". That CPU then issues the ibm,suspend-me call. What is interesting, of course, is that the CPU running when the rtas syscall is made, may NOT be the CPU that ultimately executes the ibm,suspend-me rtas call. Signed-off-by: Dave Boutcher <sleddog@us.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
|
#
a9415644 |
|
11-Jan-2006 |
Randy Dunlap <rdunlap@infradead.org> |
[PATCH] capable/capability.h (arch/) arch: Use <linux/capability.h> where capable() is used. Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
#
296167ae |
|
10-Jan-2006 |
Michael Ellerman <michael@ellerman.id.au> |
[PATCH] powerpc: Make early debugging configurable via Kconfig This patch adds Kconfig entries to control the early debugging options, currently in setup_64.c. Doing this via Kconfig rather than #defines means you can have one source tree, which is buildable for multiple platforms - and you can enable the correct early debug option for each platform via .config. I made udbg_early_init() a static inline because otherwise GCC is to daft to optimise it away when debugging is off. Now that we have udbg_init_rtas() we can make call_rtas_display_status* static. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
|
#
943ffb58 |
|
09-Jan-2006 |
Adrian Bunk <bunk@stusta.de> |
spelling: s/retreive/retrieve/ Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
#
799d6046 |
|
09-Nov-2005 |
Paul Mackerras <paulus@samba.org> |
[PATCH] powerpc: merge code values for identifying platforms This patch merges platform codes. systemcfg->platform is no longer used, systemcfg use in general is deprecated as much as possible (and renamed _systemcfg before it gets completely moved elsewhere in a future patch), _machine is now used on ppc64 along as ppc32. Platform codes aren't gone yet but we are getting a step closer. A bunch of asm code in head[_64].S is also turned into C code. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
|
#
21fe3301 |
|
06-Nov-2005 |
Benjamin Herrenschmidt <benh@kernel.crashing.org> |
[PATCH] ppc: fix a bunch of warnings Building a PowerMac kernel with ARCH=powerpc causes a bunch of warnings, this fixes some of them Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
|
#
2249ca9d |
|
06-Nov-2005 |
Paul Mackerras <paulus@samba.org> |
powerpc: Various UP build fixes Mostly this involves adding #include <asm/smp.h>, since that defines things like boot_cpuid[_phys] and [gs]et_hard_smp_processor_id, which are SMP-related but still needed on UP. This incorporates fixes posted by Olof Johansson and Heikki Lindholm. Signed-off-by: Paul Mackerras <paulus@samba.org>
|
#
f4fcbbe9 |
|
02-Nov-2005 |
Paul Mackerras <paulus@samba.org> |
powerpc: Merge remaining RTAS code This moves rtas-proc.c and rtas_flash.c into arch/powerpc/kernel, since cell wants them as well as pseries (and chrp can use rtas-proc.c too, at least in principle). rtas_fw.c is gone, with its bits moved into rtas_flash.c and rtas.c. Signed-off-by: Paul Mackerras <paulus@samba.org>
|
#
033ef338 |
|
26-Oct-2005 |
Paul Mackerras <paulus@samba.org> |
powerpc: Merge rtas.c into arch/powerpc/kernel This splits arch/ppc64/kernel/rtas.c into arch/powerpc/kernel/rtas.c, which contains generic RTAS functions useful on any CHRP platform, and arch/powerpc/platforms/pseries/rtas-fw.[ch], which contain some pSeries-specific firmware flashing bits. The parts of rtas.c that are to do with pSeries-specific error logging are protected by a new CONFIG_RTAS_ERROR_LOGGING symbol. The inclusion of rtas.o is controlled by the CONFIG_PPC_RTAS symbol, and the relevant platforms select that. Signed-off-by: Paul Mackerras <paulus@samba.org>
|