#
4f74fb30 |
|
07-Aug-2023 |
Mitchell Levy <levymitchell0@gmail.com> |
hv_balloon: Update the balloon driver to use the SBRM API This patch is intended as a proof-of-concept for the new SBRM machinery[1]. For some brief background, the idea behind SBRM is using the __cleanup__ attribute to automatically unlock locks (or otherwise release resources) when they go out of scope, similar to C++ style RAII. This promises some benefits such as making code simpler (particularly where you have lots of goto fail; type constructs) as well as reducing the surface area for certain kinds of bugs. The changes in this patch should not result in any difference in how the code actually runs (i.e., it's purely an exercise in this new syntax sugar). In one instance SBRM was not appropriate, so I left that part alone, but all other locking/unlocking is handled automatically in this patch. [1] https://lore.kernel.org/all/20230626125726.GU4253@hirez.programming.kicks-ass.net/ Suggested-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: "Mitchell Levy (Microsoft)" <levymitchell0@gmail.com> Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Wei Liu <wei.liu@kernel.org> Link: https://lore.kernel.org/r/20230807-sbrm-hyperv-v2-1-9d2ac15305bd@gmail.com
|
#
55e544e1 |
|
20-Jun-2023 |
Nischala Yelchuri <niyelchu@linux.microsoft.com> |
x86/hyperv: Improve code for referencing hyperv_pcpu_input_arg Several places in code for Hyper-V reference the per-CPU variable hyperv_pcpu_input_arg. Older code uses a multi-line sequence to reference the variable, and usually includes a cast. Newer code does a much simpler direct assignment. The latter is preferable as the complexity of the older code is unnecessary. Update older code to use the simpler direct assignment. Signed-off-by: Nischala Yelchuri <niyelchu@linux.microsoft.com> Link: https://lore.kernel.org/r/1687286438-9421-1-git-send-email-niyelchu@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
|
#
96ec2939 |
|
05-Jan-2023 |
Dawei Li <set_pte_at@outlook.com> |
Drivers: hv: Make remove callback of hyperv driver void returned Since commit fc7a6209d571 ("bus: Make remove callback return void") forces bus_type::remove be void-returned, it doesn't make much sense for any bus based driver implementing remove callbalk to return non-void to its caller. As such, change the remove function for Hyper-V VMBus based drivers to return void. Signed-off-by: Dawei Li <set_pte_at@outlook.com> Link: https://lore.kernel.org/r/TYCP286MB2323A93C55526E4DF239D3ACCAFA9@TYCP286MB2323.JPNP286.PROD.OUTLOOK.COM Signed-off-by: Wei Liu <wei.liu@kernel.org>
|
#
6dfb0771 |
|
02-Feb-2023 |
Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
HV: hv_balloon: fix memory leak with using debugfs_lookup() When calling debugfs_lookup() the result must have dput() called on it, otherwise the memory will leak over time. To make things simpler, just call debugfs_lookup_and_remove() instead which handles all of the logic at once. Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Wei Liu <wei.liu@kernel.org> Cc: Dexuan Cui <decui@microsoft.com> Fixes: d180e0a1be6c ("Drivers: hv: Create debugfs file with hyper-v balloon usage information") Cc: stable <stable@kernel.org> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/20230202140918.2289522-1-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
dc60f2db |
|
30-Sep-2022 |
Shradha Gupta <shradhagupta@linux.microsoft.com> |
hv_balloon: Add support for configurable order free page reporting Newer versions of Hyper-V allow reporting unused guest pages in chunks smaller than 2 Mbytes. Using smaller chunks allows reporting more unused guest pages, but with increased overhead in the finding the small chunks. To make this tradeoff configurable, use the existing page_reporting_order module parameter to control the reporting order. Drop and refine checks that restricted the minimun page reporting order to 2Mbytes size pages. Add appropriate checks to make sure the underlying Hyper-V versions support cold discard hints of any order (and not just starting from 9) Signed-off-by: Shradha Gupta <shradhagupta@linux.microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/1664517699-1085-3-git-send-email-shradhagupta@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
|
#
99632e3d |
|
19-Oct-2022 |
Jilin Yuan <yuanjilin@cdjrlc.com> |
Drivers: hv: fix repeated words in comments Delete the redundant word 'of'. Signed-off-by: Jilin Yuan <yuanjilin@cdjrlc.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/20221019125604.52999-1-yuanjilin@cdjrlc.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
|
#
d180e0a1 |
|
11-Jul-2022 |
Alexander Atanasov <alexander.atanasov@virtuozzo.com> |
Drivers: hv: Create debugfs file with hyper-v balloon usage information Allow the guest to know how much it is ballooned by the host. It is useful when debugging out of memory conditions. When host gets back memory from the guest it is accounted as used memory in the guest but the guest have no way to know how much it is actually ballooned. Expose current state, flags and max possible memory to the guest. While at it - fix a 10+ years old typo. Signed-off-by: Alexander Atanasov <alexander.atanasov@virtuozzo.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/20220711181825.52318-1-alexander.atanasov@virtuozzo.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
|
#
d27423bf |
|
15-May-2022 |
Shradha Gupta <shradhagupta@linux.microsoft.com> |
hv_balloon: Fix balloon_probe() and balloon_remove() error handling Add missing cleanup in balloon_probe() if the call to balloon_connect_vsp() fails. Also correctly handle cleanup in balloon_remove() when dm_state is DM_INIT_ERROR because balloon_resume() failed. Signed-off-by: Shradha Gupta <shradhagupta@microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/20220516045058.GA7933@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net Signed-off-by: Wei Liu <wei.liu@kernel.org>
|
#
be580279 |
|
24-Mar-2022 |
Boqun Feng <boqun.feng@gmail.com> |
Drivers: hv: balloon: Disable balloon and hot-add accordingly Currently there are known potential issues for balloon and hot-add on ARM64: * Unballoon requests from Hyper-V should only unballoon ranges that are guest page size aligned, otherwise guests cannot handle because it's impossible to partially free a page. This is a problem when guest page size > 4096 bytes. * Memory hot-add requests from Hyper-V should provide the NUMA node id of the added ranges or ARM64 should have a functional memory_add_physaddr_to_nid(), otherwise the node id is missing for add_memory(). These issues require discussions on design and implementation. In the meanwhile, post_status() is working and essential to guest monitoring. Therefore instead of disabling the entire hv_balloon driver, the ballooning (when page size > 4096 bytes) and hot-add are disabled accordingly for now. Once the issues are fixed, they can be re-enable in these cases. Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/20220325023212.1570049-3-boqun.feng@gmail.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
|
#
b3d6dd09 |
|
24-Mar-2022 |
Boqun Feng <boqun.feng@gmail.com> |
Drivers: hv: balloon: Support status report for larger page sizes DM_STATUS_REPORT expects the numbers of pages in the unit of 4k pages (HV_HYP_PAGE) instead of guest pages, so to make it work when guest page sizes are larger than 4k, convert the numbers of guest pages into the numbers of HV_HYP_PAGEs. Note that the numbers of guest pages are still used for tracing because tracing is internal to the guest kernel. Reported-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/20220325023212.1570049-2-boqun.feng@gmail.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
|
#
1d728672 |
|
22-Feb-2022 |
Anssi Hannula <anssi.hannula@bitwise.fi> |
hv_balloon: rate-limit "Unhandled message" warning For a couple of times I have encountered a situation where hv_balloon: Unhandled message: type: 12447 is being flooded over 1 million times per second with various values, filling the log and consuming cycles, making debugging difficult. Add rate limiting to the message. Most other Hyper-V drivers already have similar rate limiting in their message callbacks. The cause of the floods in my case was probably fixed by 96d9d1fa5cd5 ("Drivers: hv: balloon: account for vmbus packet header in max_pkt_size"). Fixes: 9aa8b50b2b3d ("Drivers: hv: Add Hyper-V balloon driver") Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/20220222141400.98160-1-anssi.hannula@bitwise.fi Signed-off-by: Wei Liu <wei.liu@kernel.org>
|
#
96d9d1fa |
|
19-Jan-2022 |
Yanming Liu <yanminglr@gmail.com> |
Drivers: hv: balloon: account for vmbus packet header in max_pkt_size Commit adae1e931acd ("Drivers: hv: vmbus: Copy packets sent by Hyper-V out of the ring buffer") introduced a notion of maximum packet size in vmbus channel and used that size to initialize a buffer holding all incoming packet along with their vmbus packet header. hv_balloon uses the default maximum packet size VMBUS_DEFAULT_MAX_PKT_SIZE which matches its maximum message size, however vmbus_open expects this size to also include vmbus packet header. This leads to 4096 bytes dm_unballoon_request messages being truncated to 4080 bytes. When the driver tries to read next packet it starts from a wrong read_index, receives garbage and prints a lot of "Unhandled message: type: <garbage>" in dmesg. Allocate the buffer with HV_HYP_PAGE_SIZE more bytes to make room for the header. Fixes: adae1e931acd ("Drivers: hv: vmbus: Copy packets sent by Hyper-V out of the ring buffer") Suggested-by: Michael Kelley (LINUX) <mikelley@microsoft.com> Suggested-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com> Signed-off-by: Yanming Liu <yanminglr@gmail.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Reviewed-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com> Link: https://lore.kernel.org/r/20220119202052.3006981-1-yanminglr@gmail.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
|
#
8a7eb2d4 |
|
01-Nov-2021 |
Boqun Feng <boqun.feng@gmail.com> |
Drivers: hv: balloon: Use VMBUS_RING_SIZE() wrapper for dm_ring_size Baihua reported an error when boot an ARM64 guest with PAGE_SIZE=64k and BALLOON is enabled: hv_vmbus: registering driver hv_balloon hv_vmbus: probe failed for device 1eccfd72-4b41-45ef-b73a-4a6e44c12924 (-22) The cause of this is that the ringbuffer size for hv_balloon is not adjusted with VMBUS_RING_SIZE(), which makes the size not large enough for ringbuffers on guest with PAGE_SIZE=64k. Therefore use VMBUS_RING_SIZE() to calculate the ringbuffer size. Note that the old size (20 * 1024) counts a 4k header in the total size, while VMBUS_RING_SIZE() expects the parameter as the payload size, so use 16 * 1024. Cc: <stable@vger.kernel.org> # 5.15.x Reported-by: Baihua Lu <baihua.lu@microsoft.com> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Tested-by: Vitaly Kuznetsov <vkuznets@redhat.com> Link: https://lore.kernel.org/r/20211101150026.736124-1-boqun.feng@gmail.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
|
#
03b30cc3 |
|
29-Apr-2021 |
Jiapeng Chong <jiapeng.chong@linux.alibaba.com> |
hv_balloon: Remove redundant assignment to region_start Variable region_start is set to pg_start but this value is never read as it is overwritten later on, hence it is a redundant assignment and can be removed. Cleans up the following clang-analyzer warning: drivers/hv/hv_balloon.c:1013:3: warning: Value stored to 'region_start' is never read [clang-analyzer-deadcode.DeadStores]. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Link: https://lore.kernel.org/r/1619691681-86256-1-git-send-email-jiapeng.chong@linux.alibaba.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
|
#
6dc2a774 |
|
23-Mar-2021 |
Sunil Muthuswamy <sunilmut@microsoft.com> |
x86/Hyper-V: Support for free page reporting Linux has support for free page reporting now (36e66c554b5c) for virtualized environment. On Hyper-V when virtually backed VMs are configured, Hyper-V will advertise cold memory discard capability, when supported. This patch adds the support to hook into the free page reporting infrastructure and leverage the Hyper-V cold memory discard hint hypercall to report/free these pages back to the host. Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com> Tested-by: Matheus Castello <matheus@castello.eng.br> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Tested-by: Nathan Chancellor <nathan@kernel.org> Link: https://lore.kernel.org/r/SN4PR2101MB0880121FA4E2FEC67F35C1DCC0649@SN4PR2101MB0880.namprd21.prod.outlook.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
|
#
26011267 |
|
25-Feb-2021 |
David Hildenbrand <david@redhat.com> |
mm/memory_hotplug: MEMHP_MERGE_RESOURCE -> MHP_MERGE_RESOURCE Let's make "MEMHP_MERGE_RESOURCE" consistent with "MHP_NONE", "mhp_t" and "mhp_flags". As discussed recently [1], "mhp" is our internal acronym for memory hotplug now. [1] https://lore.kernel.org/linux-mm/c37de2d0-28a1-4f7d-f944-cfd7d81c334d@redhat.com/ Link: https://lkml.kernel.org/r/20210126115829.10909-1-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Acked-by: Wei Liu <wei.liu@kernel.org> Reviewed-by: Pankaj Gupta <pankaj.gupta@cloud.ionos.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Juergen Gross <jgross@suse.com> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Wei Yang <richard.weiyang@linux.alibaba.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
d1df458c |
|
02-Dec-2020 |
Vitaly Kuznetsov <vkuznets@redhat.com> |
hv_balloon: do adjust_managed_page_count() when ballooning/un-ballooning Unlike virtio_balloon/virtio_mem/xen balloon drivers, Hyper-V balloon driver does not adjust managed pages count when ballooning/un-ballooning and this leads to incorrect stats being reported, e.g. unexpected 'free' output. Note, the calculation in post_status() seems to remain correct: ballooned out pages are never 'available' and we manually add dm->num_pages_ballooned to 'commited'. Suggested-by: David Hildenbrand <david@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201202161245.2406143-3-vkuznets@redhat.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
|
#
7f3f227b |
|
02-Dec-2020 |
Vitaly Kuznetsov <vkuznets@redhat.com> |
hv_balloon: simplify math in alloc_balloon_pages() 'alloc_unit' in alloc_balloon_pages() is either '512' for 2M allocations or '1' for 4k allocations. So 1 << get_order(alloc_unit << PAGE_SHIFT) equals to 'alloc_unit' and the for loop basically sets all them offline. Simplify the math to improve the readability. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201202161245.2406143-2-vkuznets@redhat.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
|
#
2c3bd2a5 |
|
08-Oct-2020 |
Olaf Hering <olaf@aepfle.de> |
hv_balloon: disable warning when floor reached It is not an error if the host requests to balloon down, but the VM refuses to do so. Without this change a warning is logged in dmesg every five minutes. Fixes: b3bb97b8a49f3 ("Drivers: hv: balloon: Add logging for dynamic memory operations") Signed-off-by: Olaf Hering <olaf@aepfle.de> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/20201008071216.16554-1-olaf@aepfle.de Signed-off-by: Wei Liu <wei.liu@kernel.org>
|
#
2c76e7f6 |
|
15-Oct-2020 |
David Hildenbrand <david@redhat.com> |
hv_balloon: try to merge system ram resources Let's try to merge system ram resources we add, to minimize the number of resources in /proc/iomem. We don't care about the boundaries of individual chunks we added. Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Wei Liu <wei.liu@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Wei Liu <wei.liu@kernel.org> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Baoquan He <bhe@redhat.com> Cc: Wei Yang <richardw.yang@linux.intel.com> Cc: Anton Blanchard <anton@ozlabs.org> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jason Wang <jasowang@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Julien Grall <julien@xen.org> Cc: Kees Cook <keescook@chromium.org> Cc: Len Brown <lenb@kernel.org> Cc: Leonardo Bras <leobras.c@gmail.com> Cc: Libor Pechacek <lpechacek@suse.cz> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Nathan Lynch <nathanl@linux.ibm.com> Cc: "Oliver O'Halloran" <oohall@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Pingfan Liu <kernelfans@gmail.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Roger Pau Monné <roger.pau@citrix.com> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Link: https://lkml.kernel.org/r/20200911103459.10306-9-david@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
b6117199 |
|
15-Oct-2020 |
David Hildenbrand <david@redhat.com> |
mm/memory_hotplug: prepare passing flags to add_memory() and friends We soon want to pass flags, e.g., to mark added System RAM resources. mergeable. Prepare for that. This patch is based on a similar patch by Oscar Salvador: https://lkml.kernel.org/r/20190625075227.15193-3-osalvador@suse.de Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Juergen Gross <jgross@suse.com> # Xen related part Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Acked-by: Wei Liu <wei.liu@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Baoquan He <bhe@redhat.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Len Brown <lenb@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Wei Liu <wei.liu@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: David Hildenbrand <david@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: "Oliver O'Halloran" <oohall@gmail.com> Cc: Pingfan Liu <kernelfans@gmail.com> Cc: Nathan Lynch <nathanl@linux.ibm.com> Cc: Libor Pechacek <lpechacek@suse.cz> Cc: Anton Blanchard <anton@ozlabs.org> Cc: Leonardo Bras <leobras.c@gmail.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Julien Grall <julien@xen.org> Cc: Kees Cook <keescook@chromium.org> Cc: Roger Pau Monné <roger.pau@citrix.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Yang <richardw.yang@linux.intel.com> Link: https://lkml.kernel.org/r/20200911103459.10306-5-david@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
bc58ebd5 |
|
06-Apr-2020 |
David Hildenbrand <david@redhat.com> |
hv_balloon: don't check for memhp_auto_online manually We get the MEM_ONLINE notifier call if memory is added right from the kernel via add_memory() or later from user space. Let's get rid of the "ha_waiting" flag - the wait event has an inbuilt mechanism (->done) for that. Initialize the wait event only once and reinitialize before adding memory. Unconditionally call complete() and wait_for_completion_timeout(). If there are no waiters, complete() will only increment ->done - which will be reset by reinit_completion(). If complete() has already been called, wait_for_completion_timeout() will not wait. There is still the chance for a small race between concurrent reinit_completion() and complete(). If complete() wins, we would not wait - which is tolerable (and the race exists in current code as well). Note: We only wait for "some" memory to get onlined, which seems to be good enough for now. [akpm@linux-foundation.org: register_memory_notifier() after init_completion(), per David] Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Baoquan He <bhe@redhat.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Wei Liu <wei.liu@kernel.org> Cc: Oscar Salvador <osalvador@suse.de> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Yumei Huang <yuhuang@redhat.com> Link: http://lkml.kernel.org/r/20200317104942.11178-6-david@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
d33c240d |
|
25-Jan-2020 |
Tianyu Lan <Tianyu.Lan@microsoft.com> |
hv_balloon: Balloon up according to request page number Current code has assumption that balloon request memory size aligns with 2MB. But actually Hyper-V doesn't guarantee such alignment. When balloon driver receives non-aligned balloon request, it produces warning and balloon up more memory than requested in order to keep 2MB alignment. Remove the warning and balloon up memory according to actual requested memory size. Fixes: f6712238471a ("hv: hv_balloon: avoid memory leak on alloc_error of 2MB memory block") Cc: stable@vger.kernel.org Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Tianyu Lan <Tianyu.Lan@microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
|
#
12cc1c73 |
|
30-Nov-2019 |
Souptick Joarder <jrdr.linux@gmail.com> |
mm/memory_hotplug.c: remove __online_page_set_limits() __online_page_set_limits() is a dummy function - remove it and all callers. Link: http://lkml.kernel.org/r/8e1bc9d3b492f6bde16e95ebc1dee11d6aefabd7.1567889743.git.jrdr.linux@gmail.com Link: http://lkml.kernel.org/r/854db2cf8145d9635249c95584d9a91fd774a229.1567889743.git.jrdr.linux@gmail.com Link: http://lkml.kernel.org/r/9afe6c5a18158f3884a6b302ac2c772f3da49ccc.1567889743.git.jrdr.linux@gmail.com Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> Reviewed-by: David Hildenbrand <david@redhat.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Juergen Gross <jgross@suse.com> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
30a9c246 |
|
30-Nov-2019 |
David Hildenbrand <david@redhat.com> |
hv_balloon: use generic_online_page() Let's use the generic onlining function - which will now also take care of calling kernel_map_pages(). Link: http://lkml.kernel.org/r/20190909114830.662-3-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Sasha Levin <sashal@kernel.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Oscar Salvador <osalvador@suse.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Qian Cai <cai@lca.pw> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
25bd2b2f |
|
20-Nov-2019 |
Dexuan Cui <decui@microsoft.com> |
hv_balloon: Add the support of hibernation When hibernation is enabled, we must ignore the balloon up/down and hot-add requests from the host, if any. Signed-off-by: Dexuan Cui <decui@microsoft.com> Acked-by: David Hildenbrand <david@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
|
#
2af5e7b7 |
|
16-Aug-2019 |
Himadri Pandya <himadrispandya@gmail.com> |
Drivers: hv: balloon: Remove dependencies on guest page size Hyper-V assumes page size to be 4K. This might not be the case for ARM64 architecture. Hence use hyper-v specific page size and page shift definitions to avoid conflicts between different host and guest page sizes on ARM64. Also, remove some old and incorrect comments and redefine ballooning granularities to handle larger page sizes correctly. Signed-off-by: Himadri Pandya <himadri18.07@gmail.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
|
#
221f6df0 |
|
14-Jun-2019 |
Dexuan Cui <decui@microsoft.com> |
hv_balloon: Reorganize the probe function Move the code that negotiates with the host to a new function balloon_connect_vsp() and improve the error handling. This makes the code more readable and paves the way for the support of hibernation in future. Makes no real logic change here. Signed-off-by: Dexuan Cui <decui@microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
|
#
1fed17df |
|
14-Jun-2019 |
Dexuan Cui <decui@microsoft.com> |
hv_balloon: Use a static page for the balloon_up send buffer It's unnecessary to dynamically allocate the buffer. Signed-off-by: Dexuan Cui <decui@microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
|
#
43aa3132 |
|
29-May-2019 |
Thomas Gleixner <tglx@linutronix.de> |
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 280 Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose good title or non infringement see the gnu general public license for more details extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 9 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexios Zavras <alexios.zavras@intel.com> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190529141900.459653302@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
fae42c4d |
|
05-Mar-2019 |
David Hildenbrand <david@redhat.com> |
hv_balloon: mark inflated pages PG_offline Mark inflated and never onlined pages PG_offline, to tell the world that the content is stale and should not be dumped. Link: http://lkml.kernel.org/r/20181119101616.8901-6-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Acked-by: Pankaj gupta <pagupta@redhat.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Kairui Song <kasong@redhat.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Baoquan He <bhe@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Christian Hansen <chansen3@cisco.com> Cc: Dave Young <dyoung@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Juergen Gross <jgross@suse.com> Cc: Julien Freche <jfreche@vmware.com> Cc: Kazuhito Hagio <k-hagio@ab.jp.nec.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Len Brown <len.brown@intel.com> Cc: Lianbo Jiang <lijiang@redhat.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Miles Chen <miles.chen@mediatek.com> Cc: Nadav Amit <namit@vmware.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Omar Sandoval <osandov@fb.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Pavel Tatashin <pasha.tatashin@oracle.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Xavier Deguillard <xdeguillard@vmware.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
a9cd410a |
|
05-Mar-2019 |
Arun KS <arunks@codeaurora.org> |
mm/page_alloc.c: memory hotplug: free pages as higher order When freeing pages are done with higher order, time spent on coalescing pages by buddy allocator can be reduced. With section size of 256MB, hot add latency of a single section shows improvement from 50-60 ms to less than 1 ms, hence improving the hot add latency by 60 times. Modify external providers of online callback to align with the change. [arunks@codeaurora.org: v11] Link: http://lkml.kernel.org/r/1547792588-18032-1-git-send-email-arunks@codeaurora.org [akpm@linux-foundation.org: remove unused local, per Arun] [akpm@linux-foundation.org: avoid return of void-returning __free_pages_core(), per Oscar] [akpm@linux-foundation.org: fix it for mm-convert-totalram_pages-and-totalhigh_pages-variables-to-atomic.patch] [arunks@codeaurora.org: v8] Link: http://lkml.kernel.org/r/1547032395-24582-1-git-send-email-arunks@codeaurora.org [arunks@codeaurora.org: v9] Link: http://lkml.kernel.org/r/1547098543-26452-1-git-send-email-arunks@codeaurora.org Link: http://lkml.kernel.org/r/1538727006-5727-1-git-send-email-arunks@codeaurora.org Signed-off-by: Arun KS <arunks@codeaurora.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Reviewed-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Juergen Gross <jgross@suse.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Mathieu Malaterre <malat@debian.org> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Souptick Joarder <jrdr.linux@gmail.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Aaron Lu <aaron.lu@intel.com> Cc: Srivatsa Vaddagiri <vatsa@codeaurora.org> Cc: Vinayak Menon <vinmenon@codeaurora.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
da8ced36 |
|
04-Jan-2019 |
Vitaly Kuznetsov <vkuznets@redhat.com> |
hv_balloon: avoid touching uninitialized struct page during tail onlining Hyper-V memory hotplug protocol has 2M granularity and in Linux x86 we use 128M. To deal with it we implement partial section onlining by registering custom page onlining callback (hv_online_page()). Later, when more memory arrives we try to online the 'tail' (see hv_bring_pgs_online()). It was found that in some cases this 'tail' onlining causes issues: BUG: Bad page state in process kworker/0:2 pfn:109e3a page:ffffe08344278e80 count:0 mapcount:1 mapping:0000000000000000 index:0x0 flags: 0xfffff80000000() raw: 000fffff80000000 dead000000000100 dead000000000200 0000000000000000 raw: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 page dumped because: nonzero mapcount ... Workqueue: events hot_add_req [hv_balloon] Call Trace: dump_stack+0x5c/0x80 bad_page.cold.112+0x7f/0xb2 free_pcppages_bulk+0x4b8/0x690 free_unref_page+0x54/0x70 hv_page_online_one+0x5c/0x80 [hv_balloon] hot_add_req.cold.24+0x182/0x835 [hv_balloon] ... Turns out that we now have deferred struct page initialization for memory hotplug so e.g. memory_block_action() in drivers/base/memory.c does pages_correctly_probed() check and in that check it avoids inspecting struct pages and checks sections instead. But in Hyper-V balloon driver we do PageReserved(pfn_to_page()) check and this is now wrong. Switch to checking online_section_nr() instead. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: stable@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
|
#
ca79b0c2 |
|
28-Dec-2018 |
Arun KS <arunks@codeaurora.org> |
mm: convert totalram_pages and totalhigh_pages variables to atomic totalram_pages and totalhigh_pages are made static inline function. Main motivation was that managed_page_count_lock handling was complicating things. It was discussed in length here, https://lore.kernel.org/patchwork/patch/995739/#1181785 So it seemes better to remove the lock and convert variables to atomic, with preventing poteintial store-to-read tearing as a bonus. [akpm@linux-foundation.org: coding style fixes] Link: http://lkml.kernel.org/r/1542090790-21750-4-git-send-email-arunks@codeaurora.org Signed-off-by: Arun KS <arunks@codeaurora.org> Suggested-by: Michal Hocko <mhocko@suse.com> Suggested-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Reviewed-by: Pavel Tatashin <pasha.tatashin@soleen.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
3d6357de |
|
28-Dec-2018 |
Arun KS <arunks@codeaurora.org> |
mm: reference totalram_pages and managed_pages once per function Patch series "mm: convert totalram_pages, totalhigh_pages and managed pages to atomic", v5. This series converts totalram_pages, totalhigh_pages and zone->managed_pages to atomic variables. totalram_pages, zone->managed_pages and totalhigh_pages updates are protected by managed_page_count_lock, but readers never care about it. Convert these variables to atomic to avoid readers potentially seeing a store tear. Main motivation was that managed_page_count_lock handling was complicating things. It was discussed in length here, https://lore.kernel.org/patchwork/patch/995739/#1181785 It seemes better to remove the lock and convert variables to atomic. With the change, preventing poteintial store-to-read tearing comes as a bonus. This patch (of 4): This is in preparation to a later patch which converts totalram_pages and zone->managed_pages to atomic variables. Please note that re-reading the value might lead to a different value and as such it could lead to unexpected behavior. There are no known bugs as a result of the current code but it is better to prevent from them in principle. Link: http://lkml.kernel.org/r/1542090790-21750-2-git-send-email-arunks@codeaurora.org Signed-off-by: Arun KS <arunks@codeaurora.org> Reviewed-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Reviewed-by: David Hildenbrand <david@redhat.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Pavel Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
1c87dc89 |
|
02-Oct-2018 |
Lance Roy <ldr709@gmail.com> |
hv_balloon: Replace spin_is_locked() with lockdep lockdep_assert_held() is better suited to checking locking requirements, since it won't get confused when someone else holds the lock. This is also a step towards possibly removing spin_is_locked(). Signed-off-by: Lance Roy <ldr709@gmail.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
af0a5646 |
|
05-Jun-2018 |
Arjan van de Ven <arjan@linux.intel.com> |
use the new async probing feature for the hyperv drivers Recent kernels support asynchronous probing; most hyperv drivers can be probed async easily so set the required flag for this. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
cf21be91 |
|
04-Mar-2018 |
Vitaly Kuznetsov <vkuznets@redhat.com> |
hv_balloon: trace post_status Hyper-V balloon driver makes non-trivial calculations to convert Linux's representation of free/used memory to what Hyper-V host expects to see. Add a tracepoint to see what's being sent and where the data comes from. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
bba072d1 |
|
04-Mar-2018 |
Vitaly Kuznetsov <vkuznets@redhat.com> |
hv_balloon: fix bugs in num_pages_onlined accounting Our num_pages_onlined accounting is buggy: 1) In case we're offlining a memory block which was present at boot (e.g. when there was no hotplug at all) we subtract 32k from 0 and as num_pages_onlined is unsigned get a very big positive number. 2) Commit 6df8d9aaf3af ("Drivers: hv: balloon: Correctly update onlined page count") made num_pages_onlined counter accurate on onlining but totally incorrect on offlining for partly populated regions: no matter how many pages were onlined and what was actually added to num_pages_onlined counter we always subtract the full region (32k) so again, num_pages_onlined can wrap around zero. By onlining/offlining the same partly populated region multiple times we can make the situation worse. Solve these issues by doing accurate accounting on offlining: walk HAS list, check for covered range and gaps. Fixes: 6df8d9aaf3af ("Drivers: hv: balloon: Correctly update onlined page count") Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
4f098af5 |
|
04-Mar-2018 |
Vitaly Kuznetsov <vkuznets@redhat.com> |
hv_balloon: simplify hv_online_page()/hv_page_online_one() Instead of doing pfn_to_page() and continuosly casting page to unsigned long just cache the pfn of the page with page_to_pfn(). Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
223e1e4d |
|
04-Mar-2018 |
Vitaly Kuznetsov <vkuznets@redhat.com> |
hv_balloon: fix printk loglevel We have a mix of different ideas of which loglevel should be used. Unify on the following: - pr_info() for normal operation - pr_warn() for 'strange' host behavior - pr_err() for all errors. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
c548f395 |
|
06-Aug-2017 |
Alex Ng <alexng@messages.microsoft.com> |
Drivers: hv: balloon: Initialize last_post_time on startup When left uninitialized, this sometimes fails the following check in post_status(): if (!time_after(now, (last_post_time + HZ))) { return; } This causes unnecessary delays in reporting memory pressure to host after booting up. Signed-off-by: Alex Ng <alexng@messages.microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
7b6e54b5 |
|
06-Aug-2017 |
Alex Ng <alexng@messages.microsoft.com> |
Drivers: hv: balloon: Show the max dynamic memory assigned Previously we were only showing max number of pages. We should make it more clear that this value is the max amount of dynamic memory that the Hyper-V host is willing to assign to this guest. Signed-off-by: Alex Ng <alexng@messages.microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
6df8d9aa |
|
06-Aug-2017 |
Alex Ng <alexng@messages.microsoft.com> |
Drivers: hv: balloon: Correctly update onlined page count Previously, num_pages_onlined was updated using value from memory online notifier. This is incorrect because they assume that all hot-added pages are online, even though we only online the amount that's backed by the host. We should update num_pages_onlined only when the balloon driver marks a page as online. Signed-off-by: Alex Ng <alexng@messages.microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
8b1f91fb |
|
04-Mar-2017 |
Stephen Hemminger <stephen@networkplumber.org> |
vmbus: remove useless return's No need for empty return at end of void function Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
ad6d4125 |
|
28-Jan-2017 |
Vitaly Kuznetsov <vkuznets@redhat.com> |
Drivers: hv: balloon: add a fall through comment to hv_memory_notifier() Coverity scan gives a warning when there is fall through in a switch without a comment. This fall through is intentional as ol_waitevent needs to be completed to unblock hv_mem_hot_add() allowing it to process next requests regardless of the result of if we were able to online this block. Reported-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
85000960 |
|
06-Nov-2016 |
Alex Ng <alexng@messages.microsoft.com> |
Drivers: hv: balloon: Fix info request to show max page count Balloon driver was only printing the size of the info blob and not the actual content. This fixes it so that the info blob (max page count as configured in Hyper-V) is printed out. Signed-off-by: Alex Ng <alexng@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
b3bb97b8 |
|
06-Nov-2016 |
Alex Ng <alexng@messages.microsoft.com> |
Drivers: hv: balloon: Add logging for dynamic memory operations Added logging to help troubleshoot common ballooning, hot add, and versioning issues. Signed-off-by: Alex Ng <alexng@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
8ba8c0a1 |
|
06-Nov-2016 |
Alex Ng <alexng@messages.microsoft.com> |
Drivers: hv: balloon: Disable hot add when CONFIG_MEMORY_HOTPLUG is not set If the guest does not support memory hotplugging, it should respond to the host with zero pages added and successful result code. This signals to the host that hotplugging is not supported and the host will avoid sending future hot-add requests. Signed-off-by: Alex Ng <alexng@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
b605c2d9 |
|
24-Aug-2016 |
Alex Ng <alexng@messages.microsoft.com> |
Drivers: hv: balloon: Use available memory value in pressure report Reports for available memory should use the si_mem_available() value. The previous freeram value does not include available page cache memory. Signed-off-by: Alex Ng <alexng@messages.microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
eece30b9 |
|
24-Aug-2016 |
Vitaly Kuznetsov <vkuznets@redhat.com> |
Drivers: hv: balloon: replace ha_region_mutex with spinlock lockdep reports possible circular locking dependency when udev is used for memory onlining: systemd-udevd/3996 is trying to acquire lock: ((memory_chain).rwsem){++++.+}, at: [<ffffffff810d137e>] __blocking_notifier_call_chain+0x4e/0xc0 but task is already holding lock: (&dm_device.ha_region_mutex){+.+.+.}, at: [<ffffffffa015382e>] hv_memory_notifier+0x5e/0xc0 [hv_balloon] ... which is probably a false positive because we take and release ha_region_mutex from memory notifier chain depending on the arg. No real deadlocks were reported so far (though I'm not really sure about preemptible kernels...) but we don't really need to hold the mutex for so long. We use it to protect ha_region_list (and its members) and the num_pages_onlined counter. None of these operations require us to sleep and nothing is slow, switch to using spinlock with interrupts disabled. While on it, replace list_for_each -> list_for_each_entry as we actually need entries in all these cases, drop meaningless list_empty() checks. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
a132c54c |
|
24-Aug-2016 |
Vitaly Kuznetsov <vkuznets@redhat.com> |
Drivers: hv: balloon: don't wait for ol_waitevent when memhp_auto_online is enabled With the recently introduced in-kernel memory onlining (MEMORY_HOTPLUG_DEFAULT_ONLINE) these is no point in waiting for pages to come online in the driver and we can get rid of the waiting. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
cb7a5724 |
|
24-Aug-2016 |
Vitaly Kuznetsov <vkuznets@redhat.com> |
Drivers: hv: balloon: account for gaps in hot add regions I'm observing the following hot add requests from the WS2012 host: hot_add_req: start_pfn = 0x108200 count = 330752 hot_add_req: start_pfn = 0x158e00 count = 193536 hot_add_req: start_pfn = 0x188400 count = 239616 As the host doesn't specify hot add regions we're trying to create 128Mb-aligned region covering the first request, we create the 0x108000 - 0x160000 region and we add 0x108000 - 0x158e00 memory. The second request passes the pfn_covered() check, we enlarge the region to 0x108000 - 0x190000 and add 0x158e00 - 0x188200 memory. The problem emerges with the third request as it starts at 0x188400 so there is a 0x200 gap which is not covered. As the end of our region is 0x190000 now it again passes the pfn_covered() check were we just adjust the covered_end_pfn and make it 0x188400 instead of 0x188200 which means that we'll try to online 0x188200-0x188400 pages but these pages were never assigned to us and we crash. We can't react to such requests by creating new hot add regions as it may happen that the whole suggested range falls into the previously identified 128Mb-aligned area so we'll end up adding nothing or create intersecting regions and our current logic doesn't allow that. Instead, create a list of such 'gaps' and check for them in the page online callback. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
7cf3b79e |
|
24-Aug-2016 |
Vitaly Kuznetsov <vkuznets@redhat.com> |
Drivers: hv: balloon: keep track of where ha_region starts Windows 2012 (non-R2) does not specify hot add region in hot add requests and the logic in hot_add_req() is trying to find a 128Mb-aligned region covering the request. It may also happen that host's requests are not 128Mb aligned and the created ha_region will start before the first specified PFN. We can't online these non-present pages but we don't remember the real start of the region. This is a regression introduced by the commit 5abbbb75d733 ("Drivers: hv: hv_balloon: don't lose memory when onlining order is not natural"). While the idea of keeping the 'moving window' was wrong (as there is no guarantee that hot add requests come ordered) we should still keep track of covered_start_pfn. This is not a revert, the logic is different. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
d19a55d6 |
|
30-Apr-2016 |
Vitaly Kuznetsov <vkuznets@redhat.com> |
Drivers: hv: balloon: reset host_specified_ha_region We set host_specified_ha_region = true on certain request but this is a global state which stays 'true' forever. We need to reset it when we receive a request where ha_region is not specified. I did not see any real issues, the bug was found by code inspection. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
77c0c973 |
|
30-Apr-2016 |
Vitaly Kuznetsov <vkuznets@redhat.com> |
Drivers: hv: balloon: don't crash when memory is added in non-sorted order When we iterate through all HA regions in handle_pg_range() we have an assumption that all these regions are sorted in the list and the 'start_pfn >= has->end_pfn' check is enough to find the proper region. Unfortunately it's not the case with WS2016 where host can hot-add regions in a different order. We end up modifying the wrong HA region and crashing later on pages online. Modify the check to make sure we found the region we were searching for while iterating. Fix the same check in pfn_covered() as well. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
b6ddeae1 |
|
01-Aug-2015 |
Alex Ng <alexng@microsoft.com> |
Drivers: hv: balloon: Enable dynamic memory protocol negotiation with Windows 10 hosts Support Win10 protocol for Dynamic Memory. Thia patch allows guests on Win10 hosts to hot-add memory even when dynamic memory is not enabled on the guest. Signed-off-by: Alex Ng <alexng@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
4e4bd36f |
|
29-May-2015 |
Vitaly Kuznetsov <vkuznets@redhat.com> |
Drivers: hv: balloon: check if ha_region_mutex was acquired in MEM_CANCEL_ONLINE case Memory notifiers are being executed in a sequential order and when one of them fails returning something different from NOTIFY_OK the remainder of the notification chain is not being executed. When a memory block is being onlined in online_pages() we do memory_notify(MEM_GOING_ONLINE, ) and if one of the notifiers in the chain fails we end up doing memory_notify(MEM_CANCEL_ONLINE, ) so it is possible for a notifier to see MEM_CANCEL_ONLINE without seeing the corresponding MEM_GOING_ONLINE event. E.g. when CONFIG_KASAN is enabled the kasan_mem_notifier() is being used to prevent memory hotplug, it returns NOTIFY_BAD for all MEM_GOING_ONLINE events. As kasan_mem_notifier() comes before the hv_memory_notifier() in the notification chain we don't see the MEM_GOING_ONLINE event and we do not take the ha_region_mutex. We, however, see the MEM_CANCEL_ONLINE event and unconditionally try to release the lock, the following is observed: [ 110.850927] ===================================== [ 110.850927] [ BUG: bad unlock balance detected! ] [ 110.850927] 4.1.0-rc3_bugxxxxxxx_test_xxxx #595 Not tainted [ 110.850927] ------------------------------------- [ 110.850927] systemd-udevd/920 is trying to release lock (&dm_device.ha_region_mutex) at: [ 110.850927] [<ffffffff81acda0e>] mutex_unlock+0xe/0x10 [ 110.850927] but there are no more locks to release! At the same time we can have the ha_region_mutex taken when we get the MEM_CANCEL_ONLINE event in case one of the memory notifiers after the hv_memory_notifier() in the notification chain failed so we need to add the mutex_is_locked() check. In case of MEM_ONLINE we are always supposed to have the mutex locked. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
797f88c9 |
|
31-Mar-2015 |
Vitaly Kuznetsov <vkuznets@redhat.com> |
Drivers: hv: hv_balloon: correctly handle num_pages>INT_MAX case balloon_wrk.num_pages is __u32 and it comes from host in struct dm_balloon where it is also __u32. We, however, use 'int' in balloon_up() and in case we happen to receive num_pages>INT_MAX request we'll end up allocating zero pages as 'num_pages < alloc_unit' check in alloc_balloon_pages() will pass. Change num_pages type to unsigned int. In real life ballooning request come with num_pages in [512, 32768] range so this is more a future-proof/cleanup. Reported-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
ba0c4441 |
|
31-Mar-2015 |
Vitaly Kuznetsov <vkuznets@redhat.com> |
Drivers: hv: hv_balloon: correctly handle val.freeram<num_pages case 'Drivers: hv: hv_balloon: refuse to balloon below the floor' fix does not correctly handle the case when val.freeram < num_pages as val.freeram is __kernel_ulong_t and the 'val.freeram - num_pages' value will be a huge positive value instead of being negative. Usually host doesn't ask us to balloon more than val.freeram but in case he have a memory hog started after we post the last pressure report we can get into troubles. Suggested-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
0a1a86ac |
|
27-Mar-2015 |
Vitaly Kuznetsov <vkuznets@redhat.com> |
Drivers: hv: hv_balloon: survive ballooning request with num_pages=0 ... and simplify alloc_balloon_pages() interface by removing redundant alloc_error from it. If we happen to enter balloon_up() with balloon_wrk.num_pages = 0 we will enter infinite 'while (!done)' loop as alloc_balloon_pages() will be always returning 0 and not setting alloc_error. We will also be sending a meaningless message to the host on every iteration. The 'alloc_unit == 1 && alloc_error -> num_ballooned == 0' change and alloc_error elimination requires a special comment. We do alloc_balloon_pages() with 2 different alloc_unit values and there are 4 different alloc_balloon_pages() results, let's check them all. alloc_unit = 512: 1) num_ballooned = 0, alloc_error = 0: we do 'alloc_unit=1' and retry pre- and post-patch. 2) num_ballooned > 0, alloc_error = 0: we check 'num_ballooned == num_pages' and act accordingly, pre- and post-patch. 3) num_ballooned > 0, alloc_error > 0: we report this chunk and remain within the loop, no changes here. 4) num_ballooned = 0, alloc_error > 0: we do 'alloc_unit=1' and retry pre- and post-patch. alloc_unit = 1: 1) num_ballooned = 0, alloc_error = 0: this can happen in two cases: when we passed 'num_pages=0' to alloc_balloon_pages() or when there was no space in bl_resp to place a single response. The second option is not possible as bl_resp is of PAGE_SIZE size and single response 'union dm_mem_page_range' is 8 bytes, but the first one is (in theory, I think that Hyper-V host never places such requests). Pre-patch code loops forever, post-patch code sends a reply with more_pages = 0 and finishes. 2) num_ballooned > 0, alloc_error = 0: we ran out of space in bl_resp, we report partial success and remain within the loop, no changes pre- and post-patch. 3) num_ballooned > 0, alloc_error > 0: pre-patch code finishes, post-patch code does one more try and if there is no progress (we finish with 'num_ballooned = 0') we finish. So we try a bit harder with this patch. 4) num_ballooned = 0, alloc_error > 0: both pre- and post-patch code enter 'more_pages = 0' branch and finish. So this patch has two real effects: 1) We reply with an empty response to 'num_pages=0' request. 2) We try a bit harder on alloc_unit=1 allocations (and reply with an empty tail reply in case we fail). An empty reply should be supported by host as we were able to send it even with pre-patch code when we were not able to allocate a single page. Suggested-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
7fb0e1a6 |
|
27-Mar-2015 |
Vitaly Kuznetsov <vkuznets@redhat.com> |
Drivers: hv: hv_balloon: eliminate jumps in piecewiese linear floor function Commit 79208c57da53 ("Drivers: hv: hv_balloon: Make adjustments in computing the floor") was inacurate as it introduced a jump in our piecewiese linear 'floor' function: At 2048MB we have: Left limit: 104 + 2048/8 = 360 Right limit: 256 + 2048/16 = 384 (so the right value is 232) We now have to make an adjustment at 8192 boundary: 232 + 8192/16 = 744 512 + 8192/32 = 768 (so the right value is 488) Suggested-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
d6cbd2c3 |
|
27-Mar-2015 |
Vitaly Kuznetsov <vkuznets@redhat.com> |
Drivers: hv: hv_balloon: do not online pages in offline blocks Currently we add memory in 128Mb blocks but the request from host can be aligned differently. In such case we add a partially backed block and when this block goes online we skip onlining pages which are not backed (hv_online_page() callback serves this purpose). When we receive next request for the same host add region we online pages which were not backed before with hv_bring_pgs_online(). However, we don't check if the the block in question was onlined and online this tail unconditionally. This is bad as we avoid all online_pages() logic: these pages are not accounted, we don't send notifications (and hv_balloon is not the only receiver of them),... And, first of all, nobody asked as to online these pages. Solve the issue by checking if the last previously backed page was onlined and onlining the tail only in case it was. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
5abbbb75 |
|
18-Mar-2015 |
Vitaly Kuznetsov <vkuznets@redhat.com> |
Drivers: hv: hv_balloon: don't lose memory when onlining order is not natural Memory blocks can be onlined in random order. When this order is not natural some memory pages are not onlined because of the redundant check in hv_online_page(). Here is a real world scenario: 1) Host tries to hot-add the following (process_hot_add): pg_start=rg_start=0x48000, pfn_cnt=111616, rg_size=262144 2) This results in adding 4 memory blocks: [ 109.057866] init_memory_mapping: [mem 0x48000000-0x4fffffff] [ 114.102698] init_memory_mapping: [mem 0x50000000-0x57ffffff] [ 119.168039] init_memory_mapping: [mem 0x58000000-0x5fffffff] [ 124.233053] init_memory_mapping: [mem 0x60000000-0x67ffffff] The last one is incomplete but we have special has->covered_end_pfn counter to avoid onlining non-backed frames and hv_bring_pgs_online() function to bring them online later on. 3) Now we have 4 offline memory blocks: /sys/devices/system/memory/memory9-12 $ for f in /sys/devices/system/memory/memory*/state; do echo $f `cat $f`; done | grep -v onlin /sys/devices/system/memory/memory10/state offline /sys/devices/system/memory/memory11/state offline /sys/devices/system/memory/memory12/state offline /sys/devices/system/memory/memory9/state offline 4) We bring them online in non-natural order: $grep MemTotal /proc/meminfo MemTotal: 966348 kB $echo online > /sys/devices/system/memory/memory12/state && grep MemTotal /proc/meminfo MemTotal: 1019596 kB $echo online > /sys/devices/system/memory/memory11/state && grep MemTotal /proc/meminfo MemTotal: 1150668 kB $echo online > /sys/devices/system/memory/memory9/state && grep MemTotal /proc/meminfo MemTotal: 1150668 kB As you can see memory9 block gives us zero additional memory. We can also observe a huge discrepancy between host- and guest-reported memory sizes. The root cause of the issue is the redundant pg >= covered_start_pfn check (and covered_start_pfn advancing) in hv_online_page(). When upper memory block in being onlined before the lower one (memory12 and memory11 in the above case) we advance the covered_start_pfn pointer and all memory9 pages do not pass the check. If the assumption that host always gives us requests in sequential order and pg_start always equals rg_start when the first request for the new HA region is received (that's the case in my testing) is correct than we can get rid of covered_start_pfn and pg >= start_pfn check in hv_online_page() is sufficient. The current char-next branch is broken and this patch fixes the bug. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
f3f6eb80 |
|
18-Mar-2015 |
Vitaly Kuznetsov <vkuznets@redhat.com> |
Drivers: hv: hv_balloon: keep locks balanced on add_memory() failure When add_memory() fails the following BUG is observed: [ 743.646107] hv_balloon: hot_add memory failed error is -17 [ 743.679973] [ 743.680930] ===================================== [ 743.680930] [ BUG: bad unlock balance detected! ] [ 743.680930] 3.19.0-rc5_bug1131426+ #552 Not tainted [ 743.680930] ------------------------------------- [ 743.680930] kworker/0:2/255 is trying to release lock (&dm_device.ha_region_mutex) at: [ 743.680930] [<ffffffff81aae5fe>] mutex_unlock+0xe/0x10 [ 743.680930] but there are no more locks to release! This happens as we don't acquire ha_region_mutex and hot_add_req() expects us to as it does unconditional mutex_unlock(). Acquire the lock on the error path. The current char-next branch is broken and this patch fixes the bug. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
530d15b9 |
|
28-Feb-2015 |
Vitaly Kuznetsov <vkuznets@redhat.com> |
Drivers: hv: hv_balloon: refuse to balloon below the floor When host asks us to balloon up we need to be sure we're not committing suicide by overballooning. Use already existent 'floor' metric as our lowest possible value for free ram. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
549fd280 |
|
28-Feb-2015 |
Vitaly Kuznetsov <vkuznets@redhat.com> |
Drivers: hv: hv_balloon: report offline pages as being used When hot-added memory pages are not brought online or when some memory blocks are sent offline the subsequent ballooning process kills the guest with OOM killer. This happens as we don't report these pages as neither used nor free and apparently host algorithm considers them as being unused. Keep track of all online/offline operations and report all currently offline pages as being used so host won't try to balloon them out. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
b05d8d9e |
|
28-Feb-2015 |
Vitaly Kuznetsov <vkuznets@redhat.com> |
Drivers: hv: hv_balloon: eliminate the trylock path in acquire/release_region_mutex When many memory regions are being added and automatically onlined the following lockup is sometimes observed: INFO: task udevd:1872 blocked for more than 120 seconds. ... Call Trace: [<ffffffff816ec0bc>] schedule_timeout+0x22c/0x350 [<ffffffff816eb98f>] wait_for_common+0x10f/0x160 [<ffffffff81067650>] ? default_wake_function+0x0/0x20 [<ffffffff816eb9fd>] wait_for_completion+0x1d/0x20 [<ffffffff8144cb9c>] hv_memory_notifier+0xdc/0x120 [<ffffffff816f298c>] notifier_call_chain+0x4c/0x70 ... When several memory blocks are going online simultaneously we got several hv_memory_notifier() trying to acquire the ha_region_mutex. When this mutex is being held by hot_add_req() all these competing acquire_region_mutex() do mutex_trylock, fail, and queue themselves into wait_for_completion(..). However when we do complete() from release_region_mutex() only one of them wakes up. This could be solved by changing complete() -> complete_all() memory onlining can be delayed as well, in that case we can still get several hv_memory_notifier() runners at the same time trying to grab the mutex. Only one of them will succeed and the others will hang for forever as complete() is not being called. We don't see this issue often because we have 5sec onlining timeout in hv_mem_hot_add() and usually all udev events arrive in this time frame. Get rid of the trylock path, waiting on the mutex is supposed to provide the required serialization. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
b057b3ad |
|
27-Feb-2015 |
Nicholas Mc Guire <der.herr@hofr.at> |
hv: hv_balloon: match var type to return type of wait_for_completion return type of wait_for_completion_timeout is unsigned long not int, this patch changes the type of t from int to unsigned long. Signed-off-by: Nicholas Mc Guire <der.herr@hofr.at> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
ab3de22b |
|
10-Jan-2015 |
K. Y. Srinivasan <kys@microsoft.com> |
Drivers: hv: hv_balloon: Don't post pressure status from interrupt context We currently release memory (balloon down) in the interrupt context and we also post memory status while releasing memory. Rather than posting the status in the interrupt context, wakeup the status posting thread to post the status. This will address the inconsistent lock state that Sitsofe Wheeler <sitsofe@gmail.com> reported: http://lkml.iu.edu/hypermail/linux/kernel/1411.1/00075.html Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reported-by: Sitsofe Wheeler <sitsofe@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
22f88475 |
|
10-Jan-2015 |
K. Y. Srinivasan <kys@microsoft.com> |
Drivers: hv: hv_balloon: Fix a locking bug in the balloon driver We support memory hot-add in the Hyper-V balloon driver by hot adding an appropriately sized and aligned region and controlling the on-lining of pages within that region based on the pages that the host wants us to online. We do this because the granularity and alignment requirements in Linux are different from what Windows expects. The state to manage the onlining of pages needs to be correctly protected. Fix this bug. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
79208c57 |
|
10-Jan-2015 |
K. Y. Srinivasan <kys@microsoft.com> |
Drivers: hv: hv_balloon: Make adjustments in computing the floor Make adjustments in computing the balloon floor. The current computation of the balloon floor was not appropriate for virtual machines with more than 10 GB of assigned memory - we would get into situations where the host would agressively balloon down the guest and leave the guest in an unusable state. This patch fixes the issue by raising the floor. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
f6712238 |
|
24-Nov-2014 |
Dexuan Cui <decui@microsoft.com> |
hv: hv_balloon: avoid memory leak on alloc_error of 2MB memory block If num_ballooned is not 0, we shouldn't neglect the already-partially-allocated 2MB memory block(s). Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
ae339336 |
|
23-Apr-2014 |
K. Y. Srinivasan <kys@microsoft.com> |
Drivers: hv: balloon: Ensure pressure reports are posted regularly The current code posts periodic memory pressure status from a dedicated thread. Under some conditions, especially when we are releasing a lot of memory into the guest, we may not send timely pressure reports back to the host. Fix this issue by reporting pressure in all contexts that can be active in this driver. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
5dba4c56 |
|
13-Feb-2014 |
K. Y. Srinivasan <kys@microsoft.com> |
Drivers: hv: Ballon: Make pressure posting thread sleep interruptibly The non-interruptible sleep of the memory pressure posting thread results in higher reported load average. Make this sleep interruptible. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
cfc25993 |
|
29-May-2013 |
Olaf Hering <olaf@aepfle.de> |
Drivers: hv: remove HV_DRV_VERSION Remove HV_DRV_VERSION, it has no meaning for upstream drivers. Initially it was supposed to show the "Linux Integration Services" version, now it is not in sync anymore with the out-of-tree drivers available from the MSFT website. The only place where a version string is still required is the KVP command "IntegrationServicesVersion" which is handled by tools/hv/hv_kvp_daemon.c. To satisfy such KVP request from the host pass the current string to the daemon during KVP userland registration. Signed-off-by: Olaf Hering <olaf@aepfle.de> Acked-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
20138d6c |
|
17-Jul-2013 |
K. Y. Srinivasan <kys@microsoft.com> |
Drivers: hv: balloon: Initialize the transaction ID just before sending the packet Each message sent from the guest carries with it a transaction ID. Assign the transaction ID just before putting the message on the VMBUS. This would help in debugging on the host side. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
c5e2254f |
|
14-Jul-2013 |
K. Y. Srinivasan <kys@microsoft.com> |
Drivers: hv: balloon: Do not post pressure status if interrupted When we are posting pressure status, we may get interrupted and handle the un-balloon operation. In this case just don't post the status as we know the pressure status is stale. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Cc: Stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
ed07ec93 |
|
14-Jul-2013 |
K. Y. Srinivasan <kys@microsoft.com> |
Drivers: hv: balloon: Fix a bug in the hot-add code As we hot-add 128 MB chunks of memory, we wait to ensure that the memory is onlined before attempting to hot-add the next chunk. If the udev rule for memory hot-add is not executed within the allowed time, we would rollback the state and abort further hot-add. Since the hot-add has succeeded and the only failure is that the memory is not onlined within the allowed time, we should not be rolling back the state. Fix this bug. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Cc: Stable <stable@vger.kernel.org> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
7f4f2302 |
|
18-Mar-2013 |
K. Y. Srinivasan <kys@microsoft.com> |
Drivers: hv: Notify the host of permanent hot-add failures If memory hot-add fails with the error -EEXIST, then this is a permanent failure. Notify the host of this information, so the host will not attempt hot-add again. If the failure were a transient failure, host will attempt a hot-add after some delay. In this version of the patch, I have added some additional comments to clarify how the host treats different failure conditions. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
f766dc1e |
|
18-Mar-2013 |
K. Y. Srinivasan <kys@microsoft.com> |
Drivers: hv: balloon: Support 2M page allocations for ballooning On Hyper-V it will be very efficient to use 2M allocations in the guest as this makes the ballooning protocol with the host that much more efficient. Hyper-V uses page ranges (start pfn : number of pages) to specify memory being moved around and with 2M pages this encoding can be very efficient. However, when memory is returned to the guest, the host does not guarantee any granularity. To deal with this issue, split the page soon after a successful 2M allocation so that this memory can potentially be freed as 4K pages. If 2M allocations fail, we revert to 4K allocations. In this version of the patch, based on the feedback from Michal Hocko <mhocko@suse.cz>, I have added some additional commentary to the patch description. Cc: Michal Hocko <mhocko@suse.cz> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
647965a2 |
|
29-Mar-2013 |
K. Y. Srinivasan <kys@microsoft.com> |
Drivers: hv: balloon: Permit Linux to specify hot-add alignment requirements Some Windows hosts permit the guest to specify memory hot-add alignment requirements (if any). Linux currently requires a 128MB alignment on memory segments that can be hot-added. Specify this alignment requirement to the host. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
a6025a2a |
|
20-Mar-2013 |
Wei Yongjun <yongjun_wei@trendmicro.com.cn> |
Drivers: hv: balloon: make local functions static local functions that could be static. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
1cac8cd4 |
|
15-Mar-2013 |
K. Y. Srinivasan <kys@microsoft.com> |
Drivers: hv: balloon: Implement hot-add functionality Implement the memory hot-add functionality. With this, Linux guests can fully participate in the Dynamic Memory protocol implemented in the Windows hosts. In this version of the patch, based Olaf Herring's feedback, I have gotten rid of the module level dependency on MEMORY_HOTPLUG. Instead the code within the driver that depends on MEMORY_HOTPLUG has the appropriate compilation switches. This would allow this driver to support pure ballooning in cases where the kernel does not support memory hotplug. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
0cf40a3e |
|
15-Mar-2013 |
K. Y. Srinivasan <kys@microsoft.com> |
Drivers: hv: balloon: Make the balloon driver not unloadable The balloon driver is stateful. For instance, it needs to keep track of pages that have been ballooned out to properly post pressure reports. This state cannot be re-constructed if the driver were to be unloaded and subsequently loaded. Furthermore, as we support memory hot-add as part of this driver, this driver becomes even more stateful and this state cannot be re-created. Make the balloon driver unloadable to deal with this issue. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
c51af826 |
|
15-Mar-2013 |
K. Y. Srinivasan <kys@microsoft.com> |
Drivers: hv: balloon: Execute hot-add code in a separate context Execute the hot-add operation in a separate work context. This allows us to decouple the pressure reporting activity from the "hot-add" activity. Testing has shown that this makes the guest more responsive to hot add requests. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
6571b2da |
|
15-Mar-2013 |
K. Y. Srinivasan <kys@microsoft.com> |
Drivers: hv: balloon: Execute balloon inflation in a separate context Execute the balloon inflation operation in a separate work context. This allows us to decouple the pressure reporting activity from the ballooning activity. Testing has shown that this decoupling makes the guest more reponsive. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
7a64b864 |
|
15-Mar-2013 |
K. Y. Srinivasan <kys@microsoft.com> |
Drivers: hv: balloon: Do not request completion notification There is no need to request completion notification; get rid of it. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
1c7db96f |
|
08-Feb-2013 |
K. Y. Srinivasan <kys@microsoft.com> |
Drivers: hv: balloon: Prevent the host from ballooning the guest too low Based on the amount of memory being managed set a floor on how low the guest can be ballooned. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
e500d158 |
|
08-Feb-2013 |
K. Y. Srinivasan <kys@microsoft.com> |
Drivers: hv: balloon: Add a parameter to delay pressure reporting Delay reporting memory pressure by a specified amount of time. This addresses the problem where the host may take memory balancing decisions based on incorrect memory pressure data that will be posted as soon as the balloon driver is loaded. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
0731572b |
|
25-Jan-2013 |
K. Y. Srinivasan <kys@microsoft.com> |
Drivers: hv: balloon: Make adjustments to the pressure report The host expects that the pressure report includes the pressure due to the pages that have been ballooned. Make necessary adjustments to reflect that. Also, include the free memory information in the pressure report. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
d13984e5 |
|
23-Jan-2013 |
K. Y. Srinivasan <kys@microsoft.com> |
Drivers: hv: Use consolidated GUID definitions Use the consolidated GUID definitions in the util and balloon drivers. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
33080c1c |
|
11-Dec-2012 |
K. Y. Srinivasan <kys@microsoft.com> |
Drivers: hv: balloon: Fix a memory leak The send buffer was being leaked; fix it. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Reported-by: Jason Wang <jasowang@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
6427a0d7 |
|
06-Dec-2012 |
K. Y. Srinivasan <kys@microsoft.com> |
Drivers: hv: balloon: Fix a bug in the definition of struct dm_info_msg There is bug in the definition of struct dm_info_msg. This patch fixes the definition of this structure and makes the corresponding adjustments. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
10d498b1 |
|
28-Nov-2012 |
Wei Yongjun <yongjun_wei@trendmicro.com.cn> |
hv: hv_balloon: remove duplicated include from hv_balloon.c Remove duplicated include. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Acked-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
989623c7 |
|
21-Nov-2012 |
Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
hv: hv_balloon: mark a function static This resolves the following sparse warning: drivers/hv/hv_balloon.c:548:6: sparse: symbol 'free_balloon_pages' was not declared. Should it be static? Reported-by: Xie ChanglongX <changlongx.xie@intel.com> Cc: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
9aa8b50b |
|
14-Nov-2012 |
K. Y. Srinivasan <kys@microsoft.com> |
Drivers: hv: Add Hyper-V balloon driver Add the basic balloon driver. Windows hosts dynamically manage the guest memory allocation via a combination memory hot add and ballooning. Memory hot add is used to grow the guest memory upto the maximum memory that can be allocatted to the guest. Ballooning is used to both shrink as well as expand up to the max memory. Supporting hot add needs additional support from the host. We will support hot add when this support is available. For now, by setting the VM startup memory to the VM max memory, we can use ballooning alone to dynamically manage memory allocation amongst competing guests on a given host. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|