Searched hist:592 (Results 51 - 75 of 377) sorted by relevance
/linux-master/drivers/mtd/ubi/ | ||
H A D | upd.c | diff 9cd9a21c Wed Feb 22 09:15:21 MST 2017 Sebastian Siewior <bigeasy@linutronix.de> ubi/upd: Always flush after prepared for an update In commit 6afaf8a484cb ("UBI: flush wl before clearing update marker") I managed to trigger and fix a similar bug. Now here is another version of which I assumed it wouldn't matter back then but it turns out UBI has a check for it and will error out like this: |ubi0 warning: validate_vid_hdr: inconsistent used_ebs |ubi0 error: validate_vid_hdr: inconsistent VID header at PEB 592 All you need to trigger this is? "ubiupdatevol /dev/ubi0_0 file" + a powercut in the middle of the operation. ubi_start_update() sets the update-marker and puts all EBs on the erase list. After that userland can proceed to write new data while the old EB aren't erased completely. A powercut at this point is usually not that much of a tragedy. UBI won't give read access to the static volume because it has the update marker. It will most likely set the corrupted flag because it misses some EBs. So we are all good. Unless the size of the image that has been written differs from the old image in the magnitude of at least one EB. In that case UBI will find two different values for `used_ebs' and refuse to attach the image with the error message mentioned above. So in order not to get in the situation, the patch will ensure that we wait until everything is removed before it tries to write any data. The alternative would be to detect such a case and remove all EBs at the attached time after we processed the volume-table and see the update-marker set. The patch looks bigger and I doubt it is worth it since usually the write() will wait from time to time for a new EB since usually there not that many spare EB that can be used. Cc: stable@vger.kernel.org Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Richard Weinberger <richard@nod.at> |
/linux-master/arch/mips/include/asm/ | ||
H A D | time.h | diff 592e527f Sun Apr 05 16:31:22 MDT 2009 Ralf Baechle <ralf@linux-mips.org> MIPS: Fix build error if CONFIG_CEVT_R4K is undefined. Introduced by 99aa5029937ee926e3b249369e208d7013cd381b. Signed-off-by: Ralf Baechle <ralf@linux-mips.org> |
/linux-master/drivers/input/mouse/ | ||
H A D | logips2pp.c | diff 592c352b Wed Mar 22 16:27:53 MDT 2017 Dmitry Torokhov <dmitry.torokhov@gmail.com> Input: logips2pp - clean up code - switch to using BIT() macros - use u8 instead of unsigned char for byte data - use input_set_capability() instead of manipulating capabilities bits directly - use sign_extend32() when extracting wheel data. - do not abuse -1 as error code, propagate errors from various calls. Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> |
/linux-master/include/net/ | ||
H A D | bpf_sk_storage.h | diff 592a3498 Thu Sep 24 18:04:02 MDT 2020 Martin KaFai Lau <kafai@fb.com> bpf: Change bpf_sk_storage_*() to accept ARG_PTR_TO_BTF_ID_SOCK_COMMON This patch changes the bpf_sk_storage_*() to take ARG_PTR_TO_BTF_ID_SOCK_COMMON such that they will work with the pointer returned by the bpf_skc_to_*() helpers also. A micro benchmark has been done on a "cgroup_skb/egress" bpf program which does a bpf_sk_storage_get(). It was driven by netperf doing a 4096 connected UDP_STREAM test with 64bytes packet. The stats from "kernel.bpf_stats_enabled" shows no meaningful difference. The sk_storage_get_btf_proto, sk_storage_delete_btf_proto, btf_sk_storage_get_proto, and btf_sk_storage_delete_proto are no longer needed, so they are removed. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Lorenz Bauer <lmb@cloudflare.com> Link: https://lore.kernel.org/bpf/20200925000402.3856307-1-kafai@fb.com |
/linux-master/drivers/ps3/ | ||
H A D | ps3-vuart.c | diff b17b3df1 Tue Jan 13 12:59:41 MST 2009 Stephen Rothwell <sfr@canb.auug.org.au> powerpc/ps3: The lv1_ routines have u64 parameters We just fix up the reference parameters as the others are dealt with by arithmetic promotion rules and don't cause warnings. This removes warnings like this: arch/powerpc/platforms/ps3/interrupt.c:327: warning: passing argument 1 of 'lv1_construct_event_receive_port' from incompatible pointer type Also, these: drivers/ps3/ps3-vuart.c:462: warning: passing argument 4 of 'ps3_vuart_raw_read' from incompatible pointer type drivers/ps3/ps3-vuart.c:592: warning: passing argument 4 of 'ps3_vuart_raw_read' from incompatible pointer type Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Geoff Levand <geoffrey.levand@am.sony.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> |
/linux-master/drivers/staging/vt6656/ | ||
H A D | usbpipe.h | diff 592ccfeb Fri Apr 16 21:07:42 MDT 2010 Andres More <more.andres@gmail.com> Staging: vt6656: Removed IN definition Code cleanup, removed empty IN definition used to denote input parameters. Signed-off-by: Andres More <more.andres@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> |
H A D | wcmd.h | diff 592ccfeb Fri Apr 16 21:07:42 MDT 2010 Andres More <more.andres@gmail.com> Staging: vt6656: Removed IN definition Code cleanup, removed empty IN definition used to denote input parameters. Signed-off-by: Andres More <more.andres@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> |
H A D | rf.h | diff 592ccfeb Fri Apr 16 21:07:42 MDT 2010 Andres More <more.andres@gmail.com> Staging: vt6656: Removed IN definition Code cleanup, removed empty IN definition used to denote input parameters. Signed-off-by: Andres More <more.andres@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> |
H A D | key.h | diff 592ccfeb Fri Apr 16 21:07:42 MDT 2010 Andres More <more.andres@gmail.com> Staging: vt6656: Removed IN definition Code cleanup, removed empty IN definition used to denote input parameters. Signed-off-by: Andres More <more.andres@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> |
H A D | channel.c | diff 592ccfeb Fri Apr 16 21:07:42 MDT 2010 Andres More <more.andres@gmail.com> Staging: vt6656: Removed IN definition Code cleanup, removed empty IN definition used to denote input parameters. Signed-off-by: Andres More <more.andres@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> |
/linux-master/net/ax25/ | ||
H A D | ax25_iface.c | diff 9ac0be9d Tue Aug 14 18:24:05 MDT 2007 Alexey Dobriyan <adobriyan@gmail.com> [AX25]: don't free pointers to statically allocated data commit 8d5cf596d10d740b69b5f4bbdb54b85abf75810d started to add statically allocated ax25_protocol's to list. However kfree() was still in place waiting for unsuspecting ones on module removal. Steps to reproduce: modprobe netrom rmmod netrom P.S.: code would benefit greatly from list_add/list_del usage kernel BUG at mm/slab.c:592! invalid opcode: 0000 [1] PREEMPT SMP CPU 0 Modules linked in: netrom ax25 af_packet usbcore rtc_cmos rtc_core rtc_lib Pid: 4477, comm: rmmod Not tainted 2.6.23-rc3-bloat #2 RIP: 0010:[<ffffffff802ac646>] [<ffffffff802ac646>] kfree+0x1c6/0x260 RSP: 0000:ffff810079a05e48 EFLAGS: 00010046 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff81000000c000 RDX: ffff81007e552458 RSI: 0000000000000000 RDI: 000000000000805d RBP: ffff810079a05e88 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000000 R12: ffffffff8805d080 R13: ffffffff8805d080 R14: 0000000000000000 R15: 0000000000000282 FS: 00002b73fc98aae0(0000) GS:ffffffff805dc000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 000000000053f3b8 CR3: 0000000079ff2000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process rmmod (pid: 4477, threadinfo ffff810079a04000, task ffff8100775aa480) Stack: ffff810079a05e68 0000000000000246 ffffffff8804eca0 0000000000000000 ffffffff8805d080 00000000000000cf 0000000000000000 0000000000000880 ffff810079a05eb8 ffffffff8803ec90 ffff810079a05eb8 0000000000000000 Call Trace: [<ffffffff8803ec90>] :ax25:ax25_protocol_release+0xa0/0xb0 [<ffffffff88056ecb>] :netrom:nr_exit+0x6b/0xf0 [<ffffffff80268bf0>] sys_delete_module+0x170/0x1f0 [<ffffffff8025da35>] trace_hardirqs_on+0xd5/0x170 [<ffffffff804835aa>] trace_hardirqs_on_thunk+0x35/0x37 [<ffffffff8020c13e>] system_call+0x7e/0x83 Code: 0f 0b eb fe 66 66 90 66 66 90 48 8b 52 10 48 8b 02 25 00 40 RIP [<ffffffff802ac646>] kfree+0x1c6/0x260 RSP <ffff810079a05e48> Kernel panic - not syncing: Fatal exception Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
/linux-master/drivers/gpu/drm/arm/ | ||
H A D | malidp_regs.h | diff 592d8c8c Mon Jan 23 06:46:41 MST 2017 Mihail Atanassov <mihail.atanassov@arm.com> drm: mali-dp: Check hw version matches device-tree Refuse to bind if the device-tree compatible string lists a different hardware version. Reviewed-by: Brian Starkey <brian.starkey@arm.com> Signed-off-by: Mihail Atanassov <mihail.atanassov@arm.com> Signed-off-by: Liviu Dudau <Liviu.Dudau@arm.com> |
/linux-master/tools/lib/api/fs/ | ||
H A D | tracing_path.h | 592d5a6b Wed Sep 02 01:56:34 MDT 2015 Jiri Olsa <jolsa@kernel.org> tools lib api fs: Move tracing_path interface into api/fs/tracing_path.c Moving tracing_path interface into api/fs/tracing_path.c out of util.c. It seems generic enough to be used by others, and I couldn't think of better place. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Reviewed-by: Matt Fleming <matt.fleming@intel.com> Reviewed-by: Raphael Beamonte <raphael.beamonte@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1441180605-24737-5-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
/linux-master/Documentation/devicetree/bindings/dma/ | ||
H A D | qcom,gpi.yaml | diff 7495a5bb Wed Apr 13 12:42:35 MDT 2022 Vinod Koul <vkoul@kernel.org> dt-bindings: dmaengine: qcom: gpi: Add minItems for interrupts Add the minItems for interrupts property as well. In the absence of this, we get warning if interrupts are less than 13 arch/arm64/boot/dts/qcom/qrb5165-rb5.dtb: dma-controller@800000: interrupts: [[0, 588, 4], [0, 589, 4], [0, 590, 4], [0, 591, 4], [0, 592, 4], [0, 593, 4], [0, 594, 4], [0, 595, 4], [0, 596, 4], [0, 597, 4]] is too short Signed-off-by: Vinod Koul <vkoul@kernel.org> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20220414064235.1182195-1-vkoul@kernel.org Signed-off-by: Vinod Koul <vkoul@kernel.org> |
/linux-master/fs/erofs/ | ||
H A D | namei.c | diff 592e7cd0 Mon Jul 13 07:09:44 MDT 2020 Alexander A. Klimov <grandmaster@al2klimov.de> erofs: Replace HTTP links with HTTPS ones Rationale: Reduces attack surface on kernel devs opening the links for MITM as HTTPS traffic is much harder to manipulate. Deterministic algorithm: For each file: If not .svg: For each line: If doesn't contain `\bxmlns\b`: For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`: If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`: If both the HTTP and HTTPS versions return 200 OK and serve the same content: Replace HTTP with HTTPS. Reviewed-by: Gao Xiang <hsiangkao@redhat.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de> Link: https://lore.kernel.org/r/20200713130944.34419-1-grandmaster@al2klimov.de Signed-off-by: Gao Xiang <hsiangkao@redhat.com> |
H A D | compress.h | diff 592e7cd0 Mon Jul 13 07:09:44 MDT 2020 Alexander A. Klimov <grandmaster@al2klimov.de> erofs: Replace HTTP links with HTTPS ones Rationale: Reduces attack surface on kernel devs opening the links for MITM as HTTPS traffic is much harder to manipulate. Deterministic algorithm: For each file: If not .svg: For each line: If doesn't contain `\bxmlns\b`: For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`: If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`: If both the HTTP and HTTPS versions return 200 OK and serve the same content: Replace HTTP with HTTPS. Reviewed-by: Gao Xiang <hsiangkao@redhat.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de> Link: https://lore.kernel.org/r/20200713130944.34419-1-grandmaster@al2klimov.de Signed-off-by: Gao Xiang <hsiangkao@redhat.com> |
/linux-master/include/linux/ | ||
H A D | find.h | diff fdae96a3 Mon Sep 19 15:05:58 MDT 2022 Yury Norov <yury.norov@gmail.com> lib/find: optimize for_each() macros Moving an iterator of the macros inside conditional part of for-loop helps to generate a better code. It had been first implemented in commit 7baac8b91f9871ba ("cpumask: make for_each_cpu_mask a bit smaller"). Now that cpumask for-loops are the aliases to bitmap loops, it's worth to optimize them the same way. Bloat-o-meter says: add/remove: 8/12 grow/shrink: 147/592 up/down: 4876/-24416 (-19540) Signed-off-by: Yury Norov <yury.norov@gmail.com> |
/linux-master/drivers/pnp/ | ||
H A D | interface.c | diff e7d2860b Mon Dec 14 19:01:06 MST 2009 André Goddard Rosa <andre.goddard@gmail.com> tree-wide: convert open calls to remove spaces to skip_spaces() lib function Makes use of skip_spaces() defined in lib/string.c for removing leading spaces from strings all over the tree. It decreases lib.a code size by 47 bytes and reuses the function tree-wide: text data bss dec hex filename 64688 584 592 65864 10148 (TOTALS-BEFORE) 64641 584 592 65817 10119 (TOTALS-AFTER) Also, while at it, if we see (*str && isspace(*str)), we can be sure to remove the first condition (*str) as the second one (isspace(*str)) also evaluates to 0 whenever *str == 0, making it redundant. In other words, "a char equals zero is never a space". Julia Lawall tried the semantic patch (http://coccinelle.lip6.fr) below, and found occurrences of this pattern on 3 more files: drivers/leds/led-class.c drivers/leds/ledtrig-timer.c drivers/video/output.c @@ expression str; @@ ( // ignore skip_spaces cases while (*str && isspace(*str)) { \(str++;\|++str;\) } | - *str && isspace(*str) ) Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com> Cc: Julia Lawall <julia@diku.dk> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Richard Purdie <rpurdie@rpsys.net> Cc: Neil Brown <neilb@suse.de> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Cc: David Howells <dhowells@redhat.com> Cc: <linux-ext4@vger.kernel.org> Cc: Samuel Ortiz <samuel@sortiz.org> Cc: Patrick McHardy <kaber@trash.net> Cc: Takashi Iwai <tiwai@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> diff e7d2860b Mon Dec 14 19:01:06 MST 2009 André Goddard Rosa <andre.goddard@gmail.com> tree-wide: convert open calls to remove spaces to skip_spaces() lib function Makes use of skip_spaces() defined in lib/string.c for removing leading spaces from strings all over the tree. It decreases lib.a code size by 47 bytes and reuses the function tree-wide: text data bss dec hex filename 64688 584 592 65864 10148 (TOTALS-BEFORE) 64641 584 592 65817 10119 (TOTALS-AFTER) Also, while at it, if we see (*str && isspace(*str)), we can be sure to remove the first condition (*str) as the second one (isspace(*str)) also evaluates to 0 whenever *str == 0, making it redundant. In other words, "a char equals zero is never a space". Julia Lawall tried the semantic patch (http://coccinelle.lip6.fr) below, and found occurrences of this pattern on 3 more files: drivers/leds/led-class.c drivers/leds/ledtrig-timer.c drivers/video/output.c @@ expression str; @@ ( // ignore skip_spaces cases while (*str && isspace(*str)) { \(str++;\|++str;\) } | - *str && isspace(*str) ) Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com> Cc: Julia Lawall <julia@diku.dk> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Richard Purdie <rpurdie@rpsys.net> Cc: Neil Brown <neilb@suse.de> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Cc: David Howells <dhowells@redhat.com> Cc: <linux-ext4@vger.kernel.org> Cc: Samuel Ortiz <samuel@sortiz.org> Cc: Patrick McHardy <kaber@trash.net> Cc: Takashi Iwai <tiwai@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
/linux-master/drivers/video/backlight/ | ||
H A D | lcd.c | diff e7d2860b Mon Dec 14 19:01:06 MST 2009 André Goddard Rosa <andre.goddard@gmail.com> tree-wide: convert open calls to remove spaces to skip_spaces() lib function Makes use of skip_spaces() defined in lib/string.c for removing leading spaces from strings all over the tree. It decreases lib.a code size by 47 bytes and reuses the function tree-wide: text data bss dec hex filename 64688 584 592 65864 10148 (TOTALS-BEFORE) 64641 584 592 65817 10119 (TOTALS-AFTER) Also, while at it, if we see (*str && isspace(*str)), we can be sure to remove the first condition (*str) as the second one (isspace(*str)) also evaluates to 0 whenever *str == 0, making it redundant. In other words, "a char equals zero is never a space". Julia Lawall tried the semantic patch (http://coccinelle.lip6.fr) below, and found occurrences of this pattern on 3 more files: drivers/leds/led-class.c drivers/leds/ledtrig-timer.c drivers/video/output.c @@ expression str; @@ ( // ignore skip_spaces cases while (*str && isspace(*str)) { \(str++;\|++str;\) } | - *str && isspace(*str) ) Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com> Cc: Julia Lawall <julia@diku.dk> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Richard Purdie <rpurdie@rpsys.net> Cc: Neil Brown <neilb@suse.de> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Cc: David Howells <dhowells@redhat.com> Cc: <linux-ext4@vger.kernel.org> Cc: Samuel Ortiz <samuel@sortiz.org> Cc: Patrick McHardy <kaber@trash.net> Cc: Takashi Iwai <tiwai@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> diff e7d2860b Mon Dec 14 19:01:06 MST 2009 André Goddard Rosa <andre.goddard@gmail.com> tree-wide: convert open calls to remove spaces to skip_spaces() lib function Makes use of skip_spaces() defined in lib/string.c for removing leading spaces from strings all over the tree. It decreases lib.a code size by 47 bytes and reuses the function tree-wide: text data bss dec hex filename 64688 584 592 65864 10148 (TOTALS-BEFORE) 64641 584 592 65817 10119 (TOTALS-AFTER) Also, while at it, if we see (*str && isspace(*str)), we can be sure to remove the first condition (*str) as the second one (isspace(*str)) also evaluates to 0 whenever *str == 0, making it redundant. In other words, "a char equals zero is never a space". Julia Lawall tried the semantic patch (http://coccinelle.lip6.fr) below, and found occurrences of this pattern on 3 more files: drivers/leds/led-class.c drivers/leds/ledtrig-timer.c drivers/video/output.c @@ expression str; @@ ( // ignore skip_spaces cases while (*str && isspace(*str)) { \(str++;\|++str;\) } | - *str && isspace(*str) ) Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com> Cc: Julia Lawall <julia@diku.dk> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Richard Purdie <rpurdie@rpsys.net> Cc: Neil Brown <neilb@suse.de> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Cc: David Howells <dhowells@redhat.com> Cc: <linux-ext4@vger.kernel.org> Cc: Samuel Ortiz <samuel@sortiz.org> Cc: Patrick McHardy <kaber@trash.net> Cc: Takashi Iwai <tiwai@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
/linux-master/mm/kfence/ | ||
H A D | kfence_test.c | diff 3cb1c962 Tue Mar 22 15:48:22 MDT 2022 Peng Liu <liupeng256@huawei.com> kfence: test: try to avoid test_gfpzero trigger rcu_stall When CONFIG_KFENCE_NUM_OBJECTS is set to a big number, kfence kunit-test-case test_gfpzero will eat up nearly all the CPU's resources and rcu_stall is reported as the following log which is cut from a physical server. rcu: INFO: rcu_sched self-detected stall on CPU rcu: 68-....: (14422 ticks this GP) idle=6ce/1/0x4000000000000002 softirq=592/592 fqs=7500 (t=15004 jiffies g=10677 q=20019) Task dump for CPU 68: task:kunit_try_catch state:R running task stack: 0 pid: 9728 ppid: 2 flags:0x0000020a Call trace: dump_backtrace+0x0/0x1e4 show_stack+0x20/0x2c sched_show_task+0x148/0x170 ... rcu_sched_clock_irq+0x70/0x180 update_process_times+0x68/0xb0 tick_sched_handle+0x38/0x74 ... gic_handle_irq+0x78/0x2c0 el1_irq+0xb8/0x140 kfree+0xd8/0x53c test_alloc+0x264/0x310 [kfence_test] test_gfpzero+0xf4/0x840 [kfence_test] kunit_try_run_case+0x48/0x20c kunit_generic_run_threadfn_adapter+0x28/0x34 kthread+0x108/0x13c ret_from_fork+0x10/0x18 To avoid rcu_stall and unacceptable latency, a schedule point is added to test_gfpzero. Link: https://lkml.kernel.org/r/20220309083753.1561921-4-liupeng256@huawei.com Signed-off-by: Peng Liu <liupeng256@huawei.com> Reviewed-by: Marco Elver <elver@google.com> Tested-by: Brendan Higgins <brendanhiggins@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Wang Kefeng <wangkefeng.wang@huawei.com> Cc: Daniel Latypov <dlatypov@google.com> Cc: David Gow <davidgow@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> diff 3cb1c962 Tue Mar 22 15:48:22 MDT 2022 Peng Liu <liupeng256@huawei.com> kfence: test: try to avoid test_gfpzero trigger rcu_stall When CONFIG_KFENCE_NUM_OBJECTS is set to a big number, kfence kunit-test-case test_gfpzero will eat up nearly all the CPU's resources and rcu_stall is reported as the following log which is cut from a physical server. rcu: INFO: rcu_sched self-detected stall on CPU rcu: 68-....: (14422 ticks this GP) idle=6ce/1/0x4000000000000002 softirq=592/592 fqs=7500 (t=15004 jiffies g=10677 q=20019) Task dump for CPU 68: task:kunit_try_catch state:R running task stack: 0 pid: 9728 ppid: 2 flags:0x0000020a Call trace: dump_backtrace+0x0/0x1e4 show_stack+0x20/0x2c sched_show_task+0x148/0x170 ... rcu_sched_clock_irq+0x70/0x180 update_process_times+0x68/0xb0 tick_sched_handle+0x38/0x74 ... gic_handle_irq+0x78/0x2c0 el1_irq+0xb8/0x140 kfree+0xd8/0x53c test_alloc+0x264/0x310 [kfence_test] test_gfpzero+0xf4/0x840 [kfence_test] kunit_try_run_case+0x48/0x20c kunit_generic_run_threadfn_adapter+0x28/0x34 kthread+0x108/0x13c ret_from_fork+0x10/0x18 To avoid rcu_stall and unacceptable latency, a schedule point is added to test_gfpzero. Link: https://lkml.kernel.org/r/20220309083753.1561921-4-liupeng256@huawei.com Signed-off-by: Peng Liu <liupeng256@huawei.com> Reviewed-by: Marco Elver <elver@google.com> Tested-by: Brendan Higgins <brendanhiggins@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Wang Kefeng <wangkefeng.wang@huawei.com> Cc: Daniel Latypov <dlatypov@google.com> Cc: David Gow <davidgow@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
/linux-master/arch/x86/kernel/cpu/mtrr/ | ||
H A D | if.c | diff e7d2860b Mon Dec 14 19:01:06 MST 2009 André Goddard Rosa <andre.goddard@gmail.com> tree-wide: convert open calls to remove spaces to skip_spaces() lib function Makes use of skip_spaces() defined in lib/string.c for removing leading spaces from strings all over the tree. It decreases lib.a code size by 47 bytes and reuses the function tree-wide: text data bss dec hex filename 64688 584 592 65864 10148 (TOTALS-BEFORE) 64641 584 592 65817 10119 (TOTALS-AFTER) Also, while at it, if we see (*str && isspace(*str)), we can be sure to remove the first condition (*str) as the second one (isspace(*str)) also evaluates to 0 whenever *str == 0, making it redundant. In other words, "a char equals zero is never a space". Julia Lawall tried the semantic patch (http://coccinelle.lip6.fr) below, and found occurrences of this pattern on 3 more files: drivers/leds/led-class.c drivers/leds/ledtrig-timer.c drivers/video/output.c @@ expression str; @@ ( // ignore skip_spaces cases while (*str && isspace(*str)) { \(str++;\|++str;\) } | - *str && isspace(*str) ) Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com> Cc: Julia Lawall <julia@diku.dk> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Richard Purdie <rpurdie@rpsys.net> Cc: Neil Brown <neilb@suse.de> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Cc: David Howells <dhowells@redhat.com> Cc: <linux-ext4@vger.kernel.org> Cc: Samuel Ortiz <samuel@sortiz.org> Cc: Patrick McHardy <kaber@trash.net> Cc: Takashi Iwai <tiwai@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> diff e7d2860b Mon Dec 14 19:01:06 MST 2009 André Goddard Rosa <andre.goddard@gmail.com> tree-wide: convert open calls to remove spaces to skip_spaces() lib function Makes use of skip_spaces() defined in lib/string.c for removing leading spaces from strings all over the tree. It decreases lib.a code size by 47 bytes and reuses the function tree-wide: text data bss dec hex filename 64688 584 592 65864 10148 (TOTALS-BEFORE) 64641 584 592 65817 10119 (TOTALS-AFTER) Also, while at it, if we see (*str && isspace(*str)), we can be sure to remove the first condition (*str) as the second one (isspace(*str)) also evaluates to 0 whenever *str == 0, making it redundant. In other words, "a char equals zero is never a space". Julia Lawall tried the semantic patch (http://coccinelle.lip6.fr) below, and found occurrences of this pattern on 3 more files: drivers/leds/led-class.c drivers/leds/ledtrig-timer.c drivers/video/output.c @@ expression str; @@ ( // ignore skip_spaces cases while (*str && isspace(*str)) { \(str++;\|++str;\) } | - *str && isspace(*str) ) Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com> Cc: Julia Lawall <julia@diku.dk> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Richard Purdie <rpurdie@rpsys.net> Cc: Neil Brown <neilb@suse.de> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Cc: David Howells <dhowells@redhat.com> Cc: <linux-ext4@vger.kernel.org> Cc: Samuel Ortiz <samuel@sortiz.org> Cc: Patrick McHardy <kaber@trash.net> Cc: Takashi Iwai <tiwai@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
/linux-master/arch/arm64/boot/dts/hisilicon/ | ||
H A D | hi3660.dtsi | diff 9a9760de Wed Dec 13 07:21:06 MST 2017 Valentin Schneider <valentin.schneider@arm.com> arm64: dts: hisilicon: Add hi3660 cpu capacity-dmips-mhz information The following dt entries are added: cpus [0-3] (Cortex A53): - capacity-dmips-mhz = <592>; cpus [4-7] (Cortex A73): - capacity-dmips-mhz = <1024>; Those values were obtained by running dhrystone 2.1 on a HiKey960 with the following procedure: - Offline all CPUs but CPU0 (A53) - Set CPU0 frequency to maximum - Run Dhrystone 2.1 for 20 seconds - Offline all CPUs but CPU4 (A73) - set CPU4 frequency to maximum - Run Dhrystone 2.1 for 20 seconds The results are as follows: A53: 129633887 loops A73: 287034147 loops By scaling those values so that the A73s use 1024, we end up with 462 for the A53s. However, they have different maximum frequencies: 1.844GHz for A53s and 2.362GHz for A73s. Thus, we can scale the A53 value to truly represent dmips per MHz, and we end up with 592. The impact of this change can be verified on HiKey960: $ cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_cur_freq 1844000 1844000 1844000 1844000 2362000 2362000 2362000 2362000 $ cat /sys/devices/system/cpu/cpu*/cpu_capacity 462 462 462 462 1024 1024 1024 1024 Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Reviewed-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Wei Xu <xuwei5@hisilicon.com> diff 9a9760de Wed Dec 13 07:21:06 MST 2017 Valentin Schneider <valentin.schneider@arm.com> arm64: dts: hisilicon: Add hi3660 cpu capacity-dmips-mhz information The following dt entries are added: cpus [0-3] (Cortex A53): - capacity-dmips-mhz = <592>; cpus [4-7] (Cortex A73): - capacity-dmips-mhz = <1024>; Those values were obtained by running dhrystone 2.1 on a HiKey960 with the following procedure: - Offline all CPUs but CPU0 (A53) - Set CPU0 frequency to maximum - Run Dhrystone 2.1 for 20 seconds - Offline all CPUs but CPU4 (A73) - set CPU4 frequency to maximum - Run Dhrystone 2.1 for 20 seconds The results are as follows: A53: 129633887 loops A73: 287034147 loops By scaling those values so that the A73s use 1024, we end up with 462 for the A53s. However, they have different maximum frequencies: 1.844GHz for A53s and 2.362GHz for A73s. Thus, we can scale the A53 value to truly represent dmips per MHz, and we end up with 592. The impact of this change can be verified on HiKey960: $ cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_cur_freq 1844000 1844000 1844000 1844000 2362000 2362000 2362000 2362000 $ cat /sys/devices/system/cpu/cpu*/cpu_capacity 462 462 462 462 1024 1024 1024 1024 Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Reviewed-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Wei Xu <xuwei5@hisilicon.com> |
/linux-master/scripts/dtc/include-prefixes/arm64/hisilicon/ | ||
H A D | hi3660.dtsi | diff 9a9760de Wed Dec 13 07:21:06 MST 2017 Valentin Schneider <valentin.schneider@arm.com> arm64: dts: hisilicon: Add hi3660 cpu capacity-dmips-mhz information The following dt entries are added: cpus [0-3] (Cortex A53): - capacity-dmips-mhz = <592>; cpus [4-7] (Cortex A73): - capacity-dmips-mhz = <1024>; Those values were obtained by running dhrystone 2.1 on a HiKey960 with the following procedure: - Offline all CPUs but CPU0 (A53) - Set CPU0 frequency to maximum - Run Dhrystone 2.1 for 20 seconds - Offline all CPUs but CPU4 (A73) - set CPU4 frequency to maximum - Run Dhrystone 2.1 for 20 seconds The results are as follows: A53: 129633887 loops A73: 287034147 loops By scaling those values so that the A73s use 1024, we end up with 462 for the A53s. However, they have different maximum frequencies: 1.844GHz for A53s and 2.362GHz for A73s. Thus, we can scale the A53 value to truly represent dmips per MHz, and we end up with 592. The impact of this change can be verified on HiKey960: $ cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_cur_freq 1844000 1844000 1844000 1844000 2362000 2362000 2362000 2362000 $ cat /sys/devices/system/cpu/cpu*/cpu_capacity 462 462 462 462 1024 1024 1024 1024 Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Reviewed-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Wei Xu <xuwei5@hisilicon.com> diff 9a9760de Wed Dec 13 07:21:06 MST 2017 Valentin Schneider <valentin.schneider@arm.com> arm64: dts: hisilicon: Add hi3660 cpu capacity-dmips-mhz information The following dt entries are added: cpus [0-3] (Cortex A53): - capacity-dmips-mhz = <592>; cpus [4-7] (Cortex A73): - capacity-dmips-mhz = <1024>; Those values were obtained by running dhrystone 2.1 on a HiKey960 with the following procedure: - Offline all CPUs but CPU0 (A53) - Set CPU0 frequency to maximum - Run Dhrystone 2.1 for 20 seconds - Offline all CPUs but CPU4 (A73) - set CPU4 frequency to maximum - Run Dhrystone 2.1 for 20 seconds The results are as follows: A53: 129633887 loops A73: 287034147 loops By scaling those values so that the A73s use 1024, we end up with 462 for the A53s. However, they have different maximum frequencies: 1.844GHz for A53s and 2.362GHz for A73s. Thus, we can scale the A53 value to truly represent dmips per MHz, and we end up with 592. The impact of this change can be verified on HiKey960: $ cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_cur_freq 1844000 1844000 1844000 1844000 2362000 2362000 2362000 2362000 $ cat /sys/devices/system/cpu/cpu*/cpu_capacity 462 462 462 462 1024 1024 1024 1024 Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Reviewed-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Wei Xu <xuwei5@hisilicon.com> |
/linux-master/arch/xtensa/kernel/ | ||
H A D | vmlinux.lds.S | diff 17290231 Sat May 24 11:48:28 MDT 2014 Max Filippov <jcmvbkbc@gmail.com> xtensa: add fixup for double exception raised in window overflow There are two FIXMEs in the double exception handler 'for the extremely unlikely case'. This case gets hit by gcc during kernel build once in a few hours, resulting in an unrecoverable exception condition. Provide missing fixup routine to handle this case. Double exception literals now need 8 more bytes, add them to the linker script. Also replace bbsi instructions with bbsi.l as we're branching depending on 8th and 7th LSB-based bits of exception address. This may be tested by adding the explicit DTLB invalidation to window overflow handlers, like the following: --- a/arch/xtensa/kernel/vectors.S +++ b/arch/xtensa/kernel/vectors.S @@ -592,6 +592,14 @@ ENDPROC(_WindowUnderflow4) ENTRY_ALIGN64(_WindowOverflow8) s32e a0, a9, -16 + bbsi.l a9, 31, 1f + rsr a0, ccount + bbsi.l a0, 4, 1f + pdtlb a0, a9 + idtlb a0 + movi a0, 9 + idtlb a0 +1: l32e a0, a1, -12 s32e a2, a9, -8 s32e a1, a9, -12 Cc: stable@vger.kernel.org Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> diff 17290231 Sat May 24 11:48:28 MDT 2014 Max Filippov <jcmvbkbc@gmail.com> xtensa: add fixup for double exception raised in window overflow There are two FIXMEs in the double exception handler 'for the extremely unlikely case'. This case gets hit by gcc during kernel build once in a few hours, resulting in an unrecoverable exception condition. Provide missing fixup routine to handle this case. Double exception literals now need 8 more bytes, add them to the linker script. Also replace bbsi instructions with bbsi.l as we're branching depending on 8th and 7th LSB-based bits of exception address. This may be tested by adding the explicit DTLB invalidation to window overflow handlers, like the following: --- a/arch/xtensa/kernel/vectors.S +++ b/arch/xtensa/kernel/vectors.S @@ -592,6 +592,14 @@ ENDPROC(_WindowUnderflow4) ENTRY_ALIGN64(_WindowOverflow8) s32e a0, a9, -16 + bbsi.l a9, 31, 1f + rsr a0, ccount + bbsi.l a0, 4, 1f + pdtlb a0, a9 + idtlb a0 + movi a0, 9 + idtlb a0 +1: l32e a0, a1, -12 s32e a2, a9, -8 s32e a1, a9, -12 Cc: stable@vger.kernel.org Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> |
/linux-master/drivers/block/zram/ | ||
H A D | Kconfig | diff ebaf9ab5 Tue Jul 26 16:22:45 MDT 2016 Sergey Senozhatsky <sergey.senozhatsky@gmail.com> zram: switch to crypto compress API We don't have an idle zstreams list anymore and our write path now works absolutely differently, preventing preemption during compression. This removes possibilities of read paths preempting writes at wrong places (which could badly affect the performance of both paths) and at the same time opens the door for a move from custom LZO/LZ4 compression backends implementation to a more generic one, using crypto compress API. Joonsoo Kim [1] attempted to do this a while ago, but faced with the need of introducing a new crypto API interface. The root cause was the fact that crypto API compression algorithms require a compression stream structure (in zram terminology) for both compression and decompression ops, while in reality only several of compression algorithms really need it. This resulted in a concept of context-less crypto API compression backends [2]. Both write and read paths, though, would have been executed with the preemption enabled, which in the worst case could have resulted in a decreased worst-case performance, e.g. consider the following case: CPU0 zram_write() spin_lock() take the last idle stream spin_unlock() << preempted >> zram_read() spin_lock() no idle streams spin_unlock() schedule() resuming zram_write compression() but it took me some time to realize that, and it took even longer to evolve zram and to make it ready for crypto API. The key turned out to be -- drop the idle streams list entirely. Without the idle streams list we are free to use compression algorithms that require compression stream for decompression (read), because streams are now placed in per-cpu data and each write path has to disable preemption for compression op, almost completely eliminating the aforementioned case (technically, we still have a small chance, because write path has a fast and a slow paths and the slow path is executed with the preemption enabled; but the frequency of failed fast path is too low). TEST ==== - 4 CPUs, x86_64 system - 3G zram, lzo - fio tests: read, randread, write, randwrite, rw, randrw test script [3] command: ZRAM_SIZE=3G LOG_SUFFIX=XXXX FIO_LOOPS=5 ./zram-fio-test.sh BASE PATCHED jobs1 READ: 2527.2MB/s 2482.7MB/s READ: 2102.7MB/s 2045.0MB/s WRITE: 1284.3MB/s 1324.3MB/s WRITE: 1080.7MB/s 1101.9MB/s READ: 430125KB/s 437498KB/s WRITE: 430538KB/s 437919KB/s READ: 399593KB/s 403987KB/s WRITE: 399910KB/s 404308KB/s jobs2 READ: 8133.5MB/s 7854.8MB/s READ: 7086.6MB/s 6912.8MB/s WRITE: 3177.2MB/s 3298.3MB/s WRITE: 2810.2MB/s 2871.4MB/s READ: 1017.6MB/s 1023.4MB/s WRITE: 1018.2MB/s 1023.1MB/s READ: 977836KB/s 984205KB/s WRITE: 979435KB/s 985814KB/s jobs3 READ: 13557MB/s 13391MB/s READ: 11876MB/s 11752MB/s WRITE: 4641.5MB/s 4682.1MB/s WRITE: 4164.9MB/s 4179.3MB/s READ: 1453.8MB/s 1455.1MB/s WRITE: 1455.1MB/s 1458.2MB/s READ: 1387.7MB/s 1395.7MB/s WRITE: 1386.1MB/s 1394.9MB/s jobs4 READ: 20271MB/s 20078MB/s READ: 18033MB/s 17928MB/s WRITE: 6176.8MB/s 6180.5MB/s WRITE: 5686.3MB/s 5705.3MB/s READ: 2009.4MB/s 2006.7MB/s WRITE: 2007.5MB/s 2004.9MB/s READ: 1929.7MB/s 1935.6MB/s WRITE: 1926.8MB/s 1932.6MB/s jobs5 READ: 18823MB/s 19024MB/s READ: 18968MB/s 19071MB/s WRITE: 6191.6MB/s 6372.1MB/s WRITE: 5818.7MB/s 5787.1MB/s READ: 2011.7MB/s 1981.3MB/s WRITE: 2011.4MB/s 1980.1MB/s READ: 1949.3MB/s 1935.7MB/s WRITE: 1940.4MB/s 1926.1MB/s jobs6 READ: 21870MB/s 21715MB/s READ: 19957MB/s 19879MB/s WRITE: 6528.4MB/s 6537.6MB/s WRITE: 6098.9MB/s 6073.6MB/s READ: 2048.6MB/s 2049.9MB/s WRITE: 2041.7MB/s 2042.9MB/s READ: 2013.4MB/s 1990.4MB/s WRITE: 2009.4MB/s 1986.5MB/s jobs7 READ: 21359MB/s 21124MB/s READ: 19746MB/s 19293MB/s WRITE: 6660.4MB/s 6518.8MB/s WRITE: 6211.6MB/s 6193.1MB/s READ: 2089.7MB/s 2080.6MB/s WRITE: 2085.8MB/s 2076.5MB/s READ: 2041.2MB/s 2052.5MB/s WRITE: 2037.5MB/s 2048.8MB/s jobs8 READ: 20477MB/s 19974MB/s READ: 18922MB/s 18576MB/s WRITE: 6851.9MB/s 6788.3MB/s WRITE: 6407.7MB/s 6347.5MB/s READ: 2134.8MB/s 2136.1MB/s WRITE: 2132.8MB/s 2134.4MB/s READ: 2074.2MB/s 2069.6MB/s WRITE: 2087.3MB/s 2082.4MB/s jobs9 READ: 19797MB/s 19994MB/s READ: 18806MB/s 18581MB/s WRITE: 6878.7MB/s 6822.7MB/s WRITE: 6456.8MB/s 6447.2MB/s READ: 2141.1MB/s 2154.7MB/s WRITE: 2144.4MB/s 2157.3MB/s READ: 2084.1MB/s 2085.1MB/s WRITE: 2091.5MB/s 2092.5MB/s jobs10 READ: 19794MB/s 19784MB/s READ: 18794MB/s 18745MB/s WRITE: 6984.4MB/s 6676.3MB/s WRITE: 6532.3MB/s 6342.7MB/s READ: 2150.6MB/s 2155.4MB/s WRITE: 2156.8MB/s 2161.5MB/s READ: 2106.4MB/s 2095.6MB/s WRITE: 2109.7MB/s 2098.4MB/s BASE PATCHED jobs1 perfstat stalled-cycles-frontend 102,480,595,419 ( 41.53%) 114,508,864,804 ( 46.92%) stalled-cycles-backend 51,941,417,832 ( 21.05%) 46,836,112,388 ( 19.19%) instructions 283,612,054,215 ( 1.15) 283,918,134,959 ( 1.16) branches 56,372,560,385 ( 724.923) 56,449,814,753 ( 733.766) branch-misses 374,826,000 ( 0.66%) 326,935,859 ( 0.58%) jobs2 perfstat stalled-cycles-frontend 155,142,745,777 ( 40.99%) 164,170,979,198 ( 43.82%) stalled-cycles-backend 70,813,866,387 ( 18.71%) 66,456,858,165 ( 17.74%) instructions 463,436,648,173 ( 1.22) 464,221,890,191 ( 1.24) branches 91,088,733,902 ( 760.088) 91,278,144,546 ( 769.133) branch-misses 504,460,363 ( 0.55%) 394,033,842 ( 0.43%) jobs3 perfstat stalled-cycles-frontend 201,300,397,212 ( 39.84%) 223,969,902,257 ( 44.44%) stalled-cycles-backend 87,712,593,974 ( 17.36%) 81,618,888,712 ( 16.19%) instructions 642,869,545,023 ( 1.27) 644,677,354,132 ( 1.28) branches 125,724,560,594 ( 690.682) 126,133,159,521 ( 694.542) branch-misses 527,941,798 ( 0.42%) 444,782,220 ( 0.35%) jobs4 perfstat stalled-cycles-frontend 246,701,197,429 ( 38.12%) 280,076,030,886 ( 43.29%) stalled-cycles-backend 119,050,341,112 ( 18.40%) 110,955,641,671 ( 17.15%) instructions 822,716,962,127 ( 1.27) 825,536,969,320 ( 1.28) branches 160,590,028,545 ( 688.614) 161,152,996,915 ( 691.068) branch-misses 650,295,287 ( 0.40%) 550,229,113 ( 0.34%) jobs5 perfstat stalled-cycles-frontend 298,958,462,516 ( 38.30%) 344,852,200,358 ( 44.16%) stalled-cycles-backend 137,558,742,122 ( 17.62%) 129,465,067,102 ( 16.58%) instructions 1,005,714,688,752 ( 1.29) 1,007,657,999,432 ( 1.29) branches 195,988,773,962 ( 697.730) 196,446,873,984 ( 700.319) branch-misses 695,818,940 ( 0.36%) 624,823,263 ( 0.32%) jobs6 perfstat stalled-cycles-frontend 334,497,602,856 ( 36.71%) 387,590,419,779 ( 42.38%) stalled-cycles-backend 163,539,365,335 ( 17.95%) 152,640,193,639 ( 16.69%) instructions 1,184,738,177,851 ( 1.30) 1,187,396,281,677 ( 1.30) branches 230,592,915,640 ( 702.902) 231,253,802,882 ( 702.356) branch-misses 747,934,786 ( 0.32%) 643,902,424 ( 0.28%) jobs7 perfstat stalled-cycles-frontend 396,724,684,187 ( 37.71%) 460,705,858,952 ( 43.84%) stalled-cycles-backend 188,096,616,496 ( 17.88%) 175,785,787,036 ( 16.73%) instructions 1,364,041,136,608 ( 1.30) 1,366,689,075,112 ( 1.30) branches 265,253,096,936 ( 700.078) 265,890,524,883 ( 702.839) branch-misses 784,991,589 ( 0.30%) 729,196,689 ( 0.27%) jobs8 perfstat stalled-cycles-frontend 440,248,299,870 ( 36.92%) 509,554,793,816 ( 42.46%) stalled-cycles-backend 222,575,930,616 ( 18.67%) 213,401,248,432 ( 17.78%) instructions 1,542,262,045,114 ( 1.29) 1,545,233,932,257 ( 1.29) branches 299,775,178,439 ( 697.666) 300,528,458,505 ( 694.769) branch-misses 847,496,084 ( 0.28%) 748,794,308 ( 0.25%) jobs9 perfstat stalled-cycles-frontend 506,269,882,480 ( 37.86%) 592,798,032,820 ( 44.43%) stalled-cycles-backend 253,192,498,861 ( 18.93%) 233,727,666,185 ( 17.52%) instructions 1,721,985,080,913 ( 1.29) 1,724,666,236,005 ( 1.29) branches 334,517,360,255 ( 694.134) 335,199,758,164 ( 697.131) branch-misses 873,496,730 ( 0.26%) 815,379,236 ( 0.24%) jobs10 perfstat stalled-cycles-frontend 549,063,363,749 ( 37.18%) 651,302,376,662 ( 43.61%) stalled-cycles-backend 281,680,986,810 ( 19.07%) 277,005,235,582 ( 18.55%) instructions 1,901,859,271,180 ( 1.29) 1,906,311,064,230 ( 1.28) branches 369,398,536,153 ( 694.004) 370,527,696,358 ( 688.409) branch-misses 967,929,335 ( 0.26%) 890,125,056 ( 0.24%) BASE PATCHED seconds elapsed 79.421641008 78.735285546 seconds elapsed 61.471246133 60.869085949 seconds elapsed 62.317058173 62.224188495 seconds elapsed 60.030739363 60.081102518 seconds elapsed 74.070398362 74.317582865 seconds elapsed 84.985953007 85.414364176 seconds elapsed 97.724553255 98.173311344 seconds elapsed 109.488066758 110.268399318 seconds elapsed 122.768189405 122.967164498 seconds elapsed 135.130035105 136.934770801 On my other system (8 x86_64 CPUs, short version of test results): BASE PATCHED seconds elapsed 19.518065994 19.806320662 seconds elapsed 15.172772749 15.594718291 seconds elapsed 13.820925970 13.821708564 seconds elapsed 13.293097816 14.585206405 seconds elapsed 16.207284118 16.064431606 seconds elapsed 17.958376158 17.771825767 seconds elapsed 19.478009164 19.602961508 seconds elapsed 21.347152811 21.352318709 seconds elapsed 24.478121126 24.171088735 seconds elapsed 26.865057442 26.767327618 So performance-wise the numbers are quite similar. Also update zcomp interface to be more aligned with the crypto API. [1] http://marc.info/?l=linux-kernel&m=144480832108927&w=2 [2] http://marc.info/?l=linux-kernel&m=145379613507518&w=2 [3] https://github.com/sergey-senozhatsky/zram-perf-test Link: http://lkml.kernel.org/r/20160531122017.2878-3-sergey.senozhatsky@gmail.com Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Suggested-by: Minchan Kim <minchan@kernel.org> Suggested-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Acked-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> diff ebaf9ab5 Tue Jul 26 16:22:45 MDT 2016 Sergey Senozhatsky <sergey.senozhatsky@gmail.com> zram: switch to crypto compress API We don't have an idle zstreams list anymore and our write path now works absolutely differently, preventing preemption during compression. This removes possibilities of read paths preempting writes at wrong places (which could badly affect the performance of both paths) and at the same time opens the door for a move from custom LZO/LZ4 compression backends implementation to a more generic one, using crypto compress API. Joonsoo Kim [1] attempted to do this a while ago, but faced with the need of introducing a new crypto API interface. The root cause was the fact that crypto API compression algorithms require a compression stream structure (in zram terminology) for both compression and decompression ops, while in reality only several of compression algorithms really need it. This resulted in a concept of context-less crypto API compression backends [2]. Both write and read paths, though, would have been executed with the preemption enabled, which in the worst case could have resulted in a decreased worst-case performance, e.g. consider the following case: CPU0 zram_write() spin_lock() take the last idle stream spin_unlock() << preempted >> zram_read() spin_lock() no idle streams spin_unlock() schedule() resuming zram_write compression() but it took me some time to realize that, and it took even longer to evolve zram and to make it ready for crypto API. The key turned out to be -- drop the idle streams list entirely. Without the idle streams list we are free to use compression algorithms that require compression stream for decompression (read), because streams are now placed in per-cpu data and each write path has to disable preemption for compression op, almost completely eliminating the aforementioned case (technically, we still have a small chance, because write path has a fast and a slow paths and the slow path is executed with the preemption enabled; but the frequency of failed fast path is too low). TEST ==== - 4 CPUs, x86_64 system - 3G zram, lzo - fio tests: read, randread, write, randwrite, rw, randrw test script [3] command: ZRAM_SIZE=3G LOG_SUFFIX=XXXX FIO_LOOPS=5 ./zram-fio-test.sh BASE PATCHED jobs1 READ: 2527.2MB/s 2482.7MB/s READ: 2102.7MB/s 2045.0MB/s WRITE: 1284.3MB/s 1324.3MB/s WRITE: 1080.7MB/s 1101.9MB/s READ: 430125KB/s 437498KB/s WRITE: 430538KB/s 437919KB/s READ: 399593KB/s 403987KB/s WRITE: 399910KB/s 404308KB/s jobs2 READ: 8133.5MB/s 7854.8MB/s READ: 7086.6MB/s 6912.8MB/s WRITE: 3177.2MB/s 3298.3MB/s WRITE: 2810.2MB/s 2871.4MB/s READ: 1017.6MB/s 1023.4MB/s WRITE: 1018.2MB/s 1023.1MB/s READ: 977836KB/s 984205KB/s WRITE: 979435KB/s 985814KB/s jobs3 READ: 13557MB/s 13391MB/s READ: 11876MB/s 11752MB/s WRITE: 4641.5MB/s 4682.1MB/s WRITE: 4164.9MB/s 4179.3MB/s READ: 1453.8MB/s 1455.1MB/s WRITE: 1455.1MB/s 1458.2MB/s READ: 1387.7MB/s 1395.7MB/s WRITE: 1386.1MB/s 1394.9MB/s jobs4 READ: 20271MB/s 20078MB/s READ: 18033MB/s 17928MB/s WRITE: 6176.8MB/s 6180.5MB/s WRITE: 5686.3MB/s 5705.3MB/s READ: 2009.4MB/s 2006.7MB/s WRITE: 2007.5MB/s 2004.9MB/s READ: 1929.7MB/s 1935.6MB/s WRITE: 1926.8MB/s 1932.6MB/s jobs5 READ: 18823MB/s 19024MB/s READ: 18968MB/s 19071MB/s WRITE: 6191.6MB/s 6372.1MB/s WRITE: 5818.7MB/s 5787.1MB/s READ: 2011.7MB/s 1981.3MB/s WRITE: 2011.4MB/s 1980.1MB/s READ: 1949.3MB/s 1935.7MB/s WRITE: 1940.4MB/s 1926.1MB/s jobs6 READ: 21870MB/s 21715MB/s READ: 19957MB/s 19879MB/s WRITE: 6528.4MB/s 6537.6MB/s WRITE: 6098.9MB/s 6073.6MB/s READ: 2048.6MB/s 2049.9MB/s WRITE: 2041.7MB/s 2042.9MB/s READ: 2013.4MB/s 1990.4MB/s WRITE: 2009.4MB/s 1986.5MB/s jobs7 READ: 21359MB/s 21124MB/s READ: 19746MB/s 19293MB/s WRITE: 6660.4MB/s 6518.8MB/s WRITE: 6211.6MB/s 6193.1MB/s READ: 2089.7MB/s 2080.6MB/s WRITE: 2085.8MB/s 2076.5MB/s READ: 2041.2MB/s 2052.5MB/s WRITE: 2037.5MB/s 2048.8MB/s jobs8 READ: 20477MB/s 19974MB/s READ: 18922MB/s 18576MB/s WRITE: 6851.9MB/s 6788.3MB/s WRITE: 6407.7MB/s 6347.5MB/s READ: 2134.8MB/s 2136.1MB/s WRITE: 2132.8MB/s 2134.4MB/s READ: 2074.2MB/s 2069.6MB/s WRITE: 2087.3MB/s 2082.4MB/s jobs9 READ: 19797MB/s 19994MB/s READ: 18806MB/s 18581MB/s WRITE: 6878.7MB/s 6822.7MB/s WRITE: 6456.8MB/s 6447.2MB/s READ: 2141.1MB/s 2154.7MB/s WRITE: 2144.4MB/s 2157.3MB/s READ: 2084.1MB/s 2085.1MB/s WRITE: 2091.5MB/s 2092.5MB/s jobs10 READ: 19794MB/s 19784MB/s READ: 18794MB/s 18745MB/s WRITE: 6984.4MB/s 6676.3MB/s WRITE: 6532.3MB/s 6342.7MB/s READ: 2150.6MB/s 2155.4MB/s WRITE: 2156.8MB/s 2161.5MB/s READ: 2106.4MB/s 2095.6MB/s WRITE: 2109.7MB/s 2098.4MB/s BASE PATCHED jobs1 perfstat stalled-cycles-frontend 102,480,595,419 ( 41.53%) 114,508,864,804 ( 46.92%) stalled-cycles-backend 51,941,417,832 ( 21.05%) 46,836,112,388 ( 19.19%) instructions 283,612,054,215 ( 1.15) 283,918,134,959 ( 1.16) branches 56,372,560,385 ( 724.923) 56,449,814,753 ( 733.766) branch-misses 374,826,000 ( 0.66%) 326,935,859 ( 0.58%) jobs2 perfstat stalled-cycles-frontend 155,142,745,777 ( 40.99%) 164,170,979,198 ( 43.82%) stalled-cycles-backend 70,813,866,387 ( 18.71%) 66,456,858,165 ( 17.74%) instructions 463,436,648,173 ( 1.22) 464,221,890,191 ( 1.24) branches 91,088,733,902 ( 760.088) 91,278,144,546 ( 769.133) branch-misses 504,460,363 ( 0.55%) 394,033,842 ( 0.43%) jobs3 perfstat stalled-cycles-frontend 201,300,397,212 ( 39.84%) 223,969,902,257 ( 44.44%) stalled-cycles-backend 87,712,593,974 ( 17.36%) 81,618,888,712 ( 16.19%) instructions 642,869,545,023 ( 1.27) 644,677,354,132 ( 1.28) branches 125,724,560,594 ( 690.682) 126,133,159,521 ( 694.542) branch-misses 527,941,798 ( 0.42%) 444,782,220 ( 0.35%) jobs4 perfstat stalled-cycles-frontend 246,701,197,429 ( 38.12%) 280,076,030,886 ( 43.29%) stalled-cycles-backend 119,050,341,112 ( 18.40%) 110,955,641,671 ( 17.15%) instructions 822,716,962,127 ( 1.27) 825,536,969,320 ( 1.28) branches 160,590,028,545 ( 688.614) 161,152,996,915 ( 691.068) branch-misses 650,295,287 ( 0.40%) 550,229,113 ( 0.34%) jobs5 perfstat stalled-cycles-frontend 298,958,462,516 ( 38.30%) 344,852,200,358 ( 44.16%) stalled-cycles-backend 137,558,742,122 ( 17.62%) 129,465,067,102 ( 16.58%) instructions 1,005,714,688,752 ( 1.29) 1,007,657,999,432 ( 1.29) branches 195,988,773,962 ( 697.730) 196,446,873,984 ( 700.319) branch-misses 695,818,940 ( 0.36%) 624,823,263 ( 0.32%) jobs6 perfstat stalled-cycles-frontend 334,497,602,856 ( 36.71%) 387,590,419,779 ( 42.38%) stalled-cycles-backend 163,539,365,335 ( 17.95%) 152,640,193,639 ( 16.69%) instructions 1,184,738,177,851 ( 1.30) 1,187,396,281,677 ( 1.30) branches 230,592,915,640 ( 702.902) 231,253,802,882 ( 702.356) branch-misses 747,934,786 ( 0.32%) 643,902,424 ( 0.28%) jobs7 perfstat stalled-cycles-frontend 396,724,684,187 ( 37.71%) 460,705,858,952 ( 43.84%) stalled-cycles-backend 188,096,616,496 ( 17.88%) 175,785,787,036 ( 16.73%) instructions 1,364,041,136,608 ( 1.30) 1,366,689,075,112 ( 1.30) branches 265,253,096,936 ( 700.078) 265,890,524,883 ( 702.839) branch-misses 784,991,589 ( 0.30%) 729,196,689 ( 0.27%) jobs8 perfstat stalled-cycles-frontend 440,248,299,870 ( 36.92%) 509,554,793,816 ( 42.46%) stalled-cycles-backend 222,575,930,616 ( 18.67%) 213,401,248,432 ( 17.78%) instructions 1,542,262,045,114 ( 1.29) 1,545,233,932,257 ( 1.29) branches 299,775,178,439 ( 697.666) 300,528,458,505 ( 694.769) branch-misses 847,496,084 ( 0.28%) 748,794,308 ( 0.25%) jobs9 perfstat stalled-cycles-frontend 506,269,882,480 ( 37.86%) 592,798,032,820 ( 44.43%) stalled-cycles-backend 253,192,498,861 ( 18.93%) 233,727,666,185 ( 17.52%) instructions 1,721,985,080,913 ( 1.29) 1,724,666,236,005 ( 1.29) branches 334,517,360,255 ( 694.134) 335,199,758,164 ( 697.131) branch-misses 873,496,730 ( 0.26%) 815,379,236 ( 0.24%) jobs10 perfstat stalled-cycles-frontend 549,063,363,749 ( 37.18%) 651,302,376,662 ( 43.61%) stalled-cycles-backend 281,680,986,810 ( 19.07%) 277,005,235,582 ( 18.55%) instructions 1,901,859,271,180 ( 1.29) 1,906,311,064,230 ( 1.28) branches 369,398,536,153 ( 694.004) 370,527,696,358 ( 688.409) branch-misses 967,929,335 ( 0.26%) 890,125,056 ( 0.24%) BASE PATCHED seconds elapsed 79.421641008 78.735285546 seconds elapsed 61.471246133 60.869085949 seconds elapsed 62.317058173 62.224188495 seconds elapsed 60.030739363 60.081102518 seconds elapsed 74.070398362 74.317582865 seconds elapsed 84.985953007 85.414364176 seconds elapsed 97.724553255 98.173311344 seconds elapsed 109.488066758 110.268399318 seconds elapsed 122.768189405 122.967164498 seconds elapsed 135.130035105 136.934770801 On my other system (8 x86_64 CPUs, short version of test results): BASE PATCHED seconds elapsed 19.518065994 19.806320662 seconds elapsed 15.172772749 15.594718291 seconds elapsed 13.820925970 13.821708564 seconds elapsed 13.293097816 14.585206405 seconds elapsed 16.207284118 16.064431606 seconds elapsed 17.958376158 17.771825767 seconds elapsed 19.478009164 19.602961508 seconds elapsed 21.347152811 21.352318709 seconds elapsed 24.478121126 24.171088735 seconds elapsed 26.865057442 26.767327618 So performance-wise the numbers are quite similar. Also update zcomp interface to be more aligned with the crypto API. [1] http://marc.info/?l=linux-kernel&m=144480832108927&w=2 [2] http://marc.info/?l=linux-kernel&m=145379613507518&w=2 [3] https://github.com/sergey-senozhatsky/zram-perf-test Link: http://lkml.kernel.org/r/20160531122017.2878-3-sergey.senozhatsky@gmail.com Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Suggested-by: Minchan Kim <minchan@kernel.org> Suggested-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Acked-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
Completed in 690 milliseconds