#
e4865c60 |
|
03-Dec-2023 |
Zhao Liu <zhao1.liu@intel.com> |
drm/i915: Use kmap_local_page() in i915_cmd_parser.c The use of kmap_atomic() is being deprecated in favor of kmap_local_page()[1], and this patch converts the call from kmap_atomic() to kmap_local_page(). The main difference between atomic and local mappings is that local mappings doesn't disable page faults or preemption (the preemption is disabled for !PREEMPT_RT case, otherwise it only disables migration). With kmap_local_page(), we can avoid the often unwanted side effect of unnecessary page faults and preemption disables. There're 2 reasons why function copy_batch() doesn't need to disable pagefaults and preemption for mapping: 1. The flush operation is safe. In i915_cmd_parser.c, copy_batch() calls drm_clflush_virt_range() to use CLFLUSHOPT or WBINVD to flush. Since CLFLUSHOPT is global on x86 and WBINVD is called on each cpu in drm_clflush_virt_range(), the flush operation is global. 2. Any context switch caused by preemption or page faults (page fault may cause sleep) doesn't affect the validity of local mapping. Therefore, copy_batch() is a function where the use of kmap_local_page() in place of kmap_atomic() is correctly suited. Convert the calls of kmap_atomic() / kunmap_atomic() to kmap_local_page() / kunmap_local(). [1]: https://lore.kernel.org/all/20220813220034.806698-1-ira.weiny@intel.com Suggested-by: Dave Hansen <dave.hansen@intel.com> Suggested-by: Ira Weiny <ira.weiny@intel.com> Suggested-by: Fabio M. De Francesco <fmdefrancesco@gmail.com> Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Fabio M. De Francesco <fmdefrancesco@gmail.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231203132947.2328805-9-zhao1.liu@linux.intel.com
|
#
8e4ee5e8 |
|
30-Nov-2022 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Wrap all access to i915_vma.node.start|size We already wrap i915_vma.node.start for use with the GGTT, as there we can perform additional sanity checks that the node belongs to the GGTT and fits within the 32b registers. In the next couple of patches, we will introduce guard pages around the objects _inside_ the drm_mm_node allocation. That is we will offset the vma->pages so that the first page is at drm_mm_node.start + vma->guard (not 0 as is currently the case). All users must then not use i915_vma.node.start directly, but compute the guard offset, thus all users are converted to use a i915_vma_offset() wrapper. The notable exceptions are the selftests that are testing exact behaviour of i915_vma_pin/i915_vma_insert. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com> Co-developed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221130235805.221010-3-andi.shyti@linux.intel.com
|
#
e9b67ec2 |
|
03-Mar-2022 |
Jani Nikula <jani.nikula@intel.com> |
drm/i915: include linux/highmem.h and linux/swap.h where needed Include linux/highmem.h and linux/swap.h explicitly where needed so we can drop the linux/i2c.h include from i915_drv.h where it pulled in the dependencies implicitly. Signed-off-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220303181931.1661767-5-jani.nikula@intel.com
|
#
5f2ec909 |
|
10-Feb-2022 |
Jani Nikula <jani.nikula@intel.com> |
drm/i915: don't include drm_cache.h in i915_drv.h Include it only in files that use it. Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/14edab4a193ea3f73f387a88e3836c8555401871.1644507885.git.jani.nikula@intel.com
|
#
ce2fce25 |
|
27-Jan-2022 |
Matt Roper <matthew.d.roper@intel.com> |
drm/i915: Only include i915_reg.h from .c files Several of our i915 header files, have been including i915_reg.h. This means that any change to i915_reg.h will trigger a full rebuild of pretty much every file of the driver, even those that don't have any kind of register access. Let's delete the i915_reg.h include from all headers and add an explicit include from the .c files that truly need the register definitions; those that need a definition of i915_reg_t for a function definition can get it from i915_reg_defs.h instead. We also remove two non-register #define's (VLV_DISPLAY_BASE and GEN12_SFC_DONE_MAX) into i915_reg_defs.h to allow us to drop the i915_reg.h include from a couple of headers. There's probably a lot more header dependency optimization possible, but the changes here roughly cut the number of files compiled after 'touch i915_reg.h' in half --- a good first step. Cc: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220127234334.4016964-7-matthew.d.roper@intel.com
|
#
0d6419e9 |
|
27-Jan-2022 |
Matt Roper <matthew.d.roper@intel.com> |
drm/i915: Move GT registers to their own header file This is a huge, chaotic mass of registers copied over as-is without any real cleanup. We'll come back and organize these better, align on consistent coding style, remove dead code, etc. in separate patches later that will be easier to review. v2: - Add missing include in intel_pxp_irq.c v3: - Correct a few indentation errors (Lucas) - Minor conflict resolution Cc: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220127234334.4016964-6-matthew.d.roper@intel.com
|
#
e71a7412 |
|
27-Jan-2022 |
Matt Roper <matthew.d.roper@intel.com> |
drm/i915: Parameterize MI_PREDICATE registers The various MI_PREDICATE registers have per-engine instances. Today we only utilize the RCS0 instance of each, but that will likely change in the future; switch to parameterized register definitions to make these easier to work with going forward. Of special note is MI_PREDICATE_RESULT_2; we only use it in one place in the driver today in HSW-specific code. It turns out that the bspec (page 94) lists two different offsets for this register on HSW; one is in the standard location shared by all other platforms (base + 0x3bc) and the other is an unusual location (0x2214). We're using the second, non-standard offset in i915 today; that offset doesn't exist on any other platforms (and it's not even 100% clear that it's correct for HSW) so I've renamed the current non-standard definition to HSW_MI_PREDICATE_RESULT_2; the new cross-platform parameterized macro (which is still unused at the moment) uses the standard offset. Cc: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220127234334.4016964-5-matthew.d.roper@intel.com
|
#
202b1f4c |
|
10-Jan-2022 |
Matt Roper <matthew.d.roper@intel.com> |
drm/i915/gt: Move engine registers to their own header Let's continue breaking up and cleaning up the massive i915_reg.h file by moving all registers that are defined in relation to an engine base to their own header. There are probably a bunch of other "engine registers" that we haven't moved yet (especially those that belong to the render engine in the 0x2??? range), but this is a relatively straightforward first step. Cc: Jani Nikula <jani.nikula@linux.intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220111051600.3429104-8-matthew.d.roper@intel.com
|
#
e9f9bcd5 |
|
10-Jan-2022 |
Matt Roper <matthew.d.roper@intel.com> |
drm/i915: Use parameterized GPR register definitions everywhere Since we have an engine-parameterized macro GEN8_RING_CS_GPR, let's use that in place of the HSW_CS_GPR and BCS_GPR register definitions. Cc: Jani Nikula <jani.nikula@linux.intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220111051600.3429104-2-matthew.d.roper@intel.com
|
#
23d639d7 |
|
07-Jan-2022 |
Jani Nikula <jani.nikula@intel.com> |
drm/i915: split out i915_cmd_parser.h from i915_drv.h We already have the i915_cmd_parser.c file. Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/1a02b8788266f4f2fd4de12808b55c4a66179e98.1641561552.git.jani.nikula@intel.com
|
#
15eb083b |
|
20-Jul-2021 |
Jason Ekstrand <jason@jlekstrand.net> |
drm/i915: Correct the docs for intel_engine_cmd_parser In 93b713304188 ("drm/i915: Revert "drm/i915/gem: Asynchronous cmdparser""), the parameters to intel_engine_cmd_parser() were altered without updating the docs, causing Fi.CI.DOCS to start failing. Fixes: 93b713304188 ("drm/i915: Revert "drm/i915/gem: Asynchronous cmdparser"") Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210720182108.2761496-1-jason@jlekstrand.net Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> [Added 'Fixes:' tag and corrected the hash for the ancestor]
|
#
0c6609bb |
|
14-Jul-2021 |
Jason Ekstrand <jason@jlekstrand.net> |
Revert "drm/i915: Skip over MI_NOOP when parsing" This reverts a6c5e2aea704 ("drm/i915: Skip over MI_NOOP when parsing"). It complicates the batch parsing code a bit and increases indentation for no reason other than fast-skipping a command that userspace uses only rarely. Sure, there may be IGT tests that fill batches with NOOPs but that's not a case we should optimize for in the kernel. We should optimize for code clarity instead. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jon Bloomfield <jon.bloomfield@intel.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210714193419.1459723-6-jason@jlekstrand.net
|
#
93b71330 |
|
14-Jul-2021 |
Jason Ekstrand <jason@jlekstrand.net> |
drm/i915: Revert "drm/i915/gem: Asynchronous cmdparser" This reverts 686c7c35abc2 ("drm/i915/gem: Asynchronous cmdparser"). The justification for this commit in the git history was a vague comment about getting it out from under the struct_mutex. While this may improve perf for some workloads on Gen7 platforms where we rely on the command parser for features such as indirect rendering, no numbers were provided to prove such an improvement. It claims to closed two gitlab/bugzilla issues but with no explanation whatsoever as to why or what bug it's fixing. Meanwhile, by moving command parsing off to an async callback, it leaves us with a problem of what to do on error. When things were synchronous, EXECBUFFER2 would fail with an error code if parsing failed. When moving it to async, we needed another way to handle that error and the solution employed was to set an error on the dma_fence and then trust that said error gets propagated to the client eventually. Moving back to synchronous will help us untangle the fence error propagation mess. This also reverts most of 0edbb9ba1bfe ("drm/i915: Move cmd parser pinning to execbuffer") which is a refactor of some of our allocation paths for asynchronous parsing. Now that everything is synchronous, we don't need it. v2 (Daniel Vetter): - Add stabel Cc and Fixes tag Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: <stable@vger.kernel.org> # v5.6+ Fixes: 9e31c1fe45d5 ("drm/i915: Propagate errors on awaiting already signaled fences") Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Jon Bloomfield <jon.bloomfield@intel.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210714193419.1459723-2-jason@jlekstrand.net
|
#
6e0b6528 |
|
20-Jul-2021 |
Jason Ekstrand <jason@jlekstrand.net> |
drm/i915: Correct the docs for intel_engine_cmd_parser In 93b713304188 ("drm/i915: Revert "drm/i915/gem: Asynchronous cmdparser""), the parameters to intel_engine_cmd_parser() were altered without updating the docs, causing Fi.CI.DOCS to start failing. Fixes: c9d9fdbc108a ("drm/i915: Revert "drm/i915/gem: Asynchronous cmdparser"") Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210720182108.2761496-1-jason@jlekstrand.net Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> [Added 'Fixes:' tag and corrected the hash for the ancestor] (cherry picked from commit 15eb083bdb561bb4862cd04cd0523e55483e877e) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> [Updated Fixes tag to match fixes branch]
|
#
c9d9fdbc |
|
14-Jul-2021 |
Jason Ekstrand <jason@jlekstrand.net> |
drm/i915: Revert "drm/i915/gem: Asynchronous cmdparser" This reverts 686c7c35abc2 ("drm/i915/gem: Asynchronous cmdparser"). The justification for this commit in the git history was a vague comment about getting it out from under the struct_mutex. While this may improve perf for some workloads on Gen7 platforms where we rely on the command parser for features such as indirect rendering, no numbers were provided to prove such an improvement. It claims to closed two gitlab/bugzilla issues but with no explanation whatsoever as to why or what bug it's fixing. Meanwhile, by moving command parsing off to an async callback, it leaves us with a problem of what to do on error. When things were synchronous, EXECBUFFER2 would fail with an error code if parsing failed. When moving it to async, we needed another way to handle that error and the solution employed was to set an error on the dma_fence and then trust that said error gets propagated to the client eventually. Moving back to synchronous will help us untangle the fence error propagation mess. This also reverts most of 0edbb9ba1bfe ("drm/i915: Move cmd parser pinning to execbuffer") which is a refactor of some of our allocation paths for asynchronous parsing. Now that everything is synchronous, we don't need it. v2 (Daniel Vetter): - Add stabel Cc and Fixes tag Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: <stable@vger.kernel.org> # v5.6+ Fixes: 9e31c1fe45d5 ("drm/i915: Propagate errors on awaiting already signaled fences") Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Jon Bloomfield <jon.bloomfield@intel.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210714193419.1459723-2-jason@jlekstrand.net (cherry picked from commit 93b713304188844b8514074dc13ffd56d12235d3) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
#
651e7d48 |
|
05-Jun-2021 |
Lucas De Marchi <lucas.demarchi@intel.com> |
drm/i915: replace IS_GEN and friends with GRAPHICS_VER This was done by the following semantic patch: @@ expression i915; @@ - INTEL_GEN(i915) + GRAPHICS_VER(i915) @@ expression i915; expression E; @@ - INTEL_GEN(i915) >= E + GRAPHICS_VER(i915) >= E @@ expression dev_priv; expression E; @@ - !IS_GEN(dev_priv, E) + GRAPHICS_VER(dev_priv) != E @@ expression dev_priv; expression E; @@ - IS_GEN(dev_priv, E) + GRAPHICS_VER(dev_priv) == E @@ expression dev_priv; expression from, until; @@ - IS_GEN_RANGE(dev_priv, from, until) + IS_GRAPHICS_VER(dev_priv, from, until) @def@ expression E; identifier id =~ "^gen$"; @@ - id = GRAPHICS_VER(E) + ver = GRAPHICS_VER(E) @@ identifier def.id; @@ - id + ver It also takes care of renaming the variable we assign to GRAPHICS_VER() so to use "ver" rather than "gen". Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210606045050.103862-2-lucas.demarchi@intel.com
|
#
f1f7f553 |
|
21-Apr-2021 |
Maarten Lankhorst <maarten.lankhorst@linux.intel.com> |
drm/i915: Fix docbook descriptions for i915_cmd_parser Fixes the following htmldocs warnings: drivers/gpu/drm/i915/i915_cmd_parser.c:1420: warning: Excess function parameter 'trampoline' description in 'intel_engine_cmd_parser' drivers/gpu/drm/i915/i915_cmd_parser.c:1420: warning: Function parameter or member 'jump_whitelist' not described in 'intel_engine_cmd_parser' drivers/gpu/drm/i915/i915_cmd_parser.c:1420: warning: Function parameter or member 'shadow_map' not described in 'intel_engine_cmd_parser' drivers/gpu/drm/i915/i915_cmd_parser.c:1420: warning: Function parameter or member 'batch_map' not described in 'intel_engine_cmd_parser' drivers/gpu/drm/i915/i915_cmd_parser.c:1420: warning: Excess function parameter 'trampoline' description in 'intel_engine_cmd_parser' Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210421120353.544518-1-maarten.lankhorst@linux.intel.com Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
0edbb9ba |
|
23-Mar-2021 |
Maarten Lankhorst <maarten.lankhorst@linux.intel.com> |
drm/i915: Move cmd parser pinning to execbuffer We need to get rid of allocations in the cmd parser, because it needs to be called from a signaling context, first move all pinning to execbuf, where we already hold all locks. Allocate jump_whitelist in the execbuffer, and add annotations around intel_engine_cmd_parser(), to ensure we only call the command parser without allocating any memory, or taking any locks we're not supposed to. Because i915_gem_object_get_page() may also allocate memory, add a path to i915_gem_object_get_sg() that prevents memory allocations, and walk the sg list manually. It should be similarly fast. This has the added benefit of being able to catch all memory allocation errors before the point of no return, and return -ENOMEM safely to the execbuf submitter. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210323155059.628690-4-maarten.lankhorst@linux.intel.com
|
#
a829f033 |
|
02-Mar-2021 |
Tvrtko Ursulin <tvrtko.ursulin@intel.com> |
drm/i915: Wedge the GPU if command parser setup fails Commit 311a50e76a33 ("drm/i915: Add support for mandatory cmdparsing") introduced mandatory command parsing but setup failures were not translated into wedging the GPU which was probably the intent. Possible errors come in two categories. Either the sanity check on internal tables has failed, which should be caught in CI unless an affected platform would be missed in testing; or memory allocation failure happened during driver load, which should be extremely unlikely but for correctness should still be handled. v2: * Tidy coding style. (Chris) [airlied: cherry-picked to avoid rc1 base] Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Fixes: 311a50e76a33 ("drm/i915: Add support for mandatory cmdparsing") Cc: Jon Bloomfield <jon.bloomfield@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris.p.wilson@intel.com> Reviewed-by: Chris Wilson <chris.p.wilson@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210302114213.1102223-1-tvrtko.ursulin@linux.intel.com (cherry picked from commit 5a1a659762d35a6dc51047c9127c011303c77b7f) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
|
#
8f47c8c3 |
|
19-Jan-2021 |
Matthew Auld <matthew.auld@intel.com> |
drm/i915/pool: constrain pool objects by mapping type In a few places we always end up mapping the pool object with the FORCE constraint(to prevent hitting -EBUSY) which will destroy the cached mapping if it has a different type. As a simple first step, make the mapping type part of the pool interface, where the behaviour is to only give out pool objects which match the requested mapping type. Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20210119133106.66294-4-matthew.auld@intel.com
|
#
eeb52ee6 |
|
24-Dec-2020 |
Matthew Auld <matthew.auld@intel.com> |
drm/i915: clear the shadow batch The shadow batch is an internal object, which doesn't have any page clearing, and since the batch_len can be smaller than the object, we should take care to clear it. Testcase: igt/gen9_exec_parse/shadow-peek Fixes: 4f7af1948abc ("drm/i915: Support ro ppgtt mapped cmdparser shadow buffers") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20201224151358.401345-1-matthew.auld@intel.com Cc: stable@vger.kernel.org
|
#
45233ab2 |
|
16-Dec-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Move gen8 CS emitters into gen8_engine_cs.h Reduce the pollution of intel_engine.h by moving gen8_emit_pipe_control and friends to gen8_engine_cs.h Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201216135452.6063-1-chris@chris-wilson.co.uk
|
#
75353bcd |
|
24-Dec-2020 |
Matthew Auld <matthew.auld@intel.com> |
drm/i915: clear the shadow batch The shadow batch is an internal object, which doesn't have any page clearing, and since the batch_len can be smaller than the object, we should take care to clear it. Testcase: igt/gen9_exec_parse/shadow-peek Fixes: 4f7af1948abc ("drm/i915: Support ro ppgtt mapped cmdparser shadow buffers") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20201224151358.401345-1-matthew.auld@intel.com Cc: stable@vger.kernel.org (cherry picked from commit eeb52ee6c4a429ec301faf1dc48988744960786e) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
#
a6c5e2ae |
|
01-Oct-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Skip over MI_NOOP when parsing Though less likely in practice, igt uses MI_NOOP frequently to pad out its batch buffers. The lookup and valiation of so many MI_NOOP command descriptions is noticeable, though the side-effect of poisoning the last-validated-command cache is more likely to impact upon real CS. Testcase: igt/gen9_exec_parse/bb-large Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201001102632.18789-1-chris@chris-wilson.co.uk
|
#
c60b93cd |
|
28-Sep-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Avoid mixing integer types during batch copies Be consistent and use unsigned long throughout the chunk copies to avoid the inherent clumsiness of mixing integer types of different widths and signs. Failing to take acount of a wider unsigned type when using min_t can lead to treating it as a negative, only for it flip back to a large unsigned value after passing a boundary check. Fixes: ed13033f0287 ("drm/i915/cmdparser: Only cache the dst vmap") Testcase: igt/gen9_exec_parse/bb-large Reported-by: "Candelaria, Jared" <jared.candelaria@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: "Candelaria, Jared" <jared.candelaria@intel.com> Cc: "Bloomfield, Jon" <jon.bloomfield@intel.com> Cc: <stable@vger.kernel.org> # v4.9+ Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200928215942.31917-1-chris@chris-wilson.co.uk (cherry picked from commit b7eeb2b4132ccf1a7d38f434cde7043913d1ed3c) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
#
b7eeb2b4 |
|
28-Sep-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Avoid mixing integer types during batch copies Be consistent and use unsigned long throughout the chunk copies to avoid the inherent clumsiness of mixing integer types of different widths and signs. Failing to take acount of a wider unsigned type when using min_t can lead to treating it as a negative, only for it flip back to a large unsigned value after passing a boundary check. Fixes: ed13033f0287 ("drm/i915/cmdparser: Only cache the dst vmap") Testcase: igt/gen9_exec_parse/bb-large Reported-by: "Candelaria, Jared" <jared.candelaria@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: "Candelaria, Jared" <jared.candelaria@intel.com> Cc: "Bloomfield, Jon" <jon.bloomfield@intel.com> Cc: <stable@vger.kernel.org> # v4.9+ Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200928215942.31917-1-chris@chris-wilson.co.uk
|
#
e5f10d63 |
|
17-Aug-2020 |
Mika Kuoppala <mika.kuoppala@linux.intel.com> |
drm/i915: Fix cmd parser desc matching with masks Our variety of defined gpu commands have the actual command id field and possibly length and flags applied. We did start to apply the mask during initialization of the cmd descriptors but forgot to also apply it on comparisons. Fix comparisons in order to properly deny access with associated commands. v2: fix lri with correct mask (Chris) References: 926abff21a8f ("drm/i915/cmdparser: Ignore Length operands during command matching") Reported-by: Nicolai Stange <nstange@suse.de> Cc: stable@vger.kernel.org # v5.4+ Cc: Miroslav Benes <mbenes@suse.cz> Cc: Takashi Iwai <tiwai@suse.de> Cc: Tyler Hicks <tyhicks@canonical.com> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Cc: Chris Wilson <chris.p.wilson@intel.com> Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200817195926.12671-1-mika.kuoppala@linux.intel.com (cherry picked from commit 3b4efa148da36f158cce3f662e831af2834b8e0f) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
#
3b4efa14 |
|
17-Aug-2020 |
Mika Kuoppala <mika.kuoppala@linux.intel.com> |
drm/i915: Fix cmd parser desc matching with masks Our variety of defined gpu commands have the actual command id field and possibly length and flags applied. We did start to apply the mask during initialization of the cmd descriptors but forgot to also apply it on comparisons. Fix comparisons in order to properly deny access with associated commands. v2: fix lri with correct mask (Chris) References: 926abff21a8f ("drm/i915/cmdparser: Ignore Length operands during command matching") Reported-by: Nicolai Stange <nstange@suse.de> Cc: stable@vger.kernel.org # v5.4+ Cc: Miroslav Benes <mbenes@suse.cz> Cc: Takashi Iwai <tiwai@suse.de> Cc: Tyler Hicks <tyhicks@canonical.com> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Cc: Chris Wilson <chris.p.wilson@intel.com> Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200817195926.12671-1-mika.kuoppala@linux.intel.com
|
#
273500ae |
|
01-Jun-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Whitelist context-local timestamp in the gen9 cmdparser Allow batch buffers to read their own _local_ cumulative HW runtime of their logical context. Fixes: 0f2f39758341 ("drm/i915: Add gen9 BCS cmdparsing") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: <stable@vger.kernel.org> # v5.4+ Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200601161942.30854-1-chris@chris-wilson.co.uk (cherry picked from commit f9496520df11de00fbafc3cbd693b9570d600ab3) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
#
f9496520 |
|
01-Jun-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Whitelist context-local timestamp in the gen9 cmdparser Allow batch buffers to read their own _local_ cumulative HW runtime of their logical context. Fixes: 0f2f39758341 ("drm/i915: Add gen9 BCS cmdparsing") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: <stable@vger.kernel.org> # v5.4+ Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200601161942.30854-1-chris@chris-wilson.co.uk
|
#
0c4336b9 |
|
30-Jan-2020 |
Wambui Karuga <wambui.karugax@gmail.com> |
drm/i915/cmd_parser: conversion to struct drm_device logging macros. Manually convert printk based drm logging macros to the struct drm_device based logging macros in i915/i915_cmd_parser.c. This also involves extracting the drm_i915_private device from various intel types for use in the macros. Instances of the DRM_DEBUG macro are not converted due to the lack of a similar struct drm_device based logging macro. References: https://lists.freedesktop.org/archives/dri-devel/2020-January/253381.html Signed-off-by: Wambui Karuga <wambui.karugax@gmail.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200131093416.28431-4-wambui.karugax@gmail.com
|
#
f8b74877 |
|
13-Dec-2019 |
Maya Rashish <coypu@sdf.org> |
Correct function name in comment Signed-off-by: Maya Rashish <coypu@sdf.org> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191213102630.GA24082@SDF.ORG
|
#
686c7c35 |
|
11-Dec-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gem: Asynchronous cmdparser Execute the cmdparser asynchronously as part of the submission pipeline. Using our dma-fences, we can schedule execution after an asynchronous piece of work, so we move the cmdparser out from under the struct_mutex inside execbuf as run it as part of the submission pipeline. The same security rules apply, we copy the user batch before validation and userspace cannot touch the validation shadow. The only caveat is that we will do request construction before we complete cmdparsing and so we cannot know the outcome of the validation step until later -- so the execbuf ioctl does not report -EINVAL directly, but we must cancel execution of the request and flag the error on the out-fence. Closes: https://gitlab.freedesktop.org/drm/intel/issues/611 Closes: https://gitlab.freedesktop.org/drm/intel/issues/412 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191211230858.599030-2-chris@chris-wilson.co.uk
|
#
32d94048 |
|
11-Dec-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gem: Prepare gen7 cmdparser for async execution The gen7 cmdparser is primarily a promotion-based system to allow access to additional registers beyond the HW validation, and allows fallback to normal execution of the user batch buffer if valid and requires chaining. In the next patch, we will do the cmdparser validation in the pipeline asynchronously and so at the point of request construction we will not know if we want to execute the privileged and validated batch, or the original user batch. The solution employed here is to execute both batches, one with raised privileges and one as normal. This is because the gen7 MI_BATCH_BUFFER_START command cannot change privilege level within a batch and must strictly use the current privilege level (or undefined behaviour kills the GPU). So in order to execute the original batch, we need a second non-priviledged batch buffer chain from the ring, i.e. we need to emit two batches for each user batch. Inside the two batches we determine which one should actually execute, we provide a conditional trampoline to call the original batch. Implementation-wise, we create a single buffer and write the shadow and the trampoline inside it at different offsets; and bind the buffer into both the kernel GGTT for the privileged execution of the shadow and into the user ppGTT for the non-privileged execution of the trampoline and original batch. One buffer, two batches and two vma. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191211230858.599030-1-chris@chris-wilson.co.uk
|
#
6aacb5a3 |
|
11-Dec-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Align start for memcpy_from_wc The movntqda requires 16-byte alignment for the source pointer. Avoid falling back to clflush if the source pointer is misaligned by doing the doing a small uncached memcpy to fixup the alignments. v2: Turn the unaligned copy into a genuine helper Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191211110437.4082687-5-chris@chris-wilson.co.uk
|
#
37d1151c |
|
11-Dec-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Simplify error escape from cmdparser We need to flush the destination buffer, even on error, to maintain consistent cache state. Thereby removing the jump on error past the clear, and reducing the loop-escape mechanism to a mere break. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191211110437.4082687-3-chris@chris-wilson.co.uk
|
#
755bf8a8 |
|
11-Dec-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Remove redundant parameters from intel_engine_cmd_parser Declutter the calling interface by reducing the parameters to the i915_vma and associated offsets. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191211110437.4082687-2-chris@chris-wilson.co.uk
|
#
8f1ada25 |
|
11-Dec-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Fix cmdparser drm.debug The cmdparser rejection debug is not for driver development, but for the user, for which we use a plain DRM_DEBUG(). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191211110437.4082687-1-chris@chris-wilson.co.uk
|
#
05975cd9 |
|
04-Dec-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Remove vestigal i915_gem_context locals from cmdparser The use GEM context itself was removed in commit cd30a5031704 ("drm/i915/gem: Excise the per-batch whitelist from the context"), but the locals were left in place as an oversight. Remove the parameters and clean up. References: cd30a5031704 ("drm/i915/gem: Excise the per-batch whitelist from the context") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191204232616.94397-1-chris@chris-wilson.co.uk
|
#
cd30a503 |
|
28-Nov-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gem: Excise the per-batch whitelist from the context One does not lightly add a new hidden struct_mutex dependency deep within the execbuf bowels! The immediate suspicion in seeing the whitelist cached on the context, is that it is intended to be preserved between batches, as the kernel is quite adept at caching small allocations itself. But no, it's sole purpose is to serialise command submission in order to save a kmalloc on a slow, slow path! By removing the whitelist dependency from the context, our freedom to chop the big struct_mutex is greatly augmented. v2: s/set_bit/__set_bit/ as the whitelist shall never be accessed concurrently. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191128113424.3885958-1-chris@chris-wilson.co.uk
|
#
ea0b163b |
|
11-Nov-2019 |
Ben Hutchings <ben@decadent.org.uk> |
drm/i915/cmdparser: Fix jump whitelist clearing When a jump_whitelist bitmap is reused, it needs to be cleared. Currently this is done with memset() and the size calculation assumes bitmaps are made of 32-bit words, not longs. So on 64-bit architectures, only the first half of the bitmap is cleared. If some whitelist bits are carried over between successive batches submitted on the same context, this will presumably allow embedding the rogue instructions that we're trying to reject. Use bitmap_zero() instead, which gets the calculation right. Fixes: f8c08d8faee5 ("drm/i915/cmdparser: Add support for backward jumps") Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
|
#
926abff2 |
|
20-Sep-2018 |
Jon Bloomfield <jon.bloomfield@intel.com> |
drm/i915/cmdparser: Ignore Length operands during command matching Some of the gen instruction macros (e.g. MI_DISPLAY_FLIP) have the length directly encoded in them. Since these are used directly in the tables, the Length becomes part of the comparison used for matching during parsing. Thus, if the cmd being parsed has a different length to that in the table, it is not matched and the cmd is accepted via the default variable length path. Fix by masking out everything except the Opcode in the cmd tables Cc: Tony Luck <tony.luck@intel.com> Cc: Dave Airlie <airlied@redhat.com> Cc: Takashi Iwai <tiwai@suse.de> Cc: Tyler Hicks <tyhicks@canonical.com> Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com> Reviewed-by: Chris Wilson <chris.p.wilson@intel.com>
|
#
f8c08d8f |
|
20-Sep-2018 |
Jon Bloomfield <jon.bloomfield@intel.com> |
drm/i915/cmdparser: Add support for backward jumps To keep things manageable, the pre-gen9 cmdparser does not attempt to track any form of nested BB_START's. This did not prevent usermode from using nested starts, or even chained batches because the cmdparser is not strictly enforced pre gen9. Instead, the existence of a nested BB_START would cause the batch to be emitted in insecure mode, and any privileged capabilities would not be available. For Gen9, the cmdparser becomes mandatory (for BCS at least), and so not providing any form of nested BB_START support becomes overly restrictive. Any such batch will simply not run. We make heavy use of backward jumps in igt, and it is much easier to add support for this restricted subset of nested jumps, than to rewrite the whole of our test suite to avoid them. Add the required logic to support limited backward jumps, to instructions that have already been validated by the parser. Note that it's not sufficient to simply approve any BB_START that jumps backwards in the buffer because this would allow an attacker to embed a rogue instruction sequence within the operand words of a harmless instruction (say LRI) and jump to that. We introduce a bit array to track every instr offset successfully validated, and test the target of BB_START against this. If the target offset hits, it is re-written to the same offset in the shadow buffer and the BB_START cmd is allowed. Note: This patch deliberately ignores checkpatch issues in the cmdtables, in order to match the style of the surrounding code. We'll correct the entire file in one go in a later patch. v2: set dispatch secure late (Mika) v3: rebase (Mika) v4: Clear whitelist on each parse Minor review updates (Chris) v5: Correct backward jump batching v6: fix compilation error due to struct eb shuffle (Mika) Cc: Tony Luck <tony.luck@intel.com> Cc: Dave Airlie <airlied@redhat.com> Cc: Takashi Iwai <tiwai@suse.de> Cc: Tyler Hicks <tyhicks@canonical.com> Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com> Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Chris Wilson <chris.p.wilson@intel.com>
|
#
0546a29c |
|
27-Sep-2018 |
Jon Bloomfield <jon.bloomfield@intel.com> |
drm/i915/cmdparser: Use explicit goto for error paths In the next patch we will be adding a second valid termination condition which will require a small amount of refactoring to share logic with the BB_END case. Refactor all error conditions to jump to a dedicated exit path, with 'break' reserved only for a successful parse. Cc: Tony Luck <tony.luck@intel.com> Cc: Dave Airlie <airlied@redhat.com> Cc: Takashi Iwai <tiwai@suse.de> Cc: Tyler Hicks <tyhicks@canonical.com> Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com> Reviewed-by: Chris Wilson <chris.p.wilson@intel.com>
|
#
0f2f3975 |
|
23-Apr-2018 |
Jon Bloomfield <jon.bloomfield@intel.com> |
drm/i915: Add gen9 BCS cmdparsing For gen9 we enable cmdparsing on the BCS ring, specifically to catch inadvertent accesses to sensitive registers Unlike gen7/hsw, we use the parser only to block certain registers. We can rely on h/w to block restricted commands, so the command tables only provide enough info to allow the parser to delineate each command, and identify commands that access registers. Note: This patch deliberately ignores checkpatch issues in favour of matching the style of the surrounding code. We'll correct the entire file in one go in a later patch. v3: rebase (Mika) v4: Add RING_TIMESTAMP registers to whitelist (Jon) Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Dave Airlie <airlied@redhat.com> Cc: Takashi Iwai <tiwai@suse.de> Cc: Tyler Hicks <tyhicks@canonical.com> Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Chris Wilson <chris.p.wilson@intel.com>
|
#
311a50e7 |
|
01-Aug-2018 |
Jon Bloomfield <jon.bloomfield@intel.com> |
drm/i915: Add support for mandatory cmdparsing The existing cmdparser for gen7 can be bypassed by specifying batch_len=0 in the execbuf call. This is safe because bypassing simply reduces the cmd-set available. In a later patch we will introduce cmdparsing for gen9, as a security measure, which must be strictly enforced since without it we are vulnerable to DoS attacks. Introduce the concept of 'required' cmd parsing that cannot be bypassed by submitting zero-length bb's. v2: rebase (Mika) v2: rebase (Mika) v3: fix conflict on engine flags (Mika) Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Dave Airlie <airlied@redhat.com> Cc: Takashi Iwai <tiwai@suse.de> Cc: Tyler Hicks <tyhicks@canonical.com> Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Chris Wilson <chris.p.wilson@intel.com>
|
#
66d8aba1 |
|
08-Jun-2018 |
Jon Bloomfield <jon.bloomfield@intel.com> |
drm/i915: Remove Master tables from cmdparser The previous patch has killed support for secure batches on gen6+, and hence the cmdparsers master tables are now dead code. Remove them. Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Dave Airlie <airlied@redhat.com> Cc: Takashi Iwai <tiwai@suse.de> Cc: Tyler Hicks <tyhicks@canonical.com> Reviewed-by: Chris Wilson <chris.p.wilson@intel.com>
|
#
0a2f661b |
|
20-Apr-2018 |
Jon Bloomfield <jon.bloomfield@intel.com> |
drm/i915: Rename gen7 cmdparser tables We're about to introduce some new tables for later gens, and the current naming for the gen7 tables will no longer make sense. v2: rebase Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Dave Airlie <airlied@redhat.com> Cc: Takashi Iwai <tiwai@suse.de> Cc: Tyler Hicks <tyhicks@canonical.com> Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Chris Wilson <chris.p.wilson@intel.com>
|
#
9c9082b9 |
|
08-Aug-2019 |
Jani Nikula <jani.nikula@intel.com> |
drm/i915: extract i915_memcpy.h from i915_drv.h It used to be handy that we only had a couple of headers, but over time i915_drv.h has become unwieldy. Extract declarations to a separate header file corresponding to the implementation module, clarifying the modularity of the driver. Ensure the new header is self-contained, and do so with minimal further includes, using forward declarations as needed. Include the new header only where needed, and sort the modified include directives while at it and as needed. No functional changes. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/f2b887002150acdf218385ea846f7aa617aa5f15.1565271681.git.jani.nikula@intel.com
|
#
750e76b4 |
|
06-Aug-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Move the [class][inst] lookup for engines onto the GT To maintain a fast lookup from a GT centric irq handler, we want the engine lookup tables on the intel_gt. To avoid having multiple copies of the same multi-dimension lookup table, move the generic user engine lookup into an rbtree (for fast and flexible indexing). v2: Split uabi_instance cf uabi_class v3: Set uabi_class/uabi_instance after collating all engines to provide a stable uabi across parallel unordered construction. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> #v2 Link: https://patchwork.freedesktop.org/patch/msgid/20190806124300.24945-2-chris@chris-wilson.co.uk
|
#
6951e589 |
|
28-May-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Move GEM object domain management from struct_mutex to local Use the per-object local lock to control the cache domain of the individual GEM objects, not struct_mutex. This is a huge leap forward for us in terms of object-level synchronisation; execbuffers are coordinated using the ww_mutex and pread/pwrite is finally fully serialised again. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-10-chris@chris-wilson.co.uk
|
#
f0e4a063 |
|
28-May-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Move GEM domain management to its own file Continuing the decluttering of i915_gem.c, that of the read/write domains, perhaps the biggest of GEM's follies? Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-7-chris@chris-wilson.co.uk
|
#
112ed2d3 |
|
24-Apr-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Move GraphicsTechnology files under gt/ Start partitioning off the code that talks to the hardware (GT) from the uapi layers and move the device facing code under gt/ One casualty is s/intel_ringbuffer.h/intel_engine.h/ with the plan to subdivide that header and body further (and split out the submission code from the ringbuffer and logical context handling). This patch aims to be simple motion so git can fixup inflight patches with little mess. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Acked-by: Jani Nikula <jani.nikula@intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190424174839.7141-1-chris@chris-wilson.co.uk
|
#
8a68d464 |
|
05-Mar-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Store the BIT(engine->id) as the engine's mask In the next patch, we are introducing a broad virtual engine to encompass multiple physical engines, losing the 1:1 nature of BIT(engine->id). To reflect the broader set of engines implied by the virtual instance, lets store the full bitmask. v2: Use intel_engine_mask_t (s/ring_mask/engine_mask/) v3: Tvrtko voted for moah churn so teach everyone to not mention ring and use $class$instance throughout. v4: Comment upon the disparity in bspec for using VCS1,VCS2 in gen8 and VCS[0-4] in later gen. We opt to keep the code consistent and use 0-index naming throughout. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190305180332.30900-1-chris@chris-wilson.co.uk
|
#
cf819eff |
|
12-Dec-2018 |
Lucas De Marchi <lucas.demarchi@intel.com> |
drm/i915: replace IS_GEN<N> with IS_GEN(..., N) Define IS_GEN() similarly to our IS_GEN_RANGE(). but use gen instead of gen_mask to do the comparison. Now callers can pass then gen as a parameter, so we don't require one macro for each gen. The following spatch was used to convert the users of these macros: @@ expression e; @@ ( - IS_GEN2(e) + IS_GEN(e, 2) | - IS_GEN3(e) + IS_GEN(e, 3) | - IS_GEN4(e) + IS_GEN(e, 4) | - IS_GEN5(e) + IS_GEN(e, 5) | - IS_GEN6(e) + IS_GEN(e, 6) | - IS_GEN7(e) + IS_GEN(e, 7) | - IS_GEN8(e) + IS_GEN(e, 8) | - IS_GEN9(e) + IS_GEN(e, 9) | - IS_GEN10(e) + IS_GEN(e, 10) | - IS_GEN11(e) + IS_GEN(e, 11) ) v2: use IS_GEN rather than GT_GEN and compare to info.gen rather than using the bitmask Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181212181044.15886-2-lucas.demarchi@intel.com
|
#
b3ad99ed |
|
05-Feb-2018 |
Michal Srb <msrb@suse.com> |
drm/i915/cmdparser: Do not check past the cmd length. The command MEDIA_VFE_STATE checks bits at offset +2 dwords. However, it is possible to have MEDIA_VFE_STATE command with length = 0 + LENGTH_BIAS = 2. In that case check_cmd will read bits from the following command, or even past the end of the buffer. If the offset ends up outside of the command length, reject the command. Fixes: 351e3db2b363 ("drm/i915: Implement command buffer parsing logic") Signed-off-by: Michal Srb <msrb@suse.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180205151745.29292-1-msrb@suse.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20180205160438.3267-2-chris@chris-wilson.co.uk (cherry picked from commit 3aec7f871c65eb5f76b4125fda432593c834a6f2) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
#
b18224e9 |
|
05-Feb-2018 |
Michal Srb <msrb@suse.com> |
drm/i915/cmdparser: Check reg_table_count before derefencing. The find_reg function was assuming that there is always at least one table in reg_tables. It is not always true. In case of VCS or VECS, the reg_tables is NULL and reg_table_count is 0, implying that no register-accessing commands are allowed. However, the command tables include commands such as MI_STORE_REGISTER_MEM. When trying to check such command, the find_reg would dereference NULL pointer. Now it will just return NULL meaning that the register was not found and the command will be rejected. Fixes: 76ff480ec963 ("drm/i915/cmdparser: Use binary search for faster register lookup") Signed-off-by: Michal Srb <msrb@suse.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180205142916.27092-2-msrb@suse.com Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20180205160438.3267-1-chris@chris-wilson.co.uk register lookup") (cherry picked from commit 2f265fad9756a40c09e3f4dcc62d5d7fa73a9fb2) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
#
3aec7f87 |
|
05-Feb-2018 |
Michal Srb <msrb@suse.com> |
drm/i915/cmdparser: Do not check past the cmd length. The command MEDIA_VFE_STATE checks bits at offset +2 dwords. However, it is possible to have MEDIA_VFE_STATE command with length = 0 + LENGTH_BIAS = 2. In that case check_cmd will read bits from the following command, or even past the end of the buffer. If the offset ends up outside of the command length, reject the command. Fixes: 351e3db2b363 ("drm/i915: Implement command buffer parsing logic") Signed-off-by: Michal Srb <msrb@suse.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180205151745.29292-1-msrb@suse.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20180205160438.3267-2-chris@chris-wilson.co.uk
|
#
2f265fad |
|
05-Feb-2018 |
Michal Srb <msrb@suse.com> |
drm/i915/cmdparser: Check reg_table_count before derefencing. The find_reg function was assuming that there is always at least one table in reg_tables. It is not always true. In case of VCS or VECS, the reg_tables is NULL and reg_table_count is 0, implying that no register-accessing commands are allowed. However, the command tables include commands such as MI_STORE_REGISTER_MEM. When trying to check such command, the find_reg would dereference NULL pointer. Now it will just return NULL meaning that the register was not found and the command will be rejected. Fixes: 76ff480ec963 ("drm/i915/cmdparser: Use binary search for faster register lookup") Signed-off-by: Michal Srb <msrb@suse.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180205142916.27092-2-msrb@suse.com Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20180205160438.3267-1-chris@chris-wilson.co.uk register lookup")
|
#
439e2ee4 |
|
29-Nov-2017 |
Tvrtko Ursulin <tvrtko.ursulin@intel.com> |
drm/i915: Move engine->needs_cmd_parser to engine->flags Will be adding a new per-engine flags shortly so it makes sense to consolidate. v2: Keep the original code flow in intel_engine_cleanup_cmd_parser. (Joonas Lahtinen) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Sagar Arun Kamble <sagar.a.kamble@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171129082409.18189-1-tvrtko.ursulin@linux.intel.com
|
#
0ffba1fc |
|
07-Nov-2017 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Silence smatch for cmdparser drivers/gpu/drm/i915/i915_cmd_parser.c:808:23: error: not an lvalue drivers/gpu/drm/i915/i915_cmd_parser.c:811:23: error: not an lvalue drivers/gpu/drm/i915/i915_cmd_parser.c:814:23: error: not an lvalue drivers/gpu/drm/i915/i915_cmd_parser.c:808:23: error: not an lvalue drivers/gpu/drm/i915/i915_cmd_parser.c:811:23: error: not an lvalue drivers/gpu/drm/i915/i915_cmd_parser.c:814:23: error: not an lvalue drivers/gpu/drm/i915/i915_cmd_parser.c:808:23: error: not an lvalue drivers/gpu/drm/i915/i915_cmd_parser.c:811:23: error: not an lvalue drivers/gpu/drm/i915/i915_cmd_parser.c:814:23: error: not an lvalue drivers/gpu/drm/i915/i915_cmd_parser.c:808:23: error: not an lvalue drivers/gpu/drm/i915/i915_cmd_parser.c:811:23: error: not an lvalue drivers/gpu/drm/i915/i915_cmd_parser.c:814:23: error: not an lvalue If we move the shift into each case not only do we kill the warning from smatch, but we shrink the code slightly: text data bss dec hex filename 1267906 20587 3168 1291661 13b58d before 1267890 20587 3168 1291645 13b57d after Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Matthew Auld <matthew.william.auld@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171107154055.19460-1-chris@chris-wilson.co.uk Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com> Reviewed-by: Gabriel Krisman Bertazi <krisman@collabora.co.uk>
|
#
3b24e7e8 |
|
28-Aug-2017 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Recreate vmapping even when the object is pinned Sometimes we know we are the only user of the bo, but since we take a protective pin_pages early on, an attempt to change the vmap on the object is denied because it is busy. i915_gem_object_pin_map() cannot tell from our single pin_count if the operation is safe. Instead we must pass that information down from the caller in the manner of I915_MAP_OVERRIDE. This issue has existed from the introduction of the mapping, but was never noticed as the only place where this conflict might happen is for cached kernel buffers (such as allocated by i915_gem_batch_pool_get()). Until recently there was only a single user (the cmdparser) so no conflicts ever occurred. However, we now use it to allocate batches for different operations (using MAP_WC on !llc for writes) in addition to the existing shadow batch (using MAP_WB for reads). We could either keep both mappings cached, or use a different write mechanism if we detect a MAP_WB already exists (i.e. clflush afterwards), but as we haven't seen this issue in the wild (it requires hitting the GPU reloc path in addition to the cmdparser) for simplicity just allow the mappings to be recreated. v2: Include the i915_MAP_OVERRIDE bit in the enum so the compiler knows about all the valid values. Fixes: 7dd4f6729f92 ("drm/i915: Async GPU relocation processing") Testcase: igt/gem_lut_handle # byt, completely by accident Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170828104631.8606-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (cherry picked from commit a575c6761757232ea2c7dc9f370640754b90cc69) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
#
a575c676 |
|
28-Aug-2017 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Recreate vmapping even when the object is pinned Sometimes we know we are the only user of the bo, but since we take a protective pin_pages early on, an attempt to change the vmap on the object is denied because it is busy. i915_gem_object_pin_map() cannot tell from our single pin_count if the operation is safe. Instead we must pass that information down from the caller in the manner of I915_MAP_OVERRIDE. This issue has existed from the introduction of the mapping, but was never noticed as the only place where this conflict might happen is for cached kernel buffers (such as allocated by i915_gem_batch_pool_get()). Until recently there was only a single user (the cmdparser) so no conflicts ever occurred. However, we now use it to allocate batches for different operations (using MAP_WC on !llc for writes) in addition to the existing shadow batch (using MAP_WB for reads). We could either keep both mappings cached, or use a different write mechanism if we detect a MAP_WB already exists (i.e. clflush afterwards), but as we haven't seen this issue in the wild (it requires hitting the GPU reloc path in addition to the cmdparser) for simplicity just allow the mappings to be recreated. v2: Include the i915_MAP_OVERRIDE bit in the enum so the compiler knows about all the valid values. Fixes: 7dd4f6729f92 ("drm/i915: Async GPU relocation processing") Testcase: igt/gem_lut_handle # byt, completely by accident Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170828104631.8606-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
#
0ce81788 |
|
17-May-2017 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Redefine ptr_pack_bits() and friends Rebrand the current (pointer | bits) pack/unpack utility macros as explicit bit twiddling for PAGE_SIZE so that we can use the more flexible underlying macros for different bits. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170517121007.27224-4-chris@chris-wilson.co.uk
|
#
1d39f281 |
|
11-Apr-2017 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Rename intel_engine_cs.exec_id to uabi_id We want to refer to the index of the engine consistently throughout the userspace ABI. We already have such an index through the execbuffer engine specifier, that needs to be able to refer to each engine specifically, so rename it the index to uabi_id to reflect its generality beyond execbuf. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170411124306.15448-1-chris@chris-wilson.co.uk
|
#
504ae402 |
|
10-Mar-2017 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/cmdparser: Limit clflush to active cachelines We only need to clflush those cachelines that we have validated to be read by the GPU. Userspace typically fills the batch length in correctly, the exceptions tend to be explicit tests within igt. v2: Use ptr_mask_bits() to make Mika happy v3: cmd is not advanced on MI_BBE, so make sure to include an extra dword in the clflush. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170310115518.13832-1-chris@chris-wilson.co.uk
|
#
c4d3ae68 |
|
06-Jan-2017 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Consolidate checks for memcpy-from-wc support In order to silence sparse: ../drivers/gpu/drm/i915/i915_gpu_error.c:200:39: warning: Using plain integer as NULL pointer add a helper to check whether we have sse4.1 and that the desired alignment is valid for acceleration. v2: Explain the macros and split the two use cases between i915_has_memcpy_from_wc() and i915_can_memcpy_from_wc(). Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170106152013.24684-1-chris@chris-wilson.co.uk
|
#
41736a8e |
|
23-Nov-2016 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Use the precomputed value for whether to enable command parsing As i915.enable_cmd_parser is an unsafe option, make it read-only at runtime. Now that it is constant, we can use the value determined during initialisation as to whether we need the cmdparser at execbuffer time. v2: Remove the inline for its single user, it is clear enough (and shorter) without! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161124125851.6615-1-chris@chris-wilson.co.uk
|
#
007873b3 |
|
23-Nov-2016 |
Matthew Auld <matthew.auld@intel.com> |
drm/i915: kick out cmd_parser specific structs from i915_drv.h No sense in keeping the cmd_descriptor and cmd_table structs in i915_drv.h, now that they are no longer referenced externally. Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1479942147-9837-1-git-send-email-matthew.auld@intel.com
|
#
e3f51ece |
|
14-Nov-2016 |
Matthew Auld <matthew.auld@intel.com> |
drm/i915: cleanup use of INSTR_CLIENT_MASK Doing cmd_header >> 29 to extract our 3-bit client value where we know cmd_header is a u32 shouldn't then also require the use of a mask. So remove the redundant operation and get rid of INSTR_CLIENT_MASK now that there are no longer any users. Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1479163174-29686-1-git-send-email-matthew.auld@intel.com
|
#
10ff401d |
|
07-Nov-2016 |
Robert Bragg <robert@sixbynine.org> |
drm/i915: don't whitelist oacontrol in cmd parser Being able to program OACONTROL from a non-privileged batch buffer is not sufficient to be able to configure the OA unit. This was originally allowed to help enable Mesa to expose OA counters via the INTEL_performance_query extension, but the current implementation based on programming OACONTROL via a batch buffer isn't able to report useable data without a more complete OA unit configuration. Mesa handles the possibility that writes to OACONTROL may not be allowed and so only advertises the extension after explicitly testing that a write to OACONTROL succeeds. Based on this; removing OACONTROL from the whitelist should be ok for userspace. Removing this simplifies adding a new kernel api for configuring the OA unit without needing to consider the possibility that userspace might trample on OACONTROL state which we'd like to start managing within the kernel instead. In particular running any Mesa based GL application currently results in clearing OACONTROL when initializing which would disable the capturing of metrics. v2: This bumps the command parser version from 8 to 9, as the change is visible to userspace. Signed-off-by: Robert Bragg <robert@sixbynine.org> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Sourab Gupta <sourab.gupta@intel.com>Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20161108125148.25007-1-robert@sixbynine.org
|
#
9bbeaedb |
|
07-Nov-2016 |
Robert Bragg <robert@sixbynine.org> |
drm/i915: return EACCES for check_cmd() failures check_cmd() is checking whether a command adheres to certain restrictions that ensure it's safe to execute within a privileged batch buffer. Returning false implies a privilege problem, not that the command is invalid. The distinction makes the difference between allowing the buffer to be executed as an unprivileged batch buffer or returning an EINVAL error to userspace without executing anything. In a case where userspace may want to test whether it can successfully write to a register that needs privileges the distinction may be important and an EINVAL error may be considered fatal. In particular this is currently true for Mesa, which includes a test for whether OACONTROL can be written too, but Mesa treats any error when flushing a batch buffer as fatal, calling exit(1). As it is currently Mesa can gracefully handle a failure to write to OACONTROL if the command parser is disabled, but if we were to remove OACONTROL from the parser's whitelist then the returned EINVAL would break Mesa applications as they attempt an OACONTROL write. This bumps the command parser version from 7 to 8, as the change is visible to userspace. Signed-off-by: Robert Bragg <robert@sixbynine.org> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Sourab Gupta <sourab.gupta@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20161107194957.3385-4-robert@sixbynine.org
|
#
a941795a |
|
07-Nov-2016 |
Robert Bragg <robert@sixbynine.org> |
drm/i915: rename OACONTROL GEN7_OACONTROL OACONTROL changes quite a bit for gen8, with some bits split out into a per-context OACTXCONTROL register. Rename now before adding more gen7 OA registers Signed-off-by: Robert Bragg <robert@sixbynine.org> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Sourab Gupta <sourab.gupta@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20161107194957.3385-3-robert@sixbynine.org
|
#
a4f5ea64 |
|
28-Oct-2016 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Refactor object page API The plan is to make obtaining the backing storage for the object avoid struct_mutex (i.e. use its own locking). The first step is to update the API so that normal users only call pin/unpin whilst working on the backing storage. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-12-chris@chris-wilson.co.uk
|
#
3b3f1650 |
|
13-Oct-2016 |
Akash Goel <akash.goel@intel.com> |
drm/i915: Allocate intel_engine_cs structure only for the enabled engines With the possibility of addition of many more number of rings in future, the drm_i915_private structure could bloat as an array, of type intel_engine_cs, is embedded inside it. struct intel_engine_cs engine[I915_NUM_ENGINES]; Though this is still fine as generally there is only a single instance of drm_i915_private structure used, but not all of the possible rings would be enabled or active on most of the platforms. Some memory can be saved by allocating intel_engine_cs structure only for the enabled/active engines. Currently the engine/ring ID is kept static and dev_priv->engine[] is simply indexed using the enums defined in intel_engine_id. To save memory and continue using the static engine/ring IDs, 'engine' is defined as an array of pointers. struct intel_engine_cs *engine[I915_NUM_ENGINES]; dev_priv->engine[engine_ID] will be NULL for disabled engine instances. There is a text size reduction of 928 bytes, from 1028200 to 1027272, for i915.o file (but for i915.ko file text size remain same as 1193131 bytes). v2: - Remove the engine iterator field added in drm_i915_private structure, instead pass a local iterator variable to the for_each_engine** macros. (Chris) - Do away with intel_engine_initialized() and instead directly use the NULL pointer check on engine pointer. (Chris) v3: - Remove for_each_engine_id() macro, as the updated macro for_each_engine() can be used in place of it. (Chris) - Protect the access to Render engine Fault register with a NULL check, as engine specific init is done later in Driver load sequence. v4: - Use !!dev_priv->engine[VCS] style for the engine check in getparam. (Chris) - Kill the superfluous init_engine_lists(). v5: - Cleanup the intel_engines_init() & intel_engines_setup(), with respect to allocation of intel_engine_cs structure. (Chris) v6: - Rebase. v7: - Optimize the for_each_engine_masked() macro. (Chris) - Change the type of 'iter' local variable to enum intel_engine_id. (Chris) - Rebase. v8: Rebase. v9: Rebase. v10: - For index calculation use engine ID instead of pointer based arithmetic in intel_engine_sync_index() as engine pointers are not contiguous now (Chris) - For appropriateness, rename local enum variable 'iter' to 'id'. (Joonas) - Use for_each_engine macro for cleanup in intel_engines_init() and remove check for NULL engine pointer in cleanup() routines. (Joonas) v11: Rebase. Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Akash Goel <akash.goel@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1476378888-7372-1-git-send-email-akash.goel@intel.com
|
#
911f4869 |
|
15-Sep-2016 |
Jani Nikula <jani.nikula@intel.com> |
drm/i915: use NULL for NULL pointers Fix sparse warning: drivers/gpu/drm/i915/i915_cmd_parser.c:987:72: warning: Using plain integer as NULL pointer Fixes: 52a42cec4b70 ("drm/i915/cmdparser: Accelerate copies from WC memory") Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1473946137-1931-4-git-send-email-jani.nikula@intel.com
|
#
52a42cec |
|
18-Aug-2016 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/cmdparser: Accelerate copies from WC memory If we need to use clflush to prepare our batch for reads from memory, we can bypass the cache instead by using non-temporal copies. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-39-chris@chris-wilson.co.uk
|
#
76ff480e |
|
18-Aug-2016 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/cmdparser: Use binary search for faster register lookup A significant proportion of the cmdparsing time for some batches is the cost to find the register in the mmiotable. We ensure that those tables are in ascending order such that we could do a binary search if it was ever merited. It is. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-38-chris@chris-wilson.co.uk
|
#
ea884f09 |
|
18-Aug-2016 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/cmdparser: Check for SKIP descriptors first If the command descriptor says to skip it, ignore checking for anyother other conflict. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-37-chris@chris-wilson.co.uk
|
#
efdfd91f |
|
18-Aug-2016 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/cmdparser: Compare against the previous command descriptor On the blitter (and in test code), we see long sequences of repeated commands, e.g. XY_PIXEL_BLT, XY_SCANLINE_BLT, or XY_SRC_COPY. For these, we can skip the hashtable lookup by remembering the previous command descriptor and doing a straightforward compare of the command header. The corollary is that we need to do one extra comparison before lookup up new commands. v2: Less magic mask (ok, it is still magic, but now you cannot see!) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-36-chris@chris-wilson.co.uk
|
#
d6a4ead7 |
|
18-Aug-2016 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/cmdparser: Improve hash function The existing code's hashfunction is very suboptimal (most 3D commands use the same bucket degrading the hash to a long list). The code even acknowledge that the issue was known and the fix simple: /* * If we attempt to generate a perfect hash, we should be able to look at bits * 31:29 of a command from a batch buffer and use the full mask for that * client. The existing INSTR_CLIENT_MASK/SHIFT defines can be used for this. */ Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-35-chris@chris-wilson.co.uk
|
#
ed13033f |
|
18-Aug-2016 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/cmdparser: Only cache the dst vmap For simplicity, we want to continue using a contiguous mapping of the command buffer, but we can reduce the number of vmappings we hold by switching over to a page-by-page copy from the user batch buffer to the shadow. The cost for saving one linear mapping is about 5% in trivial workloads - which is more or less the overhead in calling kmap_atomic(). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-34-chris@chris-wilson.co.uk
|
#
0b537272 |
|
18-Aug-2016 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/cmdparser: Use cached vmappings The single largest factor in the overhead of parsing the commands is the setup of the virtual mapping to provide a continuous block for the batch buffer. If we keep those vmappings around (against the better judgement of mm/vmalloc.c, which we offset by handwaving and looking suggestively at the shrinker) we can dramatically improve the performance of the parser for small batches (such as media workloads). Furthermore, we can use the prepare shmem read/write functions to determine how best we need to clflush the range (rather than every page of the object). The impact of caching both src/dst vmaps is +80% on ivb and +140% on byt for the throughput on small batches. (Caching just the dst vmap and iterating over the src, doing a page by page copy is roughly 5% slower on both platforms. That may be an acceptable trade-off to eliminate one cached vmapping, and we may be able to reduce the per-page copying overhead further.) For *this* simple test case, the cmdparser is now within a factor of 2 of ideal performance. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.william.auld@gmail.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-33-chris@chris-wilson.co.uk
|
#
068715b9 |
|
18-Aug-2016 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/cmdparser: Add the TIMESTAMP register for the other engines Since I have been using the BCS_TIMESTAMP to measure latency of execution upon the blitter ring, allow regular userspace to also read from that register. They are already allowed RCS_TIMESTAMP! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-32-chris@chris-wilson.co.uk
|
#
7756e454 |
|
18-Aug-2016 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/cmdparser: Make initialisation failure non-fatal If the developer adds a register in the wrong order, we BUG during boot. That makes development and testing very difficult. Let's be a bit more friendly and disable the command parser with a big warning if the tables are invalid. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-31-chris@chris-wilson.co.uk
|
#
43394c7d |
|
18-Aug-2016 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Extract i915_gem_obj_prepare_shmem_write() This is a companion to i915_gem_obj_prepare_shmem_read() that prepares the backing storage for direct writes. It first serialises with the GPU, pins the backing storage and then indicates what clfushes are required in order for the writes to be coherent. Whilst here, fix support for ancient CPUs without clflush for which we cannot do the GTT+clflush tricks. v2: Add i915_gem_obj_finish_shmem_access() for symmetry Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-8-chris@chris-wilson.co.uk
|
#
33a051a5 |
|
27-Jul-2016 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/cmdparser: Remove stray intel_engine_cs *ring When we refer to intel_engine_cs, we want to use engine so as not to confuse ourselves about ringbuffers. v2: Rename all the functions as well, as well as a few more stray comments. v3: Split the really long error message strings Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-6-git-send-email-chris@chris-wilson.co.uk Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1469606850-28659-1-git-send-email-chris@chris-wilson.co.uk
|
#
14bb2c11 |
|
03-Jun-2016 |
Tvrtko Ursulin <tvrtko.ursulin@intel.com> |
drm/i915: Fix a buch of kerneldoc warnings Just a bunch of stale kerneldocs generating warnings when building the docs. Mostly function parameters so not very useful but still. v2: Tidy. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1464958937-23344-1-git-send-email-tvrtko.ursulin@linux.intel.com
|
#
c033666a |
|
06-May-2016 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Store a i915 backpointer from engine, and use it text data bss dec hex filename 6309351 3578714 696320 10584385 a18141 vmlinux 6308391 3578714 696320 10583425 a17d81 vmlinux Almost 1KiB of code reduction. v2: More s/INTEL_INFO()->gen/INTEL_GEN()/ and IS_GENx() conversions text data bss dec hex filename 6304579 3578778 696320 10579677 a16edd vmlinux 6303427 3578778 696320 10578525 a16a5d vmlinux Now over 1KiB! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1462545621-30125-3-git-send-email-chris@chris-wilson.co.uk
|
#
6761d0a1 |
|
06-May-2016 |
Kenneth Graunke <kenneth@whitecape.org> |
drm/i915: Allow MI_LOAD_REGISTER_REG between whitelisted registers. Allowing register copies where the source and destination are both whitelisted should be safe, and is useful. For example, Mesa uses this to load the command streamer math registers with data from the pipeline statistics counters. v2: Reject writes to OACONTROL (and reads as well :( Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> # v1 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1462521014-13595-1-git-send-email-chris@chris-wilson.co.uk
|
#
1ca3712c |
|
04-May-2016 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Report command parser version 0 if disabled If the command parser is not active, then it is appropriate to report it as operating at version 0 as no higher mode is supported. This greatly simplifies userspace querying for the command parser as we then do not need to second guess when it will be active (a mixture of module parameters and generational support, which may change over time). v2: s/comand/command/ misspelling in comment Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1462368336-21230-1-git-send-email-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
#
6cf0716c |
|
07-Mar-2016 |
Jordan Justen <jordan.l.justen@intel.com> |
drm/i915: Bump command parser version for new whitelisted registers Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1457335830-30923-6-git-send-email-jordan.l.justen@intel.com
|
#
1b85066b |
|
07-Mar-2016 |
Jordan Justen <jordan.l.justen@intel.com> |
drm/i915: Add Haswell CS GPR registers to whitelist This is needed for the Mesa Vulkan driver on Haswell. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1457335830-30923-5-git-send-email-jordan.l.justen@intel.com
|
#
99c5aeca |
|
07-Mar-2016 |
Jordan Justen <jordan.l.justen@intel.com> |
drm/i915: Move Haswell registers to separate whitelist table Now that we can whitelist registers only on Haswell, move HSW_SCRATCH1 and HSW_ROW_CHICKEN3 into a separate Haswell only table. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Cc: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1457335830-30923-4-git-send-email-jordan.l.justen@intel.com
|
#
361b027b |
|
07-Mar-2016 |
Jordan Justen <jordan.l.justen@intel.com> |
drm/i915: Use an array of register tables in command parser For Haswell, we will want another table of registers while retaining the large common table of whitelisted registers shared by all gen7 devices. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> [danvet: Pipe patch through sed -e 's/\<ring\>/engine/g' to make it apply.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
a6573e1f |
|
07-Mar-2016 |
Jordan Justen <jordan.l.justen@intel.com> |
drm/i915: Add TIMESTAMP to register whitelist This is needed for the Mesa Vulkan driver on Haswell. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Cc: Kristian Høgsberg <krh@bitplanet.net> Cc: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1457335830-30923-2-git-send-email-jordan.l.justen@intel.com
|
#
0bc40be8 |
|
16-Mar-2016 |
Tvrtko Ursulin <tvrtko.ursulin@intel.com> |
drm/i915: Rename intel_engine_cs function parameters @@ identifier func; @@ func(..., struct intel_engine_cs * - ring + engine , ...) { <... - ring + engine ...> } @@ identifier func; type T; @@ T func(..., struct intel_engine_cs * - ring + engine , ...); Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
|
#
f0f59a00 |
|
18-Nov-2015 |
Ville Syrjälä <ville.syrjala@linux.intel.com> |
drm/i915: Type safe register read/write Make I915_READ and I915_WRITE more type safe by wrapping the register offset in a struct. This should eliminate most of the fumbles we've had with misplaced parens. This only takes care of normal mmio registers. We could extend the idea to other register types and define each with its own struct. That way you wouldn't be able to accidentally pass the wrong thing to a specific register access function. The gpio_reg setup is probably the ugliest thing left. But I figure I'd just leave it for now, and wait for some divine inspiration to strike before making it nice. As for the generated code, it's actually a bit better sometimes. Eg. looking at i915_irq_handler(), we can see the following change: lea 0x70024(%rdx,%rax,1),%r9d mov $0x1,%edx - movslq %r9d,%r9 - mov %r9,%rsi - mov %r9,-0x58(%rbp) - callq *0xd8(%rbx) + mov %r9d,%esi + mov %r9d,-0x48(%rbp) callq *0xd8(%rbx) So previously gcc thought the register offset might be signed and decided to sign extend it, just in case. The rest appears to be mostly just minor shuffling of instructions. v2: i915_mmio_reg_{offset,equal,valid}() helpers added s/_REG/_MMIO/ in the register defines mo more switch statements left to worry about ring_emit stuff got sorted in a prep patch cmd parser, lrc context and w/a batch buildup also in prep patch vgpu stuff cleaned up and moved to a prep patch all other unrelated changes split out v3: Rebased due to BXT DSI/BLC, MOCS, etc. v4: Rebased due to churn, s/i915_mmio_reg_t/i915_reg_t/ Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1447853606-2751-1-git-send-email-ville.syrjala@linux.intel.com
|
#
e597ef40 |
|
06-Nov-2015 |
Ville Syrjälä <ville.syrjala@linux.intel.com> |
drm/i915: Make the cmd parser 64bit regs explicit Add defines for the upper halves of the registers used by the cmd parser. Getting rid of the arithmetic with the register offset will help in making registers type safe. v2: s/_HI/_UDW/ (Chris) Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1446839080-18732-1-git-send-email-ville.syrjala@linux.intel.com
|
#
7b9748cb |
|
02-Oct-2015 |
Jordan Justen <jordan.l.justen@intel.com> |
drm/i915: Add GEN7_GPGPU_DISPATCHDIMX/Y/Z to the register whitelist This is required to support glDispatchComputeIndirect for gen7. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
614f4ad7 |
|
01-Sep-2015 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Fix cmdparser STORE/LOAD command descriptors Fixes regression from commit f1afe24f0e736b9d7f2275e2b1504af3fe612f2a Author: Arun Siluvery <arun.siluvery@linux.intel.com> Date: Tue Aug 4 16:22:20 2015 +0100 drm/i915: Change SRM, LRM instructions to use correct length which forgot to account for the length bias when declaring the fixed length. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91844 Reported-by: Andreas Reis <andreas.reis@gmail.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Dave Gordon <david.s.gordon@intel.com> Cc: Arun Siluvery <arun.siluvery@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Arun Siluvery <arun.siluvery@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
2bbe6bbb |
|
15-Jun-2015 |
Francisco Jerez <currojerez@riseup.net> |
drm/i915: Bump command parser version number. This was forgotten in commit d351f6d94893f3ba98b1b20c5ef44c35fc1da124 Author: Francisco Jerez <currojerez@riseup.net> Date: Fri May 29 16:44:15 2015 +0300 drm/i915: Add SCRATCH1 and ROW_CHICKEN3 to the register whitelist. Signed-off-by: Francisco Jerez <currojerez@riseup.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
f1afe24f |
|
04-Aug-2015 |
Arun Siluvery <arun.siluvery@linux.intel.com> |
drm/i915: Change SRM, LRM instructions to use correct length MI_STORE_REGISTER_MEM, MI_LOAD_REGISTER_MEM instructions are not really variable length instructions unlike MI_LOAD_REGISTER_IMM where it expects (reg, addr) pairs so use fixed length for these instructions. v2: rebase Cc: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> [danvet: Appease checkpatch as Mika spotted in i915_reg.h - it seems terminally unhappy about i915_cmd_parser.c so that would be a separate patch.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
8453580c |
|
29-Jul-2015 |
Hanno Böck <hanno@hboeck.de> |
drm/i915: Fix command parser table validator As we may like to use a bisection search on the tables in future, we need them to be ordered. For convenience we expect the compiled tables to be order and check on initialisation. However, the validator used the wrong iterators failed to spot the misordered MI tables and instead walked off into the unknown (as spotted by kasan). Signed-off-by: Hanno Boeck <hanno@hboeck.de> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: Again hand-assemble patch ...] Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
|
#
9f58582c |
|
29-Jul-2015 |
Hanno Böck <hanno@hboeck.de> |
drm/i915: Properly sort MI coomand table In the future, we may want to speed up command/register searching using a bisection and so we require them to be in ascending order respectively by command value or register address. However, this was not true for one pair in the MI table; make it so. Signed-off-by: Hanno Boeck <hanno@hboeck.de> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: Hand-assemble patch from raw patch from Hanno and commit message from Chris.] Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
|
#
9e000847 |
|
03-Jul-2015 |
Arun Siluvery <arun.siluvery@linux.intel.com> |
drm/i915: Update WaFlushCoherentL3CacheLinesAtContextSwitch In this WA we need to set GEN8_L3SQCREG4[21:21] and reset it after PIPE_CONTROL instruction but there is a slight complication as this is applied in WA batch where the values are only initialized once. Dave identified an issue with the current implementation where the register value is read once at the beginning and it is reused; this patch corrects this by saving the register value to memory, update register with the bit of our interest and restore it back with original value. This implementation uses MI_LOAD_REGISTER_MEM which is currently only used by command parser and was using a default length of 0. This is now updated with correct length and moved to appropriate place. Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
41d232b7 |
|
29-May-2015 |
Francisco Jerez <currojerez@riseup.net> |
drm/i915: Add SCRATCH1 and ROW_CHICKEN3 to the register whitelist. Only bit 27 of SCRATCH1 and bit 6 of ROW_CHICKEN3 are allowed to be set because of security-sensitive bits we don't want userspace to mess with. On HSW hardware the whitelisted bits control whether atomic read-modify-write operations are performed on L3 or on GTI, and when set to L3 (which can be 10x-30x better performing than on GTI, depending on the application) require great care to avoid a system hang, so we currently program them to be handled on GTI by default. Beignet can immediately start taking advantage of this change to enable L3 atomics. Mesa should eventually switch to L3 atomics too, but a number of non-trivial changes are still required so it will continue using GTI atomics for now. Signed-off-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
#
c1091b25 |
|
29-May-2015 |
Francisco Jerez <currojerez@riseup.net> |
drm/i915: Extend the parser to check register writes against a mask/value pair. In some cases it might be unnecessary or dangerous to give userspace the right to write arbitrary values to some register, even though it might be desirable to give it control of some of its bits. This patch extends the register whitelist entries to contain a mask/value pair in addition to the register offset. For registers with non-zero mask, any LRM writes and LRI writes where the bits of the immediate given by the mask don't match the specified value will be rejected. This will be used in my next patch to grant userspace partial write access to some sensitive registers. Signed-off-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
#
8a389cac |
|
29-May-2015 |
Francisco Jerez <currojerez@riseup.net> |
drm/i915: Fix command parser to validate multiple register access with the same command. Until now the software command checker assumed that commands could read or write at most a single register per packet. This is not necessarily the case, MI_LOAD_REGISTER_IMM expects a variable-length list of offset/value pairs and writes them in sequence. The previous code would only check whether the first entry was valid, effectively allowing userspace to write unrestricted registers of the MMIO space by sending a multi-register write with a legal first register, with potential security implications on Gen6 and 7 hardware. Fix it by extending the drm_i915_cmd_descriptor table to represent multi-register access and making validate_cmd() iterate for all register offsets present in the command packet. Signed-off-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
#
d351f6d9 |
|
29-May-2015 |
Francisco Jerez <currojerez@riseup.net> |
drm/i915: Add SCRATCH1 and ROW_CHICKEN3 to the register whitelist. Only bit 27 of SCRATCH1 and bit 6 of ROW_CHICKEN3 are allowed to be set because of security-sensitive bits we don't want userspace to mess with. On HSW hardware the whitelisted bits control whether atomic read-modify-write operations are performed on L3 or on GTI, and when set to L3 (which can be 10x-30x better performing than on GTI, depending on the application) require great care to avoid a system hang, so we currently program them to be handled on GTI by default. Beignet can immediately start taking advantage of this change to enable L3 atomics. Mesa should eventually switch to L3 atomics too, but a number of non-trivial changes are still required so it will continue using GTI atomics for now. Signed-off-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
4e86f725 |
|
29-May-2015 |
Francisco Jerez <currojerez@riseup.net> |
drm/i915: Extend the parser to check register writes against a mask/value pair. In some cases it might be unnecessary or dangerous to give userspace the right to write arbitrary values to some register, even though it might be desirable to give it control of some of its bits. This patch extends the register whitelist entries to contain a mask/value pair in addition to the register offset. For registers with non-zero mask, any LRM writes and LRI writes where the bits of the immediate given by the mask don't match the specified value will be rejected. This will be used in my next patch to grant userspace partial write access to some sensitive registers. Signed-off-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
6a65c5b9 |
|
29-May-2015 |
Francisco Jerez <currojerez@riseup.net> |
drm/i915: Fix command parser to validate multiple register access with the same command. Until now the software command checker assumed that commands could read or write at most a single register per packet. This is not necessarily the case, MI_LOAD_REGISTER_IMM expects a variable-length list of offset/value pairs and writes them in sequence. The previous code would only check whether the first entry was valid, effectively allowing userspace to write unrestricted registers of the MMIO space by sending a multi-register write with a legal first register, with potential security implications on Gen6 and 7 hardware. Fix it by extending the drm_i915_cmd_descriptor table to represent multi-register access and making validate_cmd() iterate for all register offsets present in the command packet. Signed-off-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
de4e783a |
|
07-Apr-2015 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Tidy batch pool logic Move the madvise logic out of the execbuffer main path into the relatively rare allocation path, making the execbuffer manipulation less fragile. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
72c5ba95 |
|
13-Mar-2015 |
Mika Kuoppala <mika.kuoppala@linux.intel.com> |
drm/i915: Fix vmap_batch page iterator overrun vmap_batch() calculates amount of needed pages for the mapping we are going to create. And it uses this page count as an argument for the for_each_sg_pages() macro. The macro takes the number of sg list entities as an argument, not the page count. So we ended up iterating through all the pages on the mapped object, corrupting memory past the smaller pages[] array. Fix this by bailing out when we have enough pages. This regression has been introduced in commit 17cabf571e50677d980e9ab2a43c5f11213003ae Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Jan 14 11:20:57 2015 +0000 drm/i915: Trim the command parser allocations Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
17cabf57 |
|
14-Jan-2015 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Trim the command parser allocations Currently, the command parser tries to create a secondary batch exactly as large as the original, and vmap both. This is open to abuse by userspace using extremely large batch objects, but only executing very short batches. For example, this would be if userspace were to implement a command submission ringbuffer. However, we only need to allocate pages for just the contents of the command sequence in the batch - all relocations copied to the secondary batch will reference the original batch and so there can be no access to the secondary batch outside of the explicit execution region. Testcase: igt/gem_exec_big #ivb,byt,hsw Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88308 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
c61200c2 |
|
11-Dec-2014 |
Jordan Justen <jordan.l.justen@intel.com> |
drm/i915: Add GPGPU_THREADS_DISPATCHED to the register whitelist This will allow us to read the number of dispatched compute threads for GL_ARB_pipeline_statistics_query. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Cc: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
71745376 |
|
11-Dec-2014 |
Brad Volkin <bradley.d.volkin@intel.com> |
drm/i915: Tidy up execbuffer command parsing code Move it to a separate function since the main do_execbuffer function already has so much going on. v2: - Move pin/unpin calls inside i915_parse_cmds() (Chris W, v4 7/7 feedback) Issue: VIZ-4719 Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com> Reviewed-By: Jon Bloomfield <jon.bloomfield@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
b9ffd80e |
|
11-Dec-2014 |
Brad Volkin <bradley.d.volkin@intel.com> |
drm/i915: Use batch length instead of object size in command parser Previously we couldn't trust the user-supplied batch length because it came directly from userspace (i.e. untrusted code). It would have affected what commands software parsed without regard to what hardware would actually execute, leaving a potential hole. With the parser now copying the user supplied batch buffer and writing MI_NOP commands to any space after the copied region, we can safely use the batch length input. This should be a performance win as the actual batch length is frequently much smaller than the allocated object size. v2: Fix handling of non-zero batch_start_offset Issue: VIZ-4719 Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com> Reviewed-By: Jon Bloomfield <jon.bloomfield@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
78a42377 |
|
11-Dec-2014 |
Brad Volkin <bradley.d.volkin@intel.com> |
drm/i915: Use batch pools with the command parser This patch sets up all of the tracking and copying necessary to use batch pools with the command parser and dispatches the copied (shadow) batch to the hardware. After this patch, the parser is in 'enabling' mode. Note that performance takes a hit from the copy in some cases and will likely need some work. At a rough pass, the memcpy appears to be the bottleneck. Without having done a deeper analysis, two ideas that come to mind are: 1) Copy sections of the batch at a time, as they are reached by parsing. Might improve cache locality. 2) Copy only up to the userspace-supplied batch length and memset the rest of the buffer. Reduces the number of reads. v2: - Remove setting the capacity of the pool - One global pool instead of per-ring pools - Replace batch_obj with shadow_batch_obj and hook into eb->vmas - Memset any space in the shadow batch beyond what gets copied - Rebased on execlist prep refactoring v3: - Rebase on chained batch handling - Squash in setting the secure dispatch flag - Add a note about the interaction w/secure dispatch pinning - Check for request->batch_obj == NULL in i915_gem_free_request v4: - Fix read domains for shadow_batch_obj - Remove the set_to_gtt_domain call from i915_parse_cmds - ggtt_pin/unpin in the parser block to simplify error handling - Check USES_FULL_PPGTT before setting DISPATCH_SECURE flag - Remove i915_gem_batch_pool_put calls v5: - Move 'pending_read_domains |= I915_GEM_DOMAIN_COMMAND' after the parser (danvet, from v4 0/7 feedback) Issue: VIZ-4719 Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com> Reviewed-By: Jon Bloomfield <jon.bloomfield@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
86ef630d |
|
21-Nov-2014 |
Michael H. Nguyen <michael.h.nguyen@intel.com> |
drm/i915: Add MI_SET_APPID cmd to cmd parser tables Was missing. Issue: VIZ-4701 Signed-off-by: Michael H. Nguyen <michael.h.nguyen@intel.com> Reviewed-by: Jon Bloomfield <jon.bloomfield@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
bfc882b4 |
|
19-Nov-2014 |
Daniel Vetter <daniel.vetter@ffwll.ch> |
drm/i915: Flatten engine init control flow Now that sanity prevails and we have the clean split between software init and starting the engines we can drop all the "have we allocate this struct already?" nonsense. Execlist code could benefit quite a bit more still, but that's for another patch. Reviewed-by: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
|
#
f1f55cc0 |
|
07-Nov-2014 |
Neil Roberts <neil@linux.intel.com> |
drm/i915: Add the predicate source registers to the register whitelist The predicate source registers are needed to implement conditional rendering without stalling. The two source registers are used to load the previous values of the PS_DEPTH_COUNT register saved from PIPE_CONTROL commands. These can then be compared and used to set the predicate enable bit via the MI_PREDICATE command. The command parser version number is increased to 2 to make it easier to detect the new functionality in user space. Signed-off-by: Neil Roberts <neil@linux.intel.com> Reviewed-by: Brad Volkin <bradley.d.volkin@intel.com> (v1) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
42c7156a |
|
16-Oct-2014 |
Brad Volkin <bradley.d.volkin@intel.com> |
drm/i915: Abort command parsing for chained batches libva uses chained batch buffers in a way that the command parser can't generally handle. Fortunately, libva doesn't need to write registers from batch buffers in the way that mesa does, so this patch causes the driver to fall back to non-secure dispatch if the parser detects a chained batch buffer. Note: The 2nd hunk to munge the error code of the parser looks a bit superflous. At least until we have the batch copy code ready and can run the cmd parser in granting mode. But it isn't since we still need to let existing libva buffers pass (though not with elevated privs ofc!). Testcase: igt/gem_exec_parse/chained-batch Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com> [danvet: Add note - this confused me in review and Brad clarified things (after a few mails ...).] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
32197aab |
|
20-Oct-2014 |
Masanari Iida <standby24x7@gmail.com> |
gpu:drm: Fix typo in Documentation/DocBook/drm.xml This patch fix spelling typos found in drm.xml. It is because the file is generated from comments in source codes, I have to fix the typos within source files. Signed-off-by: Masanari Iida <standby24x7@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
22cb99af |
|
22-Sep-2014 |
Brad Volkin <bradley.d.volkin@intel.com> |
drm/i915: Don't leak command parser tables on suspend/resume Ring init and cleanup are not balanced because we re-init the rings on resume without having cleaned them up on suspend. This leads to the driver leaking the parser's hash tables with a kmemleak signature such as this: unreferenced object 0xffff880405960980 (size 32): comm "systemd-udevd", pid 516, jiffies 4294896961 (age 10202.044s) hex dump (first 32 bytes): d0 85 46 c0 ff ff ff ff 00 00 00 00 00 00 00 00 ..F............. 98 60 28 04 04 88 ff ff 00 00 00 00 00 00 00 00 .`(............. backtrace: [<ffffffff81816f9e>] kmemleak_alloc+0x4e/0xb0 [<ffffffff811fa678>] kmem_cache_alloc_trace+0x168/0x2f0 [<ffffffffc03e20a5>] i915_cmd_parser_init_ring+0x2a5/0x3e0 [i915] [<ffffffffc04088a2>] intel_init_ring_buffer+0x202/0x470 [i915] [<ffffffffc040c998>] intel_init_vebox_ring_buffer+0x1e8/0x2b0 [i915] [<ffffffffc03eff59>] i915_gem_init_hw+0x2f9/0x3a0 [i915] [<ffffffffc03f0057>] i915_gem_init+0x57/0x1d0 [i915] [<ffffffffc045e26a>] i915_driver_load+0xc0a/0x10e0 [i915] [<ffffffffc02e0d5d>] drm_dev_register+0xad/0x100 [drm] [<ffffffffc02e3b9f>] drm_get_pci_dev+0x8f/0x200 [drm] [<ffffffffc03c934b>] i915_pci_probe+0x3b/0x60 [i915] [<ffffffff81436725>] local_pci_probe+0x45/0xa0 [<ffffffff81437a69>] pci_device_probe+0xd9/0x130 [<ffffffff81524f4d>] driver_probe_device+0x12d/0x3e0 [<ffffffff815252d3>] __driver_attach+0x93/0xa0 [<ffffffff81522e1b>] bus_for_each_dev+0x6b/0xb0 This patch extends the current convention of checking whether a resource is already allocated before allocating it during ring init. Longer term it might make sense to only init the rings once. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83794 Tested-by: Kari Suvanto <kari.tj.suvanto@gmail.com> Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: stable@vger.kernel.org Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
#
00caf019 |
|
18-Sep-2014 |
Brad Volkin <bradley.d.volkin@intel.com> |
drm/i915: Log a message when rejecting LRM to OACONTROL The other paths in the command parser that reject a batch all log a message indicating the reason. We simply missed this one. Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
9beb0ccb |
|
18-Sep-2014 |
Brad Volkin <bradley.d.volkin@intel.com> |
drm/i915: Re-enable the command parser when using PPGTT In commit commit 896ab1a5d54269b463a24194c2e4a369103b46d8 Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Wed Aug 6 15:04:51 2014 +0200 drm/i915: Fix up checks for aliasing ppgtt it looks like we accidentally inverted the check that the command parser should only run when the driver enables some form of PPGTT. Testcase: igt/gem_exec_parse Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com> [danvet: Also drop the comment right above, all production vlv now have hw ppgtt enabled.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
896ab1a5 |
|
06-Aug-2014 |
Daniel Vetter <daniel.vetter@ffwll.ch> |
drm/i915: Fix up checks for aliasing ppgtt A subsequent patch will no longer initialize the aliasing ppgtt if we have full ppgtt enabled, since we simply don't need that any more. Unfortunately a few places check for the aliasing ppgtt instead of checking for ppgtt in general. Fix them up. One special case are the gtt offset and size macros, which have some code to remap the aliasing ppgtt to the global gtt. The aliasing ppgtt is _not_ a logical address space, so passing that in as the vm is plain and simple a bug. So just WARN about it and carry on - we have a gracefully fall-through anyway if we can't find the vma. Reviewed-by: Michel Thierry <michel.thierry@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
c9224faa |
|
17-Jun-2014 |
Brad Volkin <bradley.d.volkin@intel.com> |
drm/i915: Add some L3 registers to the parser whitelist Beignet needs these in order to program the L3 cache config for OpenCL workloads, particularly when using SLM. Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
a4872ba6 |
|
22-May-2014 |
Oscar Mateo <oscar.mateo@intel.com> |
drm/i915: s/intel_ring_buffer/intel_engine_cs In the upcoming patches we plan to break the correlation between engine command streamers (a.k.a. rings) and ringbuffers, so it makes sense to refactor the code and make the change obvious. No functional changes. Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
44e895a8 |
|
10-May-2014 |
Brad Volkin <bradley.d.volkin@intel.com> |
drm/i915: Use hash tables for the command parser For clients that submit large batch buffers the command parser has a substantial impact on performance. On my HSW ULT system performance drops as much as ~20% on some tests. Most of the time is spent in the command lookup code. Converting that from the current naive search to a hash table lookup reduces the performance drop to ~10%. The choice of value for I915_CMD_HASH_ORDER allows all commands currently used in the parser tables to hash to their own bucket (except for one collision on the render ring). The tradeoff is that it wastes memory. Because the opcodes for the commands in the tables are not particularly well distributed, reducing the order still leaves many buckets empty. The increased collisions don't seem to have a huge impact on the performance gain, but for now anyhow, the parser trades memory for performance. NB: Ville noticed that the error paths through the ring init code will leak memory. I've not addressed that here. We can do a follow up pass to handle all of the leaks. v2: improved comment describing selection of hash key mask (Damien) replace a BUG_ON() with an error return (Tvrtko, Ville) commit message improvements Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
4b6eab59 |
|
28-Apr-2014 |
Jan Moskyto Matejka <mq@suse.cz> |
Revert "drm/i915: fix build warning on 32-bit (v2)" This reverts commit 60f2b4af1258c05e6b037af866be81abc24438f7. The same warning has been fixed in e5081a538a565284fec5f30a937d98e460d5e780 and these two commits got merged in 74e99a84de2d0980320612db8015ba606af42114 which caused another warning. Simply, the reverted commit casted the pointer difference to unsigned long and the other commit changed the output type from long to ptrdiff_t. The other commit fixes the original warning the better way so I'm reverting this commit now. Signed-off-by: Jan Moskyto Matejka <mq@suse.cz> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
122b2505 |
|
25-Apr-2014 |
Daniel Vetter <daniel.vetter@ffwll.ch> |
drm/i915: Integrate cmd parser kerneldoc Ville noticed that we have this nice kerneldoc but it's not integrated anywhere. Fix this asap! Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Brad Volkin <bradley.d.volkin@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
113a0476 |
|
08-Apr-2014 |
Brad Volkin <bradley.d.volkin@intel.com> |
drm/i915: Add more registers to the whitelist for mesa These are additional registers needed for performance monitoring and ARB_draw_indirect extensions in mesa. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76719 Cc: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> [danvet: Squash in fixup from Brad requested by Ken.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
86a25121 |
|
02-Apr-2014 |
Jani Nikula <jani.nikula@intel.com> |
drm/i915: fix command parser debug print format mismatches Drop the cast from the pointer diff to fix: drivers/gpu/drm/i915/i915_cmd_parser.c:405:4: warning: format '%td' expects argument of type 'ptrdiff_t', but argument 5 has type 'long unsigned int' [-Wformat] While at it, use %u for u32. Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jani Nikula <jani.nikula@intel.com> [danvet: After conflict resolution only the "While at it, ..." part was left standing ...] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
6e66ea13 |
|
28-Mar-2014 |
Brad Volkin <bradley.d.volkin@intel.com> |
drm/i915: Track OACONTROL register enable/disable during parsing There is some thought that the data from the performance counters enabled via OACONTROL should only be available to the process that enabled counting. To limit snooping, require that any batch buffer which sets OACONTROL to a non-zero value also sets it back to 0 before the end of the batch. This requires limiting OACONTROL writes to happen via MI_LOAD_REGISTER_IMM so that we can access the value being written. This should be in line with the expected use case for writing OACONTROL. v2: Drop an unnecessary '? true : false' Cc: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
b651000b |
|
27-Mar-2014 |
Brad Volkin <bradley.d.volkin@intel.com> |
drm/i915: Refactor cmd parser checks into a function This brings the code a little more in line with kernel coding style. Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
300233ee |
|
27-Mar-2014 |
Brad Volkin <bradley.d.volkin@intel.com> |
drm/i915: BUG_ON() when cmd/reg tables are not sorted As suggested during review, this makes it much more obvious when the tables are not sorted. Cc: Jani Nikula <jani.nikula@linux.intel.com> Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
77fec556 |
|
31-Mar-2014 |
Jani Nikula <jani.nikula@intel.com> |
drm/i915: drop the typedef for drm_i915_private_t There are no longer users of drm_i915_private_t. Drop the typedef. Good riddance. Signed-off-by: Jani Nikula <jani.nikula@intel.com> Acked-by: Chris Wilson <chris@chris-wislon.co.uk> [danvet: Add the hunk in i915_cmd_parser.c here which had to be relocated to the how this was merged.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
180b813c |
|
25-Mar-2014 |
Kenneth Graunke <kenneth@whitecape.org> |
drm/i915: Add OACONTROL to the command parser register whitelist. Mesa needs to be able to write OACONTROL in order to expose the Observability Architecture's performance counters via OpenGL. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> [danvet: Add comment that this is just a temporary work-around and that we need to check more things before we can allow OACONTROL writes for real everywhere.] [danvet 2: Squash in fixup to avoid a DRM_ERROR due to unsorted reg list, spotted by Jani.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
d728c8ef |
|
18-Feb-2014 |
Brad Volkin <bradley.d.volkin@intel.com> |
drm/i915: Add a CMD_PARSER_VERSION getparam So userspace can query the kernel for command parser support. v2: Add i915_cmd_parser_get_version(), history log, and kerneldoc OTC-Tracker: AXIA-4631 Change-Id: I58af650db9f6753c2dcac9c54ab432fd31db302f Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
114d4f70 |
|
18-Feb-2014 |
Brad Volkin <bradley.d.volkin@intel.com> |
drm/i915: Reject commands that would store to global HWS page PIPE_CONTROL and MI_FLUSH_DW have bits that would write to the hardware status page. The driver stores request tracking info there, so don't let userspace overwrite it. v2: trailing comma fix, rebased Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
d4d48035 |
|
18-Feb-2014 |
Brad Volkin <bradley.d.volkin@intel.com> |
drm/i915: Enable PPGTT command parser checks Various commands that access memory have a bit to determine whether the graphics address specified in the command should use the GGTT or PPGTT for translation. These checks ensure that the bit indicates PPGTT translation. Most of these checks use the existing bit-checking infrastructure. The PIPE_CONTROL and MI_FLUSH_DW commands, however, are multi-function commands. The GGTT/PPGTT bit is only relevant for certain uses of the command. As such, this change also extends the bit-checking code to include a "condition" mask and offset. If the condition mask is non-zero then the parser only performs the bit check when the bits specified by the condition mask/offset are also non-zero. NOTE: At this point in the series PPGTT must be enabled for the parser to work correctly. If it's not enabled, userspace will not be setting the PPGTT bits the way the parser requires. VLV is the only platform where this is a problem, so at this point, we disable parsing for VLV. v2: whitespace and trailing commas fixes, rebased OTC-Tracker: AXIA-4631 Change-Id: I3f4c76b6734f1956ec47e698230f97d0998ff92b Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> [danvet: Drop the unecessary cast Jani spotted.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
b18b396b |
|
18-Feb-2014 |
Brad Volkin <bradley.d.volkin@intel.com> |
drm/i915: Reject commands that explicitly generate interrupts The driver leaves most interrupts masked during normal operation, so there would have to be additional work to enable userspace to safely request/receive an interrupt. v2: trailing commas, rebased Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
f0a346bd |
|
18-Feb-2014 |
Brad Volkin <bradley.d.volkin@intel.com> |
drm/i915: Enable register whitelist checks MI_STORE_REGISTER_MEM, MI_LOAD_REGISTER_MEM, and MI_LOAD_REGISTER_IMM commands allow userspace access to registers. Only certain registers should be allowed for such access, so enable checking for those commands. Each ring gets its own register whitelist. MI_LOAD_REGISTER_REG on HSW also allows register access but is currently unused by userspace components. Leave it rejected. PIPE_CONTROL and MEDIA_VFE_STATE allow register access based on certain bits being set. Reject those as well. v2: trailing commas, rebased OTC-Tracker: AXIA-4631 Change-Id: Ie614a2f0eb2e5917de809e5a17957175d24cc44f Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
220375aa |
|
18-Feb-2014 |
Brad Volkin <bradley.d.volkin@intel.com> |
drm/i915: Add register whitelist for DRM master These are used to implement scanline waits in the X server. v2: Use #defines instead of magic numbers Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
5947de9b |
|
18-Feb-2014 |
Brad Volkin <bradley.d.volkin@intel.com> |
drm/i915: Add register whitelists for mesa These registers are currently used by mesa for blitting, transform feedback extensions, and performance monitoring extensions. v2: REG64 macro Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
17c1eb15 |
|
18-Feb-2014 |
Brad Volkin <bradley.d.volkin@intel.com> |
drm/i915: Allow some privileged commands from master The Intel DDX uses these to implement scanline waits in the X server. Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
9c640d1d |
|
18-Feb-2014 |
Brad Volkin <bradley.d.volkin@intel.com> |
drm/i915: Reject privileged commands The spec defines most of these commands as privileged. A few others, like the semaphore mbox command and some display commands, are also reserved for the driver's use. Subsequent patches relax some of these restrictions. v2: Rebased Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
3a6fa984 |
|
18-Feb-2014 |
Brad Volkin <bradley.d.volkin@intel.com> |
drm/i915: Initial command parser table definitions Add command tables defining irregular length commands for each ring. This requires a few new command opcode definitions. v2: Whitespace adjustment in command definitions, sparse fix for !F OTC-Tracker: AXIA-4631 Change-Id: I064bceb457e15f46928058352afe76d918c58ef5 Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
60f2b4af |
|
27-Mar-2014 |
Dave Airlie <airlied@redhat.com> |
drm/i915: fix build warning on 32-bit (v2) /ssd/git/drm-next/drivers/gpu/drm/i915/i915_cmd_parser.c: In function ‘i915_parse_cmds’: /ssd/git/drm-next/drivers/gpu/drm/i915/i915_cmd_parser.c:405:4: warning: format ‘%ld’ expects argument of type ‘long int’, but argument 5 has type ‘int’ [-Wformat=] DRM_DEBUG_DRIVER("CMD: Command length exceeds batch length: 0x%08X length=%d batchlen=%ld\n", ^ Signed-off-by: Dave Airlie <airlied@redhat.com>
|
#
e5081a53 |
|
18-Mar-2014 |
Damien Lespiau <damien.lespiau@intel.com> |
drm/i915: Use the correct format string modifier for ptrdiff_t When compiling on 32bits, I have the following warning: drivers/gpu/drm/i915/i915_cmd_parser.c:405:4: warning: format ‘%ld’ expects argument of type ‘long int’, but argument 7 has type ‘int’ [-Wformat=] DRM_DEBUG_DRIVER("CMD: Command length exceeds batch length: 0x%08X length=%d batchlen=%ld\n", The ptrdiff_t type has its own modifier: 't'. Cc: Brad Volkin <bradley.d.volkin@intel.com> Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Brad Volkin <bradley.d.volkin@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
351e3db2 |
|
18-Feb-2014 |
Brad Volkin <bradley.d.volkin@intel.com> |
drm/i915: Implement command buffer parsing logic The command parser scans batch buffers submitted via execbuffer ioctls before the driver submits them to hardware. At a high level, it looks for several things: 1) Commands which are explicitly defined as privileged or which should only be used by the kernel driver. The parser generally rejects such commands, with the provision that it may allow some from the drm master process. 2) Commands which access registers. To support correct/enhanced userspace functionality, particularly certain OpenGL extensions, the parser provides a whitelist of registers which userspace may safely access (for both normal and drm master processes). 3) Commands which access privileged memory (i.e. GGTT, HWS page, etc). The parser always rejects such commands. See the overview comment in the source for more details. This patch only implements the logic. Subsequent patches will build the tables that drive the parser. v2: Don't set the secure bit if the parser succeeds Fail harder during init Makefile cleanup Kerneldoc cleanup Clarify module param description Convert ints to bools in a few places Move client/subclient defs to i915_reg.h Remove the bits_count field OTC-Tracker: AXIA-4631 Change-Id: I50b98c71c6655893291c78a2d1b8954577b37a30 Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> [danvet: Appease checkpatch.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|