History log of /linux-master/drivers/gpu/drm/i915/i915_cmd_parser.c
Revision Date Author Comments
# e4865c60 03-Dec-2023 Zhao Liu <zhao1.liu@intel.com>

drm/i915: Use kmap_local_page() in i915_cmd_parser.c

The use of kmap_atomic() is being deprecated in favor of
kmap_local_page()[1], and this patch converts the call from
kmap_atomic() to kmap_local_page().

The main difference between atomic and local mappings is that local
mappings doesn't disable page faults or preemption (the preemption is
disabled for !PREEMPT_RT case, otherwise it only disables migration).

With kmap_local_page(), we can avoid the often unwanted side effect of
unnecessary page faults and preemption disables.

There're 2 reasons why function copy_batch() doesn't need to disable
pagefaults and preemption for mapping:

1. The flush operation is safe. In i915_cmd_parser.c, copy_batch() calls
drm_clflush_virt_range() to use CLFLUSHOPT or WBINVD to flush.
Since CLFLUSHOPT is global on x86 and WBINVD is called on each cpu
in drm_clflush_virt_range(), the flush operation is global.

2. Any context switch caused by preemption or page faults (page fault
may cause sleep) doesn't affect the validity of local mapping.

Therefore, copy_batch() is a function where the use of
kmap_local_page() in place of kmap_atomic() is correctly suited.

Convert the calls of kmap_atomic() / kunmap_atomic() to
kmap_local_page() / kunmap_local().

[1]: https://lore.kernel.org/all/20220813220034.806698-1-ira.weiny@intel.com

Suggested-by: Dave Hansen <dave.hansen@intel.com>
Suggested-by: Ira Weiny <ira.weiny@intel.com>
Suggested-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231203132947.2328805-9-zhao1.liu@linux.intel.com


# 8e4ee5e8 30-Nov-2022 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Wrap all access to i915_vma.node.start|size

We already wrap i915_vma.node.start for use with the GGTT, as there we
can perform additional sanity checks that the node belongs to the GGTT
and fits within the 32b registers. In the next couple of patches, we
will introduce guard pages around the objects _inside_ the drm_mm_node
allocation. That is we will offset the vma->pages so that the first page
is at drm_mm_node.start + vma->guard (not 0 as is currently the case).
All users must then not use i915_vma.node.start directly, but compute
the guard offset, thus all users are converted to use a
i915_vma_offset() wrapper.

The notable exceptions are the selftests that are testing exact
behaviour of i915_vma_pin/i915_vma_insert.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com>
Co-developed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221130235805.221010-3-andi.shyti@linux.intel.com


# e9b67ec2 03-Mar-2022 Jani Nikula <jani.nikula@intel.com>

drm/i915: include linux/highmem.h and linux/swap.h where needed

Include linux/highmem.h and linux/swap.h explicitly where needed so we
can drop the linux/i2c.h include from i915_drv.h where it pulled in the
dependencies implicitly.

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220303181931.1661767-5-jani.nikula@intel.com


# 5f2ec909 10-Feb-2022 Jani Nikula <jani.nikula@intel.com>

drm/i915: don't include drm_cache.h in i915_drv.h

Include it only in files that use it.

Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/14edab4a193ea3f73f387a88e3836c8555401871.1644507885.git.jani.nikula@intel.com


# ce2fce25 27-Jan-2022 Matt Roper <matthew.d.roper@intel.com>

drm/i915: Only include i915_reg.h from .c files

Several of our i915 header files, have been including i915_reg.h. This
means that any change to i915_reg.h will trigger a full rebuild of
pretty much every file of the driver, even those that don't have any
kind of register access. Let's delete the i915_reg.h include from all
headers and add an explicit include from the .c files that truly
need the register definitions; those that need a definition of
i915_reg_t for a function definition can get it from i915_reg_defs.h
instead.

We also remove two non-register #define's (VLV_DISPLAY_BASE and
GEN12_SFC_DONE_MAX) into i915_reg_defs.h to allow us to drop the
i915_reg.h include from a couple of headers.

There's probably a lot more header dependency optimization possible, but
the changes here roughly cut the number of files compiled after 'touch
i915_reg.h' in half --- a good first step.

Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220127234334.4016964-7-matthew.d.roper@intel.com


# 0d6419e9 27-Jan-2022 Matt Roper <matthew.d.roper@intel.com>

drm/i915: Move GT registers to their own header file

This is a huge, chaotic mass of registers copied over as-is without any
real cleanup. We'll come back and organize these better, align on
consistent coding style, remove dead code, etc. in separate patches
later that will be easier to review.

v2:
- Add missing include in intel_pxp_irq.c
v3:
- Correct a few indentation errors (Lucas)
- Minor conflict resolution

Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220127234334.4016964-6-matthew.d.roper@intel.com


# e71a7412 27-Jan-2022 Matt Roper <matthew.d.roper@intel.com>

drm/i915: Parameterize MI_PREDICATE registers

The various MI_PREDICATE registers have per-engine instances. Today we
only utilize the RCS0 instance of each, but that will likely change in
the future; switch to parameterized register definitions to make these
easier to work with going forward.

Of special note is MI_PREDICATE_RESULT_2; we only use it in one place in
the driver today in HSW-specific code. It turns out that the bspec
(page 94) lists two different offsets for this register on HSW; one is
in the standard location shared by all other platforms (base + 0x3bc)
and the other is an unusual location (0x2214). We're using the second,
non-standard offset in i915 today; that offset doesn't exist on any
other platforms (and it's not even 100% clear that it's correct for HSW)
so I've renamed the current non-standard definition to
HSW_MI_PREDICATE_RESULT_2; the new cross-platform parameterized macro
(which is still unused at the moment) uses the standard offset.

Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220127234334.4016964-5-matthew.d.roper@intel.com


# 202b1f4c 10-Jan-2022 Matt Roper <matthew.d.roper@intel.com>

drm/i915/gt: Move engine registers to their own header

Let's continue breaking up and cleaning up the massive i915_reg.h file
by moving all registers that are defined in relation to an engine base
to their own header.

There are probably a bunch of other "engine registers" that we haven't
moved yet (especially those that belong to the render engine in the
0x2??? range), but this is a relatively straightforward first step.

Cc: Jani Nikula <jani.nikula@linux.intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220111051600.3429104-8-matthew.d.roper@intel.com


# e9f9bcd5 10-Jan-2022 Matt Roper <matthew.d.roper@intel.com>

drm/i915: Use parameterized GPR register definitions everywhere

Since we have an engine-parameterized macro GEN8_RING_CS_GPR, let's use
that in place of the HSW_CS_GPR and BCS_GPR register definitions.

Cc: Jani Nikula <jani.nikula@linux.intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220111051600.3429104-2-matthew.d.roper@intel.com


# 23d639d7 07-Jan-2022 Jani Nikula <jani.nikula@intel.com>

drm/i915: split out i915_cmd_parser.h from i915_drv.h

We already have the i915_cmd_parser.c file.

Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1a02b8788266f4f2fd4de12808b55c4a66179e98.1641561552.git.jani.nikula@intel.com


# 15eb083b 20-Jul-2021 Jason Ekstrand <jason@jlekstrand.net>

drm/i915: Correct the docs for intel_engine_cmd_parser

In 93b713304188 ("drm/i915: Revert "drm/i915/gem: Asynchronous
cmdparser""), the parameters to intel_engine_cmd_parser() were altered
without updating the docs, causing Fi.CI.DOCS to start failing.

Fixes: 93b713304188 ("drm/i915: Revert "drm/i915/gem: Asynchronous cmdparser"")
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210720182108.2761496-1-jason@jlekstrand.net
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[Added 'Fixes:' tag and corrected the hash for the ancestor]


# 0c6609bb 14-Jul-2021 Jason Ekstrand <jason@jlekstrand.net>

Revert "drm/i915: Skip over MI_NOOP when parsing"

This reverts a6c5e2aea704 ("drm/i915: Skip over MI_NOOP when parsing").
It complicates the batch parsing code a bit and increases indentation
for no reason other than fast-skipping a command that userspace uses
only rarely. Sure, there may be IGT tests that fill batches with NOOPs
but that's not a case we should optimize for in the kernel. We should
optimize for code clarity instead.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jon Bloomfield <jon.bloomfield@intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210714193419.1459723-6-jason@jlekstrand.net


# 93b71330 14-Jul-2021 Jason Ekstrand <jason@jlekstrand.net>

drm/i915: Revert "drm/i915/gem: Asynchronous cmdparser"

This reverts 686c7c35abc2 ("drm/i915/gem: Asynchronous cmdparser"). The
justification for this commit in the git history was a vague comment
about getting it out from under the struct_mutex. While this may
improve perf for some workloads on Gen7 platforms where we rely on the
command parser for features such as indirect rendering, no numbers were
provided to prove such an improvement. It claims to closed two
gitlab/bugzilla issues but with no explanation whatsoever as to why or
what bug it's fixing.

Meanwhile, by moving command parsing off to an async callback, it leaves
us with a problem of what to do on error. When things were synchronous,
EXECBUFFER2 would fail with an error code if parsing failed. When
moving it to async, we needed another way to handle that error and the
solution employed was to set an error on the dma_fence and then trust
that said error gets propagated to the client eventually. Moving back
to synchronous will help us untangle the fence error propagation mess.

This also reverts most of 0edbb9ba1bfe ("drm/i915: Move cmd parser
pinning to execbuffer") which is a refactor of some of our allocation
paths for asynchronous parsing. Now that everything is synchronous, we
don't need it.

v2 (Daniel Vetter):
- Add stabel Cc and Fixes tag

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: <stable@vger.kernel.org> # v5.6+
Fixes: 9e31c1fe45d5 ("drm/i915: Propagate errors on awaiting already signaled fences")
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Jon Bloomfield <jon.bloomfield@intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210714193419.1459723-2-jason@jlekstrand.net


# 6e0b6528 20-Jul-2021 Jason Ekstrand <jason@jlekstrand.net>

drm/i915: Correct the docs for intel_engine_cmd_parser

In 93b713304188 ("drm/i915: Revert "drm/i915/gem: Asynchronous
cmdparser""), the parameters to intel_engine_cmd_parser() were altered
without updating the docs, causing Fi.CI.DOCS to start failing.

Fixes: c9d9fdbc108a ("drm/i915: Revert "drm/i915/gem: Asynchronous cmdparser"")
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210720182108.2761496-1-jason@jlekstrand.net
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[Added 'Fixes:' tag and corrected the hash for the ancestor]
(cherry picked from commit 15eb083bdb561bb4862cd04cd0523e55483e877e)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[Updated Fixes tag to match fixes branch]


# c9d9fdbc 14-Jul-2021 Jason Ekstrand <jason@jlekstrand.net>

drm/i915: Revert "drm/i915/gem: Asynchronous cmdparser"

This reverts 686c7c35abc2 ("drm/i915/gem: Asynchronous cmdparser"). The
justification for this commit in the git history was a vague comment
about getting it out from under the struct_mutex. While this may
improve perf for some workloads on Gen7 platforms where we rely on the
command parser for features such as indirect rendering, no numbers were
provided to prove such an improvement. It claims to closed two
gitlab/bugzilla issues but with no explanation whatsoever as to why or
what bug it's fixing.

Meanwhile, by moving command parsing off to an async callback, it leaves
us with a problem of what to do on error. When things were synchronous,
EXECBUFFER2 would fail with an error code if parsing failed. When
moving it to async, we needed another way to handle that error and the
solution employed was to set an error on the dma_fence and then trust
that said error gets propagated to the client eventually. Moving back
to synchronous will help us untangle the fence error propagation mess.

This also reverts most of 0edbb9ba1bfe ("drm/i915: Move cmd parser
pinning to execbuffer") which is a refactor of some of our allocation
paths for asynchronous parsing. Now that everything is synchronous, we
don't need it.

v2 (Daniel Vetter):
- Add stabel Cc and Fixes tag

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: <stable@vger.kernel.org> # v5.6+
Fixes: 9e31c1fe45d5 ("drm/i915: Propagate errors on awaiting already signaled fences")
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Jon Bloomfield <jon.bloomfield@intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210714193419.1459723-2-jason@jlekstrand.net
(cherry picked from commit 93b713304188844b8514074dc13ffd56d12235d3)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>


# 651e7d48 05-Jun-2021 Lucas De Marchi <lucas.demarchi@intel.com>

drm/i915: replace IS_GEN and friends with GRAPHICS_VER

This was done by the following semantic patch:

@@ expression i915; @@
- INTEL_GEN(i915)
+ GRAPHICS_VER(i915)

@@ expression i915; expression E; @@
- INTEL_GEN(i915) >= E
+ GRAPHICS_VER(i915) >= E

@@ expression dev_priv; expression E; @@
- !IS_GEN(dev_priv, E)
+ GRAPHICS_VER(dev_priv) != E

@@ expression dev_priv; expression E; @@
- IS_GEN(dev_priv, E)
+ GRAPHICS_VER(dev_priv) == E

@@
expression dev_priv;
expression from, until;
@@
- IS_GEN_RANGE(dev_priv, from, until)
+ IS_GRAPHICS_VER(dev_priv, from, until)

@def@
expression E;
identifier id =~ "^gen$";
@@
- id = GRAPHICS_VER(E)
+ ver = GRAPHICS_VER(E)

@@
identifier def.id;
@@
- id
+ ver

It also takes care of renaming the variable we assign to GRAPHICS_VER()
so to use "ver" rather than "gen".

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210606045050.103862-2-lucas.demarchi@intel.com


# f1f7f553 21-Apr-2021 Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

drm/i915: Fix docbook descriptions for i915_cmd_parser

Fixes the following htmldocs warnings:
drivers/gpu/drm/i915/i915_cmd_parser.c:1420: warning: Excess function parameter 'trampoline' description in 'intel_engine_cmd_parser'
drivers/gpu/drm/i915/i915_cmd_parser.c:1420: warning: Function parameter or member 'jump_whitelist' not described in 'intel_engine_cmd_parser'
drivers/gpu/drm/i915/i915_cmd_parser.c:1420: warning: Function parameter or member 'shadow_map' not described in 'intel_engine_cmd_parser'
drivers/gpu/drm/i915/i915_cmd_parser.c:1420: warning: Function parameter or member 'batch_map' not described in 'intel_engine_cmd_parser'
drivers/gpu/drm/i915/i915_cmd_parser.c:1420: warning: Excess function parameter 'trampoline' description in 'intel_engine_cmd_parser'

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210421120353.544518-1-maarten.lankhorst@linux.intel.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 0edbb9ba 23-Mar-2021 Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

drm/i915: Move cmd parser pinning to execbuffer

We need to get rid of allocations in the cmd parser, because it needs
to be called from a signaling context, first move all pinning to
execbuf, where we already hold all locks.

Allocate jump_whitelist in the execbuffer, and add annotations around
intel_engine_cmd_parser(), to ensure we only call the command parser
without allocating any memory, or taking any locks we're not supposed to.

Because i915_gem_object_get_page() may also allocate memory, add a
path to i915_gem_object_get_sg() that prevents memory allocations,
and walk the sg list manually. It should be similarly fast.

This has the added benefit of being able to catch all memory allocation
errors before the point of no return, and return -ENOMEM safely to the
execbuf submitter.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210323155059.628690-4-maarten.lankhorst@linux.intel.com


# a829f033 02-Mar-2021 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Wedge the GPU if command parser setup fails

Commit 311a50e76a33 ("drm/i915: Add support for mandatory cmdparsing")
introduced mandatory command parsing but setup failures were not
translated into wedging the GPU which was probably the intent.

Possible errors come in two categories. Either the sanity check on
internal tables has failed, which should be caught in CI unless an
affected platform would be missed in testing; or memory allocation failure
happened during driver load, which should be extremely unlikely but for
correctness should still be handled.

v2:
* Tidy coding style. (Chris)

[airlied: cherry-picked to avoid rc1 base]
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: 311a50e76a33 ("drm/i915: Add support for mandatory cmdparsing")
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris.p.wilson@intel.com>
Reviewed-by: Chris Wilson <chris.p.wilson@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210302114213.1102223-1-tvrtko.ursulin@linux.intel.com
(cherry picked from commit 5a1a659762d35a6dc51047c9127c011303c77b7f)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>


# 8f47c8c3 19-Jan-2021 Matthew Auld <matthew.auld@intel.com>

drm/i915/pool: constrain pool objects by mapping type

In a few places we always end up mapping the pool object with the FORCE
constraint(to prevent hitting -EBUSY) which will destroy the cached
mapping if it has a different type. As a simple first step, make the
mapping type part of the pool interface, where the behaviour is to only
give out pool objects which match the requested mapping type.

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20210119133106.66294-4-matthew.auld@intel.com


# eeb52ee6 24-Dec-2020 Matthew Auld <matthew.auld@intel.com>

drm/i915: clear the shadow batch

The shadow batch is an internal object, which doesn't have any page
clearing, and since the batch_len can be smaller than the object, we
should take care to clear it.

Testcase: igt/gen9_exec_parse/shadow-peek
Fixes: 4f7af1948abc ("drm/i915: Support ro ppgtt mapped cmdparser shadow buffers")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201224151358.401345-1-matthew.auld@intel.com
Cc: stable@vger.kernel.org


# 45233ab2 16-Dec-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Move gen8 CS emitters into gen8_engine_cs.h

Reduce the pollution of intel_engine.h by moving gen8_emit_pipe_control
and friends to gen8_engine_cs.h

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201216135452.6063-1-chris@chris-wilson.co.uk


# 75353bcd 24-Dec-2020 Matthew Auld <matthew.auld@intel.com>

drm/i915: clear the shadow batch

The shadow batch is an internal object, which doesn't have any page
clearing, and since the batch_len can be smaller than the object, we
should take care to clear it.

Testcase: igt/gen9_exec_parse/shadow-peek
Fixes: 4f7af1948abc ("drm/i915: Support ro ppgtt mapped cmdparser shadow buffers")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201224151358.401345-1-matthew.auld@intel.com
Cc: stable@vger.kernel.org
(cherry picked from commit eeb52ee6c4a429ec301faf1dc48988744960786e)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>


# a6c5e2ae 01-Oct-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Skip over MI_NOOP when parsing

Though less likely in practice, igt uses MI_NOOP frequently to pad out
its batch buffers. The lookup and valiation of so many MI_NOOP command
descriptions is noticeable, though the side-effect of poisoning the
last-validated-command cache is more likely to impact upon real CS.

Testcase: igt/gen9_exec_parse/bb-large
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201001102632.18789-1-chris@chris-wilson.co.uk


# c60b93cd 28-Sep-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Avoid mixing integer types during batch copies

Be consistent and use unsigned long throughout the chunk copies to
avoid the inherent clumsiness of mixing integer types of different
widths and signs. Failing to take acount of a wider unsigned type when
using min_t can lead to treating it as a negative, only for it flip back
to a large unsigned value after passing a boundary check.

Fixes: ed13033f0287 ("drm/i915/cmdparser: Only cache the dst vmap")
Testcase: igt/gen9_exec_parse/bb-large
Reported-by: "Candelaria, Jared" <jared.candelaria@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: "Candelaria, Jared" <jared.candelaria@intel.com>
Cc: "Bloomfield, Jon" <jon.bloomfield@intel.com>
Cc: <stable@vger.kernel.org> # v4.9+
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200928215942.31917-1-chris@chris-wilson.co.uk
(cherry picked from commit b7eeb2b4132ccf1a7d38f434cde7043913d1ed3c)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>


# b7eeb2b4 28-Sep-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Avoid mixing integer types during batch copies

Be consistent and use unsigned long throughout the chunk copies to
avoid the inherent clumsiness of mixing integer types of different
widths and signs. Failing to take acount of a wider unsigned type when
using min_t can lead to treating it as a negative, only for it flip back
to a large unsigned value after passing a boundary check.

Fixes: ed13033f0287 ("drm/i915/cmdparser: Only cache the dst vmap")
Testcase: igt/gen9_exec_parse/bb-large
Reported-by: "Candelaria, Jared" <jared.candelaria@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: "Candelaria, Jared" <jared.candelaria@intel.com>
Cc: "Bloomfield, Jon" <jon.bloomfield@intel.com>
Cc: <stable@vger.kernel.org> # v4.9+
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200928215942.31917-1-chris@chris-wilson.co.uk


# e5f10d63 17-Aug-2020 Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915: Fix cmd parser desc matching with masks

Our variety of defined gpu commands have the actual
command id field and possibly length and flags applied.

We did start to apply the mask during initialization of
the cmd descriptors but forgot to also apply it on comparisons.

Fix comparisons in order to properly deny access with
associated commands.

v2: fix lri with correct mask (Chris)

References: 926abff21a8f ("drm/i915/cmdparser: Ignore Length operands during command matching")
Reported-by: Nicolai Stange <nstange@suse.de>
Cc: stable@vger.kernel.org # v5.4+
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Tyler Hicks <tyhicks@canonical.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Chris Wilson <chris.p.wilson@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200817195926.12671-1-mika.kuoppala@linux.intel.com
(cherry picked from commit 3b4efa148da36f158cce3f662e831af2834b8e0f)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>


# 3b4efa14 17-Aug-2020 Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915: Fix cmd parser desc matching with masks

Our variety of defined gpu commands have the actual
command id field and possibly length and flags applied.

We did start to apply the mask during initialization of
the cmd descriptors but forgot to also apply it on comparisons.

Fix comparisons in order to properly deny access with
associated commands.

v2: fix lri with correct mask (Chris)

References: 926abff21a8f ("drm/i915/cmdparser: Ignore Length operands during command matching")
Reported-by: Nicolai Stange <nstange@suse.de>
Cc: stable@vger.kernel.org # v5.4+
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Tyler Hicks <tyhicks@canonical.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Chris Wilson <chris.p.wilson@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200817195926.12671-1-mika.kuoppala@linux.intel.com


# 273500ae 01-Jun-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Whitelist context-local timestamp in the gen9 cmdparser

Allow batch buffers to read their own _local_ cumulative HW runtime of
their logical context.

Fixes: 0f2f39758341 ("drm/i915: Add gen9 BCS cmdparsing")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.4+
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200601161942.30854-1-chris@chris-wilson.co.uk
(cherry picked from commit f9496520df11de00fbafc3cbd693b9570d600ab3)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# f9496520 01-Jun-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Whitelist context-local timestamp in the gen9 cmdparser

Allow batch buffers to read their own _local_ cumulative HW runtime of
their logical context.

Fixes: 0f2f39758341 ("drm/i915: Add gen9 BCS cmdparsing")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.4+
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200601161942.30854-1-chris@chris-wilson.co.uk


# 0c4336b9 30-Jan-2020 Wambui Karuga <wambui.karugax@gmail.com>

drm/i915/cmd_parser: conversion to struct drm_device logging macros.

Manually convert printk based drm logging macros to the struct
drm_device based logging macros in i915/i915_cmd_parser.c.
This also involves extracting the drm_i915_private device from various
intel types for use in the macros.

Instances of the DRM_DEBUG macro are not converted due to the lack of a
similar struct drm_device based logging macro.

References: https://lists.freedesktop.org/archives/dri-devel/2020-January/253381.html
Signed-off-by: Wambui Karuga <wambui.karugax@gmail.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200131093416.28431-4-wambui.karugax@gmail.com


# f8b74877 13-Dec-2019 Maya Rashish <coypu@sdf.org>

Correct function name in comment

Signed-off-by: Maya Rashish <coypu@sdf.org>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191213102630.GA24082@SDF.ORG


# 686c7c35 11-Dec-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gem: Asynchronous cmdparser

Execute the cmdparser asynchronously as part of the submission pipeline.
Using our dma-fences, we can schedule execution after an asynchronous
piece of work, so we move the cmdparser out from under the struct_mutex
inside execbuf as run it as part of the submission pipeline. The same
security rules apply, we copy the user batch before validation and
userspace cannot touch the validation shadow. The only caveat is that we
will do request construction before we complete cmdparsing and so we
cannot know the outcome of the validation step until later -- so the
execbuf ioctl does not report -EINVAL directly, but we must cancel
execution of the request and flag the error on the out-fence.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/611
Closes: https://gitlab.freedesktop.org/drm/intel/issues/412
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191211230858.599030-2-chris@chris-wilson.co.uk


# 32d94048 11-Dec-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gem: Prepare gen7 cmdparser for async execution

The gen7 cmdparser is primarily a promotion-based system to allow access
to additional registers beyond the HW validation, and allows fallback to
normal execution of the user batch buffer if valid and requires
chaining. In the next patch, we will do the cmdparser validation in the
pipeline asynchronously and so at the point of request construction we
will not know if we want to execute the privileged and validated batch,
or the original user batch. The solution employed here is to execute
both batches, one with raised privileges and one as normal. This is
because the gen7 MI_BATCH_BUFFER_START command cannot change privilege
level within a batch and must strictly use the current privilege level
(or undefined behaviour kills the GPU). So in order to execute the
original batch, we need a second non-priviledged batch buffer chain from
the ring, i.e. we need to emit two batches for each user batch. Inside
the two batches we determine which one should actually execute, we
provide a conditional trampoline to call the original batch.

Implementation-wise, we create a single buffer and write the shadow and
the trampoline inside it at different offsets; and bind the buffer into
both the kernel GGTT for the privileged execution of the shadow and into
the user ppGTT for the non-privileged execution of the trampoline and
original batch. One buffer, two batches and two vma.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191211230858.599030-1-chris@chris-wilson.co.uk


# 6aacb5a3 11-Dec-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Align start for memcpy_from_wc

The movntqda requires 16-byte alignment for the source pointer. Avoid
falling back to clflush if the source pointer is misaligned by doing the
doing a small uncached memcpy to fixup the alignments.

v2: Turn the unaligned copy into a genuine helper

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191211110437.4082687-5-chris@chris-wilson.co.uk


# 37d1151c 11-Dec-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Simplify error escape from cmdparser

We need to flush the destination buffer, even on error, to maintain
consistent cache state. Thereby removing the jump on error past the
clear, and reducing the loop-escape mechanism to a mere break.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191211110437.4082687-3-chris@chris-wilson.co.uk


# 755bf8a8 11-Dec-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Remove redundant parameters from intel_engine_cmd_parser

Declutter the calling interface by reducing the parameters to the
i915_vma and associated offsets.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191211110437.4082687-2-chris@chris-wilson.co.uk


# 8f1ada25 11-Dec-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Fix cmdparser drm.debug

The cmdparser rejection debug is not for driver development, but for the
user, for which we use a plain DRM_DEBUG().

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191211110437.4082687-1-chris@chris-wilson.co.uk


# 05975cd9 04-Dec-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Remove vestigal i915_gem_context locals from cmdparser

The use GEM context itself was removed in commit cd30a5031704
("drm/i915/gem: Excise the per-batch whitelist from the context"), but
the locals were left in place as an oversight. Remove the parameters and
clean up.

References: cd30a5031704 ("drm/i915/gem: Excise the per-batch whitelist from the context")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191204232616.94397-1-chris@chris-wilson.co.uk


# cd30a503 28-Nov-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gem: Excise the per-batch whitelist from the context

One does not lightly add a new hidden struct_mutex dependency deep within
the execbuf bowels! The immediate suspicion in seeing the whitelist
cached on the context, is that it is intended to be preserved between
batches, as the kernel is quite adept at caching small allocations
itself. But no, it's sole purpose is to serialise command submission in
order to save a kmalloc on a slow, slow path!

By removing the whitelist dependency from the context, our freedom to
chop the big struct_mutex is greatly augmented.

v2: s/set_bit/__set_bit/ as the whitelist shall never be accessed
concurrently.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191128113424.3885958-1-chris@chris-wilson.co.uk


# ea0b163b 11-Nov-2019 Ben Hutchings <ben@decadent.org.uk>

drm/i915/cmdparser: Fix jump whitelist clearing

When a jump_whitelist bitmap is reused, it needs to be cleared.
Currently this is done with memset() and the size calculation assumes
bitmaps are made of 32-bit words, not longs. So on 64-bit
architectures, only the first half of the bitmap is cleared.

If some whitelist bits are carried over between successive batches
submitted on the same context, this will presumably allow embedding
the rogue instructions that we're trying to reject.

Use bitmap_zero() instead, which gets the calculation right.

Fixes: f8c08d8faee5 ("drm/i915/cmdparser: Add support for backward jumps")
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>


# 926abff2 20-Sep-2018 Jon Bloomfield <jon.bloomfield@intel.com>

drm/i915/cmdparser: Ignore Length operands during command matching

Some of the gen instruction macros (e.g. MI_DISPLAY_FLIP) have the
length directly encoded in them. Since these are used directly in
the tables, the Length becomes part of the comparison used for
matching during parsing. Thus, if the cmd being parsed has a
different length to that in the table, it is not matched and the
cmd is accepted via the default variable length path.

Fix by masking out everything except the Opcode in the cmd tables

Cc: Tony Luck <tony.luck@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Reviewed-by: Chris Wilson <chris.p.wilson@intel.com>


# f8c08d8f 20-Sep-2018 Jon Bloomfield <jon.bloomfield@intel.com>

drm/i915/cmdparser: Add support for backward jumps

To keep things manageable, the pre-gen9 cmdparser does not
attempt to track any form of nested BB_START's. This did not
prevent usermode from using nested starts, or even chained
batches because the cmdparser is not strictly enforced pre gen9.

Instead, the existence of a nested BB_START would cause the batch
to be emitted in insecure mode, and any privileged capabilities
would not be available.

For Gen9, the cmdparser becomes mandatory (for BCS at least), and
so not providing any form of nested BB_START support becomes
overly restrictive. Any such batch will simply not run.

We make heavy use of backward jumps in igt, and it is much easier
to add support for this restricted subset of nested jumps, than to
rewrite the whole of our test suite to avoid them.

Add the required logic to support limited backward jumps, to
instructions that have already been validated by the parser.

Note that it's not sufficient to simply approve any BB_START
that jumps backwards in the buffer because this would allow an
attacker to embed a rogue instruction sequence within the
operand words of a harmless instruction (say LRI) and jump to
that.

We introduce a bit array to track every instr offset successfully
validated, and test the target of BB_START against this. If the
target offset hits, it is re-written to the same offset in the
shadow buffer and the BB_START cmd is allowed.

Note: This patch deliberately ignores checkpatch issues in the
cmdtables, in order to match the style of the surrounding code.
We'll correct the entire file in one go in a later patch.

v2: set dispatch secure late (Mika)
v3: rebase (Mika)
v4: Clear whitelist on each parse
Minor review updates (Chris)
v5: Correct backward jump batching
v6: fix compilation error due to struct eb shuffle (Mika)

Cc: Tony Luck <tony.luck@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris.p.wilson@intel.com>


# 0546a29c 27-Sep-2018 Jon Bloomfield <jon.bloomfield@intel.com>

drm/i915/cmdparser: Use explicit goto for error paths

In the next patch we will be adding a second valid
termination condition which will require a small
amount of refactoring to share logic with the BB_END
case.

Refactor all error conditions to jump to a dedicated
exit path, with 'break' reserved only for a successful
parse.

Cc: Tony Luck <tony.luck@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Reviewed-by: Chris Wilson <chris.p.wilson@intel.com>


# 0f2f3975 23-Apr-2018 Jon Bloomfield <jon.bloomfield@intel.com>

drm/i915: Add gen9 BCS cmdparsing

For gen9 we enable cmdparsing on the BCS ring, specifically
to catch inadvertent accesses to sensitive registers

Unlike gen7/hsw, we use the parser only to block certain
registers. We can rely on h/w to block restricted commands,
so the command tables only provide enough info to allow the
parser to delineate each command, and identify commands that
access registers.

Note: This patch deliberately ignores checkpatch issues in
favour of matching the style of the surrounding code. We'll
correct the entire file in one go in a later patch.

v3: rebase (Mika)
v4: Add RING_TIMESTAMP registers to whitelist (Jon)

Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris.p.wilson@intel.com>


# 311a50e7 01-Aug-2018 Jon Bloomfield <jon.bloomfield@intel.com>

drm/i915: Add support for mandatory cmdparsing

The existing cmdparser for gen7 can be bypassed by specifying
batch_len=0 in the execbuf call. This is safe because bypassing
simply reduces the cmd-set available.

In a later patch we will introduce cmdparsing for gen9, as a
security measure, which must be strictly enforced since without
it we are vulnerable to DoS attacks.

Introduce the concept of 'required' cmd parsing that cannot be
bypassed by submitting zero-length bb's.

v2: rebase (Mika)
v2: rebase (Mika)
v3: fix conflict on engine flags (Mika)

Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris.p.wilson@intel.com>


# 66d8aba1 08-Jun-2018 Jon Bloomfield <jon.bloomfield@intel.com>

drm/i915: Remove Master tables from cmdparser

The previous patch has killed support for secure batches
on gen6+, and hence the cmdparsers master tables are
now dead code. Remove them.

Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Tyler Hicks <tyhicks@canonical.com>
Reviewed-by: Chris Wilson <chris.p.wilson@intel.com>


# 0a2f661b 20-Apr-2018 Jon Bloomfield <jon.bloomfield@intel.com>

drm/i915: Rename gen7 cmdparser tables

We're about to introduce some new tables for later gens, and the
current naming for the gen7 tables will no longer make sense.

v2: rebase

Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris.p.wilson@intel.com>


# 9c9082b9 08-Aug-2019 Jani Nikula <jani.nikula@intel.com>

drm/i915: extract i915_memcpy.h from i915_drv.h

It used to be handy that we only had a couple of headers, but over time
i915_drv.h has become unwieldy. Extract declarations to a separate
header file corresponding to the implementation module, clarifying the
modularity of the driver.

Ensure the new header is self-contained, and do so with minimal further
includes, using forward declarations as needed. Include the new header
only where needed, and sort the modified include directives while at it
and as needed.

No functional changes.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/f2b887002150acdf218385ea846f7aa617aa5f15.1565271681.git.jani.nikula@intel.com


# 750e76b4 06-Aug-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Move the [class][inst] lookup for engines onto the GT

To maintain a fast lookup from a GT centric irq handler, we want the
engine lookup tables on the intel_gt. To avoid having multiple copies of
the same multi-dimension lookup table, move the generic user engine
lookup into an rbtree (for fast and flexible indexing).

v2: Split uabi_instance cf uabi_class
v3: Set uabi_class/uabi_instance after collating all engines to provide a
stable uabi across parallel unordered construction.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> #v2
Link: https://patchwork.freedesktop.org/patch/msgid/20190806124300.24945-2-chris@chris-wilson.co.uk


# 6951e589 28-May-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Move GEM object domain management from struct_mutex to local

Use the per-object local lock to control the cache domain of the
individual GEM objects, not struct_mutex. This is a huge leap forward
for us in terms of object-level synchronisation; execbuffers are
coordinated using the ww_mutex and pread/pwrite is finally fully
serialised again.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-10-chris@chris-wilson.co.uk


# f0e4a063 28-May-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Move GEM domain management to its own file

Continuing the decluttering of i915_gem.c, that of the read/write
domains, perhaps the biggest of GEM's follies?

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-7-chris@chris-wilson.co.uk


# 112ed2d3 24-Apr-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Move GraphicsTechnology files under gt/

Start partitioning off the code that talks to the hardware (GT) from the
uapi layers and move the device facing code under gt/

One casualty is s/intel_ringbuffer.h/intel_engine.h/ with the plan to
subdivide that header and body further (and split out the submission
code from the ringbuffer and logical context handling). This patch aims
to be simple motion so git can fixup inflight patches with little mess.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190424174839.7141-1-chris@chris-wilson.co.uk


# 8a68d464 05-Mar-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Store the BIT(engine->id) as the engine's mask

In the next patch, we are introducing a broad virtual engine to encompass
multiple physical engines, losing the 1:1 nature of BIT(engine->id). To
reflect the broader set of engines implied by the virtual instance, lets
store the full bitmask.

v2: Use intel_engine_mask_t (s/ring_mask/engine_mask/)
v3: Tvrtko voted for moah churn so teach everyone to not mention ring
and use $class$instance throughout.
v4: Comment upon the disparity in bspec for using VCS1,VCS2 in gen8 and
VCS[0-4] in later gen. We opt to keep the code consistent and use
0-index naming throughout.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190305180332.30900-1-chris@chris-wilson.co.uk


# cf819eff 12-Dec-2018 Lucas De Marchi <lucas.demarchi@intel.com>

drm/i915: replace IS_GEN<N> with IS_GEN(..., N)

Define IS_GEN() similarly to our IS_GEN_RANGE(). but use gen instead of
gen_mask to do the comparison. Now callers can pass then gen as a parameter,
so we don't require one macro for each gen.

The following spatch was used to convert the users of these macros:

@@
expression e;
@@
(
- IS_GEN2(e)
+ IS_GEN(e, 2)
|
- IS_GEN3(e)
+ IS_GEN(e, 3)
|
- IS_GEN4(e)
+ IS_GEN(e, 4)
|
- IS_GEN5(e)
+ IS_GEN(e, 5)
|
- IS_GEN6(e)
+ IS_GEN(e, 6)
|
- IS_GEN7(e)
+ IS_GEN(e, 7)
|
- IS_GEN8(e)
+ IS_GEN(e, 8)
|
- IS_GEN9(e)
+ IS_GEN(e, 9)
|
- IS_GEN10(e)
+ IS_GEN(e, 10)
|
- IS_GEN11(e)
+ IS_GEN(e, 11)
)

v2: use IS_GEN rather than GT_GEN and compare to info.gen rather than
using the bitmask

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181212181044.15886-2-lucas.demarchi@intel.com


# b3ad99ed 05-Feb-2018 Michal Srb <msrb@suse.com>

drm/i915/cmdparser: Do not check past the cmd length.

The command MEDIA_VFE_STATE checks bits at offset +2 dwords. However, it is
possible to have MEDIA_VFE_STATE command with length = 0 + LENGTH_BIAS = 2.
In that case check_cmd will read bits from the following command, or even past
the end of the buffer.

If the offset ends up outside of the command length, reject the command.

Fixes: 351e3db2b363 ("drm/i915: Implement command buffer parsing logic")
Signed-off-by: Michal Srb <msrb@suse.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180205151745.29292-1-msrb@suse.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180205160438.3267-2-chris@chris-wilson.co.uk
(cherry picked from commit 3aec7f871c65eb5f76b4125fda432593c834a6f2)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>


# b18224e9 05-Feb-2018 Michal Srb <msrb@suse.com>

drm/i915/cmdparser: Check reg_table_count before derefencing.

The find_reg function was assuming that there is always at least one table in
reg_tables. It is not always true.

In case of VCS or VECS, the reg_tables is NULL and reg_table_count is 0,
implying that no register-accessing commands are allowed. However, the command
tables include commands such as MI_STORE_REGISTER_MEM. When trying to check
such command, the find_reg would dereference NULL pointer.

Now it will just return NULL meaning that the register was not found and the
command will be rejected.

Fixes: 76ff480ec963 ("drm/i915/cmdparser: Use binary search for faster register lookup")
Signed-off-by: Michal Srb <msrb@suse.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180205142916.27092-2-msrb@suse.com
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180205160438.3267-1-chris@chris-wilson.co.uk
register lookup")
(cherry picked from commit 2f265fad9756a40c09e3f4dcc62d5d7fa73a9fb2)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>


# 3aec7f87 05-Feb-2018 Michal Srb <msrb@suse.com>

drm/i915/cmdparser: Do not check past the cmd length.

The command MEDIA_VFE_STATE checks bits at offset +2 dwords. However, it is
possible to have MEDIA_VFE_STATE command with length = 0 + LENGTH_BIAS = 2.
In that case check_cmd will read bits from the following command, or even past
the end of the buffer.

If the offset ends up outside of the command length, reject the command.

Fixes: 351e3db2b363 ("drm/i915: Implement command buffer parsing logic")
Signed-off-by: Michal Srb <msrb@suse.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180205151745.29292-1-msrb@suse.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180205160438.3267-2-chris@chris-wilson.co.uk


# 2f265fad 05-Feb-2018 Michal Srb <msrb@suse.com>

drm/i915/cmdparser: Check reg_table_count before derefencing.

The find_reg function was assuming that there is always at least one table in
reg_tables. It is not always true.

In case of VCS or VECS, the reg_tables is NULL and reg_table_count is 0,
implying that no register-accessing commands are allowed. However, the command
tables include commands such as MI_STORE_REGISTER_MEM. When trying to check
such command, the find_reg would dereference NULL pointer.

Now it will just return NULL meaning that the register was not found and the
command will be rejected.

Fixes: 76ff480ec963 ("drm/i915/cmdparser: Use binary search for faster register lookup")
Signed-off-by: Michal Srb <msrb@suse.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180205142916.27092-2-msrb@suse.com
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180205160438.3267-1-chris@chris-wilson.co.uk
register lookup")


# 439e2ee4 29-Nov-2017 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Move engine->needs_cmd_parser to engine->flags

Will be adding a new per-engine flags shortly so it makes sense
to consolidate.

v2: Keep the original code flow in intel_engine_cleanup_cmd_parser.
(Joonas Lahtinen)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171129082409.18189-1-tvrtko.ursulin@linux.intel.com


# 0ffba1fc 07-Nov-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Silence smatch for cmdparser

drivers/gpu/drm/i915/i915_cmd_parser.c:808:23: error: not an lvalue
drivers/gpu/drm/i915/i915_cmd_parser.c:811:23: error: not an lvalue
drivers/gpu/drm/i915/i915_cmd_parser.c:814:23: error: not an lvalue
drivers/gpu/drm/i915/i915_cmd_parser.c:808:23: error: not an lvalue
drivers/gpu/drm/i915/i915_cmd_parser.c:811:23: error: not an lvalue
drivers/gpu/drm/i915/i915_cmd_parser.c:814:23: error: not an lvalue
drivers/gpu/drm/i915/i915_cmd_parser.c:808:23: error: not an lvalue
drivers/gpu/drm/i915/i915_cmd_parser.c:811:23: error: not an lvalue
drivers/gpu/drm/i915/i915_cmd_parser.c:814:23: error: not an lvalue
drivers/gpu/drm/i915/i915_cmd_parser.c:808:23: error: not an lvalue
drivers/gpu/drm/i915/i915_cmd_parser.c:811:23: error: not an lvalue
drivers/gpu/drm/i915/i915_cmd_parser.c:814:23: error: not an lvalue

If we move the shift into each case not only do we kill the warning from
smatch, but we shrink the code slightly:

text data bss dec hex filename
1267906 20587 3168 1291661 13b58d before
1267890 20587 3168 1291645 13b57d after

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171107154055.19460-1-chris@chris-wilson.co.uk
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Gabriel Krisman Bertazi <krisman@collabora.co.uk>


# 3b24e7e8 28-Aug-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Recreate vmapping even when the object is pinned

Sometimes we know we are the only user of the bo, but since we take a
protective pin_pages early on, an attempt to change the vmap on the
object is denied because it is busy. i915_gem_object_pin_map() cannot
tell from our single pin_count if the operation is safe. Instead we must
pass that information down from the caller in the manner of
I915_MAP_OVERRIDE.

This issue has existed from the introduction of the mapping, but was
never noticed as the only place where this conflict might happen is for
cached kernel buffers (such as allocated by i915_gem_batch_pool_get()).
Until recently there was only a single user (the cmdparser) so no
conflicts ever occurred. However, we now use it to allocate batches for
different operations (using MAP_WC on !llc for writes) in addition to the
existing shadow batch (using MAP_WB for reads).

We could either keep both mappings cached, or use a different write
mechanism if we detect a MAP_WB already exists (i.e. clflush
afterwards), but as we haven't seen this issue in the wild (it requires
hitting the GPU reloc path in addition to the cmdparser) for simplicity
just allow the mappings to be recreated.

v2: Include the i915_MAP_OVERRIDE bit in the enum so the compiler knows
about all the valid values.

Fixes: 7dd4f6729f92 ("drm/i915: Async GPU relocation processing")
Testcase: igt/gem_lut_handle # byt, completely by accident
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170828104631.8606-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
(cherry picked from commit a575c6761757232ea2c7dc9f370640754b90cc69)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>


# a575c676 28-Aug-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Recreate vmapping even when the object is pinned

Sometimes we know we are the only user of the bo, but since we take a
protective pin_pages early on, an attempt to change the vmap on the
object is denied because it is busy. i915_gem_object_pin_map() cannot
tell from our single pin_count if the operation is safe. Instead we must
pass that information down from the caller in the manner of
I915_MAP_OVERRIDE.

This issue has existed from the introduction of the mapping, but was
never noticed as the only place where this conflict might happen is for
cached kernel buffers (such as allocated by i915_gem_batch_pool_get()).
Until recently there was only a single user (the cmdparser) so no
conflicts ever occurred. However, we now use it to allocate batches for
different operations (using MAP_WC on !llc for writes) in addition to the
existing shadow batch (using MAP_WB for reads).

We could either keep both mappings cached, or use a different write
mechanism if we detect a MAP_WB already exists (i.e. clflush
afterwards), but as we haven't seen this issue in the wild (it requires
hitting the GPU reloc path in addition to the cmdparser) for simplicity
just allow the mappings to be recreated.

v2: Include the i915_MAP_OVERRIDE bit in the enum so the compiler knows
about all the valid values.

Fixes: 7dd4f6729f92 ("drm/i915: Async GPU relocation processing")
Testcase: igt/gem_lut_handle # byt, completely by accident
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170828104631.8606-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# 0ce81788 17-May-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Redefine ptr_pack_bits() and friends

Rebrand the current (pointer | bits) pack/unpack utility macros as
explicit bit twiddling for PAGE_SIZE so that we can use the more
flexible underlying macros for different bits.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170517121007.27224-4-chris@chris-wilson.co.uk


# 1d39f281 11-Apr-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Rename intel_engine_cs.exec_id to uabi_id

We want to refer to the index of the engine consistently throughout the
userspace ABI. We already have such an index through the execbuffer
engine specifier, that needs to be able to refer to each engine
specifically, so rename it the index to uabi_id to reflect its
generality beyond execbuf.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170411124306.15448-1-chris@chris-wilson.co.uk


# 504ae402 10-Mar-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/cmdparser: Limit clflush to active cachelines

We only need to clflush those cachelines that we have validated to be
read by the GPU. Userspace typically fills the batch length in
correctly, the exceptions tend to be explicit tests within igt.

v2: Use ptr_mask_bits() to make Mika happy
v3: cmd is not advanced on MI_BBE, so make sure to include an extra
dword in the clflush.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170310115518.13832-1-chris@chris-wilson.co.uk


# c4d3ae68 06-Jan-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Consolidate checks for memcpy-from-wc support

In order to silence sparse:

../drivers/gpu/drm/i915/i915_gpu_error.c:200:39: warning: Using plain integer as NULL pointer

add a helper to check whether we have sse4.1 and that the desired
alignment is valid for acceleration.

v2: Explain the macros and split the two use cases between
i915_has_memcpy_from_wc() and i915_can_memcpy_from_wc().

Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170106152013.24684-1-chris@chris-wilson.co.uk


# 41736a8e 23-Nov-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Use the precomputed value for whether to enable command parsing

As i915.enable_cmd_parser is an unsafe option, make it read-only at
runtime. Now that it is constant, we can use the value determined during
initialisation as to whether we need the cmdparser at execbuffer time.

v2: Remove the inline for its single user, it is clear enough (and
shorter) without!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161124125851.6615-1-chris@chris-wilson.co.uk


# 007873b3 23-Nov-2016 Matthew Auld <matthew.auld@intel.com>

drm/i915: kick out cmd_parser specific structs from i915_drv.h

No sense in keeping the cmd_descriptor and cmd_table structs in
i915_drv.h, now that they are no longer referenced externally.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1479942147-9837-1-git-send-email-matthew.auld@intel.com


# e3f51ece 14-Nov-2016 Matthew Auld <matthew.auld@intel.com>

drm/i915: cleanup use of INSTR_CLIENT_MASK

Doing cmd_header >> 29 to extract our 3-bit client value where we know
cmd_header is a u32 shouldn't then also require the use of a mask. So
remove the redundant operation and get rid of INSTR_CLIENT_MASK now that
there are no longer any users.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1479163174-29686-1-git-send-email-matthew.auld@intel.com


# 10ff401d 07-Nov-2016 Robert Bragg <robert@sixbynine.org>

drm/i915: don't whitelist oacontrol in cmd parser

Being able to program OACONTROL from a non-privileged batch buffer is
not sufficient to be able to configure the OA unit. This was originally
allowed to help enable Mesa to expose OA counters via the
INTEL_performance_query extension, but the current implementation based
on programming OACONTROL via a batch buffer isn't able to report useable
data without a more complete OA unit configuration. Mesa handles the
possibility that writes to OACONTROL may not be allowed and so only
advertises the extension after explicitly testing that a write to
OACONTROL succeeds. Based on this; removing OACONTROL from the whitelist
should be ok for userspace.

Removing this simplifies adding a new kernel api for configuring the OA
unit without needing to consider the possibility that userspace might
trample on OACONTROL state which we'd like to start managing within
the kernel instead. In particular running any Mesa based GL application
currently results in clearing OACONTROL when initializing which would
disable the capturing of metrics.

v2:
This bumps the command parser version from 8 to 9, as the change is
visible to userspace.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Sourab Gupta <sourab.gupta@intel.com>Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20161108125148.25007-1-robert@sixbynine.org


# 9bbeaedb 07-Nov-2016 Robert Bragg <robert@sixbynine.org>

drm/i915: return EACCES for check_cmd() failures

check_cmd() is checking whether a command adheres to certain
restrictions that ensure it's safe to execute within a privileged batch
buffer. Returning false implies a privilege problem, not that the
command is invalid.

The distinction makes the difference between allowing the buffer to be
executed as an unprivileged batch buffer or returning an EINVAL error to
userspace without executing anything.

In a case where userspace may want to test whether it can successfully
write to a register that needs privileges the distinction may be
important and an EINVAL error may be considered fatal.

In particular this is currently true for Mesa, which includes a test for
whether OACONTROL can be written too, but Mesa treats any error when
flushing a batch buffer as fatal, calling exit(1).

As it is currently Mesa can gracefully handle a failure to write to
OACONTROL if the command parser is disabled, but if we were to remove
OACONTROL from the parser's whitelist then the returned EINVAL would
break Mesa applications as they attempt an OACONTROL write.

This bumps the command parser version from 7 to 8, as the change is
visible to userspace.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Sourab Gupta <sourab.gupta@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20161107194957.3385-4-robert@sixbynine.org


# a941795a 07-Nov-2016 Robert Bragg <robert@sixbynine.org>

drm/i915: rename OACONTROL GEN7_OACONTROL

OACONTROL changes quite a bit for gen8, with some bits split out into a
per-context OACTXCONTROL register. Rename now before adding more gen7 OA
registers

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Sourab Gupta <sourab.gupta@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20161107194957.3385-3-robert@sixbynine.org


# a4f5ea64 28-Oct-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Refactor object page API

The plan is to make obtaining the backing storage for the object avoid
struct_mutex (i.e. use its own locking). The first step is to update the
API so that normal users only call pin/unpin whilst working on the
backing storage.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-12-chris@chris-wilson.co.uk


# 3b3f1650 13-Oct-2016 Akash Goel <akash.goel@intel.com>

drm/i915: Allocate intel_engine_cs structure only for the enabled engines

With the possibility of addition of many more number of rings in future,
the drm_i915_private structure could bloat as an array, of type
intel_engine_cs, is embedded inside it.
struct intel_engine_cs engine[I915_NUM_ENGINES];
Though this is still fine as generally there is only a single instance of
drm_i915_private structure used, but not all of the possible rings would be
enabled or active on most of the platforms. Some memory can be saved by
allocating intel_engine_cs structure only for the enabled/active engines.
Currently the engine/ring ID is kept static and dev_priv->engine[] is simply
indexed using the enums defined in intel_engine_id.
To save memory and continue using the static engine/ring IDs, 'engine' is
defined as an array of pointers.
struct intel_engine_cs *engine[I915_NUM_ENGINES];
dev_priv->engine[engine_ID] will be NULL for disabled engine instances.

There is a text size reduction of 928 bytes, from 1028200 to 1027272, for
i915.o file (but for i915.ko file text size remain same as 1193131 bytes).

v2:
- Remove the engine iterator field added in drm_i915_private structure,
instead pass a local iterator variable to the for_each_engine**
macros. (Chris)
- Do away with intel_engine_initialized() and instead directly use the
NULL pointer check on engine pointer. (Chris)

v3:
- Remove for_each_engine_id() macro, as the updated macro for_each_engine()
can be used in place of it. (Chris)
- Protect the access to Render engine Fault register with a NULL check, as
engine specific init is done later in Driver load sequence.

v4:
- Use !!dev_priv->engine[VCS] style for the engine check in getparam. (Chris)
- Kill the superfluous init_engine_lists().

v5:
- Cleanup the intel_engines_init() & intel_engines_setup(), with respect to
allocation of intel_engine_cs structure. (Chris)

v6:
- Rebase.

v7:
- Optimize the for_each_engine_masked() macro. (Chris)
- Change the type of 'iter' local variable to enum intel_engine_id. (Chris)
- Rebase.

v8: Rebase.

v9: Rebase.

v10:
- For index calculation use engine ID instead of pointer based arithmetic in
intel_engine_sync_index() as engine pointers are not contiguous now (Chris)
- For appropriateness, rename local enum variable 'iter' to 'id'. (Joonas)
- Use for_each_engine macro for cleanup in intel_engines_init() and remove
check for NULL engine pointer in cleanup() routines. (Joonas)

v11: Rebase.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Akash Goel <akash.goel@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1476378888-7372-1-git-send-email-akash.goel@intel.com


# 911f4869 15-Sep-2016 Jani Nikula <jani.nikula@intel.com>

drm/i915: use NULL for NULL pointers

Fix sparse warning:

drivers/gpu/drm/i915/i915_cmd_parser.c:987:72: warning: Using plain
integer as NULL pointer

Fixes: 52a42cec4b70 ("drm/i915/cmdparser: Accelerate copies from WC memory")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1473946137-1931-4-git-send-email-jani.nikula@intel.com


# 52a42cec 18-Aug-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/cmdparser: Accelerate copies from WC memory

If we need to use clflush to prepare our batch for reads from memory, we
can bypass the cache instead by using non-temporal copies.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-39-chris@chris-wilson.co.uk


# 76ff480e 18-Aug-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/cmdparser: Use binary search for faster register lookup

A significant proportion of the cmdparsing time for some batches is the
cost to find the register in the mmiotable. We ensure that those tables
are in ascending order such that we could do a binary search if it was
ever merited. It is.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-38-chris@chris-wilson.co.uk


# ea884f09 18-Aug-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/cmdparser: Check for SKIP descriptors first

If the command descriptor says to skip it, ignore checking for anyother
other conflict.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-37-chris@chris-wilson.co.uk


# efdfd91f 18-Aug-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/cmdparser: Compare against the previous command descriptor

On the blitter (and in test code), we see long sequences of repeated
commands, e.g. XY_PIXEL_BLT, XY_SCANLINE_BLT, or XY_SRC_COPY. For these,
we can skip the hashtable lookup by remembering the previous command
descriptor and doing a straightforward compare of the command header.
The corollary is that we need to do one extra comparison before lookup
up new commands.

v2: Less magic mask (ok, it is still magic, but now you cannot see!)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-36-chris@chris-wilson.co.uk


# d6a4ead7 18-Aug-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/cmdparser: Improve hash function

The existing code's hashfunction is very suboptimal (most 3D commands
use the same bucket degrading the hash to a long list). The code even
acknowledge that the issue was known and the fix simple:

/*
* If we attempt to generate a perfect hash, we should be able to look at bits
* 31:29 of a command from a batch buffer and use the full mask for that
* client. The existing INSTR_CLIENT_MASK/SHIFT defines can be used for this.
*/

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-35-chris@chris-wilson.co.uk


# ed13033f 18-Aug-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/cmdparser: Only cache the dst vmap

For simplicity, we want to continue using a contiguous mapping of the
command buffer, but we can reduce the number of vmappings we hold by
switching over to a page-by-page copy from the user batch buffer to the
shadow. The cost for saving one linear mapping is about 5% in trivial
workloads - which is more or less the overhead in calling kmap_atomic().

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-34-chris@chris-wilson.co.uk


# 0b537272 18-Aug-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/cmdparser: Use cached vmappings

The single largest factor in the overhead of parsing the commands is the
setup of the virtual mapping to provide a continuous block for the batch
buffer. If we keep those vmappings around (against the better judgement
of mm/vmalloc.c, which we offset by handwaving and looking suggestively
at the shrinker) we can dramatically improve the performance of the
parser for small batches (such as media workloads). Furthermore, we can
use the prepare shmem read/write functions to determine how best we
need to clflush the range (rather than every page of the object).

The impact of caching both src/dst vmaps is +80% on ivb and +140% on byt
for the throughput on small batches. (Caching just the dst vmap and
iterating over the src, doing a page by page copy is roughly 5% slower
on both platforms. That may be an acceptable trade-off to eliminate one
cached vmapping, and we may be able to reduce the per-page copying overhead
further.) For *this* simple test case, the cmdparser is now within a
factor of 2 of ideal performance.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-33-chris@chris-wilson.co.uk


# 068715b9 18-Aug-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/cmdparser: Add the TIMESTAMP register for the other engines

Since I have been using the BCS_TIMESTAMP to measure latency of
execution upon the blitter ring, allow regular userspace to also read
from that register. They are already allowed RCS_TIMESTAMP!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-32-chris@chris-wilson.co.uk


# 7756e454 18-Aug-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/cmdparser: Make initialisation failure non-fatal

If the developer adds a register in the wrong order, we BUG during boot.
That makes development and testing very difficult. Let's be a bit more
friendly and disable the command parser with a big warning if the tables
are invalid.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-31-chris@chris-wilson.co.uk


# 43394c7d 18-Aug-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Extract i915_gem_obj_prepare_shmem_write()

This is a companion to i915_gem_obj_prepare_shmem_read() that prepares
the backing storage for direct writes. It first serialises with the GPU,
pins the backing storage and then indicates what clfushes are required in
order for the writes to be coherent.

Whilst here, fix support for ancient CPUs without clflush for which we
cannot do the GTT+clflush tricks.

v2: Add i915_gem_obj_finish_shmem_access() for symmetry

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-8-chris@chris-wilson.co.uk


# 33a051a5 27-Jul-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/cmdparser: Remove stray intel_engine_cs *ring

When we refer to intel_engine_cs, we want to use engine so as not to
confuse ourselves about ringbuffers.

v2: Rename all the functions as well, as well as a few more stray comments.
v3: Split the really long error message strings

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-6-git-send-email-chris@chris-wilson.co.uk
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1469606850-28659-1-git-send-email-chris@chris-wilson.co.uk


# 14bb2c11 03-Jun-2016 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Fix a buch of kerneldoc warnings

Just a bunch of stale kerneldocs generating warnings when
building the docs. Mostly function parameters so not very
useful but still.

v2: Tidy.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1464958937-23344-1-git-send-email-tvrtko.ursulin@linux.intel.com


# c033666a 06-May-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Store a i915 backpointer from engine, and use it

text data bss dec hex filename
6309351 3578714 696320 10584385 a18141 vmlinux
6308391 3578714 696320 10583425 a17d81 vmlinux

Almost 1KiB of code reduction.

v2: More s/INTEL_INFO()->gen/INTEL_GEN()/ and IS_GENx() conversions

text data bss dec hex filename
6304579 3578778 696320 10579677 a16edd vmlinux
6303427 3578778 696320 10578525 a16a5d vmlinux

Now over 1KiB!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1462545621-30125-3-git-send-email-chris@chris-wilson.co.uk


# 6761d0a1 06-May-2016 Kenneth Graunke <kenneth@whitecape.org>

drm/i915: Allow MI_LOAD_REGISTER_REG between whitelisted registers.

Allowing register copies where the source and destination are both
whitelisted should be safe, and is useful. For example, Mesa uses
this to load the command streamer math registers with data from the
pipeline statistics counters.

v2: Reject writes to OACONTROL (and reads as well :(

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> # v1
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1462521014-13595-1-git-send-email-chris@chris-wilson.co.uk


# 1ca3712c 04-May-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Report command parser version 0 if disabled

If the command parser is not active, then it is appropriate to report it
as operating at version 0 as no higher mode is supported. This greatly
simplifies userspace querying for the command parser as we then do not
need to second guess when it will be active (a mixture of module
parameters and generational support, which may change over time).

v2: s/comand/command/ misspelling in comment

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1462368336-21230-1-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# 6cf0716c 07-Mar-2016 Jordan Justen <jordan.l.justen@intel.com>

drm/i915: Bump command parser version for new whitelisted registers

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1457335830-30923-6-git-send-email-jordan.l.justen@intel.com


# 1b85066b 07-Mar-2016 Jordan Justen <jordan.l.justen@intel.com>

drm/i915: Add Haswell CS GPR registers to whitelist

This is needed for the Mesa Vulkan driver on Haswell.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1457335830-30923-5-git-send-email-jordan.l.justen@intel.com


# 99c5aeca 07-Mar-2016 Jordan Justen <jordan.l.justen@intel.com>

drm/i915: Move Haswell registers to separate whitelist table

Now that we can whitelist registers only on Haswell, move HSW_SCRATCH1
and HSW_ROW_CHICKEN3 into a separate Haswell only table.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1457335830-30923-4-git-send-email-jordan.l.justen@intel.com


# 361b027b 07-Mar-2016 Jordan Justen <jordan.l.justen@intel.com>

drm/i915: Use an array of register tables in command parser

For Haswell, we will want another table of registers while retaining
the large common table of whitelisted registers shared by all gen7
devices.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
[danvet: Pipe patch through sed -e 's/\<ring\>/engine/g' to make it
apply.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# a6573e1f 07-Mar-2016 Jordan Justen <jordan.l.justen@intel.com>

drm/i915: Add TIMESTAMP to register whitelist

This is needed for the Mesa Vulkan driver on Haswell.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: Kristian Høgsberg <krh@bitplanet.net>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1457335830-30923-2-git-send-email-jordan.l.justen@intel.com


# 0bc40be8 16-Mar-2016 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Rename intel_engine_cs function parameters

@@
identifier func;
@@
func(..., struct intel_engine_cs *
- ring
+ engine
, ...)
{
<...
- ring
+ engine
...>
}
@@
identifier func;
type T;
@@
T func(..., struct intel_engine_cs *
- ring
+ engine
, ...);

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>


# f0f59a00 18-Nov-2015 Ville Syrjälä <ville.syrjala@linux.intel.com>

drm/i915: Type safe register read/write

Make I915_READ and I915_WRITE more type safe by wrapping the register
offset in a struct. This should eliminate most of the fumbles we've had
with misplaced parens.

This only takes care of normal mmio registers. We could extend the idea
to other register types and define each with its own struct. That way
you wouldn't be able to accidentally pass the wrong thing to a specific
register access function.

The gpio_reg setup is probably the ugliest thing left. But I figure I'd
just leave it for now, and wait for some divine inspiration to strike
before making it nice.

As for the generated code, it's actually a bit better sometimes. Eg.
looking at i915_irq_handler(), we can see the following change:
lea 0x70024(%rdx,%rax,1),%r9d
mov $0x1,%edx
- movslq %r9d,%r9
- mov %r9,%rsi
- mov %r9,-0x58(%rbp)
- callq *0xd8(%rbx)
+ mov %r9d,%esi
+ mov %r9d,-0x48(%rbp)
callq *0xd8(%rbx)

So previously gcc thought the register offset might be signed and
decided to sign extend it, just in case. The rest appears to be
mostly just minor shuffling of instructions.

v2: i915_mmio_reg_{offset,equal,valid}() helpers added
s/_REG/_MMIO/ in the register defines
mo more switch statements left to worry about
ring_emit stuff got sorted in a prep patch
cmd parser, lrc context and w/a batch buildup also in prep patch
vgpu stuff cleaned up and moved to a prep patch
all other unrelated changes split out
v3: Rebased due to BXT DSI/BLC, MOCS, etc.
v4: Rebased due to churn, s/i915_mmio_reg_t/i915_reg_t/

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1447853606-2751-1-git-send-email-ville.syrjala@linux.intel.com


# e597ef40 06-Nov-2015 Ville Syrjälä <ville.syrjala@linux.intel.com>

drm/i915: Make the cmd parser 64bit regs explicit

Add defines for the upper halves of the registers used by the cmd
parser. Getting rid of the arithmetic with the register offset
will help in making registers type safe.

v2: s/_HI/_UDW/ (Chris)

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1446839080-18732-1-git-send-email-ville.syrjala@linux.intel.com


# 7b9748cb 02-Oct-2015 Jordan Justen <jordan.l.justen@intel.com>

drm/i915: Add GEN7_GPGPU_DISPATCHDIMX/Y/Z to the register whitelist

This is required to support glDispatchComputeIndirect for gen7.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 614f4ad7 01-Sep-2015 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Fix cmdparser STORE/LOAD command descriptors

Fixes regression from
commit f1afe24f0e736b9d7f2275e2b1504af3fe612f2a
Author: Arun Siluvery <arun.siluvery@linux.intel.com>
Date: Tue Aug 4 16:22:20 2015 +0100

drm/i915: Change SRM, LRM instructions to use correct length

which forgot to account for the length bias when declaring the fixed
length.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91844
Reported-by: Andreas Reis <andreas.reis@gmail.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dave Gordon <david.s.gordon@intel.com>
Cc: Arun Siluvery <arun.siluvery@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Arun Siluvery <arun.siluvery@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 2bbe6bbb 15-Jun-2015 Francisco Jerez <currojerez@riseup.net>

drm/i915: Bump command parser version number.

This was forgotten in

commit d351f6d94893f3ba98b1b20c5ef44c35fc1da124
Author: Francisco Jerez <currojerez@riseup.net>
Date: Fri May 29 16:44:15 2015 +0300

drm/i915: Add SCRATCH1 and ROW_CHICKEN3 to the register whitelist.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# f1afe24f 04-Aug-2015 Arun Siluvery <arun.siluvery@linux.intel.com>

drm/i915: Change SRM, LRM instructions to use correct length

MI_STORE_REGISTER_MEM, MI_LOAD_REGISTER_MEM instructions are not really
variable length instructions unlike MI_LOAD_REGISTER_IMM where it expects
(reg, addr) pairs so use fixed length for these instructions.

v2: rebase

Cc: Dave Gordon <david.s.gordon@intel.com>
Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
[danvet: Appease checkpatch as Mika spotted in i915_reg.h - it seems
terminally unhappy about i915_cmd_parser.c so that would be a separate
patch.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 8453580c 29-Jul-2015 Hanno Böck <hanno@hboeck.de>

drm/i915: Fix command parser table validator

As we may like to use a bisection search on the tables in future, we
need them to be ordered. For convenience we expect the compiled tables
to be order and check on initialisation. However, the validator used the
wrong iterators failed to spot the misordered MI tables and instead
walked off into the unknown (as spotted by kasan).

Signed-off-by: Hanno Boeck <hanno@hboeck.de>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Again hand-assemble patch ...]
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>


# 9f58582c 29-Jul-2015 Hanno Böck <hanno@hboeck.de>

drm/i915: Properly sort MI coomand table

In the future, we may want to speed up command/register searching using
a bisection and so we require them to be in ascending order respectively
by command value or register address. However, this was not true for one
pair in the MI table; make it so.

Signed-off-by: Hanno Boeck <hanno@hboeck.de>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Hand-assemble patch from raw patch from Hanno and commit message from Chris.]
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>


# 9e000847 03-Jul-2015 Arun Siluvery <arun.siluvery@linux.intel.com>

drm/i915: Update WaFlushCoherentL3CacheLinesAtContextSwitch

In this WA we need to set GEN8_L3SQCREG4[21:21] and reset it after PIPE_CONTROL
instruction but there is a slight complication as this is applied in WA batch
where the values are only initialized once.
Dave identified an issue with the current implementation where the register value
is read once at the beginning and it is reused; this patch corrects this by saving
the register value to memory, update register with the bit of our interest and
restore it back with original value.

This implementation uses MI_LOAD_REGISTER_MEM which is currently only used
by command parser and was using a default length of 0. This is now updated
with correct length and moved to appropriate place.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dave Gordon <david.s.gordon@intel.com>
Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 41d232b7 29-May-2015 Francisco Jerez <currojerez@riseup.net>

drm/i915: Add SCRATCH1 and ROW_CHICKEN3 to the register whitelist.

Only bit 27 of SCRATCH1 and bit 6 of ROW_CHICKEN3 are allowed to be
set because of security-sensitive bits we don't want userspace to mess
with. On HSW hardware the whitelisted bits control whether atomic
read-modify-write operations are performed on L3 or on GTI, and when
set to L3 (which can be 10x-30x better performing than on GTI,
depending on the application) require great care to avoid a system
hang, so we currently program them to be handled on GTI by default.

Beignet can immediately start taking advantage of this change to
enable L3 atomics. Mesa should eventually switch to L3 atomics too,
but a number of non-trivial changes are still required so it will
continue using GTI atomics for now.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>


# c1091b25 29-May-2015 Francisco Jerez <currojerez@riseup.net>

drm/i915: Extend the parser to check register writes against a mask/value pair.

In some cases it might be unnecessary or dangerous to give userspace
the right to write arbitrary values to some register, even though it
might be desirable to give it control of some of its bits. This patch
extends the register whitelist entries to contain a mask/value pair in
addition to the register offset. For registers with non-zero mask,
any LRM writes and LRI writes where the bits of the immediate given by
the mask don't match the specified value will be rejected.

This will be used in my next patch to grant userspace partial write
access to some sensitive registers.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>


# 8a389cac 29-May-2015 Francisco Jerez <currojerez@riseup.net>

drm/i915: Fix command parser to validate multiple register access with the same command.

Until now the software command checker assumed that commands could
read or write at most a single register per packet. This is not
necessarily the case, MI_LOAD_REGISTER_IMM expects a variable-length
list of offset/value pairs and writes them in sequence. The previous
code would only check whether the first entry was valid, effectively
allowing userspace to write unrestricted registers of the MMIO space
by sending a multi-register write with a legal first register, with
potential security implications on Gen6 and 7 hardware.

Fix it by extending the drm_i915_cmd_descriptor table to represent
multi-register access and making validate_cmd() iterate for all
register offsets present in the command packet.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>


# d351f6d9 29-May-2015 Francisco Jerez <currojerez@riseup.net>

drm/i915: Add SCRATCH1 and ROW_CHICKEN3 to the register whitelist.

Only bit 27 of SCRATCH1 and bit 6 of ROW_CHICKEN3 are allowed to be
set because of security-sensitive bits we don't want userspace to mess
with. On HSW hardware the whitelisted bits control whether atomic
read-modify-write operations are performed on L3 or on GTI, and when
set to L3 (which can be 10x-30x better performing than on GTI,
depending on the application) require great care to avoid a system
hang, so we currently program them to be handled on GTI by default.

Beignet can immediately start taking advantage of this change to
enable L3 atomics. Mesa should eventually switch to L3 atomics too,
but a number of non-trivial changes are still required so it will
continue using GTI atomics for now.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 4e86f725 29-May-2015 Francisco Jerez <currojerez@riseup.net>

drm/i915: Extend the parser to check register writes against a mask/value pair.

In some cases it might be unnecessary or dangerous to give userspace
the right to write arbitrary values to some register, even though it
might be desirable to give it control of some of its bits. This patch
extends the register whitelist entries to contain a mask/value pair in
addition to the register offset. For registers with non-zero mask,
any LRM writes and LRI writes where the bits of the immediate given by
the mask don't match the specified value will be rejected.

This will be used in my next patch to grant userspace partial write
access to some sensitive registers.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 6a65c5b9 29-May-2015 Francisco Jerez <currojerez@riseup.net>

drm/i915: Fix command parser to validate multiple register access with the same command.

Until now the software command checker assumed that commands could
read or write at most a single register per packet. This is not
necessarily the case, MI_LOAD_REGISTER_IMM expects a variable-length
list of offset/value pairs and writes them in sequence. The previous
code would only check whether the first entry was valid, effectively
allowing userspace to write unrestricted registers of the MMIO space
by sending a multi-register write with a legal first register, with
potential security implications on Gen6 and 7 hardware.

Fix it by extending the drm_i915_cmd_descriptor table to represent
multi-register access and making validate_cmd() iterate for all
register offsets present in the command packet.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# de4e783a 07-Apr-2015 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Tidy batch pool logic

Move the madvise logic out of the execbuffer main path into the
relatively rare allocation path, making the execbuffer manipulation less
fragile.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 72c5ba95 13-Mar-2015 Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915: Fix vmap_batch page iterator overrun

vmap_batch() calculates amount of needed pages for the mapping
we are going to create. And it uses this page count as an
argument for the for_each_sg_pages() macro. The macro takes the number
of sg list entities as an argument, not the page count. So we ended
up iterating through all the pages on the mapped object, corrupting
memory past the smaller pages[] array.

Fix this by bailing out when we have enough pages.

This regression has been introduced in

commit 17cabf571e50677d980e9ab2a43c5f11213003ae
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Wed Jan 14 11:20:57 2015 +0000

drm/i915: Trim the command parser allocations

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 17cabf57 14-Jan-2015 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Trim the command parser allocations

Currently, the command parser tries to create a secondary batch exactly
as large as the original, and vmap both. This is open to abuse by
userspace using extremely large batch objects, but only executing very
short batches. For example, this would be if userspace were to implement
a command submission ringbuffer. However, we only need to allocate pages
for just the contents of the command sequence in the batch - all
relocations copied to the secondary batch will reference the original
batch and so there can be no access to the secondary batch outside of
the explicit execution region.

Testcase: igt/gem_exec_big #ivb,byt,hsw
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88308
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# c61200c2 11-Dec-2014 Jordan Justen <jordan.l.justen@intel.com>

drm/i915: Add GPGPU_THREADS_DISPATCHED to the register whitelist

This will allow us to read the number of dispatched compute threads
for GL_ARB_pipeline_statistics_query.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 71745376 11-Dec-2014 Brad Volkin <bradley.d.volkin@intel.com>

drm/i915: Tidy up execbuffer command parsing code

Move it to a separate function since the main do_execbuffer function
already has so much going on.

v2:
- Move pin/unpin calls inside i915_parse_cmds() (Chris W, v4 7/7
feedback)

Issue: VIZ-4719
Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
Reviewed-By: Jon Bloomfield <jon.bloomfield@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# b9ffd80e 11-Dec-2014 Brad Volkin <bradley.d.volkin@intel.com>

drm/i915: Use batch length instead of object size in command parser

Previously we couldn't trust the user-supplied batch length because
it came directly from userspace (i.e. untrusted code). It would have
affected what commands software parsed without regard to what hardware
would actually execute, leaving a potential hole.

With the parser now copying the user supplied batch buffer and writing
MI_NOP commands to any space after the copied region, we can safely use
the batch length input. This should be a performance win as the actual
batch length is frequently much smaller than the allocated object size.

v2: Fix handling of non-zero batch_start_offset

Issue: VIZ-4719
Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
Reviewed-By: Jon Bloomfield <jon.bloomfield@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 78a42377 11-Dec-2014 Brad Volkin <bradley.d.volkin@intel.com>

drm/i915: Use batch pools with the command parser

This patch sets up all of the tracking and copying necessary to
use batch pools with the command parser and dispatches the copied
(shadow) batch to the hardware.

After this patch, the parser is in 'enabling' mode.

Note that performance takes a hit from the copy in some cases
and will likely need some work. At a rough pass, the memcpy
appears to be the bottleneck. Without having done a deeper
analysis, two ideas that come to mind are:
1) Copy sections of the batch at a time, as they are reached
by parsing. Might improve cache locality.
2) Copy only up to the userspace-supplied batch length and
memset the rest of the buffer. Reduces the number of reads.

v2:
- Remove setting the capacity of the pool
- One global pool instead of per-ring pools
- Replace batch_obj with shadow_batch_obj and hook into eb->vmas
- Memset any space in the shadow batch beyond what gets copied
- Rebased on execlist prep refactoring

v3:
- Rebase on chained batch handling
- Squash in setting the secure dispatch flag
- Add a note about the interaction w/secure dispatch pinning
- Check for request->batch_obj == NULL in i915_gem_free_request

v4:
- Fix read domains for shadow_batch_obj
- Remove the set_to_gtt_domain call from i915_parse_cmds
- ggtt_pin/unpin in the parser block to simplify error handling
- Check USES_FULL_PPGTT before setting DISPATCH_SECURE flag
- Remove i915_gem_batch_pool_put calls

v5:
- Move 'pending_read_domains |= I915_GEM_DOMAIN_COMMAND' after
the parser (danvet, from v4 0/7 feedback)

Issue: VIZ-4719
Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
Reviewed-By: Jon Bloomfield <jon.bloomfield@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 86ef630d 21-Nov-2014 Michael H. Nguyen <michael.h.nguyen@intel.com>

drm/i915: Add MI_SET_APPID cmd to cmd parser tables

Was missing.

Issue: VIZ-4701
Signed-off-by: Michael H. Nguyen <michael.h.nguyen@intel.com>
Reviewed-by: Jon Bloomfield <jon.bloomfield@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# bfc882b4 19-Nov-2014 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Flatten engine init control flow

Now that sanity prevails and we have the clean split between software
init and starting the engines we can drop all the "have we allocate
this struct already?" nonsense.

Execlist code could benefit quite a bit more still, but that's for
another patch.

Reviewed-by: Dave Gordon <david.s.gordon@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>


# f1f55cc0 07-Nov-2014 Neil Roberts <neil@linux.intel.com>

drm/i915: Add the predicate source registers to the register whitelist

The predicate source registers are needed to implement conditional
rendering without stalling. The two source registers are used to load
the previous values of the PS_DEPTH_COUNT register saved from
PIPE_CONTROL commands. These can then be compared and used to set the
predicate enable bit via the MI_PREDICATE command.

The command parser version number is increased to 2 to make it easier
to detect the new functionality in user space.

Signed-off-by: Neil Roberts <neil@linux.intel.com>
Reviewed-by: Brad Volkin <bradley.d.volkin@intel.com> (v1)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 42c7156a 16-Oct-2014 Brad Volkin <bradley.d.volkin@intel.com>

drm/i915: Abort command parsing for chained batches

libva uses chained batch buffers in a way that the command parser
can't generally handle. Fortunately, libva doesn't need to write
registers from batch buffers in the way that mesa does, so this
patch causes the driver to fall back to non-secure dispatch if
the parser detects a chained batch buffer.

Note: The 2nd hunk to munge the error code of the parser looks a bit
superflous. At least until we have the batch copy code ready and can
run the cmd parser in granting mode. But it isn't since we still need
to let existing libva buffers pass (though not with elevated privs
ofc!).

Testcase: igt/gem_exec_parse/chained-batch
Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
[danvet: Add note - this confused me in review and Brad clarified
things (after a few mails ...).]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 32197aab 20-Oct-2014 Masanari Iida <standby24x7@gmail.com>

gpu:drm: Fix typo in Documentation/DocBook/drm.xml

This patch fix spelling typos found in drm.xml.
It is because the file is generated from comments in
source codes, I have to fix the typos within source files.

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 22cb99af 22-Sep-2014 Brad Volkin <bradley.d.volkin@intel.com>

drm/i915: Don't leak command parser tables on suspend/resume

Ring init and cleanup are not balanced because we re-init the rings on
resume without having cleaned them up on suspend. This leads to the
driver leaking the parser's hash tables with a kmemleak signature such
as this:

unreferenced object 0xffff880405960980 (size 32):
comm "systemd-udevd", pid 516, jiffies 4294896961 (age 10202.044s)
hex dump (first 32 bytes):
d0 85 46 c0 ff ff ff ff 00 00 00 00 00 00 00 00 ..F.............
98 60 28 04 04 88 ff ff 00 00 00 00 00 00 00 00 .`(.............
backtrace:
[<ffffffff81816f9e>] kmemleak_alloc+0x4e/0xb0
[<ffffffff811fa678>] kmem_cache_alloc_trace+0x168/0x2f0
[<ffffffffc03e20a5>] i915_cmd_parser_init_ring+0x2a5/0x3e0 [i915]
[<ffffffffc04088a2>] intel_init_ring_buffer+0x202/0x470 [i915]
[<ffffffffc040c998>] intel_init_vebox_ring_buffer+0x1e8/0x2b0 [i915]
[<ffffffffc03eff59>] i915_gem_init_hw+0x2f9/0x3a0 [i915]
[<ffffffffc03f0057>] i915_gem_init+0x57/0x1d0 [i915]
[<ffffffffc045e26a>] i915_driver_load+0xc0a/0x10e0 [i915]
[<ffffffffc02e0d5d>] drm_dev_register+0xad/0x100 [drm]
[<ffffffffc02e3b9f>] drm_get_pci_dev+0x8f/0x200 [drm]
[<ffffffffc03c934b>] i915_pci_probe+0x3b/0x60 [i915]
[<ffffffff81436725>] local_pci_probe+0x45/0xa0
[<ffffffff81437a69>] pci_device_probe+0xd9/0x130
[<ffffffff81524f4d>] driver_probe_device+0x12d/0x3e0
[<ffffffff815252d3>] __driver_attach+0x93/0xa0
[<ffffffff81522e1b>] bus_for_each_dev+0x6b/0xb0

This patch extends the current convention of checking whether a
resource is already allocated before allocating it during ring init.
Longer term it might make sense to only init the rings once.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83794
Tested-by: Kari Suvanto <kari.tj.suvanto@gmail.com>
Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: stable@vger.kernel.org
Signed-off-by: Jani Nikula <jani.nikula@intel.com>


# 00caf019 18-Sep-2014 Brad Volkin <bradley.d.volkin@intel.com>

drm/i915: Log a message when rejecting LRM to OACONTROL

The other paths in the command parser that reject a batch all
log a message indicating the reason. We simply missed this one.

Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 9beb0ccb 18-Sep-2014 Brad Volkin <bradley.d.volkin@intel.com>

drm/i915: Re-enable the command parser when using PPGTT

In commit

commit 896ab1a5d54269b463a24194c2e4a369103b46d8
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Wed Aug 6 15:04:51 2014 +0200

drm/i915: Fix up checks for aliasing ppgtt

it looks like we accidentally inverted the check that the command
parser should only run when the driver enables some form of PPGTT.

Testcase: igt/gem_exec_parse
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
[danvet: Also drop the comment right above, all production vlv now
have hw ppgtt enabled.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 896ab1a5 06-Aug-2014 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Fix up checks for aliasing ppgtt

A subsequent patch will no longer initialize the aliasing ppgtt if we
have full ppgtt enabled, since we simply don't need that any more.

Unfortunately a few places check for the aliasing ppgtt instead of
checking for ppgtt in general. Fix them up.

One special case are the gtt offset and size macros, which have some
code to remap the aliasing ppgtt to the global gtt. The aliasing ppgtt
is _not_ a logical address space, so passing that in as the vm is
plain and simple a bug. So just WARN about it and carry on - we have a
gracefully fall-through anyway if we can't find the vma.

Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# c9224faa 17-Jun-2014 Brad Volkin <bradley.d.volkin@intel.com>

drm/i915: Add some L3 registers to the parser whitelist

Beignet needs these in order to program the L3 cache config for
OpenCL workloads, particularly when using SLM.

Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# a4872ba6 22-May-2014 Oscar Mateo <oscar.mateo@intel.com>

drm/i915: s/intel_ring_buffer/intel_engine_cs

In the upcoming patches we plan to break the correlation between
engine command streamers (a.k.a. rings) and ringbuffers, so it
makes sense to refactor the code and make the change obvious.

No functional changes.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 44e895a8 10-May-2014 Brad Volkin <bradley.d.volkin@intel.com>

drm/i915: Use hash tables for the command parser

For clients that submit large batch buffers the command parser has
a substantial impact on performance. On my HSW ULT system performance
drops as much as ~20% on some tests. Most of the time is spent in the
command lookup code. Converting that from the current naive search to
a hash table lookup reduces the performance drop to ~10%.

The choice of value for I915_CMD_HASH_ORDER allows all commands
currently used in the parser tables to hash to their own bucket (except
for one collision on the render ring). The tradeoff is that it wastes
memory. Because the opcodes for the commands in the tables are not
particularly well distributed, reducing the order still leaves many
buckets empty. The increased collisions don't seem to have a huge
impact on the performance gain, but for now anyhow, the parser trades
memory for performance.

NB: Ville noticed that the error paths through the ring init code
will leak memory. I've not addressed that here. We can do a follow
up pass to handle all of the leaks.

v2: improved comment describing selection of hash key mask (Damien)
replace a BUG_ON() with an error return (Tvrtko, Ville)
commit message improvements

Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 4b6eab59 28-Apr-2014 Jan Moskyto Matejka <mq@suse.cz>

Revert "drm/i915: fix build warning on 32-bit (v2)"

This reverts commit 60f2b4af1258c05e6b037af866be81abc24438f7.

The same warning has been fixed in e5081a538a565284fec5f30a937d98e460d5e780 and
these two commits got merged in 74e99a84de2d0980320612db8015ba606af42114 which
caused another warning. Simply, the reverted commit casted the pointer
difference to unsigned long and the other commit changed the output type from
long to ptrdiff_t.

The other commit fixes the original warning the better way so I'm reverting
this commit now.

Signed-off-by: Jan Moskyto Matejka <mq@suse.cz>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 122b2505 25-Apr-2014 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Integrate cmd parser kerneldoc

Ville noticed that we have this nice kerneldoc but it's not integrated
anywhere. Fix this asap!

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Brad Volkin <bradley.d.volkin@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 113a0476 08-Apr-2014 Brad Volkin <bradley.d.volkin@intel.com>

drm/i915: Add more registers to the whitelist for mesa

These are additional registers needed for performance monitoring and
ARB_draw_indirect extensions in mesa.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76719
Cc: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
[danvet: Squash in fixup from Brad requested by Ken.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 86a25121 02-Apr-2014 Jani Nikula <jani.nikula@intel.com>

drm/i915: fix command parser debug print format mismatches

Drop the cast from the pointer diff to fix:

drivers/gpu/drm/i915/i915_cmd_parser.c:405:4: warning: format '%td' expects
argument of type 'ptrdiff_t', but argument 5 has type 'long unsigned int'
[-Wformat]

While at it, use %u for u32.

Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
[danvet: After conflict resolution only the "While at it, ..." part
was left standing ...]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 6e66ea13 28-Mar-2014 Brad Volkin <bradley.d.volkin@intel.com>

drm/i915: Track OACONTROL register enable/disable during parsing

There is some thought that the data from the performance counters enabled
via OACONTROL should only be available to the process that enabled counting.
To limit snooping, require that any batch buffer which sets OACONTROL to a
non-zero value also sets it back to 0 before the end of the batch.

This requires limiting OACONTROL writes to happen via MI_LOAD_REGISTER_IMM
so that we can access the value being written. This should be in line with
the expected use case for writing OACONTROL.

v2: Drop an unnecessary '? true : false'

Cc: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# b651000b 27-Mar-2014 Brad Volkin <bradley.d.volkin@intel.com>

drm/i915: Refactor cmd parser checks into a function

This brings the code a little more in line with kernel coding style.

Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 300233ee 27-Mar-2014 Brad Volkin <bradley.d.volkin@intel.com>

drm/i915: BUG_ON() when cmd/reg tables are not sorted

As suggested during review, this makes it much more obvious
when the tables are not sorted.

Cc: Jani Nikula <jani.nikula@linux.intel.com>
Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 77fec556 31-Mar-2014 Jani Nikula <jani.nikula@intel.com>

drm/i915: drop the typedef for drm_i915_private_t

There are no longer users of drm_i915_private_t. Drop the typedef. Good
riddance.

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Chris Wilson <chris@chris-wislon.co.uk>
[danvet: Add the hunk in i915_cmd_parser.c here which had to be
relocated to the how this was merged.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 180b813c 25-Mar-2014 Kenneth Graunke <kenneth@whitecape.org>

drm/i915: Add OACONTROL to the command parser register whitelist.

Mesa needs to be able to write OACONTROL in order to expose the
Observability Architecture's performance counters via OpenGL.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
[danvet: Add comment that this is just a temporary work-around and
that we need to check more things before we can allow OACONTROL writes
for real everywhere.]
[danvet 2: Squash in fixup to avoid a DRM_ERROR due to unsorted reg
list, spotted by Jani.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# d728c8ef 18-Feb-2014 Brad Volkin <bradley.d.volkin@intel.com>

drm/i915: Add a CMD_PARSER_VERSION getparam

So userspace can query the kernel for command parser support.

v2: Add i915_cmd_parser_get_version(), history log, and kerneldoc

OTC-Tracker: AXIA-4631
Change-Id: I58af650db9f6753c2dcac9c54ab432fd31db302f
Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 114d4f70 18-Feb-2014 Brad Volkin <bradley.d.volkin@intel.com>

drm/i915: Reject commands that would store to global HWS page

PIPE_CONTROL and MI_FLUSH_DW have bits that would write to the
hardware status page. The driver stores request tracking info
there, so don't let userspace overwrite it.

v2: trailing comma fix, rebased

Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# d4d48035 18-Feb-2014 Brad Volkin <bradley.d.volkin@intel.com>

drm/i915: Enable PPGTT command parser checks

Various commands that access memory have a bit to determine whether
the graphics address specified in the command should use the GGTT or
PPGTT for translation. These checks ensure that the bit indicates
PPGTT translation.

Most of these checks use the existing bit-checking infrastructure.
The PIPE_CONTROL and MI_FLUSH_DW commands, however, are multi-function
commands. The GGTT/PPGTT bit is only relevant for certain uses of the
command. As such, this change also extends the bit-checking code to
include a "condition" mask and offset. If the condition mask is non-zero
then the parser only performs the bit check when the bits specified by
the condition mask/offset are also non-zero.

NOTE: At this point in the series PPGTT must be enabled for the parser
to work correctly. If it's not enabled, userspace will not be setting
the PPGTT bits the way the parser requires. VLV is the only platform
where this is a problem, so at this point, we disable parsing for VLV.

v2: whitespace and trailing commas fixes, rebased

OTC-Tracker: AXIA-4631
Change-Id: I3f4c76b6734f1956ec47e698230f97d0998ff92b
Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
[danvet: Drop the unecessary cast Jani spotted.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# b18b396b 18-Feb-2014 Brad Volkin <bradley.d.volkin@intel.com>

drm/i915: Reject commands that explicitly generate interrupts

The driver leaves most interrupts masked during normal operation,
so there would have to be additional work to enable userspace to
safely request/receive an interrupt.

v2: trailing commas, rebased

Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# f0a346bd 18-Feb-2014 Brad Volkin <bradley.d.volkin@intel.com>

drm/i915: Enable register whitelist checks

MI_STORE_REGISTER_MEM, MI_LOAD_REGISTER_MEM, and MI_LOAD_REGISTER_IMM
commands allow userspace access to registers. Only certain registers
should be allowed for such access, so enable checking for those commands.
Each ring gets its own register whitelist.

MI_LOAD_REGISTER_REG on HSW also allows register access but is currently
unused by userspace components. Leave it rejected.

PIPE_CONTROL and MEDIA_VFE_STATE allow register access based on certain
bits being set. Reject those as well.

v2: trailing commas, rebased

OTC-Tracker: AXIA-4631
Change-Id: Ie614a2f0eb2e5917de809e5a17957175d24cc44f
Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 220375aa 18-Feb-2014 Brad Volkin <bradley.d.volkin@intel.com>

drm/i915: Add register whitelist for DRM master

These are used to implement scanline waits in the X server.

v2: Use #defines instead of magic numbers

Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 5947de9b 18-Feb-2014 Brad Volkin <bradley.d.volkin@intel.com>

drm/i915: Add register whitelists for mesa

These registers are currently used by mesa for blitting,
transform feedback extensions, and performance monitoring
extensions.

v2: REG64 macro

Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 17c1eb15 18-Feb-2014 Brad Volkin <bradley.d.volkin@intel.com>

drm/i915: Allow some privileged commands from master

The Intel DDX uses these to implement scanline waits in the X server.

Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 9c640d1d 18-Feb-2014 Brad Volkin <bradley.d.volkin@intel.com>

drm/i915: Reject privileged commands

The spec defines most of these commands as privileged. A few others,
like the semaphore mbox command and some display commands, are also
reserved for the driver's use. Subsequent patches relax some of
these restrictions.

v2: Rebased

Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 3a6fa984 18-Feb-2014 Brad Volkin <bradley.d.volkin@intel.com>

drm/i915: Initial command parser table definitions

Add command tables defining irregular length commands for each ring.
This requires a few new command opcode definitions.

v2: Whitespace adjustment in command definitions, sparse fix for !F

OTC-Tracker: AXIA-4631
Change-Id: I064bceb457e15f46928058352afe76d918c58ef5
Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 60f2b4af 27-Mar-2014 Dave Airlie <airlied@redhat.com>

drm/i915: fix build warning on 32-bit (v2)

/ssd/git/drm-next/drivers/gpu/drm/i915/i915_cmd_parser.c: In function ‘i915_parse_cmds’:
/ssd/git/drm-next/drivers/gpu/drm/i915/i915_cmd_parser.c:405:4: warning: format ‘%ld’ expects argument of type ‘long int’, but argument 5 has type ‘int’ [-Wformat=]
DRM_DEBUG_DRIVER("CMD: Command length exceeds batch length: 0x%08X length=%d batchlen=%ld\n",
^

Signed-off-by: Dave Airlie <airlied@redhat.com>


# e5081a53 18-Mar-2014 Damien Lespiau <damien.lespiau@intel.com>

drm/i915: Use the correct format string modifier for ptrdiff_t

When compiling on 32bits, I have the following warning:

drivers/gpu/drm/i915/i915_cmd_parser.c:405:4: warning: format ‘%ld’
expects argument of type ‘long int’, but argument 7 has type ‘int’
[-Wformat=]
DRM_DEBUG_DRIVER("CMD: Command length exceeds batch length: 0x%08X
length=%d batchlen=%ld\n",

The ptrdiff_t type has its own modifier: 't'.

Cc: Brad Volkin <bradley.d.volkin@intel.com>
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Brad Volkin <bradley.d.volkin@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 351e3db2 18-Feb-2014 Brad Volkin <bradley.d.volkin@intel.com>

drm/i915: Implement command buffer parsing logic

The command parser scans batch buffers submitted via execbuffer ioctls before
the driver submits them to hardware. At a high level, it looks for several
things:

1) Commands which are explicitly defined as privileged or which should only be
used by the kernel driver. The parser generally rejects such commands, with
the provision that it may allow some from the drm master process.
2) Commands which access registers. To support correct/enhanced userspace
functionality, particularly certain OpenGL extensions, the parser provides a
whitelist of registers which userspace may safely access (for both normal and
drm master processes).
3) Commands which access privileged memory (i.e. GGTT, HWS page, etc). The
parser always rejects such commands.

See the overview comment in the source for more details.

This patch only implements the logic. Subsequent patches will build the tables
that drive the parser.

v2: Don't set the secure bit if the parser succeeds
Fail harder during init
Makefile cleanup
Kerneldoc cleanup
Clarify module param description
Convert ints to bools in a few places
Move client/subclient defs to i915_reg.h
Remove the bits_count field

OTC-Tracker: AXIA-4631
Change-Id: I50b98c71c6655893291c78a2d1b8954577b37a30
Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
[danvet: Appease checkpatch.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>