History log of /linux-master/drivers/gpu/drm/i915/i915_trace.h
Revision Date Author Comments
# d3f23ab9 20-Jul-2023 Andrzej Hajda <andrzej.hajda@intel.com>

drm/i915: use direct alias for i915 in requests

i915_request contains direct alias to i915, there is no point to go via
rq->engine->i915.

v2: added missing rq.i915 initialization in measure_breadcrumb_dw.

Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230720113002.1541572-1-andrzej.hajda@intel.com


# 801543b2 09-Nov-2022 Jani Nikula <jani.nikula@intel.com>

drm/i915: stop including i915_irq.h from i915_trace.h

Turns out many of the files that need i915_reg.h get it implicitly via
{display/intel_de.h, gt/intel_context.h} -> i915_trace.h -> i915_irq.h
-> i915_reg.h. Since i915_trace.h doesn't actually need i915_irq.h,
makes sense to drop it, but that requires adding quite a few new
includes all over the place.

Prefer including i915_reg.h where needed instead of adding another
implicit include, because eventually we'll want to split up i915_reg.h
and only include the specific registers at each place.

Also some places actually needed i915_irq.h too.

Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/6e78a2e0ac1bffaf5af3b5ccc21dff05e6518cef.1668008071.git.jani.nikula@intel.com


# e55427b4 11-Oct-2022 Andi Shyti <andi.shyti@linux.intel.com>

drm/i915/trace: Remove unused frequency trace

Commit 3e7abf814193 ("drm/i915: Extract GT render power state management")
removes the "trace_intel_gpu_freq_change()" trace points but
their definition was left without users. Remove it.

Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221011135940.367048-1-andi.shyti@linux.intel.com


# fd2b94a5 08-Dec-2021 Jani Nikula <jani.nikula@intel.com>

drm/i915/trace: split out display trace to a separate file

Add display/intel_display_trace.[ch] for defining display
tracepoints. The main goal is to reduce cross-includes between gem and
display. It would be possible split up tracing even further, but that
would lead to more boilerplate.

We end up having to include intel_crtc.h in a few places because it was
pulled in implicitly via intel_de.h -> i915_trace.h -> intel_crtc.h, and
that's no longer the case.

There should be no changes to tracepoints.

v3:
- Rebase

v2:
- Define TRACE_INCLUDE_PATH relative to define_trace.h (Chris)
- Remove useless comments (Ville)

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/7862ad764fbd0748d903c76bc632d3d277874e5b.1638961423.git.jani.nikula@intel.com


# 4bb71337 08-Dec-2021 Jani Nikula <jani.nikula@intel.com>

drm/i915/trace: clean up boilerplate organization

Follow the style that seems to be prevalent in kernel for undef and
define of TRACE_SYSTEM, TRACE_INCLUDE_PATH, and TRACE_INCLUDE_FILE.

There should be no changes to tracepoints.

v2: Keep TRACE_INCLUDE_PATH relative to define_trace.h (Chris)

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/0d37790ee70fb60be6f6a73d8bde2013510a7ad8.1638961423.git.jani.nikula@intel.com


# 004f80f9 24-Nov-2021 Ville Syrjälä <ville.syrjala@linux.intel.com>

drm/i915/fbc: Track FBC usage per-plane

In the future we may have multiple planes on the same pipe
capable of using FBC. Prepare for that by tracking FBC usage
per-plane rather than per-crtc.

v2: s/intel_get_crtc_for_pipe/intel_crtc_for_pipe/

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211124113652.22090-9-ville.syrjala@linux.intel.com
Reviewed-by: Mika Kahola <mika.kahola@intel.com>


# 7794b6de 01-Dec-2021 Jani Nikula <jani.nikula@intel.com>

drm/i915/crtc: rename intel_get_crtc_for_pipe() to intel_crtc_for_pipe()

The "get" in the name implies reference counting, remove it. This also
makes the function conform to naming style.

Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/6105d0ff44efac3c999af6382e4b0729e251f1e1.1638366969.git.jani.nikula@intel.com


# 2bbc6fca 20-Oct-2021 Ville Syrjälä <ville.syrjala@linux.intel.com>

drm/i915: Use vblank workers for gamma updates

The pipe gamma registers are single buffered so they should only
be updated during the vblank to avoid screen tearing. In fact they
really should only be updated between start of vblank and frame
start because that is the only time the pipe is guaranteed to be
empty. Already at frame start the pipe begins to fill up with
data for the next frame.

Unfortunately frame start happens ~1 scanline after the start
of vblank which in practice doesn't always leave us enough time to
finish the gamma update in time (gamma LUTs can be several KiB of
data we have to bash into the registers). However we must try our
best and so we'll add a vblank work for each pipe from where we
can do the gamma update. Additionally we could consider pushing
frame start forward to the max of ~4 scanlines after start of
vblank. But not sure that's exactly a validated configuration.
As it stands the ~100 first pixels tend to make it through with
the old gamma values.

Even though the vblank worker is running on a high prority thread
we still have to contend with C-states. If the CPU happens be in
a deep C-state when the vblank interrupt arrives even the irq
handler gets delayed massively (I've observed dozens of scanlines
worth of latency). To avoid that problem we'll use the qos mechanism
to keep the CPU awake while the vblank work is scheduled.

With all this hooked up we can finally enjoy near atomic gamma
updates. It even works across several pipes from the same atomic
commit which previously was a total fail because we did the
gamma updates for each pipe serially after waiting for all
pipes to have latched the double buffered registers.

In the future the DSB should take over this responsibility
which will hopefully avoid some of these issues.

Kudos to Lyude for finishing the actual vblank workers.
Works like the proverbial train toilet.

v2: Add missing intel_atomic_state fwd declaration
v3: Clean up properly when not scheduling the worker
v4: Clean up the rest and add tracepoints
v5: s/intel_wait_for_vblank_works/intel_wait_for_vblank_workers/ (Jani,Uma)

CC: Lyude Paul <lyude@redhat.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211020223339.669-4-ville.syrjala@linux.intel.com
Reviewed-by: Uma Shankar <uma.shankar@intel.com>


# 8ac80733 18-Oct-2021 Ville Syrjälä <ville.syrjala@linux.intel.com>

drm/i915: Split update_plane() into update_noarm() + update_arm()

The amount of plane registers we have to write has been steadily
increasing, putting more pressure on the vblank evasion mechanism
and forcing us to increase its time budget. Let's try to take some
of the pressure off by splitting plane updates into two parts:
1) write all non-self arming plane registers, ie. the registers
where the write actually does nothing until a separate arming
register is also written which will cause the hardware to latch
the new register values at the next start of vblank
2) write all self arming plane registers, ie. registers which always
just latch at the next start of vblank, and registers which also
arm other registers to do so

Here we just provide the mechanism, but don't actually implement
the split on any platform yet. so everything stays now in the _arm()
hooks. Subsequently we can move a whole bunch of stuff into the
_noarm() part, especially in more modern platforms where the number
of registers we have to write is also the greatest. On older
platforms this is less beneficial probably, but no real reason
to deviate from a common behaviour.

And let's sprinkle some TODOs around the areas that will need
adapting.

Cc: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211018115030.3547-5-ville.syrjala@linux.intel.com
Reviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>


# 64512a66 26-Oct-2021 Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

drm/i915: Revert 'guc_id' from i915_request tracepoint

Avoid adding backend specific data to the tracepoints outside of
the LOW_LEVEL_TRACEPOINTS kernel config protection. These bits of
information are bound to change depending on the selected submission
method per platform and are not necessarily possible to maintain in
the future.

Fixes: dbf9da8d55ef ("drm/i915/guc: Add trace point for GuC submit")
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: John Harrison <john.c.harrison@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211027093255.66489-1-joonas.lahtinen@linux.intel.com


# ab0f0c79 26-Oct-2021 Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

drm/i915: Revert 'guc_id' from i915_request tracepoint

Avoid adding backend specific data to the tracepoints outside of
the LOW_LEVEL_TRACEPOINTS kernel config protection. These bits of
information are bound to change depending on the selected submission
method per platform and are not necessarily possible to maintain in
the future.

Fixes: dbf9da8d55ef ("drm/i915/guc: Add trace point for GuC submit")
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: John Harrison <john.c.harrison@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211027093255.66489-1-joonas.lahtinen@linux.intel.com
(cherry picked from commit 64512a66b67e6546e2db15192b3603cd6d58b75c)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>


# af5bc9f2 09-Sep-2021 Matthew Brost <matthew.brost@intel.com>

drm/i915/guc: Drop guc_active move everything into guc_state

Now that we have locking hierarchy of sched_engine->lock ->
ce->guc_state everything from guc_active can be moved into guc_state and
protected the guc_state.lock.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-23-matthew.brost@intel.com


# 3cb3e343 09-Sep-2021 Matthew Brost <matthew.brost@intel.com>

drm/i915/guc: Move fields protected by guc->contexts_lock into sub structure

To make ownership of locking clear move fields (guc_id, guc_id_ref,
guc_id_link) to sub structure guc_id in intel_context.

Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-22-matthew.brost@intel.com


# 9798b172 09-Sep-2021 Matthew Brost <matthew.brost@intel.com>

drm/i915/guc: Move GuC priority fields in context under guc_active

Move GuC management fields in context under guc_active struct as this is
where the lock that protects theses fields lives. Also only set guc_prio
field once during context init.

v2:
(Daniele)
- set CONTEXT_SET_INIT

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-21-matthew.brost@intel.com


# 0f797650 09-Sep-2021 Matthew Brost <matthew.brost@intel.com>

drm/i915/guc: Rework and simplify locking

Rework and simplify the locking with GuC subission. Drop
sched_state_no_lock and move all fields under the guc_state.sched_state
and protect all these fields with guc_state.lock . This requires
changing the locking hierarchy from guc_state.lock -> sched_engine.lock
to sched_engine.lock -> guc_state.lock.

v2:
(Daniele)
- Don't check fields outside of lock during sched disable, check less
fields within lock as some of the outside are no longer needed

Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-18-matthew.brost@intel.com


# 9ec8795e 02-Sep-2021 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Drop __rcu from gem_context->vm

It's been invariant since

commit ccbc1b97948ab671335e950271e39766729736c3
Author: Jason Ekstrand <jason@jlekstrand.net>
Date: Thu Jul 8 10:48:30 2021 -0500

drm/i915/gem: Don't allow changing the VM on running contexts (v4)

this just completes the deed. I've tried to split out prep work for
more careful review as much as possible, this is what's left:

- get_ppgtt gets simplified since we don't need to grab a temporary
reference - we can rely on the temporary reference for the gem_ctx
while we inspect the vm. The new vm_id still needs a full
i915_vm_open ofc. This also removes the final caller of context_get_vm_rcu

- A pile of selftests can now just look at ctx->vm instead of
rcu_dereference_protected( , true) or similar things.

- All callers of i915_gem_context_vm also disappear.

- I've changed the hugepage selftest to set scrub_64K without any
locking, because when we inspect that setting we're also not taking
any locks either. It works because it's a selftests that's careful
(single threaded gives you nice ordering) and not a live driver
where races can happen from anywhere.

These can only be split up further if we have some intermediate state
with a bunch more rcu_dereference_protected(ctx->vm, true), just to
shut up lockdep and sparse.

The conversion to __rcu happened in

commit a4e7ccdac38ec8335d9e4e2656c1a041c77feae1
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Fri Oct 4 14:40:09 2019 +0100

drm/i915: Move context management under GEM

Note that we're not breaking the actual bugfix in there: The real
bugfix is pushing the i915_vm_relase onto a separate worker, to avoid
locking inversion issues. The rcu conversion was just thrown in for
entertainment value on top (no vm lookup isn't even close to anything
that's a hotpath where removing the single spinlock can be measured).

v2: Rebase over the change to move the i915_vm_put() into
i915_gem_context_release().

v3: Trivial conflict against repainted shed.

Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20210902142057.929669-9-daniel.vetter@ffwll.ch


# 9a4aa3a2 26-Oct-2021 Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

drm/i915: Revert 'guc_id' from i915_request tracepoint

Avoid adding backend specific data to the tracepoints outside of
the LOW_LEVEL_TRACEPOINTS kernel config protection. These bits of
information are bound to change depending on the selected submission
method per platform and are not necessarily possible to maintain in
the future.

Fixes: dbf9da8d55ef ("drm/i915/guc: Add trace point for GuC submit")
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: John Harrison <john.c.harrison@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211027093255.66489-1-joonas.lahtinen@linux.intel.com
(cherry picked from commit 64512a66b67e6546e2db15192b3603cd6d58b75c)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# ee242ca7 26-Jul-2021 Matthew Brost <matthew.brost@intel.com>

drm/i915/guc: Implement GuC priority management

Implement a simple static mapping algorithm of the i915 priority levels
(int, -1k to 1k exposed to user) to the 4 GuC levels. Mapping is as
follows:

i915 level < 0 -> GuC low level (3)
i915 level == 0 -> GuC normal level (2)
i915 level < INT_MAX -> GuC high level (1)
i915 level == INT_MAX -> GuC highest level (0)

We believe this mapping should cover the UMD use cases (3 distinct user
levels + 1 kernel level).

In addition to static mapping, a simple counter system is attached to
each context tracking the number of requests inflight on the context at
each level. This is needed as the GuC levels are per context while in
the i915 levels are per request.

v2:
(Daniele)
- Add BUILD_BUG_ON to enforce ordering of priority levels
- Add missing lockdep to guc_prio_fini
- Check for return before setting context registered flag
- Map DISPLAY priority or higher to highest guc prio
- Update comment for guc_prio

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-33-matthew.brost@intel.com


# ae8ac10d 26-Jul-2021 Matthew Brost <matthew.brost@intel.com>

drm/i915/guc: Implement banned contexts for GuC submission

When using GuC submission, if a context gets banned disable scheduling
and mark all inflight requests as complete.

Cc: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-25-matthew.brost@intel.com


# 1e0fd2b5 26-Jul-2021 Matthew Brost <matthew.brost@intel.com>

drm/i915/guc: Handle context reset notification

GuC will issue a reset on detecting an engine hang and will notify
the driver via a G2H message. The driver will service the notification
by resetting the guilty context to a simple state or banning it
completely.

v2:
(John Harrison)
- Move msg[0] lookup after length check
v3:
(John Harrison)
- s/drm_dbg/drm_err

Cc: Matthew Brost <matthew.brost@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-13-matthew.brost@intel.com


# e03b5906 21-Jul-2021 Matthew Brost <matthew.brost@intel.com>

drm/i915: Add intel_context tracing

Add intel_context tracing. These trace points are particular helpful
when debugging the GuC firmware and can be enabled via
CONFIG_DRM_I915_LOW_LEVEL_TRACEPOINTS kernel config option.

Cc: John Harrison <john.c.harrison@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210721215101.139794-19-matthew.brost@intel.com


# dbf9da8d 21-Jul-2021 Matthew Brost <matthew.brost@intel.com>

drm/i915/guc: Add trace point for GuC submit

Add trace point for GuC submit. Extended existing request trace points
to include submit fence value,, guc_id, and ring tail value.

v2: Fix white space alignment in i915_request_add trace point
v3: Delete dep_from , dep_to (Tvrtko)

Cc: John Harrison <john.c.harrison@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210721215101.139794-18-matthew.brost@intel.com


# 1a86ac79 13-Apr-2021 Ville Syrjälä <ville.syrjala@linux.intel.com>

drm/i915: Add frontbuffer tracking tracepoints

Add some tracpoints for frontbuffer tracking so we can
try to figure out what's going on.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210414022309.30898-2-ville.syrjala@linux.intel.com
Acked-by: Jani Nikula <jani.nikula@intel.com>


# 7c53e628 27-Apr-2021 Jani Nikula <jani.nikula@intel.com>

drm/i915/display: move crtc and dpll declarations where they belong

The definitions are in the crtc and dpll files; move the declarations to
the corresponding headers.

Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210427120315.12342-1-jani.nikula@intel.com


# 5a833995 02-Jun-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Drop i915_request.i915 backpointer

We infrequently use the direct i915 backpointer from the i915_request,
so do we really need to waste the space in the struct for it? 8 bytes
from the most frequently allocated struct vs an 3 bytes and pointer
chasing in using rq->engine->i915?

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200602220953.21178-1-chris@chris-wilson.co.uk


# 0543fbf4 28-Jan-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/trace: i915_request.prio is a signed value

Don't confuse the poor developer by writing a negative value as a very
large positive, as the flow of requests is already complex enough.

Reported-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200128151647.3820659-1-chris@chris-wilson.co.uk


# d54151c5 13-Dec-2019 Ville Syrjälä <ville.syrjala@linux.intel.com>

drm/i915/fbc: Add fbc tracepoints

Add tracepoints which let us know when fbc activates/deactivates/nukes.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191213133453.22152-5-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>


# f7f1538c 13-Dec-2019 Ville Syrjälä <ville.syrjala@linux.intel.com>

drm/i915: Rename pipe update tracepoints

All the other display related tracepoints use intel_ instead
if i915_ as the prefix. Do the same for the pipe update
tracepoints so I don't always have to spend time looking for
them.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191213133453.22152-6-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>


# a4e7ccda 04-Oct-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Move context management under GEM

Keep track of the GEM contexts underneath i915->gem.contexts and assign
them their own lock for the purposes of list management.

v2: Focus on lock tracking; ctx->vm is protected by ctx->mutex
v3: Correct split with removal of logical HW ID

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-15-chris@chris-wilson.co.uk


# 2935ed53 04-Oct-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Remove logical HW ID

With the introduction of ctx->engines[] we allow multiple logical
contexts to be used on the same engine (e.g. with virtual engines).
According to bspec, aach logical context requires a unique tag in order
for context-switching to occur correctly between them. [Simple
experiments show that it is not so easy to trick the HW into performing
a lite-restore with matching logical IDs, though my memory from early
Broadwell experiments do suggest that it should be generating
lite-restores.]

We only need to keep a unique tag for the active lifetime of the
context, and for as long as we need to identify that context. The HW
uses the tag to determine if it should use a lite-restore (why not the
LRCA?) and passes the tag back for various status identifies. The only
status we need to track is for OA, so when using perf, we assign the
specific context a unique tag.

v2: Calculate required number of tags to fill ELSP.

Fixes: 976b55f0e1db ("drm/i915: Allow a context to define its set of engines")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111895
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-14-chris@chris-wilson.co.uk


# 1d455f8d 06-Aug-2019 Jani Nikula <jani.nikula@intel.com>

drm/i915: rename intel_drv.h to display/intel_display_types.h

Everything about the file is about display, and mostly about types
related to display. Move under display/ as intel_display_types.h to
reflect the facts.

There's still plenty to clean up, but start off with moving the file
where it logically belongs and naming according to contents.

v2: fix the include guard name in the renamed file

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190806113933.11799-1-jani.nikula@intel.com


# 750e76b4 06-Aug-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Move the [class][inst] lookup for engines onto the GT

To maintain a fast lookup from a GT centric irq handler, we want the
engine lookup tables on the intel_gt. To avoid having multiple copies of
the same multi-dimension lookup table, move the generic user engine
lookup into an rbtree (for fast and flexible indexing).

v2: Split uabi_instance cf uabi_class
v3: Set uabi_class/uabi_instance after collating all engines to provide a
stable uabi across parallel unordered construction.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> #v2
Link: https://patchwork.freedesktop.org/patch/msgid/20190806124300.24945-2-chris@chris-wilson.co.uk


# 7d3cd662 19-Jun-2019 Ville Syrjälä <ville.syrjala@linux.intel.com>

drm/i915: Fix various tracepoints for gen2

Gen2 doesn't have a frame counter and apparently we no longer provide
a fake .get_vblank_counter() hook for it. That means all tracepoints
calling that hook will oops. Update the tracepoints to use
intel_crtc_get_vblank_counter() which will gracefully fall back to
using the software counter. This is actually a better approach since
we now get (hopefully accurate) frame numbers in the traces.

This also gets rid of the raw driver->get_vblank_counter() calls, which
we need to do in order to switch to the per-crtc vblank vfuncs.

v2: Deal with new tracepoints
v3: Use a distinct variable name for the internal crtc iterator (Chris)

Cc: Shawn Guo <shawn.guo@linaro.org>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Fixes: 967dd4841787 ("drm: remove drm_vblank_no_hw_counter assignment from driver code")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190619170842.20579-2-ville.syrjala@linux.intel.com
(cherry picked from commit 4c888e7bd26f58deb27c2e6ddc90000b89ee9393)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>


# 86c9640b 10-Jul-2019 Steven Rostedt (VMware) <rostedt@goodmis.org>

drm/i915: Copy name string into ring buffer for intel_update/disable_plane tracepoints

Currently the intel_update_plane and intel_disable_plane tracepoints record
the address of plane->name in the ring buffer, and then when reading the
ring buffer uses %s to get the name. The issue with this, is that those two
events can be minutes, hours or even days apart. It is very dangerous to
dereference a string pointer without knowing if it still exists or not.

The proper way to handle this is to use the __string() macro in the
tracepoint which will save the string into the ring buffer at the time of
recording. Then there's no worries if the original string still exists in
memory when the ring buffer is read.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
[vsyrjala: Rebase on top of drm-tip]
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190710171230.7471-1-ville.syrjala@linux.intel.com


# 4c888e7b 19-Jun-2019 Ville Syrjälä <ville.syrjala@linux.intel.com>

drm/i915: Fix various tracepoints for gen2

Gen2 doesn't have a frame counter and apparently we no longer provide
a fake .get_vblank_counter() hook for it. That means all tracepoints
calling that hook will oops. Update the tracepoints to use
intel_crtc_get_vblank_counter() which will gracefully fall back to
using the software counter. This is actually a better approach since
we now get (hopefully accurate) frame numbers in the traces.

This also gets rid of the raw driver->get_vblank_counter() calls, which
we need to do in order to switch to the per-crtc vblank vfuncs.

v2: Deal with new tracepoints
v3: Use a distinct variable name for the internal crtc iterator (Chris)

Cc: Shawn Guo <shawn.guo@linaro.org>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Fixes: 967dd4841787 ("drm: remove drm_vblank_no_hw_counter assignment from driver code")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190619170842.20579-2-ville.syrjala@linux.intel.com


# 2f530945 18-Jun-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Stop passing I915_WAIT_LOCKED to i915_request_wait()

Since commit eb8d0f5af4ec ("drm/i915: Remove GPU reset dependence on
struct_mutex"), the I915_WAIT_LOCKED flags passed to i915_request_wait()
has been defunct. Now go ahead and remove it from all callers.

References: eb8d0f5af4ec ("drm/i915: Remove GPU reset dependence on struct_mutex")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190618074153.16055-3-chris@chris-wilson.co.uk


# e568ac38 11-Jun-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Pull kref into i915_address_space

Make the kref common to both derived structs (i915_ggtt and i915_ppgtt)
so that we can safely reference count an abstract ctx->vm address space.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190611091238.15808-1-chris@chris-wilson.co.uk


# 440e2b3d 29-Apr-2019 Jani Nikula <jani.nikula@intel.com>

drm/i915: extract i915_irq.h from intel_drv.h and i915_drv.h

It used to be handy that we only had a couple of headers, but over time
intel_drv.h has become unwieldy. Extract declarations to a separate
header file corresponding to the implementation module, clarifying the
modularity of the driver.

Ensure the new header is self-contained, and do so with minimal further
includes, using forward declarations as needed. Include the new header
only where needed, and sort the modified include directives while at it
and as needed.

No functional changes.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/64e46278dc8dccc9c548ef453cb2ceece5367bb2.1556540890.git.jani.nikula@intel.com


# 112ed2d3 24-Apr-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Move GraphicsTechnology files under gt/

Start partitioning off the code that talks to the hardware (GT) from the
uapi layers and move the device facing code under gt/

One casualty is s/intel_ringbuffer.h/intel_engine.h/ with the plan to
subdivide that header and body further (and split out the submission
code from the ringbuffer and logical context handling). This patch aims
to be simple motion so git can fixup inflight patches with little mess.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190424174839.7141-1-chris@chris-wilson.co.uk


# b300fde8 26-Feb-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Remove i915_request.global_seqno

Having weaned the interrupt handling off using a single global execution
queue, we no longer need to emit a global_seqno. Note that we still have
a few assumptions about execution order along engine timelines, but this
removes the most obvious artefact!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190226094922.31617-3-chris@chris-wilson.co.uk


# 0b2599a4 06-Feb-2019 Ville Syrjälä <ville.syrjala@linux.intel.com>

drm/i915: Add pipe enable/disable tracepoints

Add tracepoints for pipe enable/disable. We'll include the
frame/scanline counters for all pipes in these tracepoints to
help in diagnosing underruns and whatnot when enabling/disabling
pipes in parallel with plane updates/flips on another pipe.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190206204910.13965-2-ville.syrjala@linux.intel.com
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>


# 5cee6c45 06-Feb-2019 Ville Syrjälä <ville.syrjala@linux.intel.com>

drm/i915: Add pipe crc tracepoint

Add a tracepoint for pipe crc. Makes life much simpler when staring at
traces when hunting for fifo underruns and other issues which cause
corrupted frames. We'll add the tracepoint before filtering out any
potentially bogus crcs during modeset (should actually verify if that
filtering is even correct anymore...)

v2: s/crcs[5]/*crcs/ in the function argument because something
in the macros wants to do sizeof(crcs) and gcc likes to
warn us it's not an actual array so the size may not be
as expected. The silly bugger even does that for 'crcs[]'
causing us to lose any helpful syntactic hint that we
are in fact dealing with an array (kbuild test robot)

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190206204910.13965-1-ville.syrjala@linux.intel.com
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>


# 3df0bd19 29-Jan-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Remove the intel_engine_notify tracepoint

The global seqno is defunct and so we have no meaningful indicator of
forward progress for an engine. You need to listen to the request
signaling tracepoints instead.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190129205230.19056-1-chris@chris-wilson.co.uk


# fcd70cd3 17-Jan-2019 Daniel Vetter <daniel.vetter@ffwll.ch>

drm: Split out drm_probe_helper.h

Having the probe helper stuff (which pretty much everyone needs) in
the drm_crtc_helper.h file (which atomic drivers should never need) is
confusing. Split them out.

To make sure I actually achieved the goal here I went through all
drivers. And indeed, all atomic drivers are now free of
drm_crtc_helper.h includes.

v2: Make it compile. There was so much compile fail on arm drivers
that I figured I'll better not include any of the acks on v1.

v3: Massive rebase because i915 has lost a lot of drmP.h includes, but
not all: Through drm_crtc_helper.h > drm_modeset_helper.h -> drmP.h
there was still one, which this patch largely removes. Which means
rolling out lots more includes all over.

This will also conflict with ongoing drmP.h cleanup by others I
expect.

v3: Rebase on top of atomic bochs.

v4: Review from Laurent for bridge/rcar/omap/shmob/core bits:
- (re)move some of the added includes, use the better include files in
other places (all suggested from Laurent adopted unchanged).
- sort alphabetically

v5: Actually try to sort them, and while at it, sort all the ones I
touch.

v6: Rebase onto i915 changes.

v7: Rebase once more.

Acked-by: Harry Wentland <harry.wentland@amd.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Acked-by: Benjamin Gaignard <benjamin.gaignard@linaro.org>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Neil Armstrong <narmstrong@baylibre.com>
Acked-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Acked-by: CK Hu <ck.hu@mediatek.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Acked-by: Liviu Dudau <liviu.dudau@arm.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: virtualization@lists.linux-foundation.org
Cc: etnaviv@lists.freedesktop.org
Cc: linux-samsung-soc@vger.kernel.org
Cc: intel-gfx@lists.freedesktop.org
Cc: linux-mediatek@lists.infradead.org
Cc: linux-amlogic@lists.infradead.org
Cc: linux-arm-msm@vger.kernel.org
Cc: freedreno@lists.freedesktop.org
Cc: nouveau@lists.freedesktop.org
Cc: spice-devel@lists.freedesktop.org
Cc: amd-gfx@lists.freedesktop.org
Cc: linux-renesas-soc@vger.kernel.org
Cc: linux-rockchip@lists.infradead.org
Cc: linux-stm32@st-md-mailman.stormreply.com
Cc: linux-tegra@vger.kernel.org
Cc: xen-devel@lists.xen.org
Link: https://patchwork.freedesktop.org/patch/msgid/20190117210334.13234-1-daniel.vetter@ffwll.ch


# 2f80d7bd 08-Jan-2019 Jani Nikula <jani.nikula@intel.com>

drm/i915: drop all drmP.h includes

Needs just a few additional includes here and there.

Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190108082709.3748-1-jani.nikula@intel.com


# 6faf5916 28-Dec-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Remove HW semaphores for gen7 inter-engine synchronisation

The writing is on the wall for the existence of a single execution queue
along each engine, and as a consequence we will not be able to track
dependencies along the HW queue itself, i.e. we will not be able to use
HW semaphores on gen7 as they use a global set of registers (and unlike
gen8+ we can not effectively target memory to keep per-context seqno and
dependencies).

On the positive side, when we implement request reordering for gen7 we
also can not presume a simple execution queue and would also require
removing the current semaphore generation code. So this bring us another
step closer to request reordering for ringbuffer submission!

The negative side is that using interrupts to drive inter-engine
synchronisation is much slower (4us -> 15us to do a nop on each of the 3
engines on ivb). This is much better than it was at the time of introducing
the HW semaphores and equally important userspace weaned itself off
intermixing dependent BLT/RENDER operations (the prime culprit was glyph
rendering in UXA). So while we regress the microbenchmarks, it should not
impact the user.

References: https://bugs.freedesktop.org/show_bug.cgi?id=108888
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181228140736.32606-2-chris@chris-wilson.co.uk


# b3ee09a4 10-Jun-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/ringbuffer: Fix context restore upon reset

The discovery with trying to enable full-ppgtt was that we were
completely failing to the load both the mm and context following the
reset. Although we were performing mmio to set the PP_DIR (per-process
GTT) and CCID (context), these were taking no effect (the assumption was
that this would trigger reload of the context and restore the page
tables). It was not until we performed the LRI + MI_SET_CONTEXT in a
following context switch would anything occur.

Since we are then required to reset the context image and PP_DIR using
CS commands, we place those commands into every batch. The hardware
should recognise the no-ops and eliminate the expensive context loads,
but we still have to pay the cost of using cross-powerwell register
writes. In practice, this has no effect on actual context switch times,
and only adds a few hundred nanoseconds to no-op switches. We can improve
the latter by eliminating the w/a around known no-op switches, but there
is an ulterior motive to keeping them.

Always emitting the context switch at the beginning of the request (and
relying on HW to skip unneeded switches) does have one key advantage.
Should we implement request reordering on Haswell, we will not know in
advance what the previous executing context was on the GPU and so we
would not be able to elide the MI_SET_CONTEXT commands ourselves and
always have to emit them. Having our hand forced now actually prepares
us for later.

Now since that context and mm follow the request, we no longer (and not
for a long time since requests took over!) require a trace point to tell
when we write the switch into the ring, since it is always. (This is
even more important when you remember that simply writing into the ring
bears no relation to the current mm.)

v2: Sandybridge has to agree to use LRI as well.

Testcase: igt/drv_selftests/live_hangcheck
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180611110845.31890-1-chris@chris-wilson.co.uk


# 82ad6443 05-Jun-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gtt: Rename i915_hw_ppgtt base member

In the near future, I want to subclass gen6_hw_ppgtt as it contains a
few specialised members and I wish to add more. To avoid the ugliness of
using ppgtt->base.base, rename the i915_hw_ppgtt base member
(i915_address_space) as vm, which is our common shorthand for an
i915_address_space local.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180605153758.18422-1-chris@chris-wilson.co.uk


# 57d7116c 05-Jun-2018 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915/trace: Context field needs to be 64-bit wide

Underlaying field is u64 so the tracepoint needs to be as well.

v2:
* Re-order binary packet for 64-bit alignment. (Chris Wilson)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180605134124.25672-1-tvrtko.ursulin@linux.intel.com


# f24e74a7 25-May-2018 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915/trace: Remove engine out of the context sandwich

In the string tracepoint representation we ended up with the engine
sandwiched between context hardware id and context fence id.

Move the two pieces of context data together for redability.

Binary records are left as is, that is both fields remaing under the
existing name and ordering.

v2:
* Do not consolidate the printk format, just reorder. (Lionel)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180525082642.18246-2-tvrtko.ursulin@linux.intel.com


# 2956e970 25-May-2018 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915/trace: Describe engines as class:instance pairs

Instead of using the engine->id, use uabi_class:instance pairs in trace-
points including engine info.

This will be more readable, more future proof and more stable for
userspace consumption.

v2:
* Use u16 for class and instance. (Chris Wilson)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: svetlana.kukanova@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180525082642.18246-1-tvrtko.ursulin@linux.intel.com


# 4e0d64db 17-May-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Move request->ctx aside

In the next patch, we want to store the intel_context pointer inside
i915_request, as it is frequently access via a convoluted dance when
submitting the request to hw. Having two context pointers inside
i915_request leads to confusion so first rename the existing
i915_gem_context pointer to i915_request.gem_context.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180517212633.24934-1-chris@chris-wilson.co.uk


# a33f084c 08-May-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Remove unused i915_flip tracepoints

The i915_flip* tracepoints are no longer in use since the removal of CS
flip in commit 8b5d27b911d7 ("drm/i915: Remove intel_flip_work
infrastructure")

References: 8b5d27b911d7 ("drm/i915: Remove intel_flip_work infrastructure")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180508151552.31024-1-chris@chris-wilson.co.uk


# f2742e47 03-May-2018 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Include priority and completed status in request in/out tracepoints

It is useful to see the priority as requests are coming in and completed
status as requests are coming out of the GPU.

To achieve this in a more readable way we need to abandon the common
request_hw tracepoint class.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180504115643.22437-1-tvrtko.ursulin@linux.intel.com


# e61e0f51 21-Feb-2018 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Rename drm_i915_gem_request to i915_request

We want to de-emphasize the link between the request (dependency,
execution and fence tracking) from GEM and so rename the struct from
drm_i915_gem_request to i915_request. That is we may implement the GEM
user interface on top of requests, but they are an abstraction for
tracking execution rather than an implementation detail of GEM. (Since
they are not tied to HW, we keep the i915 prefix as opposed to intel.)

In short, the spatch:
@@

@@
- struct drm_i915_gem_request
+ struct i915_request

A corollary to contracting the type name, we also harmonise on using
'rq' shorthand for local variables where space if of the essence and
repetition makes 'request' unwieldy. For globals and struct members,
'request' is still much preferred for its clarity.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180221095636.6649-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# 151a99ec 18-Dec-2017 Lionel Landwerlin <lionel.g.landwerlin@intel.com>

drm/i915/trace: add hw_id to gem requests trace points

When monitoring the GPU with i915 perf, reports are tagged with a hw
id. Gem context creation tracepoints already have a hw_id field,
unfortunately you only get this correlation between a process id and a
hw context id once when the context is created. It doesn't help if you
started monitoring after the process was initialized or if the drm fd
was transfered from one process to another.

This change adds the hw_id field to gem requests, so that correlation
can also be done on submission.

v2: Place hw_id at the end of the tracepoint to not disrupt too much
existing tools (Chris)

v3: Reorder hw_id field again (Chris)

v4: Add missing hw_id to i915_gem_request_wait_begin tracepoint (Chris)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171218151959.14073-3-lionel.g.landwerlin@intel.com


# 3c2d0671 18-Dec-2017 Lionel Landwerlin <lionel.g.landwerlin@intel.com>

drm/i915: reorder field in gem_request tracepoints

Let's make the order of the fields of the tracepoints involving gem
request match across i915. This makes userspace processing of
tracepoint a bit easier.

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171218151959.14073-2-lionel.g.landwerlin@intel.com


# b2441318 01-Nov-2017 Greg Kroah-Hartman <gregkh@linuxfoundation.org>

License cleanup: add SPDX GPL-2.0 license identifier to files with no license

Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).

All documentation files were explicitly excluded.

The following heuristics were used to determine which SPDX license
identifiers to apply.

- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.

For non */uapi/* files that summary was:

SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139

and resulted in the first patch in this series.

If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:

SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930

and resulted in the second patch in this series.

- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:

SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1

and that resulted in the third patch in this series.

- when the two scanners agreed on the detected license(s), that became
the concluded license(s).

- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.

- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).

- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.

- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.

In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.

Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.

Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.

In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.

Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct

This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.

These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.

Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 65921223 03-Oct-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Remove defunct trace points

trace_i915_gem_evict_everything and trace_i915_gem_ring_flush stopped
being used when their parent functions were removed.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171003125055.11370-2-chris@chris-wilson.co.uk


# 6c1fa341 03-Oct-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Fix some tracepoints to capture full 64b

The tracepoints need some tlc, in particular we've neglected to update
them for the 64b era.

v2: Prefix hexadecimal output with 0x.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171003125055.11370-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>


# 4e6d7719 01-Sep-2017 Thierry Reding <treding@nvidia.com>

drm/i915: Use correct path to trace include

The header comment in include/trace/define_trace.h specifies that the
TRACE_INCLUDE_PATH needs to be relative to the define_trace.h header
rather than the trace file including it. Most instances get that wrong
and work around it by adding the $(src) directory to the include path.

While this works, it is preferable to refer to the correct path to the
trace file in the first place and avoid any workaround.

Signed-off-by: Thierry Reding <treding@nvidia.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20170901144954.19620-4-thierry.reding@gmail.com


# 034263a3 01-Sep-2017 Ville Syrjälä <ville.syrjala@linux.intel.com>

drm/i915: Fix enum pipe vs. enum transcoder for the PCH transcoder

Use enum pipe for PCH transcoders also in the FIFO underrun code.

Fixes the following new sparse warnings:
intel_fifo_underrun.c:340:49: warning: mixing different enum types
intel_fifo_underrun.c:340:49: int enum pipe versus
intel_fifo_underrun.c:340:49: int enum transcoder
intel_fifo_underrun.c:344:49: warning: mixing different enum types
intel_fifo_underrun.c:344:49: int enum pipe versus
intel_fifo_underrun.c:344:49: int enum transcoder
intel_fifo_underrun.c:397:57: warning: mixing different enum types
intel_fifo_underrun.c:397:57: int enum pipe versus
intel_fifo_underrun.c:397:57: int enum transcoder
intel_fifo_underrun.c:398:17: warning: mixing different enum types
intel_fifo_underrun.c:398:17: int enum pipe versus
intel_fifo_underrun.c:398:17: int enum transcoder

Cc: Matthias Kaehlcke <mka@chromium.org>
Fixes: a21960339c8c ("drm/i915: Consistently use enum pipe for PCH transcoders")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170901143123.7590-3-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
(cherry picked from commit 41c32e5da3ff3922490341a988b2a3ae46d0b6a8)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>


# 41c32e5d 01-Sep-2017 Ville Syrjälä <ville.syrjala@linux.intel.com>

drm/i915: Fix enum pipe vs. enum transcoder for the PCH transcoder

Use enum pipe for PCH transcoders also in the FIFO underrun code.

Fixes the following new sparse warnings:
intel_fifo_underrun.c:340:49: warning: mixing different enum types
intel_fifo_underrun.c:340:49: int enum pipe versus
intel_fifo_underrun.c:340:49: int enum transcoder
intel_fifo_underrun.c:344:49: warning: mixing different enum types
intel_fifo_underrun.c:344:49: int enum pipe versus
intel_fifo_underrun.c:344:49: int enum transcoder
intel_fifo_underrun.c:397:57: warning: mixing different enum types
intel_fifo_underrun.c:397:57: int enum pipe versus
intel_fifo_underrun.c:397:57: int enum transcoder
intel_fifo_underrun.c:398:17: warning: mixing different enum types
intel_fifo_underrun.c:398:17: int enum pipe versus
intel_fifo_underrun.c:398:17: int enum transcoder

Cc: Matthias Kaehlcke <mka@chromium.org>
Fixes: a21960339c8c ("drm/i915: Consistently use enum pipe for PCH transcoders")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170901143123.7590-3-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>


# e93329a5 21-Apr-2017 Ville Syrjälä <ville.syrjala@linux.intel.com>

drm/i915: Add g4x watermark tracepoint

Add a tracepoint for watermark programming on g4x, similar to what we
have on vlv/chv. Should help in debugging watermark programming sequence
issues.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170421181432.15216-15-ville.syrjala@linux.intel.com
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>


# 60367132 16-Mar-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Avoid use-after-free of ctx in request tracepoints

trace_i915_gem_request_out may be used after the request is completed,
and so the request may have been retired on another thread, invalidating
the rq->ctx. Avoid dereferencing rq->ctx in the tracepoint by switching
to the fence context id instead, updating all tracepoints to match.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170316204235.27786-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>


# 53a7915c 02-Mar-2017 Ville Syrjälä <ville.syrjala@linux.intel.com>

drm/i915: Add FIFO underrun tracepoints

Add tracepoints for display FIFO underruns. Makes it more convenient to
correlate the underruns with other display tracepoints.

v2: s/i915/intel/ in the tracepoint name

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170302171508.1666-19-ville.syrjala@linux.intel.com


# 1489bba8 02-Mar-2017 Ville Syrjälä <ville.syrjala@linux.intel.com>

drm/i915: Add cxsr toggle tracepoint

Add a tracepoint for observing changes in the cxsr state. The tracepoint
will dump out the frame and scanline counters for each pipe so that the
information can be compared with eg. plane update tracepoints.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170302171508.1666-18-ville.syrjala@linux.intel.com


# c137d660 02-Mar-2017 Ville Syrjälä <ville.syrjala@linux.intel.com>

drm/i915: Add VLV/CHV watermark/FIFO programming tracepoints

Add tracepoints for observing the WM/FIFO programming on VLV/CHV. When
compared with the plane and pipe update tracepoints this can be used
to verify that everything is performed in the right sequence.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170302171508.1666-17-ville.syrjala@linux.intel.com


# 72259536 02-Mar-2017 Ville Syrjälä <ville.syrjala@linux.intel.com>

drm/i915: Add plane update/disable tracepoints

Add tracepoints for plane programming. The tracepoints will dump
the frame and scanline counters, so this can be used to verify eg. that
the plane gets reprogrammed at the right time with respect to watermark
programming (if we have appropriate tracepoints for that as well).

v2: Rebase due to legacy cursor changes

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170302171508.1666-16-ville.syrjala@linux.intel.com


# 208b84a3 22-Feb-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Remove change_domain tracepoint

The change_domain tracepoint has been inaccurate for a few years - it
doesn't fully capture the domains, especially with userspace bypassing
them. It is defunct, misleading and time to be removed.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170222114049.28456-1-chris@chris-wilson.co.uk


# 99c181a0 21-Feb-2017 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915/tracepoints: Add hw_id to context tracepoints

It is useful to provide this info to match the one provided
in the request tracepoints.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170221091350.14605-1-tvrtko.ursulin@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>


# d7d96833 21-Feb-2017 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915/tracepoints: Add backend level request in and out tracepoints

Two new tracepoints placed at the call sites where requests are
actually passed to the GPU enable userspace to track engine
utilisation.

These tracepoints are only enabled when the
DRM_I915_LOW_LEVEL_TRACEPOINTS Kconfig option is enabled.

v2: Fix compilation with !CONFIG_DRM_I915_LOW_LEVEL_TRACEPOINTS.

v3: Name global seqno consistently across tracepoints.

v4: Remove port info from request out tracepoint. (Chris Wilson)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>


# dffabc8f 21-Feb-2017 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915/tracepoints: Rename i915_gem_request_notify

i915_gem_ring_notify is more appropriate since we do not have
the request information at this point, but it is simply a
signal from the engine that some request has been completed.

v2:
* Always trace and log if there were any waiters.
* Rename to intel_engine_notify. (Chris Wilson)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>


# 354d036f 21-Feb-2017 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915/tracepoints: Add request submit and execute tracepoints

These new tracepoints are emitted once the request is ready to
be submitted to the GPU and once the request is about to
be submitted to the GPU, respectively.

Former condition triggers as soon as all the fences and
dependencies have been resolved, and the latter once the
backend is about to submit it to the GPU.

New tracepoint are enabled via the new
DRM_I915_LOW_LEVEL_TRACEPOINTS Kconfig option which is disabled
by default to alleviate the performance impact concerns.

v2: Move execute tracepoint to __i915_gem_request_submit.
(Chris Wilson)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>


# 90aa412d 21-Feb-2017 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915/tracepoints: Remove unused i915_gem_request_complete

Tracepoint is not used and won't be suitable for its replacement.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>


# 93692502 21-Feb-2017 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915/tracepoints: Tidy i915_gem_request_wait_begin

Provide the same information as the other request event classes.

v2: Pass in flags so we can properly report the blocking status.
(Chris Wilson)

v3: Log hex with 0x prefix for clarity.

v4: Derive blocking status from flags. (Chris Wilson)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>


# 1cce8922 21-Feb-2017 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915/tracepoints: Adjust i915_gem_ring_dispatch

Rename it to i915_gem_request_queue and fix the logged info
equivalent to the i915_gem_request even class. Also moved it
a bit further apart from the i915_gem_request_add tracepoint
since they otherwise provide similar information too close in
time.

v2: Remove sw fence singalling. We will rely on the soon to
come GuC scheduling backend to enable that. (Chris Wilson)

v3: Log hex with 0x prefix for clarity.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>


# e235b530 21-Feb-2017 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915/tracepoints: Tidy request event class

At the moment only the global seqno is logged which is not set
until the request is ready for submission.

Add the per-contex seqno and the context hardware id which are
both interesting data points.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>


# 3dc523ea 15-Feb-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Remove defunct GTT tracepoints

The tracepoints are now entirely synonymous with binding and unbinding the
VMA (and the tracepoints there).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170215084357.19977-18-chris@chris-wilson.co.uk


# dd19674b 15-Feb-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Remove bitmap tracking for used-ptes

We only operate on known extents (both for alloc/clear) and so we can use
both the knowledge of the bind/unbind range along with the knowledge of
the existing pagetable to avoid having to allocate temporary and
auxiliary bitmaps.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99295
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170215084357.19977-12-chris@chris-wilson.co.uk


# 625d988a 11-Jan-2017 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Extract reserving space in the GTT to a helper

Extract drm_mm_reserve_node + calling i915_gem_evict_for_node into its
own routine so that it can be shared rather than duplicated.

v2: Kerneldoc

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: igvt-g-dev@lists.01.org
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170111112312.31493-2-chris@chris-wilson.co.uk


# 172ae5b4 05-Dec-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Fix i915_gem_evict_for_vma (soft-pinning)

Soft-pinning depends upon being able to check for availabilty of an
interval and evict overlapping object from a drm_mm range manager very
quickly. Currently it uses a linear list, and so performance is dire and
not suitable as a general replacement. Worse, the current code will oops
if it tries to evict an active buffer.

It also helps if the routine reports the correct error codes as expected
by its callers and emits a tracepoint upon use.

For posterity since the wrong patch was pushed (i.e. that missed these
key points and had known bugs), this is the changelog that should have
been on commit 506a8e87d8d2 ("drm/i915: Add soft-pinning API for
execbuffer"):

Userspace can pass in an offset that it presumes the object is located
at. The kernel will then do its utmost to fit the object into that
location. The assumption is that userspace is handling its own object
locations (for example along with full-ppgtt) and that the kernel will
rarely have to make space for the user's requests.

This extends the DRM_IOCTL_I915_GEM_EXECBUFFER2 to do the following:
* if the user supplies a virtual address via the execobject->offset
*and* sets the EXEC_OBJECT_PINNED flag in execobject->flags, then
that object is placed at that offset in the address space selected
by the context specifier in execbuffer.
* the location must be aligned to the GTT page size, 4096 bytes
* as the object is placed exactly as specified, it may be used by this
execbuffer call without relocations pointing to it

It may fail to do so if:
* EINVAL is returned if the object does not have a 4096 byte aligned
address
* the object conflicts with another pinned object (either pinned by
hardware in that address space, e.g. scanouts in the aliasing ppgtt)
or within the same batch.
EBUSY is returned if the location is pinned by hardware
EINVAL is returned if the location is already in use by the batch
* EINVAL is returned if the object conflicts with its own alignment (as meets
the hardware requirements) or if the placement of the object does not fit
within the address space

All other execbuffer errors apply.

Presence of this execbuf extension may be queried by passing
I915_PARAM_HAS_EXEC_SOFTPIN to DRM_IOCTL_I915_GETPARAM and checking for
a reported value of 1 (or greater).

v2: Combine the hole/adjusted-hole ENOSPC checks
v3: More color, more splitting, more blurb.

Fixes: 506a8e87d8d2 ("drm/i915: Add soft-pinning API for execbuffer")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161205142941.21965-2-chris@chris-wilson.co.uk


# c6385c94 28-Nov-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Fix tracepoint compilation

drivers/gpu/drm/i915/./i915_trace.h: In function ‘trace_event_raw_event_i915_gem_evict’:
drivers/gpu/drm/i915/./i915_trace.h:409:24: error: ‘struct i915_address_space’ has no member named ‘dev’
__entry->dev = vm->dev->primary->index;

A couple of macros missed in the s/vm->dev/vm->i915/ conversion.

Fixes: 49d73912cbfc ("drm/i915: Convert vm->dev backpointer to vm->i915")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161129124205.19351-1-chris@chris-wilson.co.uk
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>


# 65e4760e 28-Oct-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Introduce a global_seqno for each request

Though we will have multiple timelines, we still have a single timeline
of execution. This we can use to provide an execution and retirement order
of requests. This keeps tracking execution of requests simple, and vital
for preserving a single waiter (i.e. so that we can order the waiters so
that only the earliest to wakeup need be woken). To accomplish this we
distinguish the seqno used to order requests per-context (external) and
that used internally for execution.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-26-chris@chris-wilson.co.uk


# f54d1867 25-Oct-2016 Chris Wilson <chris@chris-wilson.co.uk>

dma-buf: Rename struct fence to dma_fence

I plan to usurp the short name of struct fence for a core kernel struct,
and so I need to rename the specialised fence/timeline for DMA
operations to make room.

A consensus was reached in
https://lists.freedesktop.org/archives/dri-devel/2016-July/113083.html
that making clear this fence applies to DMA operations was a good thing.
Since then the patch has grown a bit as usage increases, so hopefully it
remains a good thing!

(v2...: rebase, rerun spatch)
v3: Compile on msm, spotted a manual fixup that I broke.
v4: Try again for msm, sorry Daniel

coccinelle script:
@@

@@
- struct fence
+ struct dma_fence
@@

@@
- struct fence_ops
+ struct dma_fence_ops
@@

@@
- struct fence_cb
+ struct dma_fence_cb
@@

@@
- struct fence_array
+ struct dma_fence_array
@@

@@
- enum fence_flag_bits
+ enum dma_fence_flag_bits
@@

@@
(
- fence_init
+ dma_fence_init
|
- fence_release
+ dma_fence_release
|
- fence_free
+ dma_fence_free
|
- fence_get
+ dma_fence_get
|
- fence_get_rcu
+ dma_fence_get_rcu
|
- fence_put
+ dma_fence_put
|
- fence_signal
+ dma_fence_signal
|
- fence_signal_locked
+ dma_fence_signal_locked
|
- fence_default_wait
+ dma_fence_default_wait
|
- fence_add_callback
+ dma_fence_add_callback
|
- fence_remove_callback
+ dma_fence_remove_callback
|
- fence_enable_sw_signaling
+ dma_fence_enable_sw_signaling
|
- fence_is_signaled_locked
+ dma_fence_is_signaled_locked
|
- fence_is_signaled
+ dma_fence_is_signaled
|
- fence_is_later
+ dma_fence_is_later
|
- fence_later
+ dma_fence_later
|
- fence_wait_timeout
+ dma_fence_wait_timeout
|
- fence_wait_any_timeout
+ dma_fence_wait_any_timeout
|
- fence_wait
+ dma_fence_wait
|
- fence_context_alloc
+ dma_fence_context_alloc
|
- fence_array_create
+ dma_fence_array_create
|
- to_fence_array
+ to_dma_fence_array
|
- fence_is_array
+ dma_fence_is_array
|
- trace_fence_emit
+ trace_dma_fence_emit
|
- FENCE_TRACE
+ DMA_FENCE_TRACE
|
- FENCE_WARN
+ DMA_FENCE_WARN
|
- FENCE_ERR
+ DMA_FENCE_ERR
)
(
...
)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Acked-by: Sumit Semwal <sumit.semwal@linaro.org>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20161025120045.28839-1-chris@chris-wilson.co.uk


# e522ac23 04-Aug-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Remove surplus drm_device parameter to i915_gem_evict_something()

Eviction is VM local, so we can ignore the significance of the
drm_device in the caller, and leave it to i915_gem_evict_something() to
manage itself.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-2-git-send-email-chris@chris-wilson.co.uk


# 8e637178 02-Aug-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Simplify request_alloc by returning the allocated request

If is simpler and leads to more readable code through the callstack if
the allocation returns the allocated struct through the return value.

The importance of this is that it no longer looks like we accidentally
allocate requests as side-effect of calling certain functions.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-19-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470174640-18242-9-git-send-email-chris@chris-wilson.co.uk


# 04769652 20-Jul-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Derive GEM requests from dma-fence

dma-buf provides a generic fence class for interoperation between
drivers. Internally we use the request structure as a fence, and so with
only a little bit of interfacing we can rebase those requests on top of
dma-buf fences. This will allow us, in the future, to pass those fences
back to userspace or between drivers.

v2: The fence_context needs to be globally unique, not just unique to
this device.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1469002875-2335-4-git-send-email-chris@chris-wilson.co.uk


# 91c8a326 05-Jul-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Convert dev_priv->dev backpointers to dev_priv->drm

Since drm_i915_private is now a subclass of drm_device we do not need to
chase the drm_i915_private->dev backpointer and can instead simply
access drm_i915_private->drm directly.

text data bss dec hex filename
1068757 4565 416 1073738 10624a drivers/gpu/drm/i915/i915.ko
1066949 4565 416 1071930 105b3a drivers/gpu/drm/i915/i915.ko

Created by the coccinelle script:
@@
struct drm_i915_private *d;
identifier i;
@@
(
- d->dev->i
+ d->drm.i
|
- d->dev
+ &d->drm
)

and for good measure the dev_priv->dev backpointer was removed entirely.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467711623-2905-4-git-send-email-chris@chris-wilson.co.uk


# c81d4613 01-Jul-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Convert trace-irq to the breadcrumb waiter

If we convert the tracing over from direct use of ring->irq_get() and
over to the breadcrumb infrastructure, we only have a single user of the
ring->irq_get and so we will be able to simplify the driver routines
(eliminating the redundant validation and irq refcounting).

Process context is preferred over softirq (or even hardirq) for a couple
of reasons:

- we already utilize process context to have fast wakeup of a single
client (i.e. the client waiting for the GPU inspects the seqno for
itself following an interrupt to avoid the overhead of a context
switch before it returns to userspace)

- engine->irq_seqno() is not suitable for use from an softirq/hardirq
context as we may require long waits (100-250us) to ensure the seqno
write is posted before we read it from the CPU

A signaling framework is a requirement for enabling dma-fences.

v2: Move to a signaling framework based upon the waiter.
v3: Track the first-signal to avoid having to walk the rbtree everytime.
v4: Mark the signaler thread as RT priority to reduce latency in the
indirect wakeups.
v5: Make failure to allocate the thread fatal.
v6: Rename kthreads to i915/signal:%u

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-16-git-send-email-chris@chris-wilson.co.uk


# 1b7744e7 01-Jul-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Use HWS for seqno tracking everywhere

By using the same address for storing the HWS on every platform, we can
remove the platform specific vfuncs and reduce the get-seqno routine to
a single read of a cached memory location.

v2: Fix semaphore_passed() to look at the signaling engine (not the
waiter's)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-8-git-send-email-chris@chris-wilson.co.uk


# e2efd130 24-May-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Rename struct intel_context

Our goal is to rename the anonymous per-engine struct beneath the
current intel_context. However, after a lively debate resolving around
the confusion between intel_context_engine and intel_engine_context, the
realisation is that the two structs target different users. The outer
struct is API / user facing, and so carries the higher level GEM
information. The inner struct is hw facing. Thus we want to name the
inner struct intel_context and the outer one i915_gem_context. As the
first step, we need to rename the current struct:

s/struct intel_context/struct i915_gem_context/

which fits much better with its constructors already conveying the
i915_gem_context prefix!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dave Gordon <david.s.gordon@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1464098023-3294-1-git-send-email-chris@chris-wilson.co.uk


# c033666a 06-May-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Store a i915 backpointer from engine, and use it

text data bss dec hex filename
6309351 3578714 696320 10584385 a18141 vmlinux
6308391 3578714 696320 10583425 a17d81 vmlinux

Almost 1KiB of code reduction.

v2: More s/INTEL_INFO()->gen/INTEL_GEN()/ and IS_GENx() conversions

text data bss dec hex filename
6304579 3578778 696320 10579677 a16edd vmlinux
6303427 3578778 696320 10578525 a16a5d vmlinux

Now over 1KiB!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1462545621-30125-3-git-send-email-chris@chris-wilson.co.uk


# c04e0f3b 09-Apr-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Separate out the seqno-barrier from engine->get_seqno

In order to simplify future patches, extract the
lazy_coherency optimisation our of the engine->get_seqno() vfunc into
its own callback.

v2: Rename the barrier to engine->irq_seqno_barrier to try and better
reflect that the barrier is only required after the user interrupt before
reading the seqno (to ensure that the seqno update lands in time as we
do not have strict seqno-irq ordering on all platforms).

Reviewed-by: Dave Gordon <david.s.gordon@intel.com> [#v2]

v3: Comments for hangcheck paranoia. Mika wanted to keep the extra
barrier inside the hangcheck, just in case. I can argue that it doesn't
provide a barrier against anything, but the side-effects of applying the
barrier may prevent a false declaration of a hung GPU.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460195877-20520-2-git-send-email-chris@chris-wilson.co.uk


# 666796da 16-Mar-2016 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: More intel_engine_cs renaming

Some trivial ones, first pass done with Coccinelle:

@@
@@
(
- I915_NUM_RINGS
+ I915_NUM_ENGINES
|
- intel_ring_flag
+ intel_engine_flag
|
- for_each_ring
+ for_each_engine
|
- i915_gem_request_get_ring
+ i915_gem_request_get_engine
|
- intel_ring_idle
+ intel_engine_idle
|
- i915_gem_reset_ring_status
+ i915_gem_reset_engine_status
|
- i915_gem_reset_ring_cleanup
+ i915_gem_reset_engine_cleanup
|
- init_ring_lists
+ init_engine_lists
)

But that didn't fully work so I cleaned it up with:

for f in *.[hc]; do sed -i -e s/I915_NUM_RINGS/I915_NUM_ENGINES/ $f; done
for f in *.[hc]; do sed -i -e s/i915_gem_request_get_ring/i915_gem_request_get_engine/ $f; done
for f in *.[hc]; do sed -i -e s/intel_ring_flag/intel_engine_flag/ $f; done
for f in *.[hc]; do sed -i -e s/intel_ring_idle/intel_engine_idle/ $f; done
for f in *.[hc]; do sed -i -e s/init_ring_lists/init_engine_lists/ $f; done
for f in *.[hc]; do sed -i -e s/i915_gem_reset_ring_cleanup/i915_gem_reset_engine_cleanup/ $f; done
for f in *.[hc]; do sed -i -e s/i915_gem_reset_ring_status/i915_gem_reset_engine_status/ $f; done

v2: Rebase.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>


# 4a570db5 16-Mar-2016 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Rename intel_engine_cs struct members

below and a couple manual fixups.

@@
identifier I, J;
@@
struct I {
...
- struct intel_engine_cs *J;
+ struct intel_engine_cs *engine;
...
}
@@
identifier I, J;
@@
struct I {
...
- struct intel_engine_cs J;
+ struct intel_engine_cs engine;
...
}
@@
struct drm_i915_private *d;
@@
(
- d->ring
+ d->engine
)
@@
struct i915_execbuffer_params *p;
@@
(
- p->ring
+ p->engine
)
@@
struct intel_ringbuffer *r;
@@
(
- r->ring
+ r->engine
)
@@
struct drm_i915_gem_request *req;
@@
(
- req->ring
+ req->engine
)

v2: Script missed the tracepoint code - fixed up by hand.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>


# 596c5923 26-Feb-2016 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Reduce the pointer dance of i915_is_ggtt()

The multiple levels of indirect do nothing but hinder the compiler and
the pointer chasing turns to be quite painful but painless to fix.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Dave Gordon <david.s.gordon@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1456484600-11477-1-git-send-email-tvrtko.ursulin@linux.intel.com


# f0f59a00 18-Nov-2015 Ville Syrjälä <ville.syrjala@linux.intel.com>

drm/i915: Type safe register read/write

Make I915_READ and I915_WRITE more type safe by wrapping the register
offset in a struct. This should eliminate most of the fumbles we've had
with misplaced parens.

This only takes care of normal mmio registers. We could extend the idea
to other register types and define each with its own struct. That way
you wouldn't be able to accidentally pass the wrong thing to a specific
register access function.

The gpio_reg setup is probably the ugliest thing left. But I figure I'd
just leave it for now, and wait for some divine inspiration to strike
before making it nice.

As for the generated code, it's actually a bit better sometimes. Eg.
looking at i915_irq_handler(), we can see the following change:
lea 0x70024(%rdx,%rax,1),%r9d
mov $0x1,%edx
- movslq %r9d,%r9
- mov %r9,%rsi
- mov %r9,-0x58(%rbp)
- callq *0xd8(%rbx)
+ mov %r9d,%esi
+ mov %r9d,-0x48(%rbp)
callq *0xd8(%rbx)

So previously gcc thought the register offset might be signed and
decided to sign extend it, just in case. The rest appears to be
mostly just minor shuffling of instructions.

v2: i915_mmio_reg_{offset,equal,valid}() helpers added
s/_REG/_MMIO/ in the register defines
mo more switch statements left to worry about
ring_emit stuff got sorted in a prep patch
cmd parser, lrc context and w/a batch buildup also in prep patch
vgpu stuff cleaned up and moved to a prep patch
all other unrelated changes split out
v3: Rebased due to BXT DSI/BLC, MOCS, etc.
v4: Rebased due to churn, s/i915_mmio_reg_t/i915_reg_t/

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1447853606-2751-1-git-send-email-ville.syrjala@linux.intel.com


# 3abafa53 30-Sep-2015 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Add a tracepoint for the shrinker

Often it is very useful to know why we suddenly purge vast tracts of
memory and surprisingly up until now we didn't even have a tracepoint
for when we shrink our memory.

Note that there are slab_start/end tracepoints already, but those
don't cover the internal recursion when we directly call into our
shrinker code. Hence a separate tracepoint seems justified. Also note
that we don't really need a separate tracepoint for the actual amount
of pages freed since we already have an unbind tracpoint for that.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Add a note that there's also slab_start/end and why they're
insufficient.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# d637ce3f 17-Sep-2015 Jesse Barnes <jbarnes@virtuousgeek.org>

drm/i915: cleanup pipe_update trace functions with new crtc debug info v3

Use the new debug info in the intel_crtc struct in these functions
rather than passing them as args.

v2: move min/max assignment back above first trace call (Ville)
use scanline from crtc->debug rather than fetching a new one (Ville)
v3: fix up trace_i915_pipe_update_end, needs end scanline (Ville)

Requested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>


# 762d9936 30-Jul-2015 Michel Thierry <michel.thierry@intel.com>

drm/i915/gen8: implement alloc/free for 4lvl

PML4 has no special attributes, and there will always be a PML4.
So simply initialize it at creation, and destroy it at the end.

The code for 4lvl is able to call into the existing 3lvl page table code
to handle all of the lower levels.

v2: Return something at the end of gen8_alloc_va_range_4lvl to keep the
compiler happy. And define ret only in one place.
Updated gen8_ppgtt_unmap_pages and gen8_ppgtt_free to handle 4lvl.
v3: Use i915_dma_unmap_single instead of pci API. Fix a
couple of incorrect checks when unmapping pdp and pd pages (Akash).
v4: Call __pdp_fini also for 32b PPGTT. Clean up alloc_pdp param list.
v5: Prevent (harmless) out of range access in gen8_for_each_pml4e.
v6: Simplify alloc_vma_range_4lvl and gen8_ppgtt_init_common error
paths. (Akash)
v7: Rebase, s/gen8_ppgtt_free_*/gen8_ppgtt_cleanup_*/.
v8: Change location of pml4_init/fini. It will make next patches
cleaner.
v9: Rebase after Mika's ppgtt cleanup / scratch merge patch series, while
trying to reuse as much as possible for pdp alloc. pml4_init/fini
replaced by setup/cleanup_px macros.
v10: Rebase after Mika's merged ppgtt cleanup patch series.
v11: Rebase after final merged version of Mika's ppgtt/scratch
patches.
v12: Fix pdpe start value in trace (Akash)
v13: Define all 4lvl functions in this patch directly, instead of
previous patches, add i915_page_directory_pointer_entry_alloc here,
use test_bit to detect when pdp is already allocated (Akash).
v14: Move pdp allocation into a new gen8_ppgtt_alloc_page_dirpointers
funtion, as we do for pds and pts; move pd and pdp setup functions to
this patch (Akash).
v15: Added kfree(pdp) from previous patch to this (Akash).

Cc: Akash Goel <akash.goel@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
Reviewed-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 4c06ec8d 29-Jul-2015 Michel Thierry <michel.thierry@intel.com>

drm/i915/gen8: Add dynamic page trace events

The dynamic page allocation patch series added it for GEN6, this patch
adds them for GEN8.

v2: Consolidate pagetable/page_directory events
v3: Multiple rebases.
v4: Rebase after s/page_tables/page_table/.
v5: Rebase after Mika's ppgtt cleanup / scratch merge patch series.
v6: Rebase after gen8_map_pagetable_range removal.
v7: Use generic page name (px) in DECLARE_EVENT_CLASS (Akash)
v8: Defer define of i915_page_directory_pointer_entry_alloc (Akash)

Cc: Akash Goel <akash.goel@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v3+)
Reviewed-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 9ea4feec 05-May-2015 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Store device pointer in contexts for late tracepoint usafe

[ 1572.417121] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 1572.421010] IP: [<ffffffffa00b2514>] ftrace_raw_event_i915_context+0x5d/0x70 [i915]
[ 1572.424970] PGD 1766a3067 PUD 1767a2067 PMD 0
[ 1572.428892] Oops: 0000 [#1] SMP
[ 1572.432787] Modules linked in: ipv6 dm_mod iTCO_wdt iTCO_vendor_support snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel snd_hda_controller snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_timer snd soundcore serio_raw pcspkr lpc_ich i2c_i801 mfd_core battery ac acpi_cpufreq i915 button video drm_kms_helper drm
[ 1572.441720] CPU: 2 PID: 18853 Comm: kworker/u8:0 Not tainted 4.0.0_kcloud_3f0360_20150429+ #588
[ 1572.446298] Workqueue: i915 i915_gem_retire_work_handler [i915]
[ 1572.450876] task: ffff880002f428f0 ti: ffff880035724000 task.ti: ffff880035724000
[ 1572.455557] RIP: 0010:[<ffffffffa00b2514>] [<ffffffffa00b2514>] ftrace_raw_event_i915_context+0x5d/0x70 [i915]
[ 1572.460423] RSP: 0018:ffff880035727ce8 EFLAGS: 00010286
[ 1572.465262] RAX: ffff880073f1643c RBX: ffff880002da9058 RCX: ffff880073e5db40
[ 1572.470179] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880035727ce8
[ 1572.475107] RBP: ffff88007bb11a00 R08: 0000000000000000 R09: 0000000000000000
[ 1572.480034] R10: 0000000000362200 R11: 0000000000000008 R12: 0000000000000000
[ 1572.484952] R13: ffff880035727d78 R14: ffff880002dc1c98 R15: ffff880002dc1dc8
[ 1572.489886] FS: 0000000000000000(0000) GS:ffff88017fd00000(0000) knlGS:0000000000000000
[ 1572.494883] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 1572.499859] CR2: 0000000000000000 CR3: 000000017572a000 CR4: 00000000001006e0
[ 1572.504842] Stack:
[ 1572.509834] ffff88017b0090c0 ffff880073f16438 ffff880002da9058 ffff880073f1643c
[ 1572.514904] 0000000000000246 ffff880100000000 ffff88007bb11a00 ffff880002ddeb10
[ 1572.519985] ffff8801759f79c0 ffffffffa0092ff0 0000000000000000 ffff88007bb11a00
[ 1572.525049] Call Trace:
[ 1572.530093] [<ffffffffa0092ff0>] ? i915_gem_context_free+0xa8/0xc1 [i915]
[ 1572.535227] [<ffffffffa009b969>] ? i915_gem_request_free+0x4e/0x50 [i915]
[ 1572.540347] [<ffffffffa00b5533>] ? intel_execlists_retire_requests+0x14c/0x159 [i915]
[ 1572.545500] [<ffffffffa009d9ea>] ? i915_gem_retire_requests+0x9d/0xeb [i915]
[ 1572.550664] [<ffffffffa009dd8c>] ? i915_gem_retire_work_handler+0x4c/0x61 [i915]
[ 1572.555825] [<ffffffff8104ca7f>] ? process_one_work+0x1b2/0x31d
[ 1572.560951] [<ffffffff8104d278>] ? worker_thread+0x24d/0x339
[ 1572.566033] [<ffffffff8104d02b>] ? cancel_delayed_work_sync+0xa/0xa
[ 1572.571140] [<ffffffff81050b25>] ? kthread+0xce/0xd6
[ 1572.576191] [<ffffffff81050a57>] ? kthread_create_on_node+0x162/0x162
[ 1572.581228] [<ffffffff8179b3c8>] ? ret_from_fork+0x58/0x90
[ 1572.586259] [<ffffffff81050a57>] ? kthread_create_on_node+0x162/0x162
[ 1572.591318] Code: de 48 89 e7 e8 09 4d 00 e1 48 85 c0 74 27 48 89 68 10 48 8b 55 38 48 89 e7 48 89 50 18 48 8b 55 10 48 8b 12 48 8b 12 48 8b 52 38 <8b> 12 89 50 08 e8 95 4d 00 e1 48 83 c4 30 5b 5d 41 5c c3 41 55
[ 1572.596981] RIP [<ffffffffa00b2514>] ftrace_raw_event_i915_context+0x5d/0x70 [i915]
[ 1572.602464] RSP <ffff880035727ce8>
[ 1572.607911] CR2: 0000000000000000

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90112#c23
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 599d924c 29-May-2015 John Harrison <John.C.Harrison@Intel.com>

drm/i915: Update ring->sync_to() to take a request structure

Updated the ring->sync_to() implementations to take a request instead of a ring.
Also updated the tracer to include the request id.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
[danvet: Rebase since I didn't merge the patch which added ->uniq.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# a84c3ae1 29-May-2015 John Harrison <John.C.Harrison@Intel.com>

drm/i915: Update ring->flush() to take a requests structure

Updated the various ring->flush() functions to take a request instead of a ring.
Also updated the tracer to include the request id.

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
[danvet: Rebase since I didn't merge the addition of req->uniq.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# d7b9ca2f 07-Apr-2015 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Remove request->uniq

We already assign a unique identifier to every request: seqno. That
someone felt like adding a second one without even mentioning why and
tweaking ABI smells very fishy.

Fixes regression from
commit b3a38998f042b862f5ba4d7f2268f3a8dfb4883a
Author: Nick Hoath <nicholas.hoath@intel.com>
Date: Thu Feb 19 16:30:47 2015 +0000

drm/i915: Fix a use after free, and unbalanced refcounting

v2: Rebase

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Nick Hoath <nicholas.hoath@intel.com>
Cc: Thomas Daniel <thomas.daniel@intel.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Jani Nikula <jani.nikula@intel.com>
[danvet: Fixup because different merge order.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 595e1eeb 07-Apr-2015 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Remove vestigal DRI1 ring quiescing code

After the removal of DRI1, all access to the rings are through requests
and so we can always be sure that there is a request to wait upon to
free up available space. The fallback code only existed so that we could
quiesce the GPU following unmediated access by DRI1.

v2: Rebase

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# ec565b3c 07-Apr-2015 Michel Thierry <michel.thierry@intel.com>

drm/i915: Remove _entry from PPGTT page structures

Lets try to keep this consistent:

Page Directory Pointer (PDP).
Page Directory (PD), also known as page directory pointer entries.
Page Table (PT), also known as page directory entries.

s/struct i915_page_table_entry/struct i915_page_table/
s/struct i915_page_directory_entry/struct i915_page_directory/
s/struct i915_page_directory_pointer_entry/struct
i915_page_directory_pointer/

Suggested-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 77cb2fea 01-Apr-2015 Steven Rostedt <rostedt@goodmis.org>

tracing/drm: Remove unused TRACE_SYSTEM_STRING define

The tracing infrastructure is adding a macro TRACE_SYSTEM_STRING, and
hit the following build failure:

In file included from include/trace/define_trace.h:90:0,
from drivers/gpu/drm/.//radeon/radeon_trace.h:209,
from drivers/gpu/drm/.//radeon/radeon_trace_points.c:9:
>> include/trace/ftrace.h:28:0: warning: "TRACE_SYSTEM_STRING" redefined
#define TRACE_SYSTEM_STRING __app(TRACE_SYSTEM_VAR,__trace_system_name)

Seems that the DRM folks have added their own use to the
TRACE_SYSTEM_STRING, with:

#define TRACE_SYSTEM_STRING __stringify(TRACE_SYSTEM)

Although, I can not find its use anywhere. I could simply use another
name, but if this macro is not being used, it should be removed.

Link: http://lkml.kernel.org/r/20150402123736.01eda052@gandalf.local.home

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>


# 72744cb1 24-Mar-2015 Michel Thierry <michel.thierry@intel.com>

drm/i915: Add dynamic page trace events

Traces for page directories and tables allocation and map.

v2: Removed references to teardown.
v3: bitmap_scnprintf has been deprecated.
v4: Replace bitmap_scnprintf with scnprintf correctly, and get right
range lengths. (Mika)

Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 33938714 22-Jan-2015 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915/trace: Fix offsets for 64b

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# bcfcc8ba 05-Dec-2014 John Harrison <John.C.Harrison@Intel.com>

drm/i915: Additional request structure tracing

Added the request structure's 'uniq' identifier to the trace information. Also
renamed the '_complete' trace event to '_notify' as it actually happens in the
IRQ 'notify_ring()' function. The intention is to add a new '_complete' trace
event which occurs when a request structure is actually marked as complete.
However, at the moment the completion status is re-tested every time the query
is made so there isn't a completion event as such.

v2: New patch added to series.

v3: Rebased to remove completion caching as that is apparently contentious.

Change-Id: Ic9bcde67d175c6c03b96217cdcb6e4cc4aa45d67
For: VIZ-4377
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 581c26e8 24-Nov-2014 John Harrison <John.C.Harrison@Intel.com>

drm/i915: Convert 'trace_irq' to use requests rather than seqnos

Updated the trace_irq code to use requests instead of seqnos. This includes
reference counting the request object to ensure it sticks around when required.
Note that getting access to the reference counting functions means moving the
inline i915_trace_irq_get() function from intel_ringbuffer.h to i915_drv.h.

For: VIZ-4377
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com>
[danvet: Resolve conflict due to shuffled merge order.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 74328ee5 24-Nov-2014 John Harrison <John.C.Harrison@Intel.com>

drm/i915: Convert trace functions from seqno to request

All the code above is now using requests not seqnos so it is possible to convert
the trace functions across. Note that rather than get into problematic reference
counting issues, the trace code only saves the seqno and ring values from the
request structure not the structure pointer itself.

For: VIZ-4377
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 198c974d 10-Nov-2014 Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

drm/i915: Add tracepoints to track a vm during its lifetime

- ppgtt init/release: these tracepoints are useful for observing the
creation and destruction of Full PPGTTs.

- ctx create/free: we can use the ctx_free trace in combination with the
ppgtt_release one to be sure that the ppgtt doesn't stay alive for too
long after the ctx is destroyed. ctx_create is there for simmetry

- switch_mm: important point in the lifetime of the vm

v4: add DOC information
v5: pull the DOC in drm.tmpl
v6: clean ppgtt init/release traces + add ctx create/free and switch_mm
tracepoints (Chris)
v7: drop execlist_submit_context tracepoint

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# a4872ba6 22-May-2014 Oscar Mateo <oscar.mateo@intel.com>

drm/i915: s/intel_ring_buffer/intel_engine_cs

In the upcoming patches we plan to break the correlation between
engine command streamers (a.k.a. rings) and ringbuffers, so it
makes sense to refactor the code and make the change obvious.

No functional changes.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 25ef284a 29-Apr-2014 Ville Syrjälä <ville.syrjala@linux.intel.com>

drm/i915: Add pipe update trace points

Add trace points for observing the atomic pipe update mechanism.

v2: Rebased due to earlier changes
v3: Pass intel_crtc instead of drm_crtc (Daniel)
v4: Pass frame counter from the caller to evaded/end since
the caller now always has that ready

Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Sourab Gupta <sourabgupta@gmail.com>
Reviewed-by: Akash Goel <akash.goels@gmail.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 9297ebf2 18-Mar-2014 Steven Rostedt <rostedt@goodmis.org>

drm/i915: Do not dereference pointers from ring buffer in evict event

The TP_printk() should never dereference any pointers, because the ring
buffer can be read at some unknown time in the future. If a device no
longer exists, it can cause a kernel oops. This also makes this
event useless when saving the ring buffer in userspaces tools such as
perf and trace-cmd.

The i915_gem_evict_vm dereferences the vm pointer which may also not
exist when the ring buffer is read sometime in the future.

Link: http://lkml.kernel.org/r/1395095198-20034-3-git-send-email-artagnon@gmail.com
Reported-by: Ramkumar Ramachandra <artagnon@gmail.com>
Cc: stable@vger.kernel.org # 3.13+
Fixes: bcccff847d1f "drm/i915: trace vm eviction instead of everything"
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
[danvet: Try to make it actually compile]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 1ec9e26d 14-Feb-2014 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: Consolidate binding parameters into flags

Anything more than just one bool parameter is just a pain to read,
symbolic constants are much better.

Split out from Chris' vma-binding rework patch.

v2: Undo the behaviour change in object_pin that Chris spotted.

v3: Split out misplaced hunk to handle set_cache_level errors,
spotted by Jani.

v4: Keep the current over-zealous binding logic in the execbuffer code
working with a quick hack while the overall binding code gets shuffled
around.

v5: Reorder the PIN_ flags for more natural patch splitup.

v6: Pull out the PIN_GLOBAL split-up again.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# b52b89da 25-Sep-2013 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Add a tracepoint for using a semaphore

So that we can find the callers who introduce a ring stall. A single
ring stall is not too unwelcome, the right issue becomes when they start
to interlock and prevent any concurrent work. That, however, is a little
tricker to detect with a mere tracepoint!

v2: Rebrand it as a ring event, rather than an object event.
v3: Include the seqno in the tracepoint for posterity or something.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 814e9b57 23-Sep-2013 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Move the conditional seqno query into the tracepoint

We only wish to know the value of seqno when emitting the tracepoint, so
move the query from a parameter to the macro to inside the conditional
macro body so that the query is only evaluated when required.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# bcccff84 24-Sep-2013 Ben Widawsky <benjamin.widawsky@intel.com>

drm/i915: trace vm eviction instead of everything

Tracing vm eviction is really the event we care about. For the cases we
evict everything, we still will get the trace.

v2: Add the drm device to the trace since we might not be the only
device in the system. (Chris)

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 07fe0b12 31-Jul-2013 Ben Widawsky <ben@bwidawsk.net>

drm/i915: plumb VM into bind/unbind code

As alluded to in several patches, and it will be reiterated later... A
VMA is an abstraction for a GEM BO bound into an address space.
Therefore it stands to reason, that the existing bind, and unbind are
the ones which will be the most impacted. This patch implements this,
and updates all callers which weren't already updated in the series
(because it was too messy).

This patch represents the bulk of an earlier, larger patch. I've pulled
out a bunch of things by the request of Daniel. The history is preserved
for posterity with the email convention of ">" One big change from the
original patch aside from a bunch of cropping is I've created an
i915_vma_unbind() function. That is because we always have the VMA
anyway, and doing an extra lookup is useful. There is a caveat, we
retain an i915_gem_object_ggtt_unbind, for the global cases which might
not talk in VMAs.

> drm/i915: plumb VM into object operations
>
> This patch was formerly known as:
> "drm/i915: Create VMAs (part 3) - plumbing"
>
> This patch adds a VM argument, bind/unbind, and the object
> offset/size/color getters/setters. It preserves the old ggtt helper
> functions because things still need, and will continue to need them.
>
> Some code will still need to be ported over after this.
>
> v2: Fix purge to pick an object and unbind all vmas
> This was doable because of the global bound list change.
>
> v3: With the commit to actually pin/unpin pages in place, there is no
> longer a need to check if unbind succeeded before calling put_pages().
> Make put_pages only BUG() after checking pin count.
>
> v4: Rebased on top of the new hangcheck work by Mika
> plumbed eb_destroy also
> Many checkpatch related fixes
>
> v5: Very large rebase
>
> v6:
> Change BUG_ON to WARN_ON (Daniel)
> Rename vm to ggtt in preallocate stolen, since it is always ggtt when
> dealing with stolen memory. (Daniel)
> list_for_each will short-circuit already (Daniel)
> remove superflous space (Daniel)
> Use per object list of vmas (Daniel)
> Make obj_bound_any() use obj_bound for each vm (Ben)
> s/bind_to_gtt/bind_to_vm/ (Ben)
>
> Fixed up the inactive shrinker. As Daniel noticed the code could
> potentially count the same object multiple times. While it's not
> possible in the current case, since 1 object can only ever be bound into
> 1 address space thus far - we may as well try to get something more
> future proof in place now. With a prep patch before this to switch over
> to using the bound list + inactive check, we're now able to carry that
> forward for every address space an object is bound into.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Rebase on top of the loss of "drm/i915: Cleanup more of VMA
in destroy".]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# ed71f1b4 19-Jul-2013 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Convert the register access tracepoint to be conditional

The TRACE_EVENT_CONDITION is supposed to generate more efficient code
than if (cond) trace(), which is what we are currently using inside the
register access functions.

v2: Rebase onto uncore

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# f343c5f6 05-Jul-2013 Ben Widawsky <ben@bwidawsk.net>

drm/i915: Getter/setter for object attributes

Soon we want to gut a lot of our existing assumptions how many address
spaces an object can live in, and in doing so, embed the drm_mm_node in
the object (and later the VMA).

It's possible in the future we'll want to add more getter/setter
methods, but for now this is enough to enable the VMAs.

v2: Reworked commit message (Ben)
Added comments to the main functions (Ben)
sed -i "s/i915_gem_obj_set_color/i915_gem_obj_ggtt_set_color/" drivers/gpu/drm/i915/*.[ch]
sed -i "s/i915_gem_obj_bound/i915_gem_obj_ggtt_bound/" drivers/gpu/drm/i915/*.[ch]
sed -i "s/i915_gem_obj_size/i915_gem_obj_ggtt_size/" drivers/gpu/drm/i915/*.[ch]
sed -i "s/i915_gem_obj_offset/i915_gem_obj_ggtt_offset/" drivers/gpu/drm/i915/*.[ch]
(Daniel)

v3: Rebased on new reserve_node patch
Changed DRM_DEBUG_KMS to actually work (will need fixing later)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# d7d4eedd 16-Oct-2012 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Allow DRM_ROOT_ONLY|DRM_MASTER to submit privileged batchbuffers

With the introduction of per-process GTT space, the hardware designers
thought it wise to also limit the ability to write to MMIO space to only
a "secure" batch buffer. The ability to rewrite registers is the only
way to program the hardware to perform certain operations like scanline
waits (required for tear-free windowed updates). So we either have a
choice of adding an interface to perform those synchronized updates
inside the kernel, or we permit certain processes the ability to write
to the "safe" registers from within its command stream. This patch
exposes the ability to submit a SECURE batch buffer to
DRM_ROOT_ONLY|DRM_MASTER processes.

v2: Haswell split up bit8 into a ppgtt bit (still bit8) and a security
bit (bit 13, accidentally not set). Also add a comment explaining why
secure batches need a global gtt binding.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
[danvet: added hsw fixup.]
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# be2cde9a 30-Aug-2012 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: add a tracepoint for gpu frequency changes

We've had and still have too many issues where the gpu turbo doesn't
quite to what it's supposed to do (or what we want it to do).

Adding a tracepoint to track when the desired gpu frequency changes
should help a lot in characterizing and understanding problematic
workloads.

Also, this should be fairly interesting for power tuning (and
especially noticing when the gpu is stuck in high frequencies, as has
happened in the past) and hence for integration into powertop and
similar tools.

Cc: Arjan van de Ven <arjan@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Paul Menzel <paulepanter@users.sourceforge.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 6c085a72 20-Aug-2012 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Track unbound pages

When dealing with a working set larger than the GATT, or even the
mappable aperture when touching through the GTT, we end up with evicting
objects only to rebind them at a new offset again later. Moving an
object into and out of the GTT requires clflushing the pages, thus
causing a double-clflush penalty for rebinding.

To avoid having to clflush on rebinding, we can track the pages as they
are evicted from the GTT and only relinquish those pages on memory
pressure.

As usual, if it were not for the handling of out-of-memory condition and
having to manually shrink our own bo caches, it would be a net reduction
of code. Alas.

Note: The patch also contains a few changes to the last-hope
evict_everything logic in i916_gem_execbuffer.c - we no longer try to
only evict the purgeable stuff in a first try (since that's superflous
and only helps in OOM corner-cases, not fragmented-gtt trashing
situations).

Also, the extraction of the get_pages retry loop from bind_to_gtt (and
other callsites) to get_pages should imo have been a separate patch.

v2: Ditch the newly added put_pages (for unbound objects only) in
i915_gem_reset. A quick irc discussion hasn't revealed any important
reason for this, so if we need this, I'd like to have a git blame'able
explanation for it.

v3: Undo the s/drm_malloc_ab/kmalloc/ in get_pages that Chris noticed.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Split out code movements and rant a bit in the commit message
with a few Notes. Done v2]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# f3fd3768 24-May-2012 Ben Widawsky <ben@bwidawsk.net>

drm/i915: improve i915_wait_request_begin trace

The trace events adds whether or not the wait was blocking. Blocking in
this case means to hold struct_mutex (ie. no new work can be submitted
during the wait). The information is inherently racy.

The blocking information is racy since mutex_is_locked doesn't check
that the current thread holds the lock. The only other option would be
to pass the boolean information of whether or not the class was blocking
down through the stack which is less desirable.

v2: Don't do a trace event per loop. (Chris)
Only get blocking/non-blocking info (Chris)

v3: updated comment in code as well as commit msg (Daniel)
Add "(NB)" to trace information to remind us in 6 months (Ben)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 0206e353 16-Aug-2011 Akshay Joshi <me@akshayjoshi.com>

Drivers: i915: Fix all space related issues.

Various issues involved with the space character were generating
warnings in the checkpatch.pl file. This patch removes most of those
warnings.

Signed-off-by: Akshay Joshi <me@akshayjoshi.com>
Signed-off-by: Keith Packard <keithp@keithp.com>


# db53a302 03-Feb-2011 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Refine tracepoints

A lot of minor tweaks to fix the tracepoints, improve the outputting for
ftrace, and to generally make the tracepoints useful again. It is a start
and enough to begin identifying performance issues and gaps in our
coverage.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>


# 60de2ba5 12-Nov-2010 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Kill the get_fence tracepoint

As the tracepoint is now decoupled from when the actual register is
assigned and was never complemented by detailing when the object lost
its fence, it has outlived its limited usefulness. Profiling the actual
stalls is a far more profitable venture anyway.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>


# 05394f39 08-Nov-2010 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Use drm_i915_gem_object as the preferred type

A glorified s/obj_priv/obj/ with a net reduction of over a 100 lines and
many characters!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>


# ba4f01a3 08-Nov-2010 Yuanhan Liu <yuanhan.liu@linux.intel.com>

drm/i915: trace down all the register write and read

Add two tracepoints at I915_WRITE/READ for tracing down all the
register write and read.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>


# ec57d260 30-Sep-2010 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915: add mappable to gem_object_bind tracepoint

This way we can make some more educated guesses as to why exactly
we can't use 2G apertures to their full potential ;)

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>


# e5510fac 01-Jul-2010 Jesse Barnes <jbarnes@virtuousgeek.org>

drm/i915: add tracepoints for flip requests & completions

Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>


# f41275e8 24-May-2010 Li Zefan <lizf@cn.fujitsu.com>

drm/i915: Convert more trace events to DEFINE_EVENT

Convert i915_gem_object_clflush to DEFINE_EVENT, and save ~0.5K:

text data bss dec hex filename
13204 2732 12 15948 3e4c i915_trace_points.o.orig
12668 2732 12 15412 3c34 i915_trace_points.o

No change in functionality.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Eric Anholt <eric@anholt.net>


# a7c54278 03-May-2010 Peter Clifton <pcjc2@cam.ac.uk>

drm/i915: Fix out of tree builds

Fixes up include paths for i915_trace.h by setting additional CFLAGS
for i915_trace_points.c to include the $src directory. The required
TRACE_INCLUDE_PATH is then "."

Signed-off-by: Peter Clifton <pcjc2@cam.ac.uk>
Signed-off-by: Eric Anholt <eric@anholt.net>


# 903cf20c 11-Mar-2010 Li Zefan <lizf@cn.fujitsu.com>

drm/i915: Convert some trace events to DEFINE_TRACE

Use DECLARE_EVENT_CLASS to remove duplicate code:

text data bss dec hex filename
14655 2732 15 17402 43fa i915_trace_points.o.orig
11625 2732 10 14367 381f i915_trace_points.o

8 events are converted:

i915_gem_object: i915_gem_object_{unbind, destroy}
i915_gem_request: i915_gem_request_{complete, retire, wait_begin, wait_end}
i915_ring: i915_ring_{wait_begin, wait_end}

No functional change.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Eric Anholt <eric@anholt.net>


# 9d34e5db 23-Sep-2009 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Enable irq to trace batch buffer completion.

If we trigger a tracepoint for batch buffer submission, it is a reasonable
assumption that we wish to also trace the batch buffer completion. So in
order to capture the completion events, we need to enable irqs... However,
we cannot rely on the completion event to disable the irq later, so we
defer the irq disable to the retire request.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>


# 4f49be54 23-Sep-2009 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Record device minor rather than pointer in TRACE_EVENT

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>


# 1c5d22f7 25-Aug-2009 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Add tracepoints

By adding tracepoint equivalents for WATCH_BUF/EXEC we are able to monitor
the lifetimes of objects, requests and significant events. These events can
then be probed using the tracing frameworks, such as systemtap and, in
particular, perf.

For example to record the stack trace for every GPU stall during a run, use

$ perf record -e i915:i915_gem_request_wait_begin -c 1 -g

And

$ perf report

to view the results.

[Updated to fix compilation issues caused.]
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Ben Gamari <bgamari@gmail.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>