History log of /linux-master/drivers/gpu/drm/ttm/ttm_tt.c
Revision Date Author Comments
# 71ce0463 05-Jan-2024 Zack Rusin <zack.rusin@broadcom.com>

drm/ttm: Make sure the mapped tt pages are decrypted when needed

Some drivers require the mapped tt pages to be decrypted. In an ideal
world this would have been handled by the dma layer, but the TTM page
fault handling would have to be rewritten to able to do that.

A side-effect of the TTM page fault handling is using a dma allocation
per order (via ttm_pool_alloc_page) which makes it impossible to just
trivially use dma_mmap_attrs. As a result ttm has to be very careful
about trying to make its pgprot for the mapped tt pages match what
the dma layer thinks it is. At the ttm layer it's possible to
deduce the requirement to have tt pages decrypted by checking
whether coherent dma allocations have been requested and the system
is running with confidential computing technologies.

This approach isn't ideal but keeping TTM matching DMAs expectations
for the page properties is in general fragile, unfortunately proper
fix would require a rewrite of TTM's page fault handling.

Fixes vmwgfx with SEV enabled.

v2: Explicitly include cc_platform.h
v3: Use CC_ATTR_GUEST_MEM_ENCRYPT instead of CC_ATTR_MEM_ENCRYPT to
limit the scope to guests and log when memory decryption is enabled.

Signed-off-by: Zack Rusin <zack.rusin@broadcom.com>
Fixes: 3bf3710e3718 ("drm/ttm: Add a generic TTM memcpy move for page-based iomem")
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Acked-by: Christian König <christian.koenig@amd.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: dri-devel@lists.freedesktop.org
Cc: linux-kernel@vger.kernel.org
Cc: <stable@vger.kernel.org> # v5.14+
Link: https://patchwork.freedesktop.org/patch/msgid/20230926040359.3040017-1-zack@kde.org


# e6f7c641 29-Nov-2023 Karolina Stolarek <karolina.stolarek@intel.com>

drm/ttm/tests: Add tests for ttm_tt

Test initialization, creation and destruction of ttm_tt instances.
Export ttm_tt_destroy and ttm_tt_create symbols for testing purposes.

Signed-off-by: Karolina Stolarek <karolina.stolarek@intel.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Tested-by: Amaranath Somalapuram <Amaranath.Somalapuram@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/4459cb89c666bfa377753ae18d0c8917131638e8.1701257386.git.karolina.stolarek@intel.com
Signed-off-by: Christian König <christian.koenig@amd.com>


# 1d741590 25-Apr-2023 Mukul Joshi <mukul.joshi@amd.com>

drm/ttm: Helper function to get TTM mem limit

Add a helper function to get TTM memory limit. This is
needed by KFD to set its own internal memory limits.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 2ce977df 30-May-2023 Ma Jun <Jun.Ma2@amd.com>

drm/ttm: Remove redundant code in ttm_tt_init_fields

Remove redundant assignment code for ttm->caching as it's overwritten
just a few lines later.

v2:
- Update the commit message.

Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230531053051.3453509-1-Jun.Ma2@amd.com


# a3185f91 09-May-2022 Christian König <christian.koenig@amd.com>

drm/ttm: merge ttm_bo_api.h and ttm_bo_driver.h v2

Merge and cleanup the two headers into a single description of the
object API. Also move all the documentation to the implementation and
drop unnecessary includes from the header.

No functional change.

v2: minimal checkpatch.pl cleanup

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221125102137.1801-4-christian.koenig@amd.com


# 8f97344a 21-Apr-2022 Yang Wang <KevinYang.Wang@amd.com>

drm/ttm: use kvcalloc() instead of kvmalloc_array() in ttm_tt v2

simplify programming with existing functions.

v2 (chk): minimal coding style cleanup

Signed-off-by: Yang Wang <KevinYang.Wang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220421123442.1834102-1-KevinYang.Wang@amd.com
Signed-off-by: Christian König <christian.koenig@amd.com>


# e36764ec 01-Apr-2022 Ramalingam C <ramalingam.c@intel.com>

drm/ttm: Add a parameter to add extra pages into ttm_tt

Add a parameter called "extra_pages" for ttm_tt_init, to indicate that
driver needs extra pages in ttm_tt.

v2:
Used imperative wording [Thomas and Christian]

Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
cc: Christian Koenig <christian.koenig@amd.com>
cc: Hellstrom Thomas <thomas.hellstrom@intel.com>
Reviewed-by: Thomas Hellstrom <thomas.hellstrom@linux.intel.com>
Reviewed-by: Christian Konig <christian.koenig@amd.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220401123751.27771-8-ramalingam.c@intel.com


# 7938f421 04-Feb-2022 Lucas De Marchi <lucas.demarchi@intel.com>

dma-buf-map: Rename to iosys-map

Rename struct dma_buf_map to struct iosys_map and corresponding APIs.
Over time dma-buf-map grew up to more functionality than the one used by
dma-buf: in fact it's just a shim layer to abstract system memory, that
can be accessed via regular load and store, from IO memory that needs to
be acessed via arch helpers.

The idea is to extend this API so it can fulfill other needs, internal
to a single driver. Example: in the i915 driver it's desired to share
the implementation for integrated graphics, which uses mostly system
memory, with discrete graphics, which may need to access IO memory.

The conversion was mostly done with the following semantic patch:

@r1@
@@
- struct dma_buf_map
+ struct iosys_map

@r2@
@@
(
- DMA_BUF_MAP_INIT_VADDR
+ IOSYS_MAP_INIT_VADDR
|
- dma_buf_map_set_vaddr
+ iosys_map_set_vaddr
|
- dma_buf_map_set_vaddr_iomem
+ iosys_map_set_vaddr_iomem
|
- dma_buf_map_is_equal
+ iosys_map_is_equal
|
- dma_buf_map_is_null
+ iosys_map_is_null
|
- dma_buf_map_is_set
+ iosys_map_is_set
|
- dma_buf_map_clear
+ iosys_map_clear
|
- dma_buf_map_memcpy_to
+ iosys_map_memcpy_to
|
- dma_buf_map_incr
+ iosys_map_incr
)

@@
@@
- #include <linux/dma-buf-map.h>
+ #include <linux/iosys-map.h>

Then some files had their includes adjusted and some comments were
update to remove mentions to dma-buf-map.

Since this is not specific to dma-buf anymore, move the documentation to
the "Bus-Independent Device Accesses" section.

v2:
- Squash patches

v3:
- Fix wrong removal of dma-buf.h from MAINTAINERS
- Move documentation from dma-buf.rst to device-io.rst

v4:
- Change documentation title and level

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Sumit Semwal <sumit.semwal@linaro.org>
Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20220204170541.829227-1-lucas.demarchi@intel.com


# 8581fd40 02-Dec-2021 Jakub Kicinski <kuba@kernel.org>

treewide: Add missing includes masked by cgroup -> bpf dependency

cgroup.h (therefore swap.h, therefore half of the universe)
includes bpf.h which in turn includes module.h and slab.h.
Since we're about to get rid of that dependency we need
to clean things up.

v2: drop the cpu.h include from cacheinfo.h, it's not necessary
and it makes riscv sensitive to ordering of include files.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Krzysztof Wilczyński <kw@linux.com>
Acked-by: Peter Chen <peter.chen@kernel.org>
Acked-by: SeongJae Park <sj@kernel.org>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/all/20211120035253.72074-1-kuba@kernel.org/ # v1
Link: https://lore.kernel.org/all/20211120165528.197359-1-kuba@kernel.org/ # cacheinfo discussion
Link: https://lore.kernel.org/bpf/20211202203400.1208663-1-kuba@kernel.org


# 49e7f76f 29-Sep-2021 Matthew Auld <matthew.auld@intel.com>

drm/ttm: add TTM_TT_FLAG_EXTERNAL_MAPPABLE

In commit:

commit 667a50db0477d47fdff01c666f5ee1ce26b5264c
Author: Thomas Hellstrom <thellstrom@vmware.com>
Date: Fri Jan 3 11:17:18 2014 +0100

drm/ttm: Refuse to fault (prime-) imported pages

we introduced the restriction that imported pages should not be directly
mappable through TTM(this also extends to userptr). In the next patch we
want to introduce a shmem_tt backend, which should follow all the
existing rules with TTM_PAGE_FLAG_EXTERNAL, since it will need to handle
swapping itself, but with the above mapping restriction lifted.

v2(Christian):
- Don't OR together EXTERNAL and EXTERNAL_MAPPABLE in the definition
of EXTERNAL_MAPPABLE, just leave it the caller to handle this
correctly, otherwise we might encounter subtle issues.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210929132629.353541-3-matthew.auld@intel.com
Signed-off-by: Christian König <christian.koenig@amd.com>


# 43d46f0b 29-Sep-2021 Matthew Auld <matthew.auld@intel.com>

drm/ttm: s/FLAG_SG/FLAG_EXTERNAL/

It covers more than just ttm_bo_type_sg usage, like with say dma-buf,
since one other user is userptr in amdgpu, and in the future we might
have some more. Hence EXTERNAL is likely a more suitable name.

v2(Christian):
- Rename these to TTM_TT_FLAGS_*
- Fix up all the holes in the flag values

Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210929132629.353541-1-matthew.auld@intel.com
Signed-off-by: Christian König <christian.koenig@amd.com>


# 21856e1e 26-Sep-2021 Matthew Auld <matthew.auld@intel.com>

drm/ttm: move ttm_tt_{add, clear}_mapping into amdgpu

Now that setting page->index shouldn't be needed anymore, we are just
left with setting page->mapping, and here it looks like amdgpu is the
only user, where pointing the page->mapping at the dev_mapping is used
to verify that the pages do indeed belong to the device, if userspace
later tries to touch them.

v2(Christian):
- Drop the functions altogether and just inline modifying
the page->mapping

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210927114114.152310-3-matthew.auld@intel.com
Signed-off-by: Christian König <christian.koenig@amd.com>


# 635138f7 26-Sep-2021 Matthew Auld <matthew.auld@intel.com>

drm/ttm: stop setting page->index for the ttm_tt

In commit:

commit 58aa6622d32af7d2c08d45085f44c54554a16ed7
Author: Thomas Hellstrom <thellstrom@vmware.com>
Date: Fri Jan 3 11:47:23 2014 +0100

drm/ttm: Correctly set page mapping and -index members

we started setting the page->mapping and page->index to point to the
virtual address space, if the pages were faulted with TTM. Apparently
this was needed for core-mm to able to reverse lookup the virtual
address given the struct page, and potentially unmap it from the page
tables. However as pointed out by Thomas, since we are now using
PFN_MAP, instead of say PFN_MIXED, this should no longer be the case.

There was also apparently some usecase in vmwgfx which needed this for
dirty tracking, but that also doesn't appear to be the case anymore, as
pointed out by Thomas.

We still need keep the page->mapping for now, since that is still needed
for different reasons, but we try to address that in the next patch.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210927114114.152310-2-matthew.auld@intel.com
Signed-off-by: Christian König <christian.koenig@amd.com>


# d5f45d1e 28-Jul-2021 Christian König <ckoenig.leichtzumerken@gmail.com>

drm/ttm: remove ttm_tt_destroy_common v2

Move the functionality into ttm_tt_fini and ttm_bo_tt_destroy instead.

We don't need this any more since we removed the unbind from the destroy
code paths in the drivers.

Also add a warning to ttm_tt_fini() if we try to fini a still populated TT
object.

v2: instead of reverting the patch move the functionality to different
places.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210728130552.2074-5-christian.koenig@amd.com


# 80cbd880 12-Aug-2021 Jason Ekstrand <jason@jlekstrand.net>

drm/ttm: Include pagemap.h from ttm_tt.h

It's needed for pgprot_t which is used in the header.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210812203443.1725307-2-jason@jlekstrand.net
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>


# 3bf3710e 02-Jun-2021 Thomas Hellström <thomas.hellstrom@linux.intel.com>

drm/ttm: Add a generic TTM memcpy move for page-based iomem

The internal ttm_bo_util memcpy uses ioremap functionality, and while it
probably might be possible to use it for copying in- and out of
sglist represented io memory, using io_mem_reserve() / io_mem_free()
callbacks, that would cause problems with fault().
Instead, implement a method mapping page-by-page using kmap_local()
semantics. As an additional benefit we then avoid the occasional global
TLB flushes of ioremap() and consuming ioremap space, elimination of a
critical point of failure and with a slight change of semantics we could
also push the memcpy out async for testing and async driver development
purposes.

A special linear iomem iterator is introduced internally to mimic the
old ioremap behaviour for code-paths that can't immediately be ported
over. This adds to the code size and should be considered a temporary
solution.

Looking at the code we have a lot of checks for iomap tagged pointers.
Ideally we should extend the core memremap functions to also accept
uncached memory and kmap_local functionality. Then we could strip a
lot of code.

Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://lore.kernel.org/r/20210602083818.241793-4-thomas.hellstrom@linux.intel.com


# c3d670fc 02-Jun-2021 Lee Jones <lee.jones@linaro.org>

drm/ttm/ttm_tt: Demote non-conformant kernel-doc header

Fixes the following W=1 kernel build warning(s):

drivers/gpu/drm/ttm/ttm_tt.c:398: warning: Function parameter or member 'num_pages' not described in 'ttm_tt_mgr_init'
drivers/gpu/drm/ttm/ttm_tt.c:398: warning: Function parameter or member 'num_dma32_pages' not described in 'ttm_tt_mgr_init'

Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210602143300.2330146-20-lee.jones@linaro.org


# 13ea9aa1 22-Apr-2021 Shiwu Zhang <shiwu.zhang@amd.com>

drm/ttm: fix error handling if no BO can be swapped out v4

In case that all pre-allocated BOs are busy, just continue to populate
BOs since likely half of system memory in total is still free.

v4 (chk): fix code moved to VMWGFX as well

Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210422115757.3946-1-christian.koenig@amd.com


# 2b173d7f 14-Apr-2021 Felix Kuehling <Felix.Kuehling@amd.com>

drm/ttm: Don't count pages in SG BOs against pages_limit

Pages in SG BOs were not allocated by TTM. So don't count them against
TTM's pages limit.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210414064804.29356-9-Felix.Kuehling@amd.com


# b057f37b 09-Apr-2021 Christian König <christian.koenig@amd.com>

drm/ttm: re-add debugfs tt_shrink file

That got lost when we moved back to a static limit.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210409130113.1459-2-christian.koenig@amd.com


# a28e10ed 09-Apr-2021 Christian König <christian.koenig@amd.com>

drm/ttm: fix return value check

The function returns the number of swapped pages here. Only abort when we get
a negative error code.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210409130113.1459-1-christian.koenig@amd.com


# 2a269ba8 22-Apr-2021 Shiwu Zhang <shiwu.zhang@amd.com>

drm/ttm: fix error handling if no BO can be swapped out v4

In case that all pre-allocated BOs are busy, just continue to populate
BOs since likely half of system memory in total is still free.

v4 (chk): fix code moved to VMWGFX as well

Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210422115757.3946-1-christian.koenig@amd.com


# a4394b6d 14-Apr-2021 Felix Kuehling <Felix.Kuehling@amd.com>

drm/ttm: Don't count pages in SG BOs against pages_limit

Pages in SG BOs were not allocated by TTM. So don't count them against
TTM's pages limit.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210414064804.29356-9-Felix.Kuehling@amd.com


# 8a9d24f2 09-Apr-2021 Christian König <christian.koenig@amd.com>

drm/ttm: fix return value check

The function returns the number of swapped pages here. Only abort when we get
a negative error code.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210409130113.1459-1-christian.koenig@amd.com


# 680dcede 24-Mar-2021 Christian König <christian.koenig@amd.com>

drm/ttm: switch back to static allocation limits for now

The shrinker based approach still has some flaws. Especially that we need
temporary pages to free up the pages allocated to the driver is problematic
in a shrinker.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210324134845.2338-1-christian.koenig@amd.com


# ebd59851 06-Oct-2020 Christian König <christian.koenig@amd.com>

drm/ttm: move swapout logic around v3

Move the iteration of the global lru into the new function
ttm_global_swapout() and use that instead in drivers.

v2: consistently return int
v3: fix build fail

Signed-off-by: Christian König <christian.koenig@amd.com>
Tested-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/424008/


# d4bd7776 16-Nov-2020 Christian König <christian.koenig@amd.com>

drm/ttm: rework ttm_tt page limit v4

TTM implements a rather extensive accounting of allocated memory.

There are two reasons for this:
1. It tries to block userspace allocating a huge number of very small
BOs without accounting for the kmalloced memory.

2. Make sure we don't over allocate and run into an OOM situation
during swapout while trying to handle the memory shortage.

This is only partially a good idea. First of all it is perfectly
valid for an application to use all of system memory, limiting it to
50% is not really acceptable.

What we need to take care of is that the application is held
accountable for the memory it allocated. This is what control
mechanisms like memcg and the normal Linux page accounting already do.

Making sure that we don't run into an OOM situation while trying to
cope with a memory shortage is still a good idea, but this is also
not very well implemented since it means another opportunity of
recursion from the driver back into TTM.

So start to rework all of this by implementing a shrinker callback which
allows for TT object to be swapped out if necessary.

v2: Switch from limit to shrinker callback.
v3: fix gfp mask handling, use atomic for swapable_pages, add debugfs
v4: drop the extra gfp_mask checks

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210208133226.36955-1-christian.koenig@amd.com


# 8af8a109 01-Oct-2020 Christian König <christian.koenig@amd.com>

drm/ttm: device naming cleanup

Rename ttm_bo_device to ttm_device.
Rename ttm_bo_driver to ttm_device_funcs.
Rename ttm_bo_global to ttm_global.

Move global and device related functions to ttm_device.[ch].

No functional change.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/415222/


# e11bfb99 09-Dec-2020 Christian König <christian.koenig@amd.com>

drm/ttm: cleanup BO size handling v3

Based on an idea from Dave, but cleaned up a bit.

We had multiple fields for essentially the same thing.

Now bo->base.size is the original size of the BO in
arbitrary units, usually bytes.

bo->mem.num_pages is the size in number of pages in the
resource domain of bo->mem.mem_type.

v2: use the GEM object size instead of the BO size
v3: fix printks in some places

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com> (v1)
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/406831/


# 18f7608a 23-Nov-2020 Christian König <christian.koenig@amd.com>

drm/ttm: nuke ttm_dma_tt_init

Not used any more.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/403837/


# 78616b88 16-Nov-2020 Lee Jones <lee.jones@linaro.org>

drm/ttm/ttm_tt: Demote kernel-doc header format abuses

Fixes the following W=1 kernel build warning(s):

drivers/gpu/drm/ttm/ttm_tt.c:45: warning: Function parameter or member 'bo' not described in 'ttm_tt_create'
drivers/gpu/drm/ttm/ttm_tt.c:45: warning: Function parameter or member 'zero_alloc' not described in 'ttm_tt_create'
drivers/gpu/drm/ttm/ttm_tt.c:83: warning: Function parameter or member 'ttm' not described in 'ttm_tt_alloc_page_directory'

Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20201116174112.1833368-33-lee.jones@linaro.org


# 586052b0 02-Nov-2020 Christian König <christian.koenig@amd.com>

drm/ttm: rework no_retry handling v2

During eviction we do want to trigger the OOM killer.

Only while doing new allocations we should try to avoid that and
return -ENOMEM to the application.

v2: rename the flag to gfp_retry_mayfail.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/398685/


# 256dd44b 24-Oct-2020 Christian König <christian.koenig@amd.com>

drm/ttm: nuke old page allocator

Not used any more.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Madhav Chauhan <madhav.chauhan@amd.com>
Tested-by: Huang Rui <ray.huang@amd.com>
Link: https://patchwork.freedesktop.org/patch/397087/?series=83051&rev=1


# ee5d2a8e 24-Oct-2020 Christian König <christian.koenig@amd.com>

drm/ttm: wire up the new pool as default one v2

Provide the necessary parameters by all drivers and use the new pool alloc
when no driver specific function is provided.

v2: fix the GEM VRAM helpers

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Madhav Chauhan <madhav.chauhan@amd.com>
Tested-by: Huang Rui <ray.huang@amd.com>
Link: https://patchwork.freedesktop.org/patch/397081/?series=83051&rev=1


# e34b8fee 21-Oct-2020 Christian König <christian.koenig@amd.com>

drm/ttm: merge ttm_dma_tt back into ttm_tt

It makes no difference to kmalloc if the structure
is 48 or 64 bytes in size.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/396950/


# 05f8d250 19-Oct-2020 Christian König <christian.koenig@amd.com>

drm/ttm: move swapin out of page alloc backend

This is not related to allocating the backing store in any way.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/396947/


# d1cb1f25 19-Oct-2020 Christian König <christian.koenig@amd.com>

drm/ttm: nuke ttm_tt_set_(un)populated again

Neither page allocation backend nor the driver should mess with that.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Madhav Chauhan <madhav.chauhan@amd.com>
Link: https://patchwork.freedesktop.org/patch/396948/


# ce65b874 30-Sep-2020 Christian König <christian.koenig@amd.com>

drm/ttm: nuke caching placement flags

Changing the caching on the fly never really worked
flawlessly.

So stop this completely and just let drivers specific the
desired caching in the tt or bus object.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Link: https://patchwork.freedesktop.org/patch/394256/


# 1b4ea4c5 30-Sep-2020 Christian König <christian.koenig@amd.com>

drm/ttm: set the tt caching state at creation time

All drivers can determine the tt caching state at creation time,
no need to do this on the fly during every validation.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Link: https://patchwork.freedesktop.org/patch/394253/


# ab861424 17-Sep-2020 Christian König <christian.koenig@amd.com>

drm/ttm: remove persistent_swap_storage

Not used any more. Cleanup the code as well while at it.

Signed-off-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/391079/?series=81804&rev=1
Reviewed-by: Dave Airlie <airlied@redhat.com>


# db9c1734 18-Sep-2020 Christian König <christian.koenig@amd.com>

drm/ttm: stop dangerous caching attribute change

When we swapout/in a BO we try to change the caching
attributes of the pages before/after doing the copy.

On x86 this is done by calling set_pages_uc(),
set_memory_wc() or set_pages_wb() for not highmem pages
to update the linear mapping of the page.

On all other platforms we do exactly nothing.

Now on x86 this is unnecessary because copy_highpage() will
either create a temporary mapping of the page which is wb
anyway and destroyed immediately again or use the linear
mapping with the correct caching attributes.

So stop this nonsense and just keep the caching as it is and
return an error when a driver tries to change the caching of
an already populated TT object.

This is much more defensive since changing caching
attributes is platform and driver specific and usually
doesn't work after the page was initially allocated.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/391293/


# 7626168f 16-Sep-2020 Dave Airlie <airlied@redhat.com>

drm/ttm: flip tt destroy ordering.

Call the driver first and have it call the common code cleanup.

This is useful later to fix unbind.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200917043040.146575-3-airlied@gmail.com


# 9e9a153b 14-Sep-2020 Dave Airlie <airlied@redhat.com>

drm/ttm: move ttm binding/unbinding out of ttm_tt paths.

Move these up to the bo level, moving ttm_tt to just being
backing store. Next step is to move the bound flag out.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200915024007.67163-6-airlied@gmail.com


# 2040ec97 14-Sep-2020 Dave Airlie <airlied@redhat.com>

drm/ttm: split populate out from binding.

Drivers have to call populate themselves now before binding.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200915024007.67163-5-airlied@gmail.com


# 395a73f8 14-Sep-2020 Dave Airlie <airlied@redhat.com>

drm/ttm: tt destroy move null check to outer function.

This just makes things easier later.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200915024007.67163-4-airlied@gmail.com


# 7eec9151 14-Sep-2020 Dave Airlie <airlied@redhat.com>

drm/ttm/tt: add wrappers to set tt state.

This adds 2 getters and 4 setters, however unbound and populated
are currently the same thing, this will change, it also drops
a BUG_ON that seems not that useful.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200915024007.67163-2-airlied@gmail.com


# 04e89ff3 07-Sep-2020 Dave Airlie <airlied@redhat.com>

drm/ttm: drop the tt backend function paths.

These are now driver side.

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200907204630.1406528-14-airlied@gmail.com


# 86008a75 07-Sep-2020 Dave Airlie <airlied@redhat.com>

drm/ttm: add optional bind/unbind via driver.

I want to remove the backend funcs

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200907204630.1406528-3-airlied@gmail.com


# 0a667b50 24-Aug-2020 Dave Airlie <airlied@redhat.com>

drm/ttm: remove bdev from ttm_tt

I want to split this structure up and use it differently,
step one remove bdev pointer from it and pass it explicitly.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200826014428.828392-4-airlied@gmail.com


# 0b062865 19-Aug-2020 Christian König <christian.koenig@amd.com>

drm/ttm: fix broken merge between drm-next and drm-misc-next

drm-next reverted the changes to ttm_tt_create() to do the
NULL check inside the function, but drm-misc-next adds new
users of this approach.

Re-apply the NULL check change inside the function to fix this.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/386628/


# 62975d27 11-Aug-2020 Christian König <ckoenig.leichtzumerken@gmail.com>

drm/ttm: revert "drm/ttm: make TT creation purely optional v3"

This reverts commit 2ddef17678bc2ea1d20517dd2b4ed4aa967ffa8b.

As it turned out VMWGFX needs a much wider audit to fix this.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200811092400.188124-1-christian.koenig@amd.com


# 2966141a 03-Aug-2020 Dave Airlie <airlied@redhat.com>

drm/ttm: rename ttm_mem_reg to ttm_resource.

This name better reflects what the object does. I didn't rename
all the pointers it seemed too messy.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200804025632.3868079-60-airlied@gmail.com


# 5de5b6ec 27-Jul-2020 Dave Airlie <airlied@redhat.com>

drm/ttm/nouveau: don't call tt destroy callback on alloc failure.

This is confusing, and from my reading of all the drivers only
nouveau got this right.

Just make the API act under driver control of it's own allocation
failing, and don't call destroy, if the page table fails to
create there is nothing to cleanup here.

(I'm willing to believe I've missed something here, so please
review deeply).

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200728041736.20689-1-airlied@gmail.com


# 08bb88cf 27-Jul-2020 Dave Airlie <airlied@redhat.com>

drm/ttm: make ttm_tt unbind function return void.

The return value just led to BUG_ON, I think if a driver wants
to BUG_ON here it can do it itself. (don't BUG_ON).

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200728040003.20398-1-airlied@gmail.com


# 2ddef176 24-Jun-2020 Christian König <christian.koenig@amd.com>

drm/ttm: make TT creation purely optional v3

We only need the page array when the BO is about to be accessed.

So not only populate, but also create it on demand.

v2: move NULL check into ttm_tt_create()
v3: fix the occurrence in ttm_bo_kmap_ttm as well

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Link: https://patchwork.freedesktop.org/patch/373182/


# 2869e82e 03-Nov-2019 Christian König <christian.koenig@amd.com>

drm/ttm: ttm_tt_init_fields() can be static

Fixes: 75a57669cbc8 ("drm/ttm: add ttm_sg_tt_init")
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.kernel.org/patch/10263323/


# 52791eee 11-Aug-2019 Christian König <christian.koenig@amd.com>

dma-buf: rename reservation_object to dma_resv

Be more consistent with the naming of the other DMA-buf objects.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/323401/


# e532a135 05-Aug-2019 Gerd Hoffmann <kraxel@redhat.com>

drm/ttm: switch ttm core from bo->resv to bo->base.resv

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20190805140119.7337-11-kraxel@redhat.com


# df36b2fb 31-Jul-2018 Huang Rui <ray.huang@amd.com>

drm/ttm: clean up non-x86 definitions on ttm_tt

All non-x86 definitions are moved to ttm_set_memory header, so remove it from
ttm_tt.c.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 1297bf2e 02-May-2018 Dirk Hohndel <dirk@hohndel.org>

Add SPDX idenitifier and clarify license

This is dual licensed under GPL-2.0 or MIT.

Signed-off-by: Dirk Hohndel (VMware) <dirk@hohndel.org>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 536bbeba 28-Feb-2018 Christian König <christian.koenig@amd.com>

drm/ttm: move initializing ttm->sg into ttm_tt_init_fields

Better to set this with all other fields as well.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Roger He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# dde5da23 22-Feb-2018 Christian König <christian.koenig@amd.com>

drm/ttm: add bo as parameter to the ttm_tt_create callback

Instead of calculating the size in bytes just to recalculate the number
of pages from it pass the BO directly to the function.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Roger He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 75a57669 23-Feb-2018 Christian König <christian.koenig@amd.com>

drm/ttm: add ttm_sg_tt_init

This allows drivers to only allocate dma addresses, but not a page
array.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Roger He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 81f5ec02 22-Feb-2018 Christian König <christian.koenig@amd.com>

drm/ttm: move ttm_tt defines into ttm_tt.h

Let's stop mangling everything in a single header and create one header
per object instead.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Roger He <Hongbo.He@amd.com>
Acked-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 45a9d154 22-Feb-2018 Christian König <christian.koenig@amd.com>

drm/ttm: cleanup ttm_tt_create

Cleanup ttm_tt_create a bit.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Roger He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 97b7e1b8 22-Feb-2018 Christian König <christian.koenig@amd.com>

drm/ttm: move ttm_tt_create into ttm_tt.c v2

Rename ttm_bo_add_ttm to ttm_tt_create and move it into ttm_tt.c.

v2: separate the cleanup.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Roger He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 231cdafc 21-Feb-2018 Christian König <christian.koenig@amd.com>

drm/ttm: drop ttm->dummy_read_page

Only used by the AGP backend and there it can be easily accessed using
ttm->bdev->glob.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 3231a769 21-Feb-2018 Christian König <christian.koenig@amd.com>

drm/ttm: drop ttm->glob

The pointer is available as ttm->bdev->glob as well.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# e44fcf71 21-Feb-2018 Christian König <christian.koenig@amd.com>

drm/ttm: add default implementations for ttm_tt_(un)populate

Use ttm_pool_populate/ttm_pool_unpopulate if the driver doesn't provide
a function.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# ec929370 01-Feb-2018 Christian König <christian.koenig@amd.com>

drm/ttm: set page mapping during allocation

To aid debugging set the page mapping during allocation instead of
during VM faults.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Roger He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 25893a14 01-Feb-2018 Christian König <christian.koenig@amd.com>

drm/ttm: add ttm_tt_populate wrapper

Stop calling the driver callback directly.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Roger He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# ddde985c 25-Jan-2018 Tom St Denis <tom.stdenis@amd.com>

drm/ttm: Fix coding style in ttm_tt_swapout()

Add missing {} braces.

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 5b4262d7 25-Jan-2018 Tom St Denis <tom.stdenis@amd.com>

drm/ttm: Change ttm_tt page allocations to return errors

Explicitly return errors in ttm_tt_alloc_page_directory() and
ttm_dma_tt_alloc_page_directory() instead of relying on
further logic to detect errors.

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# cb5f1a52 22-Dec-2017 Andrey Grodzovsky <andrey.grodzovsky@amd.com>

drm/ttm: Allow page allocations w/o triggering OOM..

This to allow drivers to choose to avoid OOM invocation and handle
page allocation failures instead.

v2:
Remove extra new lines.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Roger He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 993baf15 21-Dec-2017 Roger He <Hongbo.He@amd.com>

drm/ttm: use an operation ctx for ttm_tt_bind

forward the operation context to ttm_tt_bind as well,
and the ultimate goal is swapout enablement for reserved BOs.

v2: use common term rather than amd specific

Reviewed-by: Thomas Hellström <thellstrom@vmware.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chuming Zhou <david1.zhou@amd.com>
Signed-off-by: Roger He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# d0cef9fa 21-Dec-2017 Roger He <Hongbo.He@amd.com>

drm/ttm: use an operation ctx for ttm_tt_populate in ttm_bo_driver (v2)

forward the operation context to ttm_tt_populate as well,
and the ultimate goal is swapout enablement for reserved BOs.

v2: squash in fix for vboxvideo

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Roger He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 2098105e 17-May-2017 Michal Hocko <mhocko@kernel.org>

drm: drop drm_[cm]alloc* helpers

Now that drm_[cm]alloc* helpers are simple one line wrappers around
kvmalloc_array and drm_free_large is just kvfree alias we can drop
them and replace by their native forms.

This shouldn't introduce any functional change.

Changes since v1
- fix typo in drivers/gpu//drm/etnaviv/etnaviv_gem.c - noticed by 0day
build robot

Suggested-by: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Michal Hocko <mhocko@suse.com>drm: drop drm_[cm]alloc* helpers
[danvet: Fixup vgem which grew another user very recently.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Christian König <christian.koenig@amd.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170517122312.GK18247@dhcp22.suse.cz


# ed3ba079 08-May-2017 Laura Abbott <labbott@redhat.com>

drm: use set_memory.h header

set_memory_* functions have moved to set_memory.h. Switch to this
explicitly.

[akpm@linux-foundation.org: track drivers/gpu/drm/i915/i915_gem_gtt.c linux-next changes]
Link: http://lkml.kernel.org/r/1488920133-27229-8-git-send-email-labbott@redhat.com
Signed-off-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# af1f85dd 16-Sep-2016 Alexandre Courbot <acourbot@nvidia.com>

drm/ttm: remove cpu_address member from ttm_tt

Patch 3d50d4dcb0 exposed the CPU address of DMA-allocated pages as
returned by dma_alloc_coherent because Nouveau on Tegra needed it.

This is not required anymore - as there were no other users for it,
remove it and save some memory for everyone.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 2ff2bf1e 20-Jul-2016 Christian König <christian.koenig@amd.com>

drm/ttm: partial revert "cleanup ttm_tt_(unbind|destroy)" v3

We still need to unbind explicitly during a move.

This partial reverts commit ff20caa0bcbfef9f7686f8d1868a3b990921afd6.

v2: remove unnecessary check and unused variable
v3: fix typo in commit message

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 4279cb14 06-Jun-2016 Christian König <christian.koenig@amd.com>

drm/ttm: remove NULL checks when calling ttm_tt_destroy

The function is a no-op with a NULL pointer.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 089f16c5 06-Jun-2016 Christian König <christian.koenig@amd.com>

drm/ttm: cleanup ttm_tt_(unbind|destroy)

ttm_tt_destroy should be the only one unbinding the object.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 93c76a3d 04-Dec-2015 Al Viro <viro@zeniv.linux.org.uk>

file_inode(f)->i_mapping is f->f_mapping

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


# 09cbfeaf 01-Apr-2016 Kirill A. Shutemov <kirill.shutemov@linux.intel.com>

mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros

PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
ago with promise that one day it will be possible to implement page
cache with bigger chunks than PAGE_SIZE.

This promise never materialized. And unlikely will.

We have many places where PAGE_CACHE_SIZE assumed to be equal to
PAGE_SIZE. And it's constant source of confusion on whether
PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
especially on the border between fs and mm.

Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
breakage to be doable.

Let's stop pretending that pages in page cache are special. They are
not.

The changes are pretty straight-forward:

- <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;

- <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;

- PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};

- page_cache_get() -> get_page();

- page_cache_release() -> put_page();

This patch contains automated changes generated with coccinelle using
script below. For some reason, coccinelle doesn't patch header files.
I've called spatch for them manually.

The only adjustment after coccinelle is revert of changes to
PAGE_CAHCE_ALIGN definition: we are going to drop it later.

There are few places in the code where coccinelle didn't reach. I'll
fix them manually in a separate patch. Comments and documentation also
will be addressed with the separate patch.

virtual patch

@@
expression E;
@@
- E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E

@@
expression E;
@@
- E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E

@@
@@
- PAGE_CACHE_SHIFT
+ PAGE_SHIFT

@@
@@
- PAGE_CACHE_SIZE
+ PAGE_SIZE

@@
@@
- PAGE_CACHE_MASK
+ PAGE_MASK

@@
expression E;
@@
- PAGE_CACHE_ALIGN(E)
+ PAGE_ALIGN(E)

@@
expression E;
@@
- page_cache_get(E)
+ get_page(E)

@@
expression E;
@@
- page_cache_release(E)
+ put_page(E)

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 55579cfe 31-Jul-2015 Viresh Kumar <viresh.kumar@linaro.org>

drivers: gpu: Drop unlikely before IS_ERR(_OR_NULL)

IS_ERR(_OR_NULL) already contain an 'unlikely' compiler flag and there
is no need to do that again from its callers. Drop it.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 3d50d4dc 04-Aug-2014 Alexandre Courbot <acourbot@nvidia.com>

drm/ttm: expose CPU address of DMA-allocated pages

Pages allocated using the DMA API have a coherent memory mapping. Make
this mapping visible to drivers so they can decide to use it instead of
creating their own redundant one.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Acked-by: David Airlie <airlied@linux.ie>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>


# 1b76af5c 05-Feb-2014 Thomas Hellstrom <thellstrom@vmware.com>

drm/ttm: Don't clear page metadata of imported sg pages

These page pointers shouldn't be visible to TTM in the first place, but
until we fix that up, don't clear the page metadata because that
will upset the exporter.

Reported-and-tested-by: Cristoph Haag <haagch.christoph@googleemail.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>


# 58aa6622 03-Jan-2014 Thomas Hellstrom <thellstrom@vmware.com>

drm/ttm: Correctly set page mapping and -index members

Needed for some vm operations; most notably unmap_mapping_range() with
even_cows = 0.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>


# 182b17c8 16-Sep-2013 Ben Skeggs <bskeggs@redhat.com>

drm/ttm: fix the tt_populated check in ttm_tt_destroy()

After a vmalloc failure in ttm_dma_tt_alloc_page_directory(),
ttm_dma_tt_init() will call ttm_tt_destroy() to cleanup, and end up
inside the driver's unpopulate() hook when populate() has never yet
been called.

On nouveau, the first issue to be hit because of this is that
dma_address[] may be a NULL pointer. After working around this,
ttm_pool_unpopulate() may potentially hit the same issue with
the pages[] array.

It seems to make more sense to avoid calling unpopulate on already
unpopulated TTMs than to add checks to all the implementations.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Cc: stable@vger.kernel.org
Cc: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>


# 496ad9aa 23-Jan-2013 Al Viro <viro@zeniv.linux.org.uk>

new helper: file_inode(file)

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


# 55aa914e 08-Nov-2012 Akinobu Mita <akinobu.mita@gmail.com>

drm/ttm: remove unneeded preempt_disable/enable

It is unnecessary to disable preemption explicitly while calling
copy_highpage(). Because copy_highpage() will do it again through
kmap_atomic/kunmap_atomic.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>


# 760285e7 02-Oct-2012 David Howells <dhowells@redhat.com>

UAPI: (Scripted) Convert #include "..." to #include <path/...> in drivers/gpu/

Convert #include "..." to #include <path/...> in drivers/gpu/.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Dave Jones <davej@redhat.com>


# 259a290e 25-Sep-2012 Akinobu Mita <akinobu.mita@gmail.com>

gpu/drm/ttm: use copy_highpage

Use copy_highpage() to copy from one page to another.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Dave Airlie <airlied@redhat.com>


# 1c9c20f6 25-Nov-2011 Cong Wang <amwang@redhat.com>

drm: remove the second argument of k[un]map_atomic()

Signed-off-by: Cong Wang <amwang@redhat.com>


# 25d0479a 16-Mar-2012 Joe Perches <joe@perches.com>

drm/ttm: Use pr_fmt and pr_<level>

Use the more current logging style.

Add pr_fmt and remove the TTM_PFX uses.
Coalesce formats and align arguments.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>


# dea7e0ac 03-Jan-2012 Jerome Glisse <jglisse@redhat.com>

ttm: fix agp since ttm tt rework

ttm tt rework modified the way we allocate and populate the
ttm_tt structure, the AGP side was missing some bit to properly
work. Fix those and fix radeon and nouveau AGP support.

Tested on radeon only so far.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>


# 8e7e7052 09-Nov-2011 Jerome Glisse <jglisse@redhat.com>

drm/ttm: isolate dma data from ttm_tt V4

Move dma data to a superset ttm_dma_tt structure which herit
from ttm_tt. This allow driver that don't use dma functionalities
to not have to waste memory for it.

V2 Rebase on top of no memory account changes (where/when is my
delorean when i need it ?)
V3 Make sure page list is initialized empty
V4 typo/syntax fixes

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>


# 2334b75f 03-Nov-2011 Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

drm/ttm: provide dma aware ttm page pool code V9

In TTM world the pages for the graphic drivers are kept in three different
pools: write combined, uncached, and cached (write-back). When the pages
are used by the graphic driver the graphic adapter via its built in MMU
(or AGP) programs these pages in. The programming requires the virtual address
(from the graphic adapter perspective) and the physical address (either System RAM
or the memory on the card) which is obtained using the pci_map_* calls (which does the
virtual to physical - or bus address translation). During the graphic application's
"life" those pages can be shuffled around, swapped out to disk, moved from the
VRAM to System RAM or vice-versa. This all works with the existing TTM pool code
- except when we want to use the software IOTLB (SWIOTLB) code to "map" the physical
addresses to the graphic adapter MMU. We end up programming the bounce buffer's
physical address instead of the TTM pool memory's and get a non-worky driver.
There are two solutions:
1) using the DMA API to allocate pages that are screened by the DMA API, or
2) using the pci_sync_* calls to copy the pages from the bounce-buffer and back.

This patch fixes the issue by allocating pages using the DMA API. The second
is a viable option - but it has performance drawbacks and potential correctness
issues - think of the write cache page being bounced (SWIOTLB->TTM), the
WC is set on the TTM page and the copy from SWIOTLB not making it to the TTM
page until the page has been recycled in the pool (and used by another application).

The bounce buffer does not get activated often - only in cases where we have
a 32-bit capable card and we want to use a page that is allocated above the
4GB limit. The bounce buffer offers the solution of copying the contents
of that 4GB page to an location below 4GB and then back when the operation has been
completed (or vice-versa). This is done by using the 'pci_sync_*' calls.
Note: If you look carefully enough in the existing TTM page pool code you will
notice the GFP_DMA32 flag is used - which should guarantee that the provided page
is under 4GB. It certainly is the case, except this gets ignored in two cases:
- If user specifies 'swiotlb=force' which bounces _every_ page.
- If user is using a Xen's PV Linux guest (which uses the SWIOTLB and the
underlaying PFN's aren't necessarily under 4GB).

To not have this extra copying done the other option is to allocate the pages
using the DMA API so that there is not need to map the page and perform the
expensive 'pci_sync_*' calls.

This DMA API capable TTM pool requires for this the 'struct device' to
properly call the DMA API. It also has to track the virtual and bus address of
the page being handed out in case it ends up being swapped out or de-allocated -
to make sure it is de-allocated using the proper's 'struct device'.

Implementation wise the code keeps two lists: one that is attached to the
'struct device' (via the dev->dma_pools list) and a global one to be used when
the 'struct device' is unavailable (think shrinker code). The global list can
iterate over all of the 'struct device' and its associated dma_pool. The list
in dev->dma_pools can only iterate the device's dma_pool.
/[struct device_pool]\
/---------------------------------------------------| dev |
/ +-------| dma_pool |
/-----+------\ / \--------------------/
|struct device| /-->[struct dma_pool for WC]</ /[struct device_pool]\
| dma_pools +----+ /-| dev |
| ... | \--->[struct dma_pool for uncached]<-/--| dma_pool |
\-----+------/ / \--------------------/
\----------------------------------------------/
[Two pools associated with the device (WC and UC), and the parallel list
containing the 'struct dev' and 'struct dma_pool' entries]

The maximum amount of dma pools a device can have is six: write-combined,
uncached, and cached; then there are the DMA32 variants which are:
write-combined dma32, uncached dma32, and cached dma32.

Currently this code only gets activated when any variant of the SWIOTLB IOMMU
code is running (Intel without VT-d, AMD without GART, IBM Calgary and Xen PV
with PCI devices).

Tested-by: Michel Dänzer <michel@daenzer.net>
[v1: Using swiotlb_nr_tbl instead of swiotlb_enabled]
[v2: Major overhaul - added 'inuse_list' to seperate used from inuse and reorder
the order of lists to get better performance.]
[v3: Added comments/and some logic based on review, Added Jerome tag]
[v4: rebase on top of ttm_tt & ttm_backend merge]
[v5: rebase on top of ttm memory accounting overhaul]
[v6: New rebase on top of more memory accouting changes]
[v7: well rebase on top of no memory accounting changes]
[v8: make sure pages list is initialized empty]
[v9: calll ttm_mem_global_free_page in unpopulate for accurate accountg]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Acked-by: Thomas Hellstrom <thellstrom@vmware.com>


# b1e5f172 02-Nov-2011 Jerome Glisse <jglisse@redhat.com>

drm/ttm: introduce callback for ttm_tt populate & unpopulate V4

Move the page allocation and freeing to driver callback and
provide ttm code helper function for those.

Most intrusive change, is the fact that we now only fully
populate an object this simplify some of code designed around
the page fault design.

V2 Rebase on top of memory accounting overhaul
V3 New rebase on top of more memory accouting changes
V4 Rebase on top of no memory account changes (where/when is my
delorean when i need it ?)

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>


# 649bf3ca 01-Nov-2011 Jerome Glisse <jglisse@redhat.com>

drm/ttm: merge ttm_backend and ttm_tt V5

ttm_backend will only exist with a ttm_tt, and ttm_tt
will only be of interest when bound to a backend. Merge them
to avoid code and data duplication.

V2 Rebase on top of memory accounting overhaul
V3 Rebase on top of more memory accounting changes
V4 Rebase on top of no memory account changes (where/when is my
delorean when i need it ?)
V5 make sure ttm is unbound before destroying, change commit
message on suggestion from Tormod Volden

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>


# 822c4d9a 10-Nov-2011 Jerome Glisse <jglisse@redhat.com>

drm/ttm: page allocation use page array instead of list

Use the ttm_tt pages array for pages allocations, move the list
unwinding into the page allocation functions.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>


# f9517e63 01-Nov-2011 Jerome Glisse <jglisse@redhat.com>

drm/ttm: test for dma_address array allocation failure

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>


# 5e265680 02-Nov-2011 Jerome Glisse <jglisse@redhat.com>

drm/ttm: use ttm put pages function to properly restore cache attribute

On failure we need to make sure the page we free has wb cache
attribute. Do this pas call the proper ttm page helper function.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>


# 667b7a27 01-Nov-2011 Jerome Glisse <jglisse@redhat.com>

drm/ttm: remove split btw highmen and lowmem page

Split btw highmem and lowmem page was rendered useless by the
pool code. Remove it. Note further cleanup would change the
ttm page allocation helper to actualy take an array instead
of relying on list this could drasticly reduce the number of
function call in the common case of allocation whole buffer.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>


# 3316497b 01-Nov-2011 Jerome Glisse <jglisse@redhat.com>

drm/ttm: remove userspace backed ttm object support

This was never use in none of the driver, properly using userspace
page for bo would need more code (vma interaction mostly). Removing
this dead code in preparation of ttm_tt & backend merge.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>


# 2d1a8a48 30-Aug-2011 Paul Gortmaker <paul.gortmaker@windriver.com>

gpu: Add export.h as required to drivers/gpu files.

They need this to get all the EXPORT_SYMBOL variants and THIS_MODULE

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>


# 3142b651 27-Jun-2011 Hugh Dickins <hughd@google.com>

drm/ttm: use shmem_read_mapping_page

Soon tmpfs will stop supporting ->readpage and read_mapping_page(): once
"tmpfs: add shmem_read_mapping_page_gfp" has been applied, this patch can
be applied to ease the transition.

ttm_tt_swapin() and ttm_tt_swapout() use shmem_read_mapping_page() in
place of read_mapping_page(), since their swap_space has been created with
shmem_file_setup().

Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: Dave Airlie <airlied@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 5df23979 03-Apr-2011 Jan Engelhardt <jengelh@medozas.de>

drm: fix "persistant" typo

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>


# a2c06ee2 22-Feb-2011 Dave Airlie <airlied@redhat.com>

Revert "ttm: Include the 'struct dev' when using the DMA API."

This reverts commit 5a893fc28f0393adb7c885a871b8c59e623fd528.

This causes a use after free in the ttm free alloc pages path,
when it tries to get the be after the be has been destroyed.

Signed-off-by: Dave Airlie <airlied@redhat.com>


# 5a893fc2 22-Feb-2011 Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

ttm: Include the 'struct dev' when using the DMA API.

This makes the accounting when using 'debug_dma_dump_mappings()'
and CONFIG_DMA_API_DEBUG=y be assigned to the correct device
instead of 'fallback'.

No functional change - just cosmetic.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>


# 27e8b237 02-Dec-2010 Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

ttm: Expand (*populate) to support an array of DMA addresses.

We pass in the array of ttm pages to be populated in the GART/MM
of the card (or AGP). Patch titled: "ttm: Utilize the DMA API for
pages that have TTM_PAGE_FLAG_DMA32 set." uses the DMA API to make
those pages have a proper DMA addresses (in the situation where
page_to_phys or virt_to_phys do not give use the DMA (bus) address).

Since we are using the DMA API on those pages, we should pass in the
DMA address to this function so it can save it in its proper fields
(later patches use it).

[v2: Added reviewed-by tag]

Reviewed-by: Thomas Hellstrom <thellstrom@shipmail.org>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Tested-by: Ian Campbell <ian.campbell@citrix.com>


# f9820a46 29-Nov-2010 Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

ttm: Introduce a placeholder for DMA (bus) addresses.

This is right now limited to only non-pool constructs.

[v2: Fixed indentation issues, add review-by tag]

Reviewed-by: Thomas Hellstrom <thomas@shipmail.org>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Tested-by: Ian Campbell <ian.campbell@citrix.com>


# 7dcebb52 29-Oct-2010 Thomas Hellstrom <thellstrom@vmware.com>

drm/ttm: remove failed ttm binding error printout

The driver (for example vmwgfx) may want to silently deal with the
error itself.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>


# 1403b1a3 31-Mar-2010 Pauli Nieminen <suokkos@gmail.com>

drm/ttm: add pool wc/uc page allocator V3

On AGP system we might allocate/free routinely uncached or wc memory,
changing page from cached (wb) to uc or wc is very expensive and involves
a lot of flushing. To improve performance this allocator use a pool
of uc,wc pages.

Pools are protected with spinlocks to allow multiple threads to allocate pages
simultanously. Expensive operations are done outside of spinlock to maximize
concurrency.

Pools are linked lists of pages that were recently freed. mm shrink callback
allows kernel to claim back pages when they are required for something else.

Fixes:
* set_pages_array_wb handles highmem pages so we don't have to remove them
from pool.
* Add count parameter to ttm_put_pages to avoid looping in free code.
* Change looping from _safe to normal in pool fill error path.
* Initialize sum variable and make the loop prettier in get_num_unused_pages.

* Moved pages_freed reseting inside the loop in ttm_page_pool_free.
* Add warning comment about spinlock context in ttm_page_pool_free.

Based on Jerome Glisse's and Dave Airlie's pool allocator.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Pauli Nieminen <suokkos@gmail.com>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>


# 5a0e3ad6 24-Mar-2010 Tejun Heo <tj@kernel.org>

include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.

2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).

* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>


# 72e942dd 08-Mar-2010 Dave Airlie <airlied@redhat.com>

drm/ttm: use drm calloc large and free large

Now that the drm core can do this, lets just use it, split the code out
so TTM doesn't have to drag all of drmP.h in.

Signed-off-by: Dave Airlie <airlied@redhat.com>


# 290e5505 19-Feb-2010 Maarten Maathuis <madman2003@gmail.com>

drm/ttm: handle OOM in ttm_tt_swapout

- Without this change I get a general protection fault.
- Also use PTR_ERR where applicable.

Signed-off-by: Maarten Maathuis <madman2003@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Acked-by: Thomas Hellstrom <thellstrom@vmware.com>
Cc: stable@kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>


# f0e2f38b 19-Feb-2010 Francisco Jerez <currojerez@riseup.net>

drm/ttm: fix caching problem on non-PAT systems.

http://bugzilla.kernel.org/show_bug.cgi?id=15328

This fixes a serious regression on AGP/non-PAT systems, where
pages were ending up in the wrong state and slowing down the
whole system.

[airlied: taken this from the bug as the other option is to revert
the change which caused it].

Tested-by: John W. Linville (in bug).
Signed-off-by: Dave Airlie <airlied@redhat.com>


# db78e27d 12-Jan-2010 Francisco Jerez <currojerez@riseup.net>

drm/ttm: Avoid conflicting reserve_memtype during ttm_tt_set_page_caching.

Fixes errors like:
> reserve_ram_pages_type failed 0x15b7a000-0x15b7b000, track 0x8, req 0x10
when a BO is moved between WC and UC areas.

Reported-by: Xavier Chantry <shiningxc@gmail.com>
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Acked-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>


# 4bfd75cb 06-Dec-2009 Thomas Hellstrom <thellstrom@vmware.com>

drm/ttm: Export symbols needed for the vmwgfx driver.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>


# df67bed9 29-Oct-2009 Dave Airlie <airlied@redhat.com>

drm/radeon/kms: fix coherency issues on AGP cards.

When we are evicting from VRAM->RAM we allocate the ttm object,
but we don't set the caching policy on it before blitting into it.
This means on AGP we end up blitting into cached pages, and
the CPU later flushes out on top of them. This was mostly seen as
font corruption.

The other question is why we don't evict VRAM->GTT in a lot of cases,
this would save us some cache transitions since a lot of objects
that are evicted from VRAM will probably end up being pulled back in
a few operations later, and evicting them to system memory involves
2 unnecessary cache transitions.

Signed-off-by: Dave Airlie <airlied@redhat.com>


# c9c97b8c 26-Aug-2009 Dave Airlie <airlied@redhat.com>

drm/ttm: consolidate cache flushing code in one place.

This merges the TTM and drm cache flushing into one file in the
drm core.

Signed-off-by: Dave Airlie <airlied@redhat.com>


# a987fcaa 18-Aug-2009 Thomas Hellstrom <thellstrom@vmware.com>

ttm: Make parts of a struct ttm_bo_device global.

Common resources, like memory accounting and swap lists should be
global and not per device. Introduce a struct ttm_bo_global to
accomodate this, and register it with sysfs. Add a small sysfs interface
to return the number of active buffer objects.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@linux.ie>


# 5fd9cbad 17-Aug-2009 Thomas Hellstrom <thellstrom@vmware.com>

drm/ttm: Memory accounting rework.

Use inclusive zones to simplify accounting and its sysfs representation.
Use DMA32 accounting where applicable.

Add a sysfs interface to make the heuristically determined limits
readable and configurable.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@linux.ie>


# b42db2b1 29-Jul-2009 Dave Airlie <airlied@redhat.com>

drm/ttm: fix highuser vs dma32 confusion.

DMA32 and highmem are sort of exclusive.

Noticed by AndrewR on #radeon.

Signed-off-by: Dave Airlie <airlied@redhat.com>


# f121ecfe 24-Jul-2009 Thomas Hellstrom <thellstrom@vmware.com>

drm/ttm: powerpc: Fix Highmem cache flushing.

Temporarily maps highmem pages while flushing to get a valid virtual
address to flush.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>


# ad49f501 10-Jul-2009 Dave Airlie <airlied@linux.ie>

drm/ttm/radeon: add dma32 support.

This add support for using dma32 memory on gpus that really need it.

Currently IGPs are left without DMA32 but we might need to change
that unless we can fix rs690.

Signed-off-by: Dave Airlie <airlied@redhat.com>


# 8b169b5f 24-Jun-2009 Huang Weiyi <weiyi.huang@gmail.com>

drm: remove unused #include <linux/version.h>'s

Remove unused #include <linux/version.h>('s) in
drivers/gpu/drm/ttm/ttm_bo_util.c
drivers/gpu/drm/ttm/ttm_bo_vm.c
drivers/gpu/drm/ttm/ttm_tt.c

Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>


# 46f4b3ea 15-Jun-2009 Michel Dänzer <daenzer@vmware.com>

drm/ttm: Add some powerpc cache flush code.

Optimise the powerpc flushing path for TTM.

Signed-off-by: Dave Airlie <airlied@redhat.com>


# ba4e7d97 10-Jun-2009 Thomas Hellstrom <thellstrom@vmware.com>

drm: Add the TTM GPU memory manager subsystem.

TTM is a GPU memory manager subsystem designed for use with GPU
devices with various memory types (On-card VRAM, AGP,
PCI apertures etc.). It's essentially a helper library that assists
the DRM driver in creating and managing persistent buffer objects.

TTM manages placement of data and CPU map setup and teardown on
data movement. It can also optionally manage synchronization of
data on a per-buffer-object level.

TTM takes care to provide an always valid virtual user-space address
to a buffer object which makes user-space sub-allocation of
big buffer objects feasible.

TTM uses a fine-grained per buffer-object locking scheme, taking
care to release all relevant locks when waiting for the GPU.
Although this implies some locking overhead, it's probably a big
win for devices with multiple command submission mechanisms, since
the lock contention will be minimal.

TTM can be used with whatever user-space interface the driver
chooses, including GEM. It's used by the upcoming Radeon KMS DRM driver
and is also the GPU memory management core of various new experimental
DRM drivers.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>