History log of /linux-master/drivers/gpu/drm/radeon/radeon_gart.c
Revision Date Author Comments
# 8eb94c9b 14-Jul-2023 Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>

drm/radeon: Fix style issues in radeon _encoders.c & _gart.c

Conform to Linux kernel coding style.

Fixes the following & other checks in radeon_encoders.c & radeon_gart.c:

WARNING: Missing a blank line after declarations
WARNING: Block comments use * on subsequent lines
WARNING: Prefer 'unsigned int' to bare use of 'unsigned'
WARNING: braces {} are not necessary for single statement blocks

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# a6ce1e1a 12-May-2021 Christian König <christian.koenig@amd.com>

drm/radeon: use the dummy page for GART if needed

Imported BOs don't have a pagelist any more.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fixes: 0575ff3d33cd ("drm/radeon: stop using pages with drm_prime_sg_to_page_addr_arrays v2")


# 0c8df343 12-May-2021 Christian König <christian.koenig@amd.com>

drm/radeon: use the dummy page for GART if needed

Imported BOs don't have a pagelist any more.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fixes: 0575ff3d33cd ("drm/radeon: stop using pages with drm_prime_sg_to_page_addr_arrays v2")
CC: stable@vger.kernel.org # 5.12


# 4c0d0bcb 26-Jul-2020 Christophe JAILLET <christophe.jaillet@wanadoo.fr>

drm/radeon: switch from 'pci_' to 'dma_' API

The wrappers in include/linux/pci-dma-compat.h should go away.

The patch has been generated with the coccinelle script below and has been
hand modified to replace GFP_ with a correct flag.
It has been compile tested.

When memory is allocated in 'radeon_gart_table_ram_alloc()' GFP_KERNEL
can be used because its callers already use this flag.

Both 'r100_pci_gart_init()' (r100.c) and 'rs400_gart_init()' (rs400.c)
call 'radeon_gart_init()'.
This function uses 'vmalloc'.

@@
@@
- PCI_DMA_BIDIRECTIONAL
+ DMA_BIDIRECTIONAL

@@
@@
- PCI_DMA_TODEVICE
+ DMA_TO_DEVICE

@@
@@
- PCI_DMA_FROMDEVICE
+ DMA_FROM_DEVICE

@@
@@
- PCI_DMA_NONE
+ DMA_NONE

@@
expression e1, e2, e3;
@@
- pci_alloc_consistent(e1, e2, e3)
+ dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3;
@@
- pci_zalloc_consistent(e1, e2, e3)
+ dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3, e4;
@@
- pci_free_consistent(e1, e2, e3, e4)
+ dma_free_coherent(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
- pci_map_single(e1, e2, e3, e4)
+ dma_map_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
- pci_unmap_single(e1, e2, e3, e4)
+ dma_unmap_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4, e5;
@@
- pci_map_page(e1, e2, e3, e4, e5)
+ dma_map_page(&e1->dev, e2, e3, e4, e5)

@@
expression e1, e2, e3, e4;
@@
- pci_unmap_page(e1, e2, e3, e4)
+ dma_unmap_page(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
- pci_map_sg(e1, e2, e3, e4)
+ dma_map_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
- pci_unmap_sg(e1, e2, e3, e4)
+ dma_unmap_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
- pci_dma_sync_single_for_cpu(e1, e2, e3, e4)
+ dma_sync_single_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
- pci_dma_sync_single_for_device(e1, e2, e3, e4)
+ dma_sync_single_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
- pci_dma_sync_sg_for_cpu(e1, e2, e3, e4)
+ dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
- pci_dma_sync_sg_for_device(e1, e2, e3, e4)
+ dma_sync_sg_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2;
@@
- pci_dma_mapping_error(e1, e2)
+ dma_mapping_error(&e1->dev, e2)

@@
expression e1, e2;
@@
- pci_set_dma_mask(e1, e2)
+ dma_set_mask(&e1->dev, e2)

@@
expression e1, e2;
@@
- pci_set_consistent_dma_mask(e1, e2)
+ dma_set_coherent_mask(&e1->dev, e2)

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# f5cd8555 26-Jul-2020 Christophe JAILLET <christophe.jaillet@wanadoo.fr>

drm/radeon: avoid a useless memset

Avoid a memset after a call to 'dma_alloc_coherent()'.
This is useless since
commit 518a2f1925c3 ("dma-mapping: zero memory returned from dma_alloc_*")

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 2ef79416 03-Dec-2019 Thomas Zimmermann <tzimmermann@suse.de>

drm/radeon: Don't include <drm/drm_pci.h>

Including <drm/drm_pci.h> is unnecessary in most cases. Replace
these instances.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191203100406.9674-9-tzimmermann@suse.de


# f9183127 08-Jun-2019 Sam Ravnborg <sam@ravnborg.org>

drm/radeon: drop use of drmP.h (1/2)

Drop use of drmP.h in all .c files named radeon*c.
To ease review a little drmP.h removal was divided in two commits.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: "David (ChunMing) Zhou" <David1.Zhou@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20190608080241.4958-7-sam@ravnborg.org


# fad953ce 12-Jun-2018 Kees Cook <keescook@chromium.org>

treewide: Use array_size() in vzalloc()

The vzalloc() function has no 2-factor argument form, so multiplication
factors need to be wrapped in array_size(). This patch replaces cases of:

vzalloc(a * b)

with:
vzalloc(array_size(a, b))

as well as handling cases of:

vzalloc(a * b * c)

with:

vzalloc(array3_size(a, b, c))

This does, however, attempt to ignore constant size factors like:

vzalloc(4 * 1024)

though any constants defined via macros get caught up in the conversion.

Any factors with a sizeof() of "unsigned char", "char", and "u8" were
dropped, since they're redundant.

The Coccinelle script used for this was:

// Fix redundant parens around sizeof().
@@
type TYPE;
expression THING, E;
@@

(
vzalloc(
- (sizeof(TYPE)) * E
+ sizeof(TYPE) * E
, ...)
|
vzalloc(
- (sizeof(THING)) * E
+ sizeof(THING) * E
, ...)
)

// Drop single-byte sizes and redundant parens.
@@
expression COUNT;
typedef u8;
typedef __u8;
@@

(
vzalloc(
- sizeof(u8) * (COUNT)
+ COUNT
, ...)
|
vzalloc(
- sizeof(__u8) * (COUNT)
+ COUNT
, ...)
|
vzalloc(
- sizeof(char) * (COUNT)
+ COUNT
, ...)
|
vzalloc(
- sizeof(unsigned char) * (COUNT)
+ COUNT
, ...)
|
vzalloc(
- sizeof(u8) * COUNT
+ COUNT
, ...)
|
vzalloc(
- sizeof(__u8) * COUNT
+ COUNT
, ...)
|
vzalloc(
- sizeof(char) * COUNT
+ COUNT
, ...)
|
vzalloc(
- sizeof(unsigned char) * COUNT
+ COUNT
, ...)
)

// 2-factor product with sizeof(type/expression) and identifier or constant.
@@
type TYPE;
expression THING;
identifier COUNT_ID;
constant COUNT_CONST;
@@

(
vzalloc(
- sizeof(TYPE) * (COUNT_ID)
+ array_size(COUNT_ID, sizeof(TYPE))
, ...)
|
vzalloc(
- sizeof(TYPE) * COUNT_ID
+ array_size(COUNT_ID, sizeof(TYPE))
, ...)
|
vzalloc(
- sizeof(TYPE) * (COUNT_CONST)
+ array_size(COUNT_CONST, sizeof(TYPE))
, ...)
|
vzalloc(
- sizeof(TYPE) * COUNT_CONST
+ array_size(COUNT_CONST, sizeof(TYPE))
, ...)
|
vzalloc(
- sizeof(THING) * (COUNT_ID)
+ array_size(COUNT_ID, sizeof(THING))
, ...)
|
vzalloc(
- sizeof(THING) * COUNT_ID
+ array_size(COUNT_ID, sizeof(THING))
, ...)
|
vzalloc(
- sizeof(THING) * (COUNT_CONST)
+ array_size(COUNT_CONST, sizeof(THING))
, ...)
|
vzalloc(
- sizeof(THING) * COUNT_CONST
+ array_size(COUNT_CONST, sizeof(THING))
, ...)
)

// 2-factor product, only identifiers.
@@
identifier SIZE, COUNT;
@@

vzalloc(
- SIZE * COUNT
+ array_size(COUNT, SIZE)
, ...)

// 3-factor product with 1 sizeof(type) or sizeof(expression), with
// redundant parens removed.
@@
expression THING;
identifier STRIDE, COUNT;
type TYPE;
@@

(
vzalloc(
- sizeof(TYPE) * (COUNT) * (STRIDE)
+ array3_size(COUNT, STRIDE, sizeof(TYPE))
, ...)
|
vzalloc(
- sizeof(TYPE) * (COUNT) * STRIDE
+ array3_size(COUNT, STRIDE, sizeof(TYPE))
, ...)
|
vzalloc(
- sizeof(TYPE) * COUNT * (STRIDE)
+ array3_size(COUNT, STRIDE, sizeof(TYPE))
, ...)
|
vzalloc(
- sizeof(TYPE) * COUNT * STRIDE
+ array3_size(COUNT, STRIDE, sizeof(TYPE))
, ...)
|
vzalloc(
- sizeof(THING) * (COUNT) * (STRIDE)
+ array3_size(COUNT, STRIDE, sizeof(THING))
, ...)
|
vzalloc(
- sizeof(THING) * (COUNT) * STRIDE
+ array3_size(COUNT, STRIDE, sizeof(THING))
, ...)
|
vzalloc(
- sizeof(THING) * COUNT * (STRIDE)
+ array3_size(COUNT, STRIDE, sizeof(THING))
, ...)
|
vzalloc(
- sizeof(THING) * COUNT * STRIDE
+ array3_size(COUNT, STRIDE, sizeof(THING))
, ...)
)

// 3-factor product with 2 sizeof(variable), with redundant parens removed.
@@
expression THING1, THING2;
identifier COUNT;
type TYPE1, TYPE2;
@@

(
vzalloc(
- sizeof(TYPE1) * sizeof(TYPE2) * COUNT
+ array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
, ...)
|
vzalloc(
- sizeof(TYPE1) * sizeof(THING2) * (COUNT)
+ array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
, ...)
|
vzalloc(
- sizeof(THING1) * sizeof(THING2) * COUNT
+ array3_size(COUNT, sizeof(THING1), sizeof(THING2))
, ...)
|
vzalloc(
- sizeof(THING1) * sizeof(THING2) * (COUNT)
+ array3_size(COUNT, sizeof(THING1), sizeof(THING2))
, ...)
|
vzalloc(
- sizeof(TYPE1) * sizeof(THING2) * COUNT
+ array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
, ...)
|
vzalloc(
- sizeof(TYPE1) * sizeof(THING2) * (COUNT)
+ array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
, ...)
)

// 3-factor product, only identifiers, with redundant parens removed.
@@
identifier STRIDE, SIZE, COUNT;
@@

(
vzalloc(
- (COUNT) * STRIDE * SIZE
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
vzalloc(
- COUNT * (STRIDE) * SIZE
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
vzalloc(
- COUNT * STRIDE * (SIZE)
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
vzalloc(
- (COUNT) * (STRIDE) * SIZE
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
vzalloc(
- COUNT * (STRIDE) * (SIZE)
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
vzalloc(
- (COUNT) * STRIDE * (SIZE)
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
vzalloc(
- (COUNT) * (STRIDE) * (SIZE)
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
vzalloc(
- COUNT * STRIDE * SIZE
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
)

// Any remaining multi-factor products, first at least 3-factor products
// when they're not all constants...
@@
expression E1, E2, E3;
constant C1, C2, C3;
@@

(
vzalloc(C1 * C2 * C3, ...)
|
vzalloc(
- E1 * E2 * E3
+ array3_size(E1, E2, E3)
, ...)
)

// And then all remaining 2 factors products when they're not all constants.
@@
expression E1, E2;
constant C1, C2;
@@

(
vzalloc(C1 * C2, ...)
|
vzalloc(
- E1 * E2
+ array_size(E1, E2)
, ...)
)

Signed-off-by: Kees Cook <keescook@chromium.org>


# 42bc47b3 12-Jun-2018 Kees Cook <keescook@chromium.org>

treewide: Use array_size() in vmalloc()

The vmalloc() function has no 2-factor argument form, so multiplication
factors need to be wrapped in array_size(). This patch replaces cases of:

vmalloc(a * b)

with:
vmalloc(array_size(a, b))

as well as handling cases of:

vmalloc(a * b * c)

with:

vmalloc(array3_size(a, b, c))

This does, however, attempt to ignore constant size factors like:

vmalloc(4 * 1024)

though any constants defined via macros get caught up in the conversion.

Any factors with a sizeof() of "unsigned char", "char", and "u8" were
dropped, since they're redundant.

The Coccinelle script used for this was:

// Fix redundant parens around sizeof().
@@
type TYPE;
expression THING, E;
@@

(
vmalloc(
- (sizeof(TYPE)) * E
+ sizeof(TYPE) * E
, ...)
|
vmalloc(
- (sizeof(THING)) * E
+ sizeof(THING) * E
, ...)
)

// Drop single-byte sizes and redundant parens.
@@
expression COUNT;
typedef u8;
typedef __u8;
@@

(
vmalloc(
- sizeof(u8) * (COUNT)
+ COUNT
, ...)
|
vmalloc(
- sizeof(__u8) * (COUNT)
+ COUNT
, ...)
|
vmalloc(
- sizeof(char) * (COUNT)
+ COUNT
, ...)
|
vmalloc(
- sizeof(unsigned char) * (COUNT)
+ COUNT
, ...)
|
vmalloc(
- sizeof(u8) * COUNT
+ COUNT
, ...)
|
vmalloc(
- sizeof(__u8) * COUNT
+ COUNT
, ...)
|
vmalloc(
- sizeof(char) * COUNT
+ COUNT
, ...)
|
vmalloc(
- sizeof(unsigned char) * COUNT
+ COUNT
, ...)
)

// 2-factor product with sizeof(type/expression) and identifier or constant.
@@
type TYPE;
expression THING;
identifier COUNT_ID;
constant COUNT_CONST;
@@

(
vmalloc(
- sizeof(TYPE) * (COUNT_ID)
+ array_size(COUNT_ID, sizeof(TYPE))
, ...)
|
vmalloc(
- sizeof(TYPE) * COUNT_ID
+ array_size(COUNT_ID, sizeof(TYPE))
, ...)
|
vmalloc(
- sizeof(TYPE) * (COUNT_CONST)
+ array_size(COUNT_CONST, sizeof(TYPE))
, ...)
|
vmalloc(
- sizeof(TYPE) * COUNT_CONST
+ array_size(COUNT_CONST, sizeof(TYPE))
, ...)
|
vmalloc(
- sizeof(THING) * (COUNT_ID)
+ array_size(COUNT_ID, sizeof(THING))
, ...)
|
vmalloc(
- sizeof(THING) * COUNT_ID
+ array_size(COUNT_ID, sizeof(THING))
, ...)
|
vmalloc(
- sizeof(THING) * (COUNT_CONST)
+ array_size(COUNT_CONST, sizeof(THING))
, ...)
|
vmalloc(
- sizeof(THING) * COUNT_CONST
+ array_size(COUNT_CONST, sizeof(THING))
, ...)
)

// 2-factor product, only identifiers.
@@
identifier SIZE, COUNT;
@@

vmalloc(
- SIZE * COUNT
+ array_size(COUNT, SIZE)
, ...)

// 3-factor product with 1 sizeof(type) or sizeof(expression), with
// redundant parens removed.
@@
expression THING;
identifier STRIDE, COUNT;
type TYPE;
@@

(
vmalloc(
- sizeof(TYPE) * (COUNT) * (STRIDE)
+ array3_size(COUNT, STRIDE, sizeof(TYPE))
, ...)
|
vmalloc(
- sizeof(TYPE) * (COUNT) * STRIDE
+ array3_size(COUNT, STRIDE, sizeof(TYPE))
, ...)
|
vmalloc(
- sizeof(TYPE) * COUNT * (STRIDE)
+ array3_size(COUNT, STRIDE, sizeof(TYPE))
, ...)
|
vmalloc(
- sizeof(TYPE) * COUNT * STRIDE
+ array3_size(COUNT, STRIDE, sizeof(TYPE))
, ...)
|
vmalloc(
- sizeof(THING) * (COUNT) * (STRIDE)
+ array3_size(COUNT, STRIDE, sizeof(THING))
, ...)
|
vmalloc(
- sizeof(THING) * (COUNT) * STRIDE
+ array3_size(COUNT, STRIDE, sizeof(THING))
, ...)
|
vmalloc(
- sizeof(THING) * COUNT * (STRIDE)
+ array3_size(COUNT, STRIDE, sizeof(THING))
, ...)
|
vmalloc(
- sizeof(THING) * COUNT * STRIDE
+ array3_size(COUNT, STRIDE, sizeof(THING))
, ...)
)

// 3-factor product with 2 sizeof(variable), with redundant parens removed.
@@
expression THING1, THING2;
identifier COUNT;
type TYPE1, TYPE2;
@@

(
vmalloc(
- sizeof(TYPE1) * sizeof(TYPE2) * COUNT
+ array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
, ...)
|
vmalloc(
- sizeof(TYPE1) * sizeof(THING2) * (COUNT)
+ array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
, ...)
|
vmalloc(
- sizeof(THING1) * sizeof(THING2) * COUNT
+ array3_size(COUNT, sizeof(THING1), sizeof(THING2))
, ...)
|
vmalloc(
- sizeof(THING1) * sizeof(THING2) * (COUNT)
+ array3_size(COUNT, sizeof(THING1), sizeof(THING2))
, ...)
|
vmalloc(
- sizeof(TYPE1) * sizeof(THING2) * COUNT
+ array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
, ...)
|
vmalloc(
- sizeof(TYPE1) * sizeof(THING2) * (COUNT)
+ array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
, ...)
)

// 3-factor product, only identifiers, with redundant parens removed.
@@
identifier STRIDE, SIZE, COUNT;
@@

(
vmalloc(
- (COUNT) * STRIDE * SIZE
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
vmalloc(
- COUNT * (STRIDE) * SIZE
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
vmalloc(
- COUNT * STRIDE * (SIZE)
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
vmalloc(
- (COUNT) * (STRIDE) * SIZE
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
vmalloc(
- COUNT * (STRIDE) * (SIZE)
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
vmalloc(
- (COUNT) * STRIDE * (SIZE)
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
vmalloc(
- (COUNT) * (STRIDE) * (SIZE)
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
vmalloc(
- COUNT * STRIDE * SIZE
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
)

// Any remaining multi-factor products, first at least 3-factor products
// when they're not all constants...
@@
expression E1, E2, E3;
constant C1, C2, C3;
@@

(
vmalloc(C1 * C2 * C3, ...)
|
vmalloc(
- E1 * E2 * E3
+ array3_size(E1, E2, E3)
, ...)
)

// And then all remaining 2 factors products when they're not all constants.
@@
expression E1, E2;
constant C1, C2;
@@

(
vmalloc(C1 * C2, ...)
|
vmalloc(
- E1 * E2
+ array_size(E1, E2)
, ...)
)

Signed-off-by: Kees Cook <keescook@chromium.org>


# ed3ba079 08-May-2017 Laura Abbott <labbott@redhat.com>

drm: use set_memory.h header

set_memory_* functions have moved to set_memory.h. Switch to this
explicitly.

[akpm@linux-foundation.org: track drivers/gpu/drm/i915/i915_gem_gtt.c linux-next changes]
Link: http://lkml.kernel.org/r/1488920133-27229-8-git-send-email-labbott@redhat.com
Signed-off-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 233709d2 02-Jul-2015 Michel Dänzer <michel.daenzer@amd.com>

drm/radeon: Don't flush the GART TLB if rdev->gart.ptr == NULL

This can be the case when the GPU is powered off, e.g. via vgaswitcheroo
or runpm. When the GPU is powered up again, radeon_gart_table_vram_pin
flushes the TLB after setting rdev->gart.ptr to non-NULL.

Fixes panic on powering off R7xx GPUs.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61529
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org


# 16653dba 21-Jan-2015 Michel Dänzer <michel.daenzer@amd.com>

drm/radeon: Remove rdev->gart.pages_addr array

radeon_vm_map_gart can use rdev->gart.pages_entry instead.

Also move the masking of the page address to radeon_vm_map_gart from its
callers.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 5636d2f8 22-Jan-2015 Michel Dänzer <michel.daenzer@amd.com>

drm/radeon: Restore GART table contents after pinning it in VRAM v3

The GART table BO has to be moved out of VRAM for suspend/resume. Any
updates to the GART table during that time were silently dropped without
this change. This caused GPU lockups on resume in some cases, see the bug
reports referenced below.

This might also make GPU reset more robust in some cases, as we no longer
rely on the GART table in VRAM being preserved across the GPU
lockup/reset.

v2: Add logic to radeon_gart_table_vram_pin directly instead of
reinstating radeon_gart_restore
v3: Move code after assignment of rdev->gart.table_addr so that the GART
TLB flush can work as intended, add code comment explaining why we're
doing this

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85204
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86267
Reviewed-by: Christian König <christian.koenig@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# cb658906 21-Jan-2015 Michel Dänzer <michel.daenzer@amd.com>

drm/radeon: Split off gart_get_page_entry ASIC hook from set_page_entry

get_page_entry calculates the GART page table entry, which is just written
to the GART page table by set_page_entry.

This is a prerequisite for the following fix.

Reviewed-by: Christian König <christian.koenig@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 831b6966 18-Sep-2014 Maarten Lankhorst <maarten.lankhorst@canonical.com>

drm/radeon: export reservation_object from dmabuf to ttm

Adds an extra argument to radeon_bo_create, which is only used in radeon_prime.c.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 02376d82 17-Jul-2014 Michel Dänzer <michel.daenzer@amd.com>

drm/radeon: Allow write-combined CPU mappings of BOs in GTT (v2)

v2: fix rebase onto drm-fixes

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 77497f27 17-Jul-2014 Michel Dänzer <michel.daenzer@amd.com>

drm/radeon: Pass GART page flags to radeon_gart_set_page() explicitly

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# a3eb06db 09-Jul-2014 Michel Dänzer <michel.daenzer@amd.com>

drm/radeon: Remove radeon_gart_restore()

Doesn't seem necessary, the GART table memory should be persistent.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 2280ab57 20-Feb-2014 Christian König <christian.koenig@amd.com>

drm/radeon: separate gart and vm functions

Both are complex enough on their own.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>


# 31dd8d93 23-Jan-2014 Christian König <christian.koenig@amd.com>

drm/radeon: add missing trace point

Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 593b2635 23-Jan-2014 Christian König <christian.koenig@amd.com>

drm/radeon: fix VMID use tracking

Otherwise we allocate a new VMID on nearly every submit.

Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 9c57a6bd 25-Nov-2013 Christian König <christian.koenig@amd.com>

drm/radeon: add radeon_vm_bo_update trace point

Also rename the function to better reflect what it is doing.

agd5f: fix argument size warning

Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 84d597b7 25-Nov-2013 Christian König <christian.koenig@amd.com>

drm/radeon: add VMID allocation trace point

Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 4cc948b9 13-Nov-2013 Alex Deucher <alexander.deucher@amd.com>

drm/radeon/vm: don't attempt to update ptes if ib allocation fails

If we fail to allocate an indirect buffer (ib) when updating
the ptes, return an error instead of trying to use the ib.
Avoids a null pointer dereference.

Bug:
https://bugzilla.kernel.org/show_bug.cgi?id=58621

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org


# 1654b817 11-Nov-2013 Christian König <christian.koenig@amd.com>

drm/radeon: allow semaphore emission to fail

To workaround bugs and/or certain limits it's sometimes
useful to fall back to waiting on fences.

Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org


# db96bd25 29-Oct-2013 Christian König <christian.koenig@amd.com>

drm/radeon: clear the page directory using the DMA

Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 5b2906ec 29-Oct-2013 Christian König <christian.koenig@amd.com>

drm/radeon: initially clear page tables

Clear page tables after allocating them in case
we don't completely fill them later.

Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 24c16439 30-Oct-2013 Christian König <christian.koenig@amd.com>

drm/radeon: drop CP page table updates & cleanup v2

The DMA ring seems to be stable now.

v2: remove pt_ring_index as well

Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 3744b248 05-Aug-2013 Christian König <christian.koenig@amd.com>

drm/radeon: remove unnecessary unpin

We don't pin the BO on allocation, so don't unpin it on free.

Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 3e3e53f8 18-Jul-2013 Alex Deucher <alexander.deucher@amd.com>

drm/radeon/vm: only align the pt base to 32k

fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=67016

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org


# 1c01103c 12-Jul-2013 Alex Deucher <alexander.deucher@amd.com>

drm/radeon: align VM PTBs (Page Table Blocks) to 32K

Covers requirements of all current asics.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org


# 6c4f978b 12-Jul-2013 Alex Deucher <alexander.deucher@amd.com>

drm/radeon: allow selection of alignment in the sub-allocator

There are cases where we need more than 4k alignment. No
functional change with this commit.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org


# 3813f5ca 05-Jun-2013 Jerome Glisse <jglisse@redhat.com>

drm/radeon: do not try to uselessly update virtual memory pagetable

If a buffer is never bound to a virtual memory pagetable than don't try
to unbind it. Only drawback is that we don't update the pagetable when
unbinding the ib pool buffer which is fine because it only happens at
suspend or module unload/shutdown.

Fixes spurious messages about buffers without VM mappings. E.g.:
radeon 0000:01:00.0: bo ffff88020afac400 don't has a mapping in vm ffff88021ca2b900

Cc: stable@kernel.org
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 43f1214a 01-Feb-2013 Alex Deucher <alexander.deucher@amd.com>

drm/radeon: use IBs for VM page table updates v2

For very large page table updates, we can exceed the
size of the ring. To avoid this, use an IB to perform
the page table update.

v2(ck): cleanup the IB infrastructure and the use it instead
of filling the struct ourself.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>


# 6ed9ccb4 28-Nov-2012 Maarten Lankhorst <m.b.lankhorst@gmail.com>

drm/radeon: allow move_notify to be called without reservation

The few places that care should have those checks instead.
This allows destruction of bo backed memory without a reservation.
It's required for being able to rework the delayed destroy path,
as it is no longer guaranteed to hold a reservation before unlocking.

However any previous wait is still guaranteed to complete, and it's
one of the last things to be done before the buffer object is freed.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>


# 0a46fb5f 12-Oct-2012 Maarten Lankhorst <maarten.lankhorst@canonical.com>

drm/radeon: Use ttm_bo_is_reserved

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>


# 59240ee3 23-Oct-2012 Christian König <deathsimple@vodafone.de>

drm/radeon: use vzalloc for gart pages

When allocating more than 2GB of GART the array of pages
gets to big for kzalloc, use vzalloc instead.

Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 08eda32b 22-Oct-2012 Christian König <deathsimple@vodafone.de>

drm/radeon: fix header size estimation in VM code

Only NI uses 3dw headers, SI uses 4dw headers.

Signed-off-by: Christian König <deathsimple@vodafone.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 204a393c 22-Oct-2012 Christian König <deathsimple@vodafone.de>

drm/radeon: remove set_page check from VM code

It's better to handle this in the chipset specific code.

Signed-off-by: Christian König <deathsimple@vodafone.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 1518d7fb 16-Oct-2012 Alex Deucher <alexander.deucher@amd.com>

drm/radeon: fix sparse warning

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 13e55c38 09-Oct-2012 Christian König <deathsimple@vodafone.de>

drm/radeon: separate pt alloc from lru add

Make it possible to allocate a persistent page table.

Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# d72d43cf 09-Oct-2012 Christian König <deathsimple@vodafone.de>

drm/radeon: don't add the IB pool to all VMs v2

We want to use VMs without the IB pool in the future.

v2: also remove it from radeon_vm_finish.

Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 90a51a32 09-Oct-2012 Christian König <deathsimple@vodafone.de>

drm/radeon: allocate page tables on demand v4

Based on Dmitries work, but splitting the code into page
directory and page table handling makes it far more
readable and (hopefully) more reliable.

Allocations of page tables are made from the SA on demand,
that should still work fine since all page tables are of
the same size.

Also using the fact that allocations from the SA are mostly
continuously (except for end of buffer wraps and under very
high memory pressure) to group updates send to the chipset
specific code into larger chunks.

v3: mostly a rewrite of Dmitries previous patch.
v4: fix some typos and coding style

Signed-off-by: Dmitry Cherkasov <Dmitrii.Cherkasov@amd.com>
Signed-off-by: Christian König <deathsimple@vodafone.de>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 760285e7 02-Oct-2012 David Howells <dhowells@redhat.com>

UAPI: (Scripted) Convert #include "..." to #include <path/...> in drivers/gpu/

Convert #include "..." to #include <path/...> in drivers/gpu/.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Dave Jones <davej@redhat.com>


# fa87e62d 17-Sep-2012 Dmitry Cherkasov <dcherkassov@gmail.com>

drm/radeon: add 2-level VM pagetables support v9

PDE/PTE update code uses CP ring for memory writes.
All page table entries are preallocated for now in alloc_pt().

It is made as whole because it's hard to divide it to several patches
that compile and doesn't break anything being applied separately.

Tested on cayman card.

v2: rebased on top of "refactor set_page chipset interface v3",
code cleanups

v3: switched offsets calc macros to inline funcs where possible,
remove pd_addr from radeon_vm, switched RADEON_BLOCK_SIZE define,
to 9 (and PTE_COUNT to 1 << BLOCK_SIZE)

v4 (ck): move "incr" documentation to previous patch, cleanup and
document RADEON_VM_* constants, change commit message to
our usual format, simplify patch allot by removing
everything current not necessary, disable SI workaround.

v5: (agd5f): Fix typo in tables_size calculation in
radeon_vm_alloc_pt(). Second line should have been
'+=' rather than '='.

v6: fix npdes calculation. In scenario when pfns to be mapped overlap
two PDE spans:

+-----------+-------------+
| PDE span | PDE span |
+-----------+----+--------+
| |
+---------+
| pfns |
+---------+

the following npdes calculation gives incorrect result:

npdes = (nptes >> RADEON_VM_BLOCK_SIZE) + 1;

For the case above picture it should give npdes = 2, but gives one.

This patch corrects it by rounding last pfn up to 512 border,
first - down to 512 border and then subtracting and dividing by 512.

v7: Make npde calculation clearer, fix ndw calculation.

v8: (agd5f): reserve enough for 2 full VM PTs, add some
additional comments.

v9: fix typo in npde calculation

Signed-off-by: Dmitry Cherkasov <Dmitrii.Cherkasov@amd.com>
Signed-off-by: Christian König <deathsimple@vodafone.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# dce34bfd 17-Sep-2012 Christian König <deathsimple@vodafone.de>

drm/radeon: refactor set_page chipset interface v5

Cleanup the interface in preparation for hierarchical page tables.

v2: add incr parameter to set_page for simple scattered PTs uptates
added PDE-specific flags to r600_flags and radeon_drm.h
removed superfluous value masking with 0xffffffff

v3: removed superfluous bo_va->valid checking
changed R600_PTE_VALID to R600_ENTRY_VALID to handle PDE too

v4 (ck): fix indention style, rework and fix typos in commit message,
add documentation for incr parameter, also use incr
parameter for system pages

v5 (agd5f): use upper_32_bits() and minor white space fixes

Signed-off-by: Christian König <deathsimple@vodafone.de>
Signed-off-by: Dmitry Cherkassov <Dmitrii.Cherkasov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# e971bd5e 11-Sep-2012 Christian König <deathsimple@vodafone.de>

drm/radeon: rework the VM code a bit more (v2)

Roughly based on how nouveau is handling it. Instead of
adding the bo_va when the address is set add the bo_va
when the handle is opened, but set the address to zero
until userspace tells us where to place it.

This fixes another bunch of problems with glamor.

v2: agd5f: fix build after dropping patch 7/8.

Signed-off-by: Christian König <deathsimple@vodafone.de>


# 421ca7ab 11-Sep-2012 Christian König <deathsimple@vodafone.de>

drm/radeon: move and rename radeon_bo_va function

It doesn't really belong into the object functions,
also rename it to avoid collisions with struct radeon_bo_va.

Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>


# ca19f21e 11-Sep-2012 Christian König <deathsimple@vodafone.de>

drm/radeon: move IB pool to 1MB offset

Even GPUs can have a null pointer dereference, so move
the IB pool to another offset to catch those.

Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>


# 96a5844f 11-Sep-2012 Christian König <deathsimple@vodafone.de>

drm/radeon: fix VA overlap check

Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>


# a36e70b2 11-Sep-2012 Christian König <deathsimple@vodafone.de>

drm/radeon: fix VA range check

The end offset is exclusive not inclusive.

Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>


# 2a6f1abb 11-Aug-2012 Christian König <deathsimple@vodafone.de>

drm/radeon: make page table updates async v2

Currently doing the update with the CP.

v2: Rebased on Jeromes bugfix. Make validity comparison
more human readable.

Signed-off-by: Christian König <deathsimple@vodafone.de>


# 089a786e 11-Aug-2012 Christian König <deathsimple@vodafone.de>

drm/radeon: Move looping over the PTEs into chip code

Makes it easier to move it into the rings.

Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>


# ddf03f5c 09-Aug-2012 Christian König <deathsimple@vodafone.de>

drm/radeon: rework VM page table handling

Removing the need to wait for anything.

Still not ideal, since we need to free pt on va remove.

Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>


# ee60e29f 09-Aug-2012 Christian König <deathsimple@vodafone.de>

drm/radeon: rework VMID handling

Move binding onto the ring, simplifying handling a bit.

Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>


# 9b40e5d8 07-Aug-2012 Christian König <deathsimple@vodafone.de>

drm/radeon: make VM flushs a ring operation

Move flushing the VMs as function into the rings.
First step to make VM operations async.

Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>


# d66a7626 06-Aug-2012 Christian König <deathsimple@vodafone.de>

drm/radeon: remove vm_unbind

It actually isn't very useful.

Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>


# 05b07147 06-Aug-2012 Christian König <deathsimple@vodafone.de>

drm/radeon: move VM funcs into asic structure

So it looks more like the rest of the driver.

Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>


# f59abbf2 13-Aug-2012 Dmitrii Cherkasov <DCherkasov@luxsoft.com>

drm/radeon: fix typo in function header comment

Signed-off-by: Dmitrii Cherkasov <DCherkasov@luxsoft.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# e43b5ec0 05-Aug-2012 Jerome Glisse <jglisse@redhat.com>

drm/radeon: fence virtual address and free it once idle v4

Virtual address need to be fenced to know when we can safely remove it.
This patch also properly clear the pagetable. Previously it was
serouisly broken.

Kernel 3.5/3.4 need a similar patch but adapted for difference in mutex locking.

v2: For to update pagetable when unbinding bo (don't bailout if
bo_va->valid is true).
v3: Add kernel 3.5/3.4 comment.
v4: Fix compilation warnings.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>


# 09db8644 17-Jul-2012 Alex Deucher <alexander.deucher@amd.com>

drm/radeon: document VM functions in radeon_gart.c (v3)

Document the VM functions in radeon_gart.c

v2: adjust per Christian's suggestions
v3: adjust to Christians's latest changes

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>


# 03eec93b 17-Jul-2012 Alex Deucher <alexander.deucher@amd.com>

drm/radeon: document non-VM functions in radeon_gart.c (v2)

Document the non-VM functions in radeon_gart.c

v2: adjust per Christian's suggestions

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>


# c6105f24 05-Jul-2012 Christian König <deathsimple@vodafone.de>

drm/radeon: remove vm_manager start/suspend

Just restore the page table instead. Addressing three
problem with this change:

1. Calling vm_manager_suspend in the suspend path is
problematic cause it wants to wait for the VM use
to end, which in case of a lockup never happens.

2. In case of a locked up memory controller
unbinding the VM seems to make it even more
unstable, creating an unrecoverable lockup
in the end.

3. If we want to backup/restore the leftover ring
content we must not unbind VMs in between.

Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>


# 35e56bd0 25-Jun-2012 Christian König <deathsimple@vodafone.de>

drm/radeon: add error handling to radeon_vm_unbind_locked

Waiting for a fence can fail for different reasons,
the most common is a deadlock.

Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>


# c21b328e 28-Jun-2012 Alex Deucher <alexander.deucher@amd.com>

drm/radeon: fix VM page table setup on SI

Cayman and trinity allow for variable sized VM page
tables, but SI requires that all page tables be the
same size. The current code assumes variablely sized
VM page tables so SI may end up with part of each page
table overlapping with other memory which could end
up being interpreted by the VM hw as garbage.

Change the code to better accomodate SI. Allocate enough
space for at least 2 full page tables and always set
last_pfn to max_pfn on SI so each VM is backed by a full
page table. This limits us to only 2 VMs active at any
given time on SI. This will be rectified and the code can
be reunified once we move to two level page tables.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>


# 36ff39c4 09-May-2012 Christian König <deathsimple@vodafone.de>

drm/radeon: replace cs_mutex with vm_mutex v3

Try to remove or replace the cs_mutex with a
vm_mutex where it is still needed.

v2: fix locking order
v3: rebased on drm-next

Signed-off-by: Christian König <deathsimple@vodafone.de>


# bb409155 03-Jun-2012 Christian König <deathsimple@vodafone.de>

drm/radeon: fix vm deadlocks on cayman

Locking mutex in different orders just screams for
deadlocks, and some testing showed that it is actually
quite easy to trigger them.

Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>


# 40f5cf99 10-May-2012 Alex Deucher <alexander.deucher@amd.com>

drm/radeon: add PRIME support (v2)

This adds prime->fd and fd->prime support to radeon.
It passes the sg object to ttm and then populates
the gart entries using it.

Compile tested only.

v2: stub kmap + use new helpers + add reimporting

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>


# c507f7ef 09-May-2012 Jerome Glisse <jglisse@redhat.com>

drm/radeon: rip out the ib pool

It isn't necessary any more and the suballocator seems to perform
even better.

Signed-off-by: Christian König <deathsimple@vodafone.de>
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>


# 557017a0 09-May-2012 Christian König <deathsimple@vodafone.de>

drm/radeon: define new SA interface v3

Define the interface without modifying the allocation
algorithm in any way.

v2: rebase on top of fence new uint64 patch
v3: add ring to debugfs output

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Christian König <deathsimple@vodafone.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>


# 2e0d9910 09-May-2012 Christian König <deathsimple@vodafone.de>

drm/radeon: make sa bo a stand alone object

Allocating and freeing it seperately.

Signed-off-by: Christian König <deathsimple@vodafone.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>


# dd8bea21 09-May-2012 Christian König <deathsimple@vodafone.de>

drm/radeon: use inline functions to calc sa_bo addr

Instead of hacking the calculation multiple times.

Signed-off-by: Christian König <deathsimple@vodafone.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>


# 04bd27ae 26-Feb-2012 Jesper Juhl <jj@chaosbits.net>

radeon: remove redundant ';' from radeon_vm_bo_update_pte()

return statement needs just one semi-colon

Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>


# 108b0d34 29-Feb-2012 Sebastian Biemueller <sebastian.biemueller@amd.com>

drm/radeon/kms/vm: fix possible bug in radeon_vm_bo_rmv()

The bo is removed from the list at the top of
radeon_vm_bo_rmv(), but then the list is used
in radeon_vm_bo_update_pte() to look up the vm.
remove the bo_list entry at the end of the
function instead.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Jerome Glisse <j.glisse@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>


# a7eef882 09-Jan-2012 Dan Carpenter <dan.carpenter@oracle.com>

drm/radeon: double lock typo in radeon_vm_bo_rmv()

The second lock should be an unlock or it causes a deadlock.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>


# 55ba70c4 09-Jan-2012 Dan Carpenter <dan.carpenter@oracle.com>

drm/radeon: use after free in radeon_vm_bo_add()

"bo_va" is dereferenced in the error message.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>


# 67e915e4 06-Jan-2012 Alex Deucher <alexander.deucher@amd.com>

drm/radeon/kms: check if vm is supported in VA ioctl

Add a VM manager enabled field and use it to check if
vm is enabled.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: jglisse@redhat.com
Signed-off-by: Dave Airlie <airlied@redhat.com>


# 721604a1 05-Jan-2012 Jerome Glisse <jglisse@redhat.com>

drm/radeon: GPU virtual memory support v22

Virtual address space are per drm client (opener of /dev/drm).
Client are in charge of virtual address space, they need to
map bo into it by calling DRM_RADEON_GEM_VA ioctl.

First 16M of virtual address space is reserved by the kernel.

Once using 2 level page table we should be able to have a small
vram memory footprint for each pt (there would be one pt for all
gart, one for all vram and then one first level for each virtual
address space).

Plan include using the sub allocator for a common vm page table
area and using memcpy to copy vm page table in & out. Or use
a gart object and copy things in & out using dma.

v2: agd5f fixes:
- Add vram base offset for vram pages. The GPU physical address of a
vram page is FB_OFFSET + page offset. FB_OFFSET is 0 on discrete
cards and the physical bus address of the stolen memory on
integrated chips.
- VM_CONTEXT1_PROTECTION_FAULT_DEFAULT_ADDR covers all vmid's >= 1

v3: agd5f:
- integrate with the semaphore/multi-ring stuff

v4:
- rebase on top ttm dma & multi-ring stuff
- userspace is now in charge of the address space
- no more specific cs vm ioctl, instead cs ioctl has a new
chunk

v5:
- properly handle mem == NULL case from move_notify callback
- fix the vm cleanup path

v6:
- fix update of page table to only happen on valid mem placement

v7:
- add tlb flush for each vm context
- add flags to define mapping property (readable, writeable, snooped)
- make ring id implicit from ib->fence->ring, up to each asic callback
to then do ring specific scheduling if vm ib scheduling function

v8:
- add query for ib limit and kernel reserved virtual space
- rename vm->size to max_pfn (maximum number of page)
- update gem_va ioctl to also allow unmap operation
- bump kernel version to allow userspace to query for vm support

v9:
- rebuild page table only when bind and incrementaly depending
on bo referenced by cs and that have been moved
- allow virtual address space to grow
- use sa allocator for vram page table
- return invalid when querying vm limit on non cayman GPU
- dump vm fault register on lockup

v10: agd5f:
- Move the vm schedule_ib callback to a standalone function, remove
the callback and use the existing ib_execute callback for VM IBs.

v11:
- rebase on top of lastest Linus

v12: agd5f:
- remove spurious backslash
- set IB vm_id to 0 in radeon_ib_get()

v13: agd5f:
- fix handling of RADEON_CHUNK_ID_FLAGS

v14:
- fix va destruction
- fix suspend resume
- forbid bo to have several different va in same vm

v15:
- rebase

v16:
- cleanup left over of vm init/fini

v17: agd5f:
- cs checker

v18: agd5f:
- reworks the CS ioctl to better support multiple rings and
VM. Rather than adding a new chunk id for VM, just re-use the
IB chunk id and add a new flags for VM mode. Also define additional
dwords for the flags chunk id to define the what ring we want to use
(gfx, compute, uvd, etc.) and the priority.

v19:
- fix cs fini in weird case of no ib
- semi working flush fix for ni
- rebase on top of sa allocator changes

v20: agd5f:
- further CS ioctl cleanups from Christian's comments

v21: agd5f:
- integrate CS checker improvements

v22: agd5f:
- final cleanups for release, only allow VM CS on cayman

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>


# c52494f6 17-Oct-2011 Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

drm/radeon/kms: enable the ttm dma pool if swiotlb is on V4

With the exception that we do not handle the AGP case. We only
deal with PCIe cards such as ATI ES1000 or HD3200 that have been
detected to only do DMA up to 32-bits.

V2 force dma32 if we fail to set bigger dma mask
V3 Rebase on top of no memory account changes (where/when is my
delorean when i need it ?)
V4 add debugfs entry is swiotlb is active not only if we are
on dma 32bits only gpu

CC: Dave Airlie <airlied@redhat.com>
CC: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>


# c9a1be96 03-Nov-2011 Jerome Glisse <jglisse@redhat.com>

drm/radeon/kms: consolidate GART code, fix segfault after GPU lockup V2

After GPU lockup VRAM gart table is unpinned and thus its pointer
becomes unvalid. This patch move the unpin code to a common helper
function and set pointer to NULL so that page update code can check
if it should update GPU page table or not. That way bo still bound
to GART can be unbound (pci_unmap_page for all there page) properly
while there is no need to update the GPU page table.

V2 move the test for null gart out of the loop, small optimization

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>


# fcf4de5a 31-Aug-2011 Tormod Volden <debian.tormod@gmail.com>

drm/radeon: Print gart initialization details on all chipsets

This was previously done for r300 only. Use %016llX instead of %08X for
printing the table address.

Also fix typos in gart warning messages.

Signed-off-by: Tormod Volden <debian.tormod@gmail.com>
Reviewed-by: Michel Dänzer <michel@daenzer.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>


# 03a80665 08-May-2011 Dave Airlie <airlied@redhat.com>

drm/radeon/nouveau: fix build regression on alpha due to Xen changes.

The Xen changes were using DMA_ERROR_CODE which isn't defined on a few
platforms, however we reverted the Xen patch that caused use to try and
use this code path earlier in 2.6.39 cycle, so for now lets just force
the code to never take this path and allow it to build again on alpha.

The proper long term answer is probably to store if the dma_addr has
been assigned to alongside the dma_addr in the higher level code,
though I think Thomas wanted to rewrite most of this anyways properly.

Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>


# 92656d70 12-Apr-2011 Alex Deucher <alexdeucher@gmail.com>

drm/radeon/kms: clean up gart dummy page handling

As per Konrad's original patch, the dummy page used
by the gart code and allocated in radeon_gart_init()
was not freed properly in radeon_gart_fini().

At the same time r6xx and newer allocated and freed the
dummy page on their own. So to do Konrad's patch one
better, just remove the allocation and freeing of the
dummy page in the r6xx, 7xx, evergreen, and ni code and
allocate and free in the gart_init/fini() functions for
all asics.

Cc: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>


# 441921d5 18-Feb-2011 Daniel Vetter <daniel.vetter@ffwll.ch>

drm/radeon: embed struct drm_gem_object

Unconditionally initialize the drm gem object - it's not
worth the trouble not to for the few kernel objects.

This patch only changes the place of the drm gem object,
access is still done via pointers.

v2: Uncoditionally align the size in radeon_bo_create. At
least the r600/evergreen blit code didn't to this, angering
the paranoid gem code.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>


# c39d3516 02-Dec-2010 Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

radeon/ttm/PCIe: Use dma_addr if TTM has set it.

If the TTM layer has used the DMA API to setup pages that are
TTM_PAGE_FLAG_DMA32 (look at patch titled: "ttm: Utilize the dma_addr_t
array for pages that are to in DMA32 pool."), lets use it
when programming the GART in the PCIe type cards.

This patch skips doing the pci_map_page (and pci_unmap_page) if
there is a DMA addresses passed in for that page. If the dma_address
is zero (or DMA_ERROR_CODE), then we continue on with our old
behaviour.

[v2: Fixed an indentation problem, added reviewed-by tag]
[v3: Added Acked-by Jerome]

Acked-by: Jerome Glisse <j.glisse@gmail.com>
Reviewed-by: Thomas Hellstrom <thomas@shipmail.org>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Tested-by: Ian Campbell <ian.campbell@citrix.com>


# 268b2510 17-Nov-2010 Alex Deucher <alexdeucher@gmail.com>

drm/radeon/kms: fix alignment when allocating buffers

We were previously dropping alignment requests on the floor
when allocating buffers so we always ended up page aligned.
Certain tiling modes on 6xx+ require larger alignment which
wasn't happening before.

Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Cc: Jerome Glisse <j.glisse@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>


# 90aca4d2 09-Mar-2010 Jerome Glisse <jglisse@redhat.com>

drm/radeon/kms: simplify & improve GPU reset V2

This simplify and improve GPU reset for R1XX-R6XX hw, it's
not 100% reliable here are result:
- R1XX/R2XX works bunch of time in a row, sometimes it
seems it can work indifinitly
- R3XX/R3XX the most unreliable one, sometimes you will be
able to reset few times, sometimes not even once
- R5XX more reliable than previous hw, seems to work most
of the times but once in a while it fails for no obvious
reasons (same status than previous reset just no same
happy ending)
- R6XX/R7XX are lot more reliable with this patch, still
it seems that it can fail after a bunch (reset every
2sec for 3hour bring down the GPU & computer)

This have been tested on various hw, for some odd reasons
i wasn't able to lockup RS480/RS690 (while they use to
love locking up).

Note that on R1XX-R5XX the cursor will disapear after
lockup haven't checked why, switch to console and back
to X will restore cursor.

Next step is to record the bogus command that leaded to
the lockup.

V2 Fix r6xx resume path to avoid reinitializing blit
module, use the gpu_lockup boolean to avoid entering
inifinite waiting loop on fence while reiniting the GPU

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>


# 82568565 04-Feb-2010 Dave Airlie <airlied@redhat.com>

drm/radeon/kms: set gart pages to invalid on unbind and point to dummy page

this uses a new entrypoint to invalidate gart entries instead of using 0.
Changed to rather than pointing to 0 address point empty entry to dummy
page. This might help to avoid hard lockup if for some wrong
reasons GPU try to access unmapped GART entry.

I'm not 100% sure this is going to work, we probably need to allocate
a dummy page and point all the GTT entries at it similiar to what AGP does.
but we can test this first I suppose.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>


# 4c788679 20-Nov-2009 Jerome Glisse <jglisse@redhat.com>

drm/radeon/kms: Rework radeon object handling

The locking & protection of radeon object was somewhat messy.
This patch completely rework it to now use ttm reserve as a
protection for the radeon object structure member. It also
shrink down the various radeon object structure by removing
field which were redondant with the ttm information. Last it
converts few simple functions to inline which should with
performances.

airlied: rebase on top of r600 and other changes.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>


# a77f1718 13-Oct-2009 Matt Turner <mattst88@gmail.com>

drm/radeon/kms: use RADEON_GPU_PAGE_SIZE instead of 4096

Signed-off-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>


# 4aac0473 14-Sep-2009 Jerome Glisse <jglisse@redhat.com>

drm/radeon/kms: clear confusion in GART init/deinit path

GART static one time initialization was mixed up with GART
enabling/disabling which could happen several time for instance
during suspend/resume cycles. This patch splits all GART
handling into 4 differents function. gart_init is for one
time initialization, gart_deinit is called upon module unload
to free resources allocated by gart_init, gart_enable enable
the GART and is intented to be call after first initialization
and at each resume cycle or reset cycle. Finaly gart_disable
stop the GART and is intended to be call at suspend time or
when unloading the module.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>


# ed10f95d 29-Jun-2009 Dave Airlie <airlied@redhat.com>

drm/radeon/kms: fix some GART table entry bugs.

1. rv370 can accept 40-bit addresses - also at 24-bit shift not 4 bits
2. rs480 table can be in 40-bit space. - 4 bit shift for top 8 bits
3. rs480 table entries can be in 40-bit space.

Signed-off-by: Dave Airlie <airlied@redhat.com>


# 771fe6b9 05-Jun-2009 Jerome Glisse <jglisse@redhat.com>

drm/radeon: introduce kernel modesetting for radeon hardware

Add kernel modesetting support to radeon driver, use the ttm memory
manager to manage memory and DRM/GEM to provide userspace API.
In order to avoid backward compatibility issue and to allow clean
design and code the radeon kernel modesetting use different code path
than old radeon/drm driver.

When kernel modesetting is enabled the IOCTL of radeon/drm
driver are considered as invalid and an error message is printed
in the log and they return failure.

KMS enabled userspace will use new API to talk with the radeon/drm
driver. The new API provide functions to create/destroy/share/mmap
buffer object which are then managed by the kernel memory manager
(here TTM). In order to submit command to the GPU the userspace
provide a buffer holding the command stream, along this buffer
userspace have to provide a list of buffer object used by the
command stream. The kernel radeon driver will then place buffer
in GPU accessible memory and will update command stream to reflect
the position of the different buffers.

The kernel will also perform security check on command stream
provided by the user, we want to catch and forbid any illegal use
of the GPU such as DMA into random system memory or into memory
not owned by the process supplying the command stream. This part
of the code is still incomplete and this why we propose that patch
as a staging driver addition, future security might forbid current
experimental userspace to run.

This code support the following hardware : R1XX,R2XX,R3XX,R4XX,R5XX
(radeon up to X1950). Works is underway to provide support for R6XX,
R7XX and newer hardware (radeon from HD2XXX to HD4XXX).

Authors:
Jerome Glisse <jglisse@redhat.com>
Dave Airlie <airlied@redhat.com>
Alex Deucher <alexdeucher@gmail.com>

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>