User:Greasemonkey/Intel GenX: Difference between revisions

(blitting; me crying about the GTT not working)
 
(5 intermediate revisions by the same user not shown)
Line 22:
GMAs tend to be advertised as at least two functions, so you may wish to check Function 1 as well. HDs tend to be advertised as just one.
 
Either way, there appears to be no difference between using BAR0 of Function 0 and BAR0 of Function 1, and only Function 0 gives thea initialwindow [[AGP]]into aperturestolen space informationmemory, so you might be able to get away with just using Function 0.
 
With that said, make sure you check the IDs to ensure that they match hardware that you've actually tested. Different GPUs have different bugs and thus require different workarounds - even when they're within the same family.
Line 28:
Here's the information, assuming you are accessing everything through Function 0:
* uint64_t @ PCI 0x10: Location of MMIO registers. (XXX: for devices with a Function 1, this appears to do more than just MMIO registers - Function 1 should be able to provide a space with "just" MMIO registers.)
* uint64_t @ PCI 0x18: Location of AGP stolen memory location.
 
The 64-bit pointers have the lower 4 bits set to 0x4, so remember to mask it out before you use it, and remember to maintain that mask before writing back.
Line 103:
</pre>
 
With the above config, your framebuffer should be located at the start of thestolen AGP aperturememory, and be 2048 32bpp BGRX pixels wide internally.
 
===Setting monitor timings===
Line 155:
It's possible to start the ring buffer and then advance the tail when you have a new command or batch of commands.
 
RING_BUFFER_START denotes the address relative to that AGP stolen memory space.
 
RING_BUFFER_HEAD and RING_BUFFER_TAIL need to be given byte offsets, so if you add, say, two DWords, you'd add 8 to RING_BUFFER_TAIL.
Line 204:
);
genx_rb_push(((screen_height)<<16)|((screen_width)<<2)); // height in scanlines, width in bytes
genx_rb_push(screen_agp_offsetscreen_stolen_memory_offset);
genx_rb_push(0x00330066); // XXRRGGBB - HTML colour #330066
</pre>
Line 227:
genx_rb_push((0<<16) | (0)); // Y1:X1 top-left
genx_rb_push(((screen_height)<<16)|((screen_width))); // Y2:X2 bottom-right
genx_rb_push(screen_agp_offsetscreen_stolen_memory_offset);
genx_rb_push(0x00330066); // XXRRGGBB - HTML colour #330066
</pre>
Line 233:
Note, if you want clipping, you'll want to run an XY_SETUP_CLIP_BLT command, and then enable the "clipping enable" flag.
 
==Making the GTT behave==
===Memory-to-GPU blit===
Tested on:
* GM45 1366x768 CQ60-210TU
 
===References for GTT page format===
''TODO: Get the GTT working first. Only then will this work.''
* G45: Vol1a, pg214
 
===Finding GTTADR===
''Nevertheless, this is probably faster than a memcpy.''
'''Gen4 only (NOT Gen4.5!)''': 32-bit address "GTTADR" at PCI B0:D2:F0:0x1C. ''(TODO: confirm)''
 
'''Gen4.5 and above''': 64-bit address "GTTMMADR" at PCI B0:D2:F0:0x18 , then add 2MB (0x200000). ''(TODO: confirm the "above" bit)''
==Making the GTT behave==
tends to use GTTADR from PCI, which is the 32-bit address at PCI+0x1C.
 
===Allocating space for the GTT===
====Early Gen4====
Allocate a block of memory in the stolen memory space. 512KB is the largest you can use for the GTT, and allows for a 512MB virtual addressing space. Ensure that the block of memory is aligned with its size.
 
Once you have it in place, ensure that the graphics pipeline is flushed (if you don't know what this is, it probably already is flushed), then:
 
<pre>
genx_reg32[GFX_FLSH_CNTL] = 0;
genx_reg32[PGTBL_CTL] = 1 | gtt_offset; // GTT: 512KB, enabled
genx_reg32[PGTBL_CTL2] = 0; // disables the PPGTT
genx_reg32[GFX_FLSH_CNTL] = 0;
</pre>
 
We will get to modifying it pretty soon.
 
*Paging type 0 is for stolen memory.
*Paging type 3 is for main CPU memory. The GPU will snoop the cache for you.
 
Note that the actual screen is rendered using physical, unmapped "stolen" memory addresses.
 
Also note that direct access to the stolen memory via GMADR also uses the physical unmapped addresses.
 
====Gen4 "Bearlake-C" (G35?) and onwards====
 
Don't allocate it. The chip allocates it for you. Just leave the upper 31:12 in PGTBL_CTL intact when you mess with it. Once you have identity paging in place, set the lower bit.
 
Of course, if you are paranoid, you can always allocate some memory anyway.
 
Paging types above are as per pre-Gen6. Gen6 has different paging types, apparently.
 
===Identity paging===
 
In this example, genx_gtt32 points to GTTADR as calculated.
 
The GPU will handle all the caching issues for you if you use GTTADR. To add to this, in Gen6 this is the only way to access the GTT, so instead of learning the older method of writing via system RAM and then flushing the GPU's cache, you should just use this instead.
 
<pre>
for(i = 0; i < 512*256; i++)
genx_gtt32[i] = (((i)<<12) | (3<<1) | (1<<0));
</pre>
 
===Memory-to-GPU blit (and vice versa)===
 
This is for a 32bpp blit.
 
blk_(width|height) denotes the size of the blit to perform.
(src|dest)_gtt are addresses that the GPU will feed through the Global GTT or PPGTT
(src|dest)_pitch are the image pitches in DWords.
 
You must ensure that the GTT has the correct paging type for the given GPU for each page that this will need to use. For Gen4, use 0 for stolen memory, and 3 for system memory.
 
<pre>
// SRC_COPY_BLT
genx_rb_push((2<<29) | (0x43<<22)
| (3<<20) // a:rgb mask
| 0x04
);
genx_rb_push(0
| (0<<30) // reverse X direction
| (3<<24) // bit depth
| (0xCC<<16) // raster op
| (dest_pitch*4) // dest pitch in bytes
);
genx_rb_push((blk_height<<16) | (blk_width*4)); // dest dims
genx_rb_push(dest_gtt);
genx_rb_push(0
| (src_pitch*4) // src pitch in bytes
);
genx_rb_push(src_gtt); // src addr
genx_rb_punch();
</pre>
 
XY_SRC_COPY_BLT and whatnot should also work just fine, including with clipping and the like.
''TODO: Succeed. Right now I'm failing. Miserably.''
 
==See Also==
Line 250 ⟶ 327:
* [https://01.org/linuxgraphics/documentation/driver-documentation-prms Official documentation from Intel]
* [http://www.x.org/docs/intel/ Official documentation on X.org] - covers up to 2012, but also has Vol_1b_G45_core.pdf which is missing from the official Intel PRM list
* Chipset datasheets:
** [http://www.intel.com/Assets/PDF/datasheet/320122.pdf Mobile Intel 4 Series Datasheet] - covers the mobile version of Gen4.5.
* [http://forums.entechtaiwan.com/index.php?topic=2578.0 1366x768 LCD timings] - thread with several different sets of timings
* [http://en.wikipedia.org/wiki/List_of_Intel_graphics_processing_units List of Intel graphics processing units] on Wikipedia - useful for finding out PCI device IDs and exactly what generation you're using.
Anonymous user