PDA

View Full Version : Durango GPU detailed



wraggster
February 4th, 2013, 23:58
The last week, we published a poll and you chose to know more about the GPU of Durango. Wishes come true. We have splitted the articled in three pages, don’t forget to read the whole work. [Note: There are two more pages with lots of details at the link.]

A better view of Durango’s GPU capabilities and performance.

Durango brings the enhanced capabilities of a modern Direct3D 11 GPU to the console space. The Durango GPU is a departure from previous console generations both in raw performance and in structure.



http://i.imgur.com/48Oi3j2.jpg

Quote:


The following table describes expected performance of the Durango GPU. Bear in mind that the table is based only on hardware specifications, not on actual hardware running actual code. For many reasons, theoretical peak performance can be difficult or impossible to achieve with real-world processing loads.




http://i.imgur.com/8BmZs7f.png

Quote:


Virtual Addressing

All GPU memory accesses on Durango use virtual addresses, and therefore pass through a translation table before being resolved to physical addresses. This layer of indirection solves the problem of resource memory fragmentation in hardware—a single resource can now occupy several noncontiguous pages of physical memory without penalty.

Virtual addresses can target pages in main RAM or ESRAM, or can be unmapped. Shader reads and writes to unmapped pages return well-defined results, including optional error codes, rather than crashing the GPU. This facility is important for support of tiled resources, which are only partially resident in physical memory

ESRAM

Durango has no video memory (VRAM) in the traditional sense, but the GPU does contain 32 MB of fast embedded SRAM (ESRAM). ESRAM on Durango is free from many of the restrictions that affect EDRAM on Xbox 360. Durango supports the following scenarios:

Texturing from ESRAM
Rendering to surfaces in main RAM
Read back from render targets without performing a resolve (in certain cases)



The difference in throughput between ESRAM and main RAM is moderate: 102.4 GB/sec versus 68 GB/sec. The advantages of ESRAM are lower latency and lack of contention from other memory clients—for instance the CPU, I/O, and display output. Low latency is particularly important for sustaining peak performance of the color blocks (CBs) and depth blocks (DBs).
Local Shared Memory and Global Shared Memory

Each shader core of the Durango GPU contains a 64-KB buffer of local shared memory (LSM). The LSM supplies scratch space for compute shader threadgroups. The LSM is also used implicitly for various purposes. The shader compiler can choose to allocate temporary arrays there, spill data from registers, or cache data that arrives from external memory. The LSM facilitates passing data from one pipeline stage to another (interpolants, patch control points, tessellation factors, stream out, etc.). In some cases, this usage implies that successive pipeline stages are restricted to run on the same SC.

The GPU also contains a single 64-KB buffer of global shared memory (GSM). The GSM contains temporary data referenced by an entire draw call. It is also used implicitly to enforce synchronization barriers, and to properly order accesses to Direct3D 11 append and consume buffers. The GSM is capable of acting as a destination for shader export, so the driver can choose to locate small render targets there for efficiency.
Cache

Durango has a two stage caching system, depicted below.

http://www.neogaf.com/forum/showthread.php?t=511579