(GPU) AMD Radeon VII
(Profile updated as of 16/12/2021.)
I made a GPU profile thing for one of my favourite GPUs. Well, it was my favourite GPU until I took it out of my PC because I broke it. Yes, I actually broke it. But anyway, here is Vega 20 with its pretty HBM2 memory and buckets of raw memory bandwidth.
(click for full images)
(Picture 1) Vega 20 Graphics processor (centre die) surrounded by its quartet of 2nd generation High-Bandwidth Memory chips, each linked to the GPU with a 1024-bit interface and carrying 4, 1GB Memory chips layered ontop of a logic die. The full pacakge has 16GB of video memory close to the processor, and a terabyte per second of raw memory bandwidth. The most on any video card to the date this profile was updated.
(picture 2) Architectural Block Diagram for Radeon VII's Vega 20 GPU. It has four disabled Compute Units, shown here.
(picture 3) Infrared Silicon die shot of the Vega 20 graphics processor die. I have annotated the disabled parts of the silicon, (4x Compute Units). There is no way to tell which exact CU are disabled, but it will likely be one from each Shader Engine. The above is for example purposes only, each Vega 20 will likely vary depending on defect binning.
Graphics Card Information
Graphics Card: AMD Radeon VII
Graphics Card Manufacturer: Advanced Micro Devices
Graphics Card Release Date: February 7, 2019
Graphics Card MSRP: $699 USD
Graphics Processor Codename: "Vega 20"
Graphics Processor Manufacturer: Advanced Micro Devices
Graphics Processor Implementation: Cut die
Graphics Interface: PCI-E 16x Gen3 *
Architecture: Graphics Core Next 5th Generation (GCN5)
Lithography Process: TSMC 7nm (N7) FinFET
Approximate die size: 331mm²
Sasha's GPU die Size Rating: mid-sized
Approximate Transistor Count: 13,200 Million
Approximate Transistor Density: 39.8 Million / Square Millimetre
Double-speed FP16 Shading: Yes (Rapid Packed Math)
Asynchronous Compute Capability: Full
DirectX Hardware Support: DX12.1 (FL 12_1)
Dedicated DXR Acceleration on chip: No
Variable-rate Shading: No
Adv. Geometry shading: No (Preliminary) *
Adv. Geometry shading (Programmable/DX12 Mesh Shaders): No
AI/ML Acceleration: No
Advanced Memory Management: Yes (HBCC)
Integer and Float Shader Co-execution: No
Tile-based Renderer: No (Preliminary) *
GPU Computing Resources
GPU Substructures: 4 Shader Engines
Graphics Cores: 60 Compute Units (64 Full Chip)
Graphics Cores per Substructure: 15
Total Stream Processors (ALU/Shaders): 3840 (4096 Full Chip)
Stream Processors per Graphics Core: 64
Graphics Core SIMD Structure: 4 x 16
Total Special Execution Units: 60 Scalar Units (64 Full Chip), 960 Load/Store Units (1024 Full Chip), 60 branch Units (64 Full Chip)
Special Execution Units per Graphics Core: 1 Scalar, 1 Branch, 16 Load/Store Units
Total Texturing Units: 224 (256 Full Chip)
Texturing Units per Graphics Core: 4
Pixel Pipelines (ROPs): 64 (16 x Render Backend with 4 Pixels per clock)
Level 2 shared on-chip cache: 4096 KB
Geometry/Tessellation Processors: 4
Raster Engines: 4
GPU Memory Subsystem
Graphics Memory Type: HBM2
Graphics Memory Standard Capacity: 16384 MB
Graphics Memory Composition: 4 x 4-high stacks (4x 1024 MB DRAM dies each stack)
Graphics Memory Access Granularity: 1024-bit (128 bytes)
Graphics Memory Standard Clock Speed / Data Rate: 1000 MHz / 2000 MHz
Graphics Memory Full Interface Width: 4096-bit (512 bytes per clock)
Graphics Memory Peak Memory Bandwidth: 1024 GB/s
GPU Frequency and Peak performance
Graphics Engine Clock: 1800 MHz *
GPU Computing Power FP16: 27,648,000 Million operations per second (FMA)
GPU Computing Power FP32: 13,824,000 Million operations per second (FMA)
GPU Computing Power FP64: 3,456,000 Million operations per second (FMA) *
GPU Texturing Rate INT8: 403,200 Million Texels per second
GPU Texturing Rate FP16: 201,600 Million Texels per second
GPU Pixel Rate: 115,200 Million Pixels per second
GPU Primitive Rate: 7,200 Million triangles per second
GPU Thermal and Power
Standard Cooling Solution: Triple-fan open-air with Vapor Chamber Heatsink
Typical Board Power: 300 W
Maximum Board Power: 360 W
Maximum Allowed Junction Temperature (TJ Max): 105*C
Graphics Card description
Radeon VII uses AMD's Graphics Core Next "5th Generation" (GCN5) Architecture, you can read what I typed about it in the RX Vega 64 profile here.
Radeon VII launched in February 2019 to the surprise of many people. This Graphics Card uses the "Vega 20" silicon; an originally compute-oriented die-shrink of the "Vega 10" silicon found in the RX Vega 64 and 56 cards. The primary difference is "Vega 20" has twice the memory interface width, courtesy of two additional 4-high stacks of High Bandwidth Memory 2nd Generation (HBM2). The bus-width is now at 4096-bit and the memory data rate runs at 2 Gbps, giving this card a staggering 1TB/s of raw memory bandwidth and 16GB of video memory. AMD originally stated that "Vega 20" was to address HPC / AI/Machine Learning only, and the release of the Radeon VII as a gaming product was unexpected. The other major change to the silicon (aside from move to TSMC's 7nm manufacturing) is the addition of a capability to process Double-Precision floats (FP64) at 1:2 speed of FP32, essential for Scientific Computing. This is throttled to 1:4 on the Radeon VII, however; only the more expensive Radeon Instinct cards with this chip have the full speed FP64.
This card is essentially a shot at the high-end and says "Radeon is still in the game, we can still compete here" for 2019. And indeed, Radeon VII's performance is very high, competing with NVIDIA's latest (at the time) (also $699) RTX 2080 as you can read below. It doesn't take the crown from the mighty RTX 2080 Ti ($999+) but it didn't really need to.
Sasha's note (27-08-2019): Since the launch of the RX 5700 XT, the Radeon VII entered End of Life, as that card provides similar performance at a much reduced power consumption; on account of the more advanced RDNA ("Navi 10") silicon. The Radeon VII did have the crown of having the most Video Memory on a consumer graphics card, and a high FP64 performance, however, when it was introduced.
Graphics Card approximate 3D Performance
Sasha's gaming performance rating (2019): Great for 4K high settings 60 Hz, or 1440p maximum settings and high refresh
Radeon VII is a high-end Graphics Card with performance in the same level as the GTX 1080 Ti and the RTX 2080. As with most GCN-based GPUs this card's performance can vary significantly depending on the game or workload, and it is not unusual to see the Radeon VII trailing the RTX 2080 (or even 1080 Ti) in some games, but also beating the 2080 in others: even coming close to the 2080 Ti performance. But overall it slots in just under the RTX 2080 (<10%) and about the same, or slightly more performance in video games than NVIDIA's popular GTX 1080 Ti.
This card is arguably better suited for 4K Gaming with High Resolution textures than the RTX 2080, on account of the huge memory capacity, especially going forward. A popular argument against the Radeon VII is "it will be obsolete before it can use all 16GB of VRAM"; this might be true, but it doesn't have to use all 16, just more than 8, to be at an advantage over the RTX 2080. Having 16GB is an architectural reason, as HBM2 stacks are not currently built in less than 4x1024 MB stacks, and having fewer stacks would have reduced the GPU's total memory bandwidth significantly.
Sasha's note (27-08-2019): The Radeon VII performs similarly to the Radeon RX 5700 XT, based on the "Navi 10" silicon, but uses more power. On average, the Radeon VII is slightly ahead (<10%).
Adv. Geometry shading (Primitive/Mesh shaders):
Vega-based GPUs were advertised as being able to utilise a new "Fast Path" geometry system at the shader-level, but in my understanding, it is not fully functional / implemented / beneficial on Vega graphics processors.
GPU Computing Power FP64:
Vega 20 silicon supports 1:2 FP64, however it is throttled to 1:4 on Radeon VII.
Vega 20 silicon supports PCI-E 16x Gen4, but as far as I'm aware the Radeon VII's PCB is only supporting Gen3.
Graphics Engine Clock:
Vega-based cards utilise a dynamic boosting algorithm, similar in functionality to NVIDIA's "GPU Boost" feature. The stated clock speed is AMD-spec maximum boost clock speed. Actual gaming frequency will vary a bit based on many factors such as temperature, power limits etc. In my observation Radeon VII runs at about ~1750 MHz in games.
Vega-based GPUs feature what AMD calls "Draw-Stream Binning Rasteriser". This is, in effect, a type of tile-based renderer, however to my knowledge it is not fully enabled in all games on the desktop Vega-based cards. In Vega, this feature is primarily used to reduce memory bandwidth requirements and power consumption, rather than improve performance significantly. It is, as a result, more useful on the Vega-based integrated Graphics processors featuring on the "Raven Ridge" Silicon.
This bit is for my personal opinion on this Graphics card / Graphics processor
Sasha's Awesomeness Rating: Awesome