NVIDIA GeForce RTX 50 “Blackwell” Flagship Reportedly Features GB202 GPU With Up To 192 SM Units & 512-bit Bus

NVIDIA's next-gen GeForce RTX 50 "Blackwell" GPU rumors have started to roll out from reliable leakers such as Kopite7kimi.

NVIDIA GeForce RTX 50 "Blackwell" Flagship Reportedly Expands Upon RTX 40 Series With Increased SM Count, Wider Bus Interface, Increased Cache & More

NVIDIA's GeForce RTX 50 "Blackwell" GPU rumors already began a few months ago when the last of the GeForce RTX 40 GPUs were done making their way to the market. Labeled as "Ada-Next", these next-gen chips will be the basis of NVIDIA's brand new gaming lineup which targets a 2025 release date according to the official roadmap but rumors are also suggesting that the launch may happen earlier.

So starting with the details, Kopite7kimi posted on X about two configurations of Blackwell GPUs. The first one is the HPC/AI-oriented chip known as GB100 which has recently been stated to utilize the TSMC 3nm process node and targeting a late 2024 launch (announcement during GTC 2024).

The GB100 GPU is expected to be the first HPC chip from NVIDIA to utilize an MCM design and will be based on an 8 GPC cluster which includes 10 TPCs per cluster and each cluster will carry 2 SMs for a total of 160 SM units on the fully enabled die. The top die will also feature an 8192-bit wide bus interface which will support the latest HBM standards such as HBM3e.

Both Ampere & Hopper feature different FP32/FP64 core count arrangments but if NVIDIA were to follow the 128 FP32 core count per SM for Blackwell, it would end up with a possible 20,480 FP32 cores on a fully enabled die. The following is how the NVIDIA HPC parts compare against Blackwell GB100:

  • A100 (Ampere) - 8 GPCs / 64 TPCs / 128 SMs / 64 Cores Per SM / 8192 Cores / 5120-bit
  • H100 (Hopper) - 8 GPCs / 72 TPCs / 144 SMs / 128 Cores Per SM / 18,432 Cores / 5120-bit
  • B100 (Blackwell) - 8 GPCs / 80 TPCs / 160 SMs / 128 Cores Per SM / 20,480 Cores  / 8192-bit

Moving back to the gaming part, the GB202 GPU is rumored to feature a vastly different GPU config as we have seen in the previous gaming/HPC launches. The chip is expected to house 12 GPCs with a total of 8 TPCs which would total up to 96 TPCs on the full die or 192 SMs. Once again, if NVIDIA is to use the same 128 FP32 cores per SM, you get up to 24,576 cores which would mark a 33% uplift in core configuration over the full AD102 GPU. Of course, we have yet to see a gaming GPU with the full AD102 GPU so NVIDIA is likely to launch a cut-down GB202 die with its next-gen GeForce RTX 50 gaming lineup too with a higher-end variant making its way to the market when GPU yields become better or if there's a need to tackle the competition.

A mockup of NVIDIA's GB202 GPU block diagram based on existing rumors.

NVIDIA has moved away from adding just traditional cores to its GPUs and now includes various different types of cores and accelerators for AI, Tensor, Neural Processing & ray tracing operations within its GPUs so by the time NVIDIA introduces Blackwell, the existing Ada Lovelace configuration may very well be an outdated design.

Kopite7kimi also reiterates that the NVIDIA GB202 "Blackwell" GPU for GeForce RTX 50 GPUs is going to get a much wider 512-bit bus interface, a 33% increase over the 384-bit wide bus interface that's being featured on existing flagship chips.

There are also some rumors coming in from Chiphell Forums which suggest NVIDIA's GeForce RTX 50 "Blackwell" flagship features a 50% increase in core count, a 52% uplift in memory bandwidth, a 78% increase in cache size, and a 15% increase in core frequency, all resulting in a 70% uplift in the overall GPU performance capabilities. It is still a bit too early to tell what the final specs NVIDIA will decide for its GeForce RTX 50 flagship graphics card and the company is known to work on several SKUs before deciding which one actually makes it to the market and since we are a year away from launch, it will be unwise to call anything final this early. But based on these reports, a GeForce RTX 50 GPU would feature:

  • 24,576 CUDA Cores (GB202 GPU)
  • 32 Gbps Memory Speeds (GDDR7)
  • ~3000 MHz Peak GPU Clock Speeds
  • 128 MB L2 Cache (For GPU)

Samsung and SK Hynix are already reported to have started sampling the next-gen GDDR7 DRAM modules to NVIDIA for its next-gen GPU lineup. The new modules are expected to feature up to 32 Gbps pin speeds, delivering up to 2 TB/s of bandwidth across a 512-bit bus interface. That will mark a huge increase in GDDR bandwidth capabilities and a 2x increase over the current fastest RTX GPU such as the 4090.

All of this is interesting stuff for sure but we have to remember that these are rumors and we have to wait and see how much ends up being true by the time the next-gen GeForce RTX series launch.

NVIDIA GeForce GPU SKUs:

Generation Pascal Turing Ampere Ada Lovelace Blackwell
Process Node TSMC 16nm TSMC 12nm Samsung 8nm TSMC 5nm TBD
Launch Year 2016 2018 2020 2022 2025
Ultra-Enthusiast SKU GP102 TU102 GA102 AD102 GB202
Enthusiast SKU GP104 TU104 GA102 AD103 GB203
High-End SKU GP104 TU106 GA104 AD104 GB205
Mainstream SKU GP106 TU106 GA106 AD106 GB206
Entry-Level SKU GP107 TU116/117 GA107 AD107 GB207
What do want to see on NVIDIA's next-gen RTX 50 series?
Vote to see results
Written by Hassan Mujtaba

Post a Comment

0 Comments