facebook rss twitter

NVIDIA GeForce 7900 GTX, 7900 GT and 7600 GT Preview

by Ryszard Sommefeldt on 9 March 2006, 14:05

Tags: Nvidia Geforce 7900 GTX, NVIDIA (NASDAQ:NVDA)

Quick Link: HEXUS.net/qaeyi

Add to My Vault: x


G71 is NVIDIA's new high-end graphics product. While your author was convinced the new high-end GPU would take at least the pixel shading hardware wider, compared to G70, the reality is different (and somewhat humbling!). G71 has the same VP/FP/ROP configuration as G70, placing 8 vertex processors (VP), 24 fragment processors (FP) and 16 raster ouput units (ROP) on a 196mm² 90nm die.

Significantly smaller than ATI's R520, we've mentioned that G71 isn't simply a shrunken G70. At 278M transistors compared to its 302M transistor parent ASIC, NVIDIA lose the transistors via the shrink to 90nm (new libraries) and a slight repipelining of the vertex and fragment hardware, trimming the fat from those.

25 million transistors might seem like a fair chunk to give up (it's a full GeForce 2 for crying out loud!), especially while improving performance further in some areas, but given up they are, the rapidly switching silicon devices trotting off to Render Heaven, leaving NVIDIA with a smaller, cheaper chip to make. And that's without a shrink to 90!

A good friend of mine pointed out recently that 196mm² means G71 is smaller than ATI's ground breaking R300. Silicon trivia at best, but that's not bad for over 10 times the fragment processing power of Radeon 9700 Pro from a G71 at GeForce 7900 GTX clocks.

So given an 8/24/16 setup, let's sum up a formal spec before looking at the theoretical rates of the two new G71-based SKUs launching along with the chip itself.

NVIDIA G71 GPU Properties
Process and Fabricator 90nm @ TSMC, 90GT w/ low-k
Die Size 13.5 x 14.5mm (196mm²)
Transistor Count 278 million
DirectX Shader Model Shader Model 3.0
Basic Configuration (VP/FP/ROP) 8/24/16
Vertex Shader Info VS3.0
5D FP32, co-issue MADD, branch, tex
Fragment Processor Info PS3.0
4D FP32, dual and co-issue MADD/MADD, branch, tex
ROP Info 4x FX8/FX16 MSAA (2 subsamples/cycle)
2x Z-only rate, 2x colour-only rate
FP/FX blender (inc. FP16)
Texture processing 24 FP32 address units, 24 samplers
Bi/Tri/Aniso (128-tap), FP16 filtering
Memory Interface 256-bit, 4 memory channels
Display output 2x dual-link DVI TMDS transmitters, NVIDIA PureVidio

Versus G70, the functional changes are subtle. Better FP16 blending performance, double colour rate, SLI AA resolve via the inter-GPU link, two dual-link TMDS transmitters and the move to 90nm are the main differences.

Your author was expecting a wider chip in terms of fragment processing, and indeed NVIDIA did work on an 8-quad design that never taped out. Despite that, the clock rates afforded to G71 by TSMC's 90GT process mean that even with only 24 fragment units, their dual MADD design is just as much a fragment processing beast as ATI's R580 in X1900 XTX configuration, at least in terms of that MADD instruction.

Theoretical Performance

Theoretical Rates for GeForce 7900 GTX and GT
NVIDIA GeForce 7900 GTX NVIDIA GeForce 7900 GT
Core Clock 650MHz (700MHz VS) 450MHz (470MHz VS)
Memory Clock 800MHz 660MHz
Pixel fillrate 10.4G pixels/sec 7.20G pixels/sec
Texture sampling rate 62.4G samples/sec 43.2G samples/sec
Z-only fillrate 20.8G samples/sec 14.4G samples/sec
Vertex transform rate 1.40G tris/sec 0.94G tris/sec
VP MADD issue rate 5.60G instr/sec 5.00G instr/sec
FP MADD issue rate 31.2G instr/sec 21.6G instr/sec
Memory bandwidth 56.32 GB/sec 42.24 GB/sec

NVIDIA don't disable any part of the chip to create the GT. It's simply a clock reduced GTX in terms of configuration, with half the memory. That indicates yeilds of the G71 GPU, shipping in A02 silicon form, are excellent.

Let's have a look at G73 before we look at the reference boards for each chip supplied by NVIDIA for evaluation recently.