facebook rss twitter

Review: ATI Radeon X1950 XTX 512MiB

by Steve Kerrison on 23 August 2006, 08:16

Tags: ATI Radeon X1950 XTX, ATi Technologies (NYSE:AMD)

Quick Link: HEXUS.net/qagmk

Add to My Vault: x

The technical bit

Let's get one important naming issue out of the way first. ' R580+ ', as a specific name for the ASIC itself, doesn't exist. Look at the GPU of an X1950 XTX and you'll see plain old R580 laser-etched onto it. What we have, however, is a new revision to the silicon - A31 - to provide a number of tweaks. the name ' R580+ ' has been used internally (and occasionally externally) by ATi as an easy way to refer to this new product over its X1900 brethren.

Here's R580's spec, as taken from our Radeon X1900 XT and XTX review.

ATI R580 GPU Properties
Process and Fabricator 90nm @ TSMC, 90GT w/ low-k
Die Size 18.5mm x 18.5mm
Transistor Count 384 million
DirectX Shader Model Shader Model 3.0
Basic Configuration (VP/FP/ROP) 8/48/16
Vertex Shader Info VS3.0 (no texld)
5D FP32, co-issue MADD, branch
single cycle trig functions
Fragment Processor Info PS3.0
4D FP32, dual-issue ADD+MADD, branch
single cycle trig functions
ROP Info 6x Int8/FP16/FX16/Int10 MSAA (2 subsamples/cycle)
2x Z-only rate
FP/FX16 blender
Texture processing 16 FP32 address units, 4 samplers
Bilinear filter for integer samples
Memory Interface 256-bit, 512-bit ring bus
Display output 2x dual-link DVI TMDS transmitters
ATI Avivo


With the new spin comes work to the R580's ring-bus memory controller. This is to improve its operation with GDDR4. The fourth version GDDR RAM was always a design consideration in the memory controller, but now that it's made it to mass production, ATI has taken an opportunity to ensure the core works optimally with it.

What's GDDR4 all about, then? Put simply, it's a higher bandwidth, lower power technology than GDDR3. Clocked at 1GHz, the X1950 XTX's 512MiB of GDDR4 can provide up to 64GiB/s of memory bandwidth. It sends (and receives) more data per cycle than GDDR3, allowing the core to be half-clocked. As such, the nominal core voltage for GDDR4 is lower than its predecessor, at 1.5V (though this is bumped up to 1.9V in gaming and overclocking scenarios). The result is less power consumption per clock compared to the older memory.

Also reducing power consumption is a technique known as Data Bus Inversion (DBI). Holding a logic-0 in memory demands more power than holding a logic-1. So ideally, it would be nice to store more 1s than 0s. With the help of DBI this is possible much of the time. When any 8-bit data segment is received by memory, if the number of 0s is greater than 4, the data is inverted and a DBI flag set to '1'. So, an 8-bit bit field with five 0s and three 1s becomes a field with five 1s and three 0s, lowering power consumption. Upon fetching that data back, if the DBI flag is set then the data is once again inverted, transforming it back to its original form.

GDDR4 chips use half the number of address pins as GDDR3. As such, addressing the memory takes two clocks rather than 1. The result is double the latency in addressing. The RAM also waits for a good clock strobe before it will put anything onto the data pins during a read operation. The end result can be a number of clock cycles passing before a clean clock strobe arrives, further increasing latency. Of course, it would be fair to call swings and roundabouts, because increased clock speeds will (although not always initially) negate the increased latency.

Currently Samsung's mass producing GDDR4 parts, with Hynix ramping it up into full steam as we speak. For now, you'll find Samsung's BC09 GDDR4 chips on the X1950 XTX in the same 512MiB quantity as we're accustomed to on such cards now.

Core clock

X1950 XTX's core clock remains the same as the X1900 XTX at 650MHz. However, the new silicon spin does seem to reduce power consumption slightly too, although we have no hard proof for you on that one just yet.

With the same clock speed then, it's down to the new memory and a tweaked memory controller to give the X1950 XTX the edge. We can expect scenarios where memory bandwidth is gobbled up to see a gain with the new product. This includes high resolutions and where anti-aliasing & anisotropic filtering are cranked up, or to put it another way, the kinds of scenarios you'd subject a card like this to.

Before we see if more girth really works, shall we take a look at the card?