R680 - what's it all about
The Radeon HD 3870 X2 - codenamed R680 - is literally two Radeon HD 3870s on a single PCB, CrossFired. Please head on back over to our look at that card's underpinnings which, in turn, are based on the Radeon HD 2900 XT's.
ATI has traditionally left it to its partners, most notably Sapphire, to custom-design dual-GPU-on-a-single-PCB cards. This time, ATI's R680 design is a reference model that will ship directly to AIBs.
The technical details do reveal that ATI has tweaked clock frequencies and a bunch of minor features, when compared to the single-GPU Radeon HD 3870.
We have the same 55nm manufacturing process; PCI-Express 2.0 connectivity*; integrated UVD video-processing engine; DX10.1-compliant specification; and powersaving PowerPlay improvements found on the HD 3870 over and above what's present on the Radeon 2900 XT.
The core and shader speeds have been increased from 775MHz to 825MHz, offering greater pixel-pushing power. Memory bandwidth, though, has reduced from an effective 2.25GHz to 1.8GHz (ignore the error in the AMD-supplied table), so ATI gives and takes in equal measure.
The Radeon HD 3870 X2's lower memory speed is likely to be a cost-cutting issue, as 1,800MHz-rated GDDR3 is intrinsically cheaper than 2,250MHz GDDR4. Each of the X2's GPUs has its own 512MiB frame buffer and both GPUs talk to one another via a PCIe 1.1 card-mounted bridge. That's an undesirable fact considering the PCIe 2.0 external interface to the motherboard.
Everything else can be thought of as double, so 640 stream processors compared to 320 and 32 texture units compared to the HD 3870's 16, etc: you get the picture.
The inherent problem that faces any single card that's based on multiple GPUs is one of scaling. The dual-GPU X2 can potentially double the HD 3870's output, rendering via AFR, when evaluated in the best-case scenario. More likely, the speedup will be between 1.5-1.7x a single card's. In some cases, the software hooks may cause the dual-GPU X2 to perform slower than a single HD 3870 - it's all dependent upon just how good the Catalyst Control Center software is at harnessing the power of, well, two.
Think of the Radeon HD 3870 X2 as a CrossFired HD 3870 setup, really.
Trotting out the old table that compares more than just ATI cards now:
Graphics cards | AMD Radeon HD 3870 X2 1024 | AMD Radeon HD 3870 512 | AMD Radeon HD 2900 XT 512 | NVIDIA GeForce 8800 GT 512 | NVIDIA GeForce 8800 GTS 512 | NVIDIA GeForce 8800 GTX 768 | |
---|---|---|---|---|---|---|---|
PCIe | PCIe 2.0 | PCIe 1.x | PCIe 2.0 | PCIe 1.x | |||
GPU clock | 825MHz | 775MHz | 743MHz | 600MHz | 650MHz | 576MHz | |
Shader clock | 825MHz | 775MHz | 743MHz | 1500MHz | 1625MHz | 1350MHz | |
Memory clock (effective) | 1802MHz | 2250MHz | 1656MHz | 1800MHz | 1940MHz | 1800MHz | |
Memory interface, size, and implementation | 2x 256-bit, 512MiB, GDDR3 | 256-bit, 512MiB, GDDR4 | 512-bit, 512MiB, GDDR3 | 256-bit, 512MiB, GDDR3 | 384-bit, 768MiB, GDDR3 | ||
Memory bandwidth | 115.328GB/sec (card) | 72.8GB/sec | 105.984GB/sec | 57.60GB/sec | 62.1GB/sec | 86.40GB/sec | |
Manufacturing process | TSMC, 55nm | TSMC, 80nm | TSMC, 65nm | TSMC, 90nm | |||
Transistor count | 1300M | 666M | 700M+ | 754M | 681M | ||
Die size | 2x 192mm² | 192mm² | 408mm² | 296mm² | 484mm² | ||
DirectX Shader Model | 4.1 | 4.0 | |||||
Vertex, fragment, geometry shading (shared) | 640 FP32 scalar ALUs, MADD dual-issue (unified) | 320 FP32 scalar ALUs, MADD dual-issue (unified) | 112 FP32 scalar ALUs, MADD+MUL dual-issue (unified) | 128 FP32 scalar ALUs, MADD+MUL dual-issue (unified) | |||
Peak GFLOP/s | 1,056 | 496 | 475.52 | 504 | 624 | 518.4 | |
Data sampling and filtering | 16ppc address and 32ppc bilinear INT8 filtering, max 16xAF | 56ppc address and 56ppc bilinear INT8 filtering, max 16xAF | 64ppc address and 64ppc bilinear INT8 filtering, max 16xAF | 32ppc address and 64ppc bilinear INT8 filtering, max 16xAF | |||
Peak GTexel/s (bilinear) | 26.4 | 12.4 | 11.888 | 16.8 | 20.8 | 18.4 | |
ROPs | 16,
8Z or 8C samples/clk, 2clk FP16 blend 8xMSAA, 24xCFAA |
16, 8Z or 8C
samples/clk, 2clk FP16 blend 8xMSAA, 16xCSAA |
24,
8Z or 8C samples/clk, 2clk FP16 blend 8xMSAA, 16xCSAA |
||||
Outputs | 2 x dual-link DVI (HDMI) w/HDCP, mini-DIN (VIVO) | 2 x dual-link DVI w/HDCP, mini-DIN | 2 x dual-link DVI w/HDCP (discrete ASIC), mini-DIN | ||||
Hardware-assisted video-decoding engine | AMD UVD - full H.264 and VC-1 decode | run via GPU's shaders | NVIDIA's VP2 - full H.264 decode and partial VC-1 decode | run via GPU's shaders | |||
Reference cooler | dual-slot | single-slot | dual-slot | ||||
Retail price (default-clocked model) | £279 | £149 | £150 | £165 | £199 | £235 |
On paper, the Radeon HD 3870 X2 1024MiB should have the beating of all
other single-GPU cards listed above. Overall performance is a factor of
underlying design excellence coupled to efficient developer-relations'
support. NVIDIA's erstwhile champ, GeForce 8800 GTX 768MiB, may not
have the on-paper clout to deal with ATI's latest and greatest, but
what it
lacks in shading and texturing horsepower may be compensated for by the
inherent efficiency of a single-GPU architecture and excellent dev-rel
support, getting the most out of the design. We'll put this conjecture
to the test during our benchmarking analysis.
Partners' projected pricing for the 'X2 is around £279, including VAT, so some way above an NVIDIA GeForce 8800 GTS 512 and GTX but significantly below an Ultra, which currently retails from £350 upwards.
Heading on over for a visual look now.