Item: BFG's NVIDIA GeForce GTX 295 - the champ is back!
Author: Parm Mann

Table time

So what exactly is the GeForce GTX 295? It's essentially two intermixed GeForce GTX 200-series GPUs side by side in a single-card solution, connected via internal SLI. But, oddly enough, the GPUs aren't replicas of either the GeForce GTX 260 or the GeForce GTX 280. Instead, we've a hybrid GPU built on a 55nm process with a few useful upgrades.

Let's take a look at the table to see how both team's range-topping GTX 200 and HD 4000-series compare.

Graphics cards	NVIDIA GeForce GTX 295	NVIDIA GeForce GTX 280 1,024MB	NVIDIA GeForce GTX 260 896MB (new)	AMD Radeon HD 4870 X2 2,048MB	AMD Radeon HD 4870 512MiB	AMD Radeon HD 4850 X2 2,048MB	AMD Radeon HD 4850 512MB
PCIe	PCIe 2.0
GPU(s) clock	576MHz	602MHz	576MHz	750MHz	750MHz	625MHz	625MHz
Shader clock	1,242MHz	1,296MHz	1,242MHz	750MHz	750MHz	625MHz	625MHz
Memory clock (effective)	1,998MHz	2,214MHz	1,998MHz	3,600MHz	3,600MHz	1,986MHz	1,986MHz
Memory interface and size	448-bit (per GPU), 1,792MB, GDDR3	512-bit, 1,024MB, GDDR3	448-bit, 896MB, GDDR3	512-bit (2x 256-bit), 2,048MB, GDDR5	256-bit, 512MB, GDDR5	512-bit (2x 256-bit), 2,048MB, GDDR3	256-bit, 512MB, GDDR3
Memory bandwidth	223.8GB/sec	141.7GB/sec	111.9GB/sec	230GB/sec	115GB/sec	127GB/sec	63.5GB/sec
Manufacturing process	TSMC, 55nm	TSMC, 65nm	TSMC, 65nm	TSMC, 55nm	TSMC, 55nm	TSMC, 55nm	TSMC, 55nm
Transistor count	TBC	1,408M	1,408M	1,930M	965M	1,930M	965M
Die size	TBC	576mm²	576mm²	520mm² (2 x 260mm²)	260mm²	520mm² (2 x 260mm²)	260mm²
Double-precision support	Yes	Yes	Yes	Yes	Yes	Yes	Yes
DirectX/ Shader Model	DX10, 4.0	DX10, 4.0	DX10, 4.0	DX10.1, 4.1	DX10.1, 4.1	DX10.1, 4.1	DX10.1, 4.1
Vertex, fragment, geometry shading (shared)	480 FP32 scalar ALUs, MADD dual-issue + MUL (unified)	240 FP32 scalar ALUs, MADD dual-issue + MUL (unified)	216 FP32 scalar ALUs, MADD dual-issue + MUL (unified)	1,600 FP32 scalar ALUs, MADD dual-issue (unified)	800 FP32 scalar ALUs, MADD dual-issue (unified)	1,600 FP32 scalar ALUs, MADD dual-issue (unified)	800 FP32 scalar ALUs, MADD dual-issue (unified)
Peak GFLOPS	1,788	933	805	2,400	1,200	2,000	1,000
Data sampling and filtering	160ppc address and 160ppc bilinear INT8/80ppc FP16 filtering, max 16xAF	80ppc address and 80ppc bilinear INT8/40ppc FP16 filtering, max 16xAF	72ppc address and 72ppc bilinear INT8/36ppc FP16 filtering, max 16xAF	80ppc address and 80ppc bilinear INT8/40ppc FP16 filtering, max 16xAF	40ppc address and 40ppc bilinear INT8/20ppc FP16 filtering, max 16xAF	80ppc address and 80ppc bilinear INT8/40ppc FP16 filtering, max 16xAF	40ppc address and 40ppc bilinear INT8/ 20ppc FP16 filtering, max 16xAF
Peak fillrate Gpixels/s	32.256	19.264	16.128	24	12	20	10
Peak Gtexel/s (bilinear)	92.2	48.16	41.472	60	30	50	25
Peak Gtexel/s (FP16, bilinear)	46.1	24.09	20.736	30	15	25	12.5
ROPs	56	32	28	32	16	32	16
Peak TDP (claimed)	289	236	182	289	160	-	110
Power connectors (default clocked)	8-pin + 6-pin	8-pin + 6-pin	6-pin + 6-pin	8-pin + 6-pin	6-pin + 6-pin	8-pin + 6-pin	6-pin
Multi-GPU	SLI - quad	SLI - three-board	SLI - three-board	CrossFire - two-board	CrossFire - four-board	CrossFire - two-board	CrossFire - four-board
Outputs	2 x dual-link DVI w/HDCP, 1 x HDMI	2 x dual-link DVI w/HDCP, native HDMI 5.1 (via S/PDIF)		2 x dual-link DVI w/HDCP, HDMI 7.1 (native, on GPU)		4 x dual-link DVI w/HDCP, HDMI 7.1 (native, on GPU)	2 x dual-link DVI w/HDCP, HDMI 7.1 (native, on GPU)
Hardware-assisted video-decoding engine	NVIDIA's PureVideo HD - full H.264 decode and partial VC-1 decode, plus dual-stream decode			AMD UVD 2 - full H.264 and VC-1 decode, plus dual-stream decode
Reference cooler	dual-slot	dual-slot	dual-slot	dual-slot	dual-slot	dual-slot	single-slot

The transition to half-node 55nm clearly has its benefits, including a maximum power draw of 289W - a feat we believe to be unachievable with two 65nm GeForce GTX 260s.

Armed with GTX 260-matching frequencies - that's 576MHz core, 1,242MHz shader and 1,998MHz memory - you'd be forgiven for believing nothing has changed. However, NVIDIA has raised the total shader count to 480, matching the per-GPU amount of the GeForce GTX 280.

Both GPUs have access to 896MB of GDDR3 memory via a 448-bit interface, creating a substantial 223.8GB/s of memory bandwidth - though, of course, each GPU can lay claim to only half.

Furthermore, the doubling-up process has the usual effect of taking numbers through the roof. ROP count rises to 56 and the 160 total texture filtering units help push the card's massive bilinear-filtering capacity to 92.2Gtexels/s.

As is the case with most dual-GPU single-card solutions, the GeForce GTX 295 fits into a single PCIe 2.0 interface, bringing SLI support to boards with just the one PCIe 2.0 slot. Should an extra PCIe 2.0 x16 slot be available, the hardened enthusiast will be happy to hear that a second GeForce GTX 295 can be partnered up to provide a return to Quad SLI performance - a feature last seen during the GeForce 9800 GX2's days.

With an estimated MSRP of £400, however, it's asking a lot to consider even a single card, let alone two.

Despite the 55nm shrink, not a whole lot else has changed. The card's compute power has been beefed up - and we'd expect nothing less given NVIDIA's desire for GPGPU domination - but its memory subsystem sticks to GDDR3. It's seemingly subdued, but the die shrink could leave headroom for NVIDIA's partners to introduce pre-overclocked models further down the road.

Of course, one can't ignore the obvious possibility of 55nm versions of the GeForce GTX 280 and GeForce GTX 260, but we'll wait for official confirmation from NVIDIA before we begin to conjecture.

Review: BFG's NVIDIA GeForce GTX 295 - the champ is back!

Table time

MY HEXUS

EVENTS

INDUSTRY PRESS RELEASES