Recently I took a look at Intel's quickest 0.18u processor and we commented that Intel were moving to 0.13u soon with the new Northwood processors. In the time that's passed Intel have come up with the goods as promised and here we are today with a closer look at the new chip.
At the time of the 2GHz 'Willamette' review the 'Northwood' had just become available in 2.0GHz and 2.2GHz forms and the 2.2 sat proudly at the top of the tree in Intel's grand scheme of things. That's still the case today with the 2.4GHz processor on the 0.13u process being Intel's most powerful consumer offering.
What's changed since then is that the new 0.13 micron processors have also become available at slower speeds in volume, specifically 1.6 and 1.8, and it's the 1.8GHz processor that I'll be looking at today.
A couple of sites are pushing the 1.6A as the processor of choice from Intel due to price, low multiplier (remember, multipliers are fixed on Intel consumer CPU's) and their ability to overclock VERY well.
We're a big fan of the 1.8 around these parts however with myself and a few good friends getting used to running and overclocking these new processors on a variety of motherboards.
We took a close look at the underlying architecture for the Pentium 4 in the 2GHz Willamette review so if you want to brush up on your core knowledge then take a look here, specifically the section on Netburst. You could also do a lot worse than hit somewhere like Anandtech if you want to make sure you have all bases covered.
So what's changed other than the move to the 0.13 micron die process with these new CPU's? The most obvious change is the move to 512kb of cache memory which brings with it a few key benefits. We saw in the previous review that with the long pipeline in the Pentium 4, being able to keep it busy for as long as possible is where the speed gains are to be found.
If you think of the extra cache memory as a bigger buffer between the CPU and main memory then you can see why it helps to keep the later stages of the pipe busy where the 256kb version would maybe have some difficulty.
Intel have speculated that when running existing code, only around 33% of the execution units are in use at any one time with the other two thirds of the CPU sitting idle. That's a lot of untapped power so the extra cache helps keep the pipe busier and in use more often.
Also, as always with a die shrink, the core voltage needed for proper operation drops, in this case from 1.75V to a max Vcore of 1.5V (more on this later).
Onto the performance.
As usual when we take a look at CPU performance we'll be running Sandra, POVRay, 3DMark and also for a decent test of CPU speed, Quake3. We don't usually look at Q3 in our CPU reviews but we'll have a quick look today.
We'll check out overclocked performance too on the same graphs to keep things nice and compact and save you reading gargantuan amounts of repeated information.
As always, a look at the test platform.
- DFI NB70-SC Intel i845-D Socket 478 Motherboard
- Intel 1.8Ghz 'A' Northwood Processor (512kb, 1.5V)
- 1 x 256Mb Crucial PC2100 CAS2.5 DDR module (CAS2 selected)
- Gainward Ti550 GeForce3 Ti500 64Mb
- Adaptec 39160 PCI SCSI Dual Channel U160 controller
- 2 x 73Gb Seagate Cheetah U160 10,000rpm SCSI disks
- Plextor 12/10/32S SCSI CDRW
- Pioneer 6x Slot-load SCSI DVD
- Creative Labs Soundblaster Audigy Player (SB0090 version)
So, first up, SiSoftware Sandra, specifically the CPU Arithmetic and CPU Multimedia Benchmarks.
- Windows XP Professional Build 2600.xpclient.010817-1148
- DetonatorXP 22.40 NVIDIA drivers
- POVRay v3.1g.msvc.unofficial-win32 dated 28 August 2001
- 3DMark 2001 Professional
- SiSoftware Sandra v2002.1.8.59
CPU Arithmetic Benchmark @ 1.8GHz (18 x 100)
CPU Multimedia Benchmark @ 1.8GHz (18 x 100)
Nothing incredibly exciting here except to note that on the i845D the Northwood runs as expected. On to the overclocked numbers. Remembering that the DFI NB70 has no facility for increasing the core voltage on the CPU makes the overclock we obtained all the more exciting.
At default voltage, the CPU managed to make it into Windows at 2358Mhz at 131MHz FSB. However the maximum stable speed that would run all our benchmarks was a slightly lower 128MHz FSB which gave us 2304MHz. 2.3GHz or 500MHz over stock speed without a voltage increase is very impressive. Here's the Sandra results at that speed.
CPU Arithmetic Benchmark @ 2.3GHz (18 x 128)
CPU Multimedia Benchmark @ 2.3GHz (18 x 128)
Here we can see the processor forge a decent lead over all the reference processors in Sandra in the 2 benchmarks. The XP2000 and 2.2Ghz Northwood are noticably absent from the list and in the multimedia benchmark the XP2000 would take the lead back with it's incredibly strong FPU but the little 1.8 does very well when overclocked and performance is very nice indeed.
As we move on to the POVRay tests it will be interesting to see if the extra 256kb of L2 cache on the Northwoods has any bearing on the render time since it doesn't affect Sandra's results.
First off, the results for the P3 and P4 binaries when running at 1.8GHz.
POVRay P3 and P4 Binaries @ 1.8GHz (18 x 100)
The execution time for the render doesn't get any faster on the Northwood compared to a 1.8Ghz Willamette processor. The POVRay working set would appear to fit nicely into the smaller cache of the Willamette CPU's so the increase in cache size provides no gain. This is to be expected when the application being run is happy in the smaller cache size.
As we increase speed to 2.3GHz we should see an impressive speedup and the quickest CPU results we've ever seen at Hexus should be provided by the P4 binary and 2.3Ghz combination.
POVRay P3 and P4 binaries @ 2.3GHz (18 x 100)
As we increase the clock speed by 500Mhz, the P3 binary result at 2.3GHz approaches the result for the P4 binary at 1.8GHz. The fact that it doesn't quite beat the 1.8GHz P4 binary result shows you the impressive results you get from optimising for the processor. A 500Mhz increase in clock speed is still not enough to catch the optimised version. The P4 binary result at 2.3GHz is simply the quickest we've yet seen at Hexus and beats all Athlon XP generated results on the P3 binary (remember the AXP can't run the P4 binary at all) that we've seen.
Finally, we'll check out 3DMark and Quake3 performance on the CPU. Quake3 loves running on the Pentium 4 processor so we should see some impressive results. A few people have noticed that our Q3 scores are a fair bit lower than other scores you might see around the web and that's down to 2 things. Firstly, I always set all the rendering options within Quake3 to their maximum, most punishing values so that the system has the most work to do. Secondly the demo we run isn't the Demo001 demo that most other sites use. We use the Four demo that's in the latest point release for our testing.
First off, 3DMark. It's a fair test of an entire system with some tests relying more on the CPU speed for their performance, some rely on memory bandwidth for more frames per second and some rely on pure GPU performance. Remember that my 3DMark results are generated by running the test 3 times and discarding the upper and lower score, leaving the middle one for our published results.
Pairing the 1.8 with a DX8 class GPU shows an impressive turn of speed. 7500+ in WindowsXP at default CPU and GPU clocks is quite impressive and shows that we'll have good 1024x768 speed out of the box, the most common gaming resolution for most. The CPU and FSB clock boosts give us another 700 points effectively for free. The system would loop 3DMark 24/7 at the overclocked speed despite the lack of voltage.
Remember that memory clocks at 128MHz FSB were 128MHz due to the memory module being unable to run at FSB x 1.33 because we couldn't increase memory voltage. Given a suitable stick of memory, the score would have increased another 2 or 3 hundred points, simply due to the increased memory bandwidth available.
Finally we can take a quick look at Quake3 on the CPU. We don't usually look at Quake3 performance in our CPU reviews but the 2.3GHz pumps out some impressive figures that are nice to look at.
Looking at the graph, the results work their way down through the following resolutions: 1600x1200, 1280x1024, 1024x768. We can see that even at 1600x1200 the P4 runs Q3 incredibly well, even at 1.8GHz. Sub 100fps just doesn't happen on the P4 anymore, even at high res, given a decent graphics card.
The extra 500Mhz gives us more than 20% more speed at 1024x768 and free speed is something we can't argue with. The high multiplier and relatively low FSB increase still keeps the rest of the system relatively unstressed which is a bonus. 24/7 running at 128MHz on an i845D board is easy and the free speed is excellent. Memory voltage adjustment is really needed for optimimum performance however. Running the memory at sub PC2100 speeds (effectively unclocking it) isn't the best thing to do. Remember to source a decent stick of memory for running at 1.33 x FSB on the i845D!
It's a shame that the Hexus CPU tests don't really illustrate the performance increase that is possible due to the extra cache size. We'll endeavour to change that in the future with some Linpack analysis and apps that illustrate the change. Look for that when we take a look at our next CPU.
As for the performance we can witness from our tests, it's obvious that the free performance from the overclock is impressive. What endears me to overclocking the P4 is that the high multiplier doesn't mean you overstress the rest of the system. Combined with the new Northwood's propensity for overclocking without a voltage change and the result is a lot of fun.
Coming from tweaking AMD systems for the past year, the P4 is a breath of fresh air and enjoyable again.
The processor overclocked very well. It's worth talking about CPU voltage here to clear up a few things people have noticed when moving to Intel from AND systems. Intel's VRM spec for the P4 (the VRM governs CPU voltage) states that the CPU is booted with a Vmax voltage. This is the maximum voltage to be fed to the CPU to boot it properly and run at the designed speed.
On the majority of P4 motherboards out there, setting a CPU voltage in the BIOS is actually providing the Vmax voltage for the VRM and won't be the constant running voltage of the CPU. Consequently, looking in the BIOS hardware monitoring or using a program like MBM will show a lower voltage than the Vmax value you set. It's worth noting that this is correct behaviour and that your system isn't faulty (like some have suggested).
On AMD systems, the voltage you set in the BIOS or via jumpers is usually the monitored voltage you get. This is down to slight changes in VRM specs between AMD and Intel systems and is just another side effect that tweakers will have to get used to when moving back and forth between systems.
This relates to this review because the reported voltage throughout the testing was a reported 1.44V. This is below the Vmax value the CPU needs and is a perfectly valid voltage for running the Northwood CPU's. The VRM constantly adjusts running voltage as needed up to Vmax to keep the CPU running properly, maybe increasing voltage slightly under high load or reducing if the CPU is in one of it's idle states.
Overall, performance was very acceptable and it ran all of our tests without hassle or slowdown and it currently resides in my system as I type this. Not quite the top x86 performer that money can buy but not slow either and the overclocking was excellent.
The move to 0.13 micron along with all the benefits including reduced cost, reduced power consumption, reduced heat and the overclocking possibilities make the Northwood a very attractive proposition at the moment. The cheaper CPU's are excellent overclockers and the cost per unit is quite low for an Intel CPU although the 2.2 is still very costly.
Overall performance was high, cost is low for an Intel processor in the midrange and it was a pleasure testing it. It'll stay in my machine for quite a while I think, displacing AMD for the time being.