facebook rss twitter

NVIDIA's PhysX being hobbled on CPUs?

by Pete Mason on 9 July 2010, 08:51

Tags: NVIDIA (NASDAQ:NVDA)

Quick Link: HEXUS.net/qay2u

Add to My Vault: x

NVIDIA is always keen to show off how much faster PhysX is when run on its GeForce GPUs than on CPUs. However, a report out this week suggests that the physics software may be purposely hamstrung by outdated and inefficient code.

Living in the past

While PhysX can be run on CPUs, NVIDIA claim a two-to-four times performance increase when using one of its GPUs. This is why hardware physics acceleration using a graphics card allows for a greater number of more complex objects to appear in games. However, a detailed report from Real World Technologies  has shown that when the physics code is run on a CPU, it relies on the x87 instruction set. While these instructions are supported by modern processors, chip manufacturers have been discouraging developers from using them for almost a decade in favour of SSE2, which is much more efficient.

According to the author, the newer CPU-based instructions could perform the same physics calculations at least twice as fast. This, combined with the fact that PhysX middleware tends to be single-threaded on CPUs – despite being massively multi-threaded when run on graphics cards – means that the performance delta wouldn’t be nearly as wide if the code were to be recompiled. While it may seem like a colossal task to rewrite the code to use SSE2, the report suggests that, with a bit of diligence, it could be updated in “about a day or two”. This means that updated, multithreaded code running on a modern quad- or hexa-core processor could, in theory, come close to matching physics acceleration by a GPU.

If it ain’t broke...

So why would NVIDIA continue to use outdated instructions? There doesn’t seem to be any valid practical reasons - like maintaining backward compatibility or because of increased precision - so the report draws the only valid conclusion: “PhysX uses x87 because Ageia and now Nvidia want it that way”.

“In the case of PhysX on the CPU, there are no significant extra costs (and frankly supporting SSE is easier than x87 anyway). For Nvidia, decreasing the baseline CPU performance by using x87 instructions and a single thread makes GPUs look better.”

Unfortunately, this just isn’t surprising. PhysX is a major value-add for the GeForce range of cards and a lot of that value would be removed if a powerful processor could do the job just as well. Given the choice of maintaining the performance gap by leaving the code as it is, or recompiling the code and losing the advantage, it becomes clear which the graphics card manufacturer will favour. While this is a real shame for consumers, it makes perfect commercial sense.

For those who like their numbers or who want to get into the detailed analysis, the report is definitely worth a read. Just try not to get frustrated when you realise that your multi-core processor could probably rip through those physics calculations without breaking a sweat.



HEXUS Forums :: 16 Comments

Login with Forum Account

Don't have an account? Register today!
I read this report yesterday and while it does seem fairly typical nVidia behaviour, calling it ‘deliberately hobbled’ is a bit of a stretch - ‘deliberately not optimised’ is more accurate, and a subtle but important difference!

Still don't like it at all - especially when they do optimise the CPU PhysX for consoles. But it's a clear chance for Havok or openCL to provide better performance.
I agree with Kalniel, although you could probably argue it was hobbled to a certain extent by the choice to deliberately use the slower instruction set of the two that were available when they developed it (given SSE has been around for about a decade).

As for not making it run on multiple cores, well that's definitely the “not optimising it” argument rather than hobbling it".
What's the history of the code behind Ageia physics? When did they first start developing the code? Likely before the PPU was architected, and before GPGPU would have allowed it to run on a graphics card. So maybe the original codebase is old enough that x87 was used, and then a poor development plan (or a cunning one) lead to the CPU version falling behind technologically. But still, SSE2 came around in 2001… long time ago!

Also you can have your compiler spit out SSE2 code for fp arithmetic - surely they're not writing in assembly?
Does PhysX notability improve any games? Being someone who normally goes the ATI route. PhysX is something I have always ignored and not seemed a problem.
Steve
What's the history of the code behind Ageia physics? When did they first start developing the code?

Well Ageia as a company wasn't even set up until 2002 - which is after SSE2 was introduced. When Nvidia bought them out in 2008, they then converted the API into CUDA to run on their cards.