FLOPS/$

Message boards : Number crunching : FLOPS/$

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
student_

Send message
Joined: 24 Sep 05
Posts: 34
Credit: 4,753,449
RAC: 1,341
Message 21696 - Posted: 3 Aug 2006, 4:05:20 UTC

I'm looking for the best CPU in terms of FLOPS (FLoating point Operations Per Second) per dollar (or FLOPS per unit currency generally), and haven't been able to find clear CPU specifications that use those metrics. AMD seems to be moving towards an economic interpretation of CPU worth in a recent marketing campaign to emphasize "performance/price", but I haven't been able to find a specific definition of what "performance" actually means. I'm looking for something like MIPS, Dhrystone, Whetstone, and optimally FLOPS benchmarks on a given processor, not how fast it compresses audio files or how many frames per second can be shown in DOOM 3. With this I would like a comparison of FLOPS/$ -- what seems to be the most valid way of putting a useful unit price on CPU's.

With Intel's July 27th release of the Core 2 Duo chips, I'd like to invest in the best computer in terms of FLOPS/$ I feasible can. Since the primary use of this computer will be to run Rosetta (and perhaps down the road other similarly computationally intense programs), I'm really only looking to optimize my credits/$, which I would assume to be most directly translatable from FLOPS/$.

Would this be better done by buying a P4 HT, Pentium D, AMD single core Athlon, AMD X2, the new Conroe, Woodcrest, or what? I'd like to place an order on August 10th, give or take a day.

More than my personal purchase and even BOINC enthusiasm in general, demanding a valid metric of computational capacity unit pricing like FLOPS/$ seems like it would clear up a look of confusion and make how people interpret CPU value more manufacturer independent.
ID: 21696 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
XS_Vietnam_Soldiers

Send message
Joined: 11 Jan 06
Posts: 240
Credit: 2,880,653
RAC: 0
Message 21703 - Posted: 3 Aug 2006, 6:21:23 UTC

I won't throw a bunch of numbers at you, but I will tell you that the best value for your money also happens to be the one best system you can buy today:
Get yourself a conroe core duo 6600, and one of the good boards for it..Asus,Gigabyte and a couple others make boards for conroe.
There are pitfalls..Come over to the forum at www.xtremesystems.org and read up on the conroe boards before making decisions.
Once you have decided, I can tell you they are monster crunchers and if your smart, you can get a cpu and board for under $600.00..Add in another $200.00 for good ram and your set to go..
Don't forget a quality PSU of at least 480 watts and up to 700 watts if your adding a big gaming vid card.
Good Luck!
Movieman
ID: 21703 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile dcdc

Send message
Joined: 3 Nov 05
Posts: 1831
Credit: 119,627,225
RAC: 11,586
Message 21706 - Posted: 3 Aug 2006, 7:31:35 UTC

for initial outlay the dual core pentium D's overclocked to near 4GHz would probably be good for the money, but the electricty consumtion would be massive - I'd say stay away from the P4s for that reason alone. The most efficient cores are the new Intel ones, but if you can't afford those then I'd go for an A64.

HTH
Danny
ID: 21706 · Rating: 1 · rate: Rate + / Rate - Report as offensive    Reply Quote
MikeMarsUK

Send message
Joined: 15 Jan 06
Posts: 121
Credit: 2,637,872
RAC: 0
Message 21707 - Posted: 3 Aug 2006, 7:33:35 UTC

I'd agree with Movieman - building your own system on a low end Conroe and overclocking it to the max is an excellent idea. The longer you can wait the better, since there will be increasingly more experience and 'how-to' guides on overclocking it turning up on the internet.

But also note that a dirt cheap baseunit from someone like Dell can be amazingly good in terms of flops/$. I recently bought an intel 820 box with a gig of ram for £ 270. Far cheaper than buying the components myself, but lacks the 'fun factor' you get with an overclockable system.

ID: 21707 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Alexander W. Janssen
Avatar

Send message
Joined: 31 May 06
Posts: 33
Credit: 97,311
RAC: 0
Message 21708 - Posted: 3 Aug 2006, 7:51:31 UTC - in response to Message 21706.  
Last modified: 3 Aug 2006, 7:52:02 UTC

for initial outlay the dual core pentium D's overclocked to near 4GHz would probably be good for the money, but the electricty consumtion would be massive


Pentium D 940 @ 3,2GHz sucks 199.7 Watt under load; just the Athlon 64 FX 62 flying at 2.8 GHz sucks more juice: 249.6 Watt.

Core Duos are the most energy-efficient CPUs at the moment:
CPU                     clock      Watt under load 
Core 2 Extreme X6800    2,93 Ghz   171,4
Core 2 Duo E6700        2,66 Ghz   164,4
Core 2 Duo E660         2,4 Ghz    155


Source: cT Magazine, issue 16/2006, page 116f.

HTH, Alex.

P.S.: Contact me via email for more details, won't post all the stuff here.
"I am tired of all this sort of thing called science here... We have spent
millions in that sort of thing for the last few years, and it is time it
should be stopped."
-- Simon Cameron, U.S. Senator, on the Smithsonian Institute, 1901.
ID: 21708 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mats Petersson

Send message
Joined: 29 Sep 05
Posts: 225
Credit: 951,788
RAC: 0
Message 21719 - Posted: 3 Aug 2006, 10:57:34 UTC - in response to Message 21708.  

for initial outlay the dual core pentium D's overclocked to near 4GHz would probably be good for the money, but the electricty consumtion would be massive


Pentium D 940 @ 3,2GHz sucks 199.7 Watt under load; just the Athlon 64 FX 62 flying at 2.8 GHz sucks more juice: 249.6 Watt.

Core Duos are the most energy-efficient CPUs at the moment:
CPU                     clock      Watt under load 
Core 2 Extreme X6800    2,93 Ghz   171,4
Core 2 Duo E6700        2,66 Ghz   164,4
Core 2 Duo E660         2,4 Ghz    155


Source: cT Magazine, issue 16/2006, page 116f.

HTH, Alex.

P.S.: Contact me via email for more details, won't post all the stuff here.


Where did you get the power figure of the Athlon FX 62? I've got one, and I can assure you that I've also got a stock heatsink that is NOT designed for a 250W processor... In fact, with a 400W power-supply, I would suspect that a 250W CPU would kill the PSU... Both cores are running at 2785.724 MHz according to Linux....

--
Mats

ID: 21719 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Alexander W. Janssen
Avatar

Send message
Joined: 31 May 06
Posts: 33
Credit: 97,311
RAC: 0
Message 21720 - Posted: 3 Aug 2006, 11:20:34 UTC - in response to Message 21719.  
Last modified: 3 Aug 2006, 11:21:07 UTC

Pentium D 940 @ 3,2GHz sucks 199.7 Watt under load; just the Athlon 64 FX 62 flying at 2.8 GHz sucks more juice: 249.6 Watt.
Where did you get the power figure of the Athlon FX 62? I've got one, and I can assure you that I've also got a stock heatsink that is NOT designed for a 250W processor...
Dear Matt,
i got all those figures from the table in that magazine. But you're right i think; i did a quick search on the 'net and i found out that the FX-62 just consumes about 125 Watt - at least all the sources i found on a quick peak were around that number.
Well, might be an error in the table then i guess. CT-magazine is a renowned and professional magazine, but everybody can make a mistake.
Sorry for spreading confusion!
Mats
Alex.
"I am tired of all this sort of thing called science here... We have spent
millions in that sort of thing for the last few years, and it is time it
should be stopped."
-- Simon Cameron, U.S. Senator, on the Smithsonian Institute, 1901.
ID: 21720 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Feet1st
Avatar

Send message
Joined: 30 Dec 05
Posts: 1755
Credit: 4,690,520
RAC: 0
Message 21731 - Posted: 3 Aug 2006, 14:23:40 UTC

So there are two issues to consider. The initial cost of the hardware, and the ongoing cost of the electricity (including cooling if you are air conditioning the room the PC is in).

The cell processors are going to kick butt on both counts. Dynamically shuts down portions of the chip that aren't being used. I couldn't find the link, but I think I read that they can do 2 FLOPS on each of 8 SPEs per clock cycle. Here is an architecture diagram. ...now just need to see one in a commercially available product that one can write code for.
Add this signature to your EMail:
Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might!
https://boinc.bakerlab.org/rosetta/
ID: 21731 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mats Petersson

Send message
Joined: 29 Sep 05
Posts: 225
Credit: 951,788
RAC: 0
Message 21733 - Posted: 3 Aug 2006, 14:28:55 UTC - in response to Message 21731.  

So there are two issues to consider. The initial cost of the hardware, and the ongoing cost of the electricity (including cooling if you are air conditioning the room the PC is in).

The cell processors are going to kick butt on both counts. Dynamically shuts down portions of the chip that aren't being used. I couldn't find the link, but I think I read that they can do 2 FLOPS on each of 8 SPEs per clock cycle. Here is an architecture diagram. ...now just need to see one in a commercially available product that one can write code for.


Sutting down units that aren't being used is great - but for something like Rosetta, I doubt very much that it will do any good, as Rosetta is pretty good at using every single unit in the current generation x86 processors, and assuming they haven't got any new types of units that are useless for Rosetta, I'd expect the "watt per decoy" to be similar. There is a possibility that the differences in processor design gives some benefits, but from what I've seen so far, high performance processors tend to be more or less equal on power/calculation performance ratios (with some improvements in the fact that smaller geometry (90nm is better than 130nm, etc) gives better performance).

ARM is a great processor for low-power, but it's performance on high-end calculations is pretty abysmal... Great if you're running a mobile phone or MP3 player that has a limited amount of calculation need, but not quite so good for the Rosetta project...

--
Mats
ID: 21733 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Feet1st
Avatar

Send message
Joined: 30 Dec 05
Posts: 1755
Credit: 4,690,520
RAC: 0
Message 21737 - Posted: 3 Aug 2006, 14:43:52 UTC

I'm thinking cell will win on power with 8 slave floating point processors for a single CPU. Should crank similar to 8 CPUs, but only use the power of 2... assuming you can code or compile in a way to take full advantage of the hardware. So, the point wasn't that there would be much idle time :) but power was apparently one of the design points for the cell processors. ...hope they find a way to integrate them in to BlueGene!
Add this signature to your EMail:
Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might!
https://boinc.bakerlab.org/rosetta/
ID: 21737 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mats Petersson

Send message
Joined: 29 Sep 05
Posts: 225
Credit: 951,788
RAC: 0
Message 21738 - Posted: 3 Aug 2006, 14:53:01 UTC - in response to Message 21737.  

I'm thinking cell will win on power with 8 slave floating point processors for a single CPU. Should crank similar to 8 CPUs, but only use the power of 2... assuming you can code or compile in a way to take full advantage of the hardware. So, the point wasn't that there would be much idle time :) but power was apparently one of the design points for the cell processors. ...hope they find a way to integrate them in to BlueGene!


And my point is that if you want to do x MFLOP's, you need, give or take some small percentage, y Watt, and that's not going to change dramatically with a differnet architecture.

CMOS technology only uses power when levels are changing, so "idle" parts of the processor are already not using any power.

Obviously, there are ways to reduce the power consumption - the obvious one is to use fewer transistors, but that usually leads to less performance [unless it's using less transistors to do exactly the same work, which is where you need CLEVER people involved!], so performance/Watt remains equal. Other ways is to run at a lower voltage, which helps quite a bit (p = f * v^2, where p = power, f = frequency, v = voltage, so half the voltage and the processor uses a quarter of the power) - unfortunately, lower voltage also, normally, means lower speed, so again, the overall performance is reduced [although not necessarily linearly to the drop in power consumption].

As with the other discussion, I don't wish to sound like a besserwisser, but if it was EASY to reduce the overall power consumption and retain the performance, it would have been done a long time ago. I've worked over ten years in the chip-design/manufacturing industry - although not with chip-design itself - so I've had some exposure to how things work...

--
Mats

ID: 21738 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Alexander W. Janssen
Avatar

Send message
Joined: 31 May 06
Posts: 33
Credit: 97,311
RAC: 0
Message 21741 - Posted: 3 Aug 2006, 15:31:30 UTC - in response to Message 21738.  

And my point is that if you want to do x MFLOP's, you need, give or take some small percentage, y Watt, and that's not going to change dramatically with a differnet architecture.

Well, let's consider the Cell-accelerator board which Mercury will sell next spring[2]. The webpage says it delivers up to 179 single-precision GFLOPS peak.
According to [1] the double-precision FLOPS are 14 times slower, so let's assume that boards does a 12.8 double-precision GLFOPS.
An Opteron 244 has about 3.14 double-precission GFLOPS[3], so a quad-Opteron system could beat that accelerator board.
The Opteron 244 gobbles 84 Watts, x4 => 336 Watt
The Cell-boards sucks around 210 Watt.

Assuming that all Specs are correct and that the Cell scales as good as the quad-Opteron system, the Accel-boards needs 37.5% less energy; what is not calculated is the energy the host-systems needs, which donates the needed PCIe-slot.

But speaking of FLOPS/Watt: Cell wins. The whole thing might become interesting if the price drops (they annouced that board with a price-tag of 8kUSD) and if you could insert a bunch of em into a PCIe-backplane.

Don't want to be a smartass either, but i'd call a power-difference of ~35% quite something.

Mats

Alex.

[1] http://www.hpcwire.com/hpc/671376.html
[2] http://www.mc.com/products/view/index.cfm?id=106&type=boards
[3] http://www.opteronics.com/pdf/39497A_HPC_WhitePaper_2xCli.pdf
"I am tired of all this sort of thing called science here... We have spent
millions in that sort of thing for the last few years, and it is time it
should be stopped."
-- Simon Cameron, U.S. Senator, on the Smithsonian Institute, 1901.
ID: 21741 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mats Petersson

Send message
Joined: 29 Sep 05
Posts: 225
Credit: 951,788
RAC: 0
Message 21742 - Posted: 3 Aug 2006, 16:13:48 UTC - in response to Message 21741.  

Don't want to be a smartass either, but i'd call a power-difference of ~35% quite something.
Alex.


Good point - however, you're comparing fairly old existing technology with future technology.

Current AMD processors are dual core, and I can certainly say that my dual core FX62 is faster than my quad-CPU 840 machine, and that's with double the amount of actual CPU's... A dual core opteron 280 is the same speed as a 250 single core, and spec says there's 10W more in the 280 than the 250. Theoretically, you should get double the FPU performance, but assuming we only get 1.3x it's still an improvement in performance/W... And the 250 is almost the same power as the 244 in your example... I don't have a document to show the FPU performance directly, and it would of course depend a lot on the actual benchmark used (memory intensive benchmarks would compete over the same memory bus - something that obviously the Cell processor would also have to contend with).

Next generation AMD processors will be quad-core, and thus have twice again the processing power - they will be out around the same time as the the Cell-board is predicted...

--
Mats
ID: 21742 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
XS_Vietnam_Soldiers

Send message
Joined: 11 Jan 06
Posts: 240
Credit: 2,880,653
RAC: 0
Message 21744 - Posted: 3 Aug 2006, 16:16:02 UTC

I don't want you guys to take this the wrong way but the orginal posters question was essentially bang for the buck and buyable now.
It's all well and great what will be available next month or next year but it doesn't answer his question.
Thanks for your time,
Movieman
ID: 21744 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
AMD_is_logical

Send message
Joined: 20 Dec 05
Posts: 299
Credit: 31,460,681
RAC: 0
Message 21748 - Posted: 3 Aug 2006, 16:57:32 UTC

I suggest waiting until the new credit system is in place. Then you will be able to see how fast various CPUs really are when running Rosetta. Certain new CPUs get fantastic benchmarks with optimized BOINC clients, but are not very impressive at running Rosetta (which uses legacy instructions for compatability, and a low optimize level to help in debugging).

Definitly go dual core, and at least 1GB memory.

Note that it is possible to set up a dedicated cruncher that has only a motherboard, CPU+HS+fan, memory, and power supply.

You didn't mention power usage. There are certain tradeoffs between up-front cost and power usage. For instance, an efficient power supply (such as Seasonic) can reduce power usage but will cost more up front. The 35W AMD dual core 3800+ processor (not available in the US yet) costs more than the more power hungry versions. And so on.
ID: 21748 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
student_

Send message
Joined: 24 Sep 05
Posts: 34
Credit: 4,753,449
RAC: 1,341
Message 21785 - Posted: 4 Aug 2006, 0:03:24 UTC - in response to Message 21748.  

While it is certainly a consideration, the FLOPS/watt metric is not what I am trying to optimize: it's FLOPS/$. If I were buying a large number of computers for a 20+ farm, then I would probably try to find the optimal FLOPS/(watt*$) computer, but since I am only buying one computer and minimize my energy usage in many other ways, I think it's alright to optimize FLOPS/$ even if FLOPS/watt may suffer.

I'm leaning most towards an Intel Conroe E6600 with >= 512MB DDR2 RAM, but with a small (probably spare) hard-drive and cheap (again probably spare) video card. I'll probably buy a ready-to-go box.
ID: 21785 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
XS_Vietnam_Soldiers

Send message
Joined: 11 Jan 06
Posts: 240
Credit: 2,880,653
RAC: 0
Message 21786 - Posted: 4 Aug 2006, 0:29:34 UTC - in response to Message 21785.  

While it is certainly a consideration, the FLOPS/watt metric is not what I am trying to optimize: it's FLOPS/$. If I were buying a large number of computers for a 20+ farm, then I would probably try to find the optimal FLOPS/(watt*$) computer, but since I am only buying one computer and minimize my energy usage in many other ways, I think it's alright to optimize FLOPS/$ even if FLOPS/watt may suffer.

I'm leaning most towards an Intel Conroe E6600 with >= 512MB DDR2 RAM, but with a small (probably spare) hard-drive and cheap (again probably spare) video card. I'll probably buy a ready-to-go box.

Always build it yourself. Then you know what's in it, how it's assembled,etc.
Nothing like your own work to make a system stable..
There's no substitute for care and taking your time in a build, no matter how much you pay..
Movieman
ID: 21786 · Rating: 1 · rate: Rate + / Rate - Report as offensive    Reply Quote
Astro
Avatar

Send message
Joined: 2 Oct 05
Posts: 987
Credit: 500,253
RAC: 0
Message 21787 - Posted: 4 Aug 2006, 0:56:53 UTC - in response to Message 21785.  

I'm leaning most towards an Intel Conroe E6600 with >= 512MB DDR2 RAM, but with a small (probably spare) hard-drive and cheap (again probably spare) video card. I'll probably buy a ready-to-go box.

You'll want atleast 512M ram per core
ID: 21787 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
senatoralex85

Send message
Joined: 27 Sep 05
Posts: 66
Credit: 169,644
RAC: 0
Message 22155 - Posted: 9 Aug 2006, 23:42:22 UTC

I am running a P4 2.26 GHz pc with Xp (my computer is not hidden for those who want to see) and have 128mb of PC 800 RD ram. Does anyone have a rough estimate on how much performance I can gain by doubling the memory? How well does PC 800 RD ram compare to DDR2 memory chips? PC 800 ram is expensive relative to DDR2. I have constantly looked on the internet for a webstite that compares the two types of memory "head to head" but could not find any sources. Thanks in advance for your help!
ID: 22155 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
BennyRop

Send message
Joined: 17 Dec 05
Posts: 555
Credit: 140,800
RAC: 0
Message 22157 - Posted: 10 Aug 2006, 0:51:57 UTC
Last modified: 10 Aug 2006, 0:54:05 UTC

SenatorAlex: Rambus ram was expensive when it came out - and never gained enough market share to drop in price and become common. While there may be a difference in memory bandwidth between pc 800 Rambus ram and pc3200 ddr it wouldn't be enough to justify buying a new motherboard, new ram, and use the old cpu.

As for 128 megs to 256 Megs to 512 Megs - with all the machines that I work on, if they only have 128 Megs of ram and have winXP they spend way too much time swapping memory out to disk. (It'd be painful to work on, if I didn't start something and run and work on something else.)
Systems with 256 Megs are now swapping out to disk all the time if the customer has gone with Nav 2005/6 NIS 2005/6, mcafee, etc where the anti virus and firewall end up causing the system to eat up more than 256 Megs of ram. If you can get to 512 Megs, then the system boots faster, responds faster, and doesn't require such patience to use.

Personally, I'd go with a new system to get away from the P4 cpu architecture and go with a Core Duo or Core 2 Duo if you're an intel fan - or Athlon 64 X2 939 pin cpu if you're an AMD fan.
ID: 22157 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
1 · 2 · Next

Message boards : Number crunching : FLOPS/$



©2024 University of Washington
https://www.bakerlab.org