Message boards : Number crunching : Is this for real???
Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next
Author | Message |
---|---|
John Hunt Send message Joined: 18 Sep 05 Posts: 446 Credit: 200,755 RAC: 0 |
I've stumbled into this thread................ I am no way a techy the way some of you guys seem to be. I started running BOINC about 18 months ago. If it is true that 'cheating' is going on, then is it worth any of us actually continuing? |
MikeMarsUK Send message Joined: 15 Jan 06 Posts: 121 Credit: 2,637,872 RAC: 0 |
If you're contributing because you want to help medical science by improving protein crunching algorithms, and so forth, then of course it's worth continuing. If your reason for getting involved in distributed computing projects is for credit, then I would rethink, because credits are meaningless for various reasons, cheating being only one of them. That applies to all projects, not just this one. To the earlier poster who suggested that benchmark-based credit was deliberately designed to allow cheating, I would disagree - far more likely to be naivety than anything else. Crunch for projects which excite your interest, don't crunch for numbers. Aglarond: I don't understand your comment about crunching 24/7 means cheating is likely? Mad credit per day, yes, I understand that, but I don't follow the second half of your argument. |
Aglarond Send message Joined: 29 Jan 06 Posts: 26 Credit: 446,212 RAC: 0 |
To John: I've stumbled into this thread................ Yes, it is worth. Rosetta is about science, that can help many people. This cheating we are all talking about don't have any impact on science. Credits are here only for people that care about credits :) People on this thread are just trying to say, that credits should be counted in another way. To Mike: Aglarond: I don't understand your comment about crunching 24/7 means cheating is likely? Mad credit per day, yes, I understand that, but I don't follow the second half of your argument. I wanted to say that if someone is crunching 12 hours for Rosetta and 12 hours for Einstein, then I see reason for using optimized boinc (5.5.0). However if he is crunching 24 hours for Rosetta, there is no need to use optimized boinc. Why would someone spend time installing special boinc client on computer, that is crunching only for Rosetta? There are some legitimate reasons, but it is very likely that it was for higher credit claims. |
Bob Guy Send message Joined: 7 Oct 05 Posts: 39 Credit: 24,895 RAC: 0 |
I'm currently using BoincStudio (v0.5.5) because I use the optimized app for Einstein. It is only enabled for Einstein. I will be very happy to discontinue its use when Einstein changes to FLOP counting. There is no reason that Rosetta shouldn't also implement FLOP counting. Except for one - allowing credit exploitation draws a certain number of participants that enables a project to get its WUs crunched sooner rather than later. I participate in this project because I think the science is useful and valid - not because of any silly notion that more credits make a participant somehow better or more important. I can still be offended by the credit exploitation though. |
John Hunt Send message Joined: 18 Sep 05 Posts: 446 Credit: 200,755 RAC: 0 |
I've stumbled into this thread................ In that case; I'm staying on board! If the cheating was in any way affecting the science I would seriously think of giving it all up. |
MikeMarsUK Send message Joined: 15 Jan 06 Posts: 121 Credit: 2,637,872 RAC: 0 |
To Mike: Ah I see, that makes sense. |
Jose Send message Joined: 28 Mar 06 Posts: 820 Credit: 48,297 RAC: 0 |
I am not angry at you but for reasons that will become clearer in the near future (and that I cannot reveal for now) let me challenge some of your statements and some of the statements made by others so I can get the input of those who have contributed to this thread. (Note: Given the sensitive nature of the issue, I am more than pleasantly surprised at the civility displayed here.) So with that in mind, and in the spirit of civility that has permeated this discussion so far, let me raise some questions so I can get very much needed answers. 1- Why is running Rosetta 24 hours a day an indicator of a possible cheat? Could it be that a 24/7 Rosetta running system is doing that because it was built to run only Rosetta 24/7? Would it surprise you that there are men and women that have stacks/piles/close to mountains of computer components of all kind stored just in case those pieces can be salvaged to build a "rig" to be tested for speed or whatever and, that there is a community of said people that will even trade pieces or donate them to a fellow community member? Would it surprise you to know that many teams have people like that among them and that certain teams have been formed almost exclusively of people like that? 2- Why so much emphasis on the CPU? Can the type of motherboard be a factor in the performance? The Graphic card? The BIOS? The case where the computer is? The power supply? 3-Are all CPUs of the same "brand and speed" the same? Are we taking in consideration that cooling a CPU improves its performance? For example, add additional fans to the case and the performance of a CPU will be improved. If adding 3 additional fans ( I will use my example a $10.00 investment)helped me improve my computer's CPU performance, can you imagine what all the other cooling options (commercially available and custom built) can do? ( Example: some people even use Liquid Nitrogen to cool their systems) So we may be talking about people that have one of a kind , specially designed systems designed for maximum performance and crunching power and designed specially to run Rosetta. So here the issue becomes how one can verify that? In the case where the people who own the computers are identifiable and their teams known, verification is easy. Believe me that type of cruncher will document his systems for probably he is proud of what he/she achieved and has that configuration and its performance registered in one or all benchmark verification databases that exists. That verification cannot be done with anonymous computers. 4- Another issue we need to address and for whom I will like your input: The BOINC Manager allows the user to check his benchmarks on demand. There are several things that even a non techie user can do to improve his benchmarks: as simple as turning the computer off, unplug it from the power supply, disk cleaning, disk defragmenting, closing most of the programs and set the tasks related to BOINC at high priority , run the bench marks again and there is a 99/9% chance they will be better than the first ones. So update those better benchmarks and by default there will be better scores. Notice, I have not mentioned the range of software that allows improvement /better management of CPU usage that are readily available. Nor, have I mentioned even more detailed tinkering with a computer's registry that can result in better performance benchmarks. (This possibility was pointed to me by my 16 year old niece) So I ask: is this type of action "cheating" or routine computer management/maintenance? 5- How do we modify a credit system to make it less dependent on the benchmarks? 6- Let's get even more radical: Should we even give credits? I will stop for now. I have other questions to ask and some other comments. But this message is getting way too long. As I said before: If I am raising these issues it is because I value your opinions on them , and more important I need your input. If you want to contact me in private please use this addy: joseantonio@choicecabledotnet. TIA Jose This and no other is the root from which a Tyrant springs; when he first appears he is a protector.†Plato |
XS_Duc Send message Joined: 30 Dec 05 Posts: 17 Credit: 310,471 RAC: 0 |
|
MikeMarsUK Send message Joined: 15 Jan 06 Posts: 121 Credit: 2,637,872 RAC: 0 |
... Not because of running 24/7, but using a client designed to handle multi-projects, and then running only against Rosetta. He's asking about the motivation of using that client.
My CPU is overclocked by 41%, the memory is DDR500 and overclocked too, leading to a 44% increase in actual model performance over my original configuration with the same CPU. These refer to processing speed of CPDN's climate model, which does not give benchmark based credit. However, some CPUs referred to earlier are displaying benchmarks which are clearly impossible, regardless of how much overclocking etc is performed. The only reason for doing this is to get excess credit rather than to reflect enhanced performance.
Increasing your PCs performance will increase your benchmarks too (to a lesser or greater extent), hence no need to modify benchmark results artificially. Even liquid nitrogen cooling will not increase a CPUs performance ten-fold... An anonymous cmoputer can be used to test the amount of true processing given out in a work unit, and hence is a good way to measure credit. It will not make a good guide of how much time a different computer will take to process the work unit, but that is irrelevant. If you have boosted your PC by 50%, then your PC will perform the work unit 50% quicker then the anonymous computer, and will be granted the same amount of credit than the anonymous computer would have been given for that work unit. Hence, it will be getting credit at 50% greater rate per hour, which is a perfectly acceptable and fair result.
Tinkering which improves the machine's speed of actually processing work units is of course entirely fair. Hence, defragmentation may improve perrmance by (say) 3%. Very little of this tickering will result in performance increases above 10% (apart from overclocking, which can give much more). Giving out credit on a work-unit basis will accurately reflect this performance increase, since the PC will have processed the work unit in reduced time, but doing the same amount of actual work as the anonymous reference computer.
By giving out a fixed amount of credit per work unit of the same type. Faster computers will process it quicker, hence get more credit per hour. The anonymous PC is used to calibrate the credit per work unit.
An interesting idea. Credit is inherently unfair when doing a project-to-project comparison, since it is not calibrated between projects. I don't have an answer for that, other than for suggesting that some way be developed in the future for calibrating interproject credit (perhaps in the same way, by using anonymous PCs to crunch work units between different projects, and then generating a correction factor based on those results).
I hope you find my responses of interest... everyone will have their own different views on credit, of course. I personally find the teams-and-competition side of crunching to be the most boring part, having to moderate team threads on the CPDN boards is something I personally find mind-numbingly boring. Credit to me is something that demonstrates what my personal interests are, it doesn't give me a big kick to see what mine is compared to other people. I picked my team based on the pretty signatures ... :-) |
LP Send message Joined: 4 Nov 05 Posts: 16 Credit: 177,147 RAC: 0 |
Jose, even though we can't see the owners of anonymous computers, Bakerlab still knows their IP's and in that sense knows who they are and where they are running from. 6- Let's get even more radical: Should we even give credits? If you take away the credits, you take away the fun of competition and a way to show achievment, and many people will quit running R@H and go to other projects. |
Jose Send message Joined: 28 Mar 06 Posts: 820 Credit: 48,297 RAC: 0 |
Believe me your answers are very useful. This and no other is the root from which a Tyrant springs; when he first appears he is a protector.†Plato |
Jose Send message Joined: 28 Mar 06 Posts: 820 Credit: 48,297 RAC: 0 |
I am not being "cute" : cause the client is there. Also, some people running Rosetta 24/7 are also running other projects out of their computers via the BOINC Manager. This is one of the issues I have found the hardest to understand.
So I take it that if those computers are identified you would be agreeable to a reduction of credits. or am I over reading you? An anonymous computer can be used to test the amount of true processing given out in a work unit, and hence is a good way to measure credit. It will not make a good guide of how much time a different computer will take to process the work unit, but that is irrelevant. If you have boosted your PC by 50%, then your PC will perform the work unit 50% quicker then the anonymous computer, and will be granted the same amount of credit than the anonymous computer would have been given for that work unit. Hence, it will be getting credit at 50% greater rate per hour, which is a perfectly acceptable and fair result.
Can you develop the theme of the anonymous computer reference? How would you go about choosing the one to select, specially since some of them may have the "impossible to get " benchmarks ...per work unit of the same type. How does one does assign credit to a certain type of work when the target protein is different making each wu unique? Again if I am raising questions to your comments it is because I find them valuable and they are helping me a lot in thinking through some issues? Again TY!!!! This and no other is the root from which a Tyrant springs; when he first appears he is a protector.†Plato |
Jose Send message Joined: 28 Mar 06 Posts: 820 Credit: 48,297 RAC: 0 |
Agreed. The issue is then, finding a credit system that is fair and agreeable to most This and no other is the root from which a Tyrant springs; when he first appears he is a protector.†Plato |
tralala Send message Joined: 8 Apr 06 Posts: 376 Credit: 581,806 RAC: 0 |
It's quite simple I think: Best is fixed credit per WU as CPDN - together with a sensitive validator makes cheating very hard. Furthermore there is no arguing whether a bench is correct or not, since no bench is needed. Unfortunately this approach is only without problems if all WU are virtually the same. Rosetta is not one of those projects. Second best is flopscounting like SETI. This makes cheating harder especially together with a quorum. In both cases the focus switches toward interproject credit calibration hence the heated debate over at Seti. Third is an internal bench, like Dr. Baker implemented in Rosetta (not released though). This makes cheating harder as well, how hard depends on the implementation details. Any of the three described methods above would put an end to the debate whether using an optimized boinc client is cheating or not, since it would no longer make a difference. Any of those solutions would level the playing field and reduce the cheating problem (though not eliminate it). I hope they will enable the Rosetta-Bench as a test soon, so anybody can study what would be different with using the internal bench instead of the BOINC-bench and then we can discuss about something concrete. |
MikeMarsUK Send message Joined: 15 Jan 06 Posts: 121 Credit: 2,637,872 RAC: 0 |
... But that's OK by his argument. If you're running other projects then your motivation for using the special manager is clear and he has no problem with it. What he's asking is ... if you're only running Rosetta, and you're using a manager which is designed to do two things, a) run with multiple projects, and b) enhance benchmark results, then a) has been ruled out, leaving only b). If the manager can do things c), and d), then his argument fails. If you had previously also run multiple projects, and are now only running one, then again, his argument fails, since you had a clear reason for wanting the fancy manager at the time you installed it. (I'm using 'you' in the generic sense, don't take it personally :-) )
My vote would be to set that person's credits to zero (if an egrarious abuse of the credit system). But since I am merely a fellow participant, my vote doesn't count :-)
As long as whatever she was doing actually resulted in the science being generated faster, I don't see how that could be a problem. Faster science = more credits = perfectly fair, in my mind. The best way to calculate that is for the credits to reflect the science done, and nothing else.
Select (say) 3 AMD boxes, 3 Intel boxes, of different generations, and different amounts of RAM, and run the same work unit on each of them. The average time to complete would be the starting point for credit. The same boxes are used each time. These anonymous computers would be run by the project staff, and would probably be the first computers to be given any particular algorithm or protein. So they're not actually participant's computers but representative of the type of computers that us particpants use. This procedure would be repeated for each different protein, and each algorithm. So protein X, with 300 bases, would be worth a lot more credit than a protein Y with 50 bases. And an algorithm with a lot of relax phases would be worth more credits than one with few.
Hope these ideas are interesting... Edit : tralala:
I'm pretty much in agreement with everything Tralala says. Slightly different mechanism for working out the reference credits, but it would be the same in the long run. |
Jose Send message Joined: 28 Mar 06 Posts: 820 Credit: 48,297 RAC: 0 |
First of all: TY TY TY. All of you cannot imagine how helpful you have been. When the time comes you will be able to see why. ( I love the mystery :) ) (I'm using 'you' in the generic sense, don't take it personally :-) ) Unless I can find a nuclear powered computer enhancer , neither I or the sloth that is my computer will take it personally. Seriously, I appreciate your examples and your explanations. They have been helpful. So let me go back and use what I have learned. I hope they will enable the Rosetta-Bench as a test soon, so anybody can study what would be different with using the internal bench instead of the BOINC-bench and then we can discuss about something concrete. I think this will be coming as soon as the CASP 7 reality is dealt with. This and no other is the root from which a Tyrant springs; when he first appears he is a protector.†Plato |
MikeMarsUK Send message Joined: 15 Jan 06 Posts: 121 Credit: 2,637,872 RAC: 0 |
Just as a comparison, here's a system which is clearly returning inflated credits: https://boinc.bakerlab.org/rosetta/results.php?hostid=233165 But is almost certainly not doing it deliberately. Some sort of error during the benchmark is probably the cause (and it doesn't look at all stable judging from the fact that none of the recent WUs have run to completion). |
Dimitris Hatzopoulos Send message Joined: 5 Jan 06 Posts: 336 Credit: 80,939 RAC: 0 |
Reality-check, people: the majority of crunchers on projects like this one are in it for the POINTS and the COMPETITION between different teams... the fact that there's some scientific gain that comes with it is a nice bonus, but believe me, it's not number 1 on the list. This is very true: Most TeraFLOPS and consistent participation, come from people who care about the POINTS (credits) VERY MUCH. I may not share this point of view, nor care much about points myself crunching for a project with very TANGIBLE and relatively short-term benefits like Rosetta, but we have to accept reality. Best UFO Resources Wikipedia R@h How-To: Join Distributed Computing projects that benefit humanity |
[AF>HFR>Corsica] DocMaboul Send message Joined: 1 Mar 06 Posts: 3 Credit: 639,939 RAC: 0 |
And I believe that there are some people cheating in all major teams. Think about this: BoincStudio comes with Boinc version 5.5.0 . This version can give you more credits, but you have to turn credit correction on. It makes perfect sense to use it on Einstein, Seti or Sztaki, when someone uses optimized applications. However why would someone use it, if he is crunching 24 hours a day for rosetta? BoincStudio is in version 0.5.5 alpha and modifications to boinc core are based on 5.4.9 boinc's version. Also, credit correction is now reported with work units results. And I will add something in next release to disable credit correction on projects where no optimized apps are known. |
stewjack Send message Joined: 23 Apr 06 Posts: 39 Credit: 95,871 RAC: 0 |
This thread is so long that I don't claim to have read every word, but I have one question that I don't believe has been raised. FUNDING How will these projects justify their funding if the can't document the computing power they are generating. Isn't that directly, or indirectly, related to accurate points, benchmarking, etc. Jack |
Message boards :
Number crunching :
Is this for real???
©2024 University of Washington
https://www.bakerlab.org