Message boards : Number crunching : Result was reported too late to validate ????????????
Previous · 1 · 2
Author | Message |
---|---|
Darren Send message Joined: 6 Oct 05 Posts: 27 Credit: 43,535 RAC: 0 |
Darren said - And what does the rest of that very paragraph say: Again, only half credit here. Boinc is designed to operate a certain way, it is Rosetta that does not conform to that standard. Now, I fully hope that boinc is successfully modified to support everything Rosetta wants (and any other project that wants to use boinc), but this adamant "it all boinc and no rosetta" to blame is really astounding. As it is rosetta that is deviating from the boinc norm, rosetta does have some obligation in addressing the issue. I never said they can't or shouldn't deviate from the boinc norm - just that they are deviating from the norm. But they shouldn't simply discard functionality because they don't like the process that generates that functionality. If they don't like the process, they should implement a local alternative, not simply ignore it. What I said and what you're implying I said are two very different things. |
Moderator9 Volunteer moderator Send message Joined: 22 Jan 06 Posts: 1014 Credit: 0 RAC: 0 |
Darren said - My apologies if I misunderstood your original statement, but I took it the same way "Scribe" took it. I think in large measure we actually agree on a lot of this. The real issue is that while some may not like the approach that Rosetta has taken, Rosetta has not simply abandoned redundancy and done nothing about the implications of that decision. They have been working with the BOINC team to fix the problem through a standard implementation of a fair credit system. Frankly, that may produce a faster fix for the issue than if they had tried to write something in house. But in any case, that approach will benefit ALL BOINC projects. Also the problem itself has not been ignored. Rosetta has removed suspect credit claims when they have come up, and can be proven to be false. But the idea that they will somehow create some sort of fair "leveling" process and remove credit from people on a wide scale would be more repugnant to many than the present situation. Imagine the outcry from people who have credits reduced, through such a process. The project would spend massive resources just convincing everyone the process was fair. Moreover this is not as trivial a process as you imply. There would have to be testing, benchmarking, code preparation, and the increase load on the servers would impact the operation of the project. Not to mention the diversion of time to prepare all this that really should be devoted to fixing run time problems in the system. Those problems probably cost people more in credits than any kind of cheating of the system at this point in time. Moreover the problem is not as epidemic as as is often implied by people who raise this issue in the forums. There are over 40,000 users attached to the project, less than 100 individual people have raised this issue, and some of those raise it often and loudly. When it is raised the project looks at the problem and takes action if it is appropriate, but the forums give a false impression of the actual size of the problem. If people are cheating it stands to reason that they would appear among the top individual systems. But even that is not easy to say when a high RAC number can be created by infrequent results uploads, and high credit claims can be made by identical systems that for a number of reasons have different but legitimate benchmarks. Not to mention that there is so much variation in work units that it is almost impossible to determine in advance how much time it will take to processes each one. Even now with the time setting, work unit run times can vary by as much as 300%. In fact even two systems running the same work unit for the same time setting may not produce the same number of models in their results, and each of those models may have some variation in steps and length. So even some variation of the "trickle" concept used by CPDN would not work. The answer to this problem is the proposed flops counting system being incorporated into BOINC right now, not some arbitrary credit value assigned to each work unit and imposed by a single project. That would be no different that simply saying one work unit equals one credit. The only way to get the flops system in place, is to work with the BOINC team to do it, and the only way to get that done is to demonstrate a real world need on one of the projects to have it. That is precisely what Rosetta has done, and is doing. As to the rest, I can understand how EMT work might impact your view of the nature of this type of research. By the way thank you for your service in that regard. But the fact is that everyone will at some point in their life be confronted with disease. Most if not all of those diseases or the treatments for them can be traced to the protein level. Even the natural aging process is protein related. So in fact until people stop dying, this type of research DOES impact everyone. This type of research may just prepare us for for the next pandemic. That is not melodrama, it is just fact. Moderator9 ROSETTA@home FAQ Moderator Contact |
Darren Send message Joined: 6 Oct 05 Posts: 27 Credit: 43,535 RAC: 0 |
My apologies if I misunderstood your original statement, but I took it the same way "Scribe" took it. And I apologize for not being clear enough. I thought I had made it very clear that I don't like the way boinc addresses the issue (by implementing redundancy) either, but that it does serve a purpose, so an alternative should be used by any project that chooses not to use redundancy. I even said that I thought projects should not use redundancy if not scientifically necessary, so I am kind of baffled how anyone could interpret what I said to mean projects should do exactly as boinc/seti does and are somehow "violating" some concept if they don't. But the idea that they will somehow create some sort of fair "leveling" process and remove credit from people on a wide scale would be more repugnant to many than the present situation. Imagine the outcry from people who have credits reduced, through such a process. I think any outcry would be suprisingly small. Any rosetta participant who also participates in any other project is already fully accustomed to the fact that the amount of credit they ask for and the amount of credit they get can be different. People have complained about the benchmark variations long and loud all the way back to when boinc was still in beta. If rosetta implemented a "leveling" process that corrected totally out of line benchmarks rather than the normal throw-out-the-high-and-low-claim leveling process other projects now use, I think rosetta would be considered a hero, not a villain. Moreover this is not as trivial a process as you imply. There would have to be testing, benchmarking, code preparation, and the increase load on the servers would impact the operation of the project. My original suggestion of averaging is not so complex. It would require running against the database, which is why I suggested it be run once per day or so. If, for example, a script extracted the benchmarks of all the p4 2.8 gh hosts then averaged those results, you now have a starting point. The project then determines how much a benchmark can reasonably and legitimately vary from the average of all similar systems - then applies that as the maximum and minimum for that cpu. This would have to be done for each type and speed of cpu, but it would only have to be done once. And one script (ok, it has to be written, but it's not a complex script) running one time could get all the averages and determine the high and low variation tolerance numbers. Those numbers are then used to write another quite simple script to run at the project defined intervals and look for benchmarks that are out of range. If it finds any - too high or too low - it recalculates the credit since its last run using the project-defined maximum or minimum benchmark for that cpu. For that matter, granting of credit could even be held until after the script has ok'd the benchmarks - it's not like people aren't already used to waiting days for credit to be issued anyway on projects that use redundancy. Of course, any process that defines an acceptable range doesn't totally eliminate the ability to manipulate, but it does cap just how far any manipulation can take you. There are over 40,000 users attached to the project, less than 100 individual people have raised this issue, and some of those raise it often and loudly. When it is raised the project looks at the problem and takes action if it is appropriate, but the forums give a false impression of the actual size of the problem. If people are cheating it stands to reason that they would appear among the top individual systems. But even that is not easy to say when a high RAC number can be created by infrequent results uploads, and high credit claims can be made by identical systems that for a number of reasons have different but legitimate benchmarks. I learned long ago not to form opinions based on forum rants. The only reason I joined in this discussion here is because I can look at the list of top computers myself and see for myself what I'm talking about. At the time that I'm writing this, the top computer has the following details: CPU type GenuineIntel Intel(R) Pentium(R) 4 CPU 2.80GHz Number of CPUs 2 Operating System Microsoft Windows XP Professional Edition, Service Pack 2, (05.01.2600.00) Memory 502.98 MB Cache 976.56 KB Measured floating point speed 6346.35 million ops/sec Measured integer speed 13034.85 million ops/sec Now, as you know I'm no computer expert, but even I know that there is no legitimate way the benchmarking method boinc uses could produce those benchmarks on a p4 2.8gh system. Not that I'm saying this person did anything to intentionally cheat (boinc itself with no user intervention has done weirder things all by itself before), but the benchmarks are still clearly wrong. If a script ran every day or so and adjusted those to an acceptable maximum, they may still be wrong - but not as wrong. Who could complain about that? The legitimate user with totally out of whack benchmarks will not care that they were corrected. I see no way anyone could complain about a fair crediting system unless they're not playing fair themselves. The answer to this problem is the proposed flops counting system being incorporated into BOINC right now, not some arbitrary credit value assigned to each work unit and imposed by a single project. That would be no different that simply saying one work unit equals one credit. The only way to get the flops system in place, is to work with the BOINC team to do it, and the only way to get that done is to demonstrate a real world need on one of the projects to have it. That is precisely what Rosetta has done, and is doing. I give rosetta full credit on that count. However, I also know from being a seti beta tester that there are some current flaws with that concept that are just as big as the existing flaws. Granted, it's still not ready for general release so there is some time to work some of that out. As has been pointed out over there though, it doesn't seem to really be fixing the problem - it just buries it a little deeper and makes it even harder to find. Anyway, I think all of my views here are pretty well known. Not than anyone listens or cares, but at least I feel better having screamed about it for a while. So, I'll just slip back into obscurity now. |
Moderator9 Volunteer moderator Send message Joined: 22 Jan 06 Posts: 1014 Credit: 0 RAC: 0 |
...Anyway, I think all of my views here are pretty well known. Not than anyone listens or cares, but at least I feel better having screamed about it for a while. So, I'll just slip back into obscurity now. I am afraid we might have to disagree on this point. Your discussion could NOT be characterized as screaming, and I am listening and do care. While full agreement is not always possible, open discussion and compromise are always welcome. I have just sent an e-mail to Dr Baker based on our discussion, suggesting some interim ideas for credits that may help. You have presented an intelligent argument for your position, and done so in a professional way. I for one have enjoyed the discussion, and hope to see you post again freely on a range of topics. Moderator9 ROSETTA@home FAQ Moderator Contact |
Scribe Send message Joined: 2 Nov 05 Posts: 284 Credit: 157,359 RAC: 0 |
I have to agree with Darren here.....those who shout the loudest often have something to hide......others like like me would not object to the 'cheats' being levelled off! |
Los Alcoholicos~La Muis Send message Joined: 4 Nov 05 Posts: 34 Credit: 1,041,724 RAC: 0 |
My original suggestion of averaging is not so complex. It would require running against the database, which is why I suggested it be run once per day or so. If, for example, a script extracted the benchmarks of all the p4 2.8 gh hosts then averaged those results, you now have a starting point. Unfortunaly Boinc isn't able to identify the right processor and processor speed. As long as that is the case such a leveling will unleash a lot of indignation and discusion. In my case Boinc misreport a Intel Celeron Tualatin 1,4 GHz as a GenuineIntel Pentium(r) II Processor. A PowerPC G4 2 GHz upgrade isn't recognized and it is still reported as a PowerPC3,1 and I know of OCed processors which run over 40% of the stockspeed without Boinc being able to report the right speed. Although I should feel a lot comfortable with a better benchmark/crediting system, the proposed "leveling" system for sure isn't the solution because of the false starting point. |
Moderator9 Volunteer moderator Send message Joined: 22 Jan 06 Posts: 1014 Credit: 0 RAC: 0 |
You may be right. I am just remembering the batch of bad WUs that went out in December, and how much furor that caused. When you look at the list of credit awards from that, most of the people who demanded something be done, were only actually talking about 2-5 credits total. Meanwhile people with a Max time problem were losing thousands. We will soon know how people react to this sort of large scale credit adjustment when the Max time credits are finally awarded. But I guess my point is, if people can get that excited over 5 credits, imagine how they would react if the project lopped a few thousand off the top of the credit claims every few weeks. Also the basis for the calculations may not be pure. The information provided by BOINC is not always correct. One of the top computers here is a Mac G5 Dual core. While his credit claims and RAC seem outrageously high and his benchmarks do as well, they are in fact legitimate for that system. It has been the subject of repeated reports, and BOINC just does not know how to deal with it correctly. When Dual core G5 Macs are independently benchmarked, that is how they look. In any case these issues have been discussed a lot. Clearly the problem is not simple, and the solution will not be either. Moderator9 ROSETTA@home FAQ Moderator Contact |
Los Alcoholicos~La Muis Send message Joined: 4 Nov 05 Posts: 34 Credit: 1,041,724 RAC: 0 |
Not only Boinc doesn't know how to deal with it... I have some Mac's and I don't either. When I use the standard Boinc client the benchmark of my dual G5 2GHz is a little more then half of that of my Sempron 2600MHz (and that seams quiet ridicoulous to me). When I use the altivec optimized client it scores 4-5 times that. Am I cheating? I don't know... there are DC-projects (ie.ORG-25) where my Mac outpreform a P4 3,2GHz 4 times. Does R@h use the capacities of the G5??? But as I stated before somewhere in another tread, I don't care much about credits (I think the competion is fun, but that isn't the reason why I join DC-projects) and I gladly give my gathered credits for better and more fair benchmark system. |
Message boards :
Number crunching :
Result was reported too late to validate ????????????
©2024 University of Washington
https://www.bakerlab.org