Message boards : Number crunching : WUs freeze !!! computer 117981
Author | Message |
---|---|
Carlos_Pfitzner Send message Joined: 22 Dec 05 Posts: 71 Credit: 138,867 RAC: 0 |
I am with a problem that some WUs running on subject computer freezes !!! rosetta 4.80 Initially I just killed boinc and after ps xu clears restarted boinc again note that rosetta is the only project running on this computer top shows 0.0% of use of cpu by rosetta killing boinc and restarting boinc only servers to I lost more time doing nothing the freezes occurs again Today I lost more the than of 4 hours of cpu IDLE ! Finally I discovered that aborting that WU via remote gui rpc that next WU comes to crunch normally !! A big problem, for my *unmonitored* server With these freezes I can end with a week of CPU IDLE Else, to monitor that freezes and abort offending WUs I have to pay a very costly $$$ diallup connection *Please, that WUs cannot auto-abort ??? see a example of a returned result Result ID 10549605 Name BARCODE_30_1ubi__299_25012_0 Workunit 8520285 Created 10 Feb 2006 11:13:57 UTC Sent 10 Feb 2006 12:50:57 UTC Received 10 Feb 2006 22:11:01 UTC Server state Over Outcome Client error Client state Computing Exit status -197 (0xffffff3b) Computer ID 117981 Report deadline 17 Feb 2006 12:50:57 UTC CPU time 551.93 stderr out <core_client_version>5.2.14</core_client_version> <message>aborted by user </message> <stderr_txt> </stderr_txt> Validate state Invalid Claimed credit 1.64110392447523 Granted credit 0 application version 4.80 Click signature for global team stats |
milw0rm Send message Joined: 10 Dec 05 Posts: 22 Credit: 6,212,738 RAC: 0 |
I am with a problem that some WUs I utterly agree with this. i have had so many different units spend many hours doing nothing because the rosetta client does not auto cancel a unit that is broken or spends much toolong over the average processing time. this i have to check my computer output stat page every so often, find out who is not submitting and then go to the machine and abort the unit and get it to download a new one, thanks to a small oversight in the rosetta programmer's thought process. "Surely nothing could go wrong, we dont need this", unfortunately, it does! :( Please fix :D |
Moderator9 Volunteer moderator Send message Joined: 22 Jan 06 Posts: 1014 Credit: 0 RAC: 0 |
|
Message boards :
Number crunching :
WUs freeze !!! computer 117981
©2024 University of Washington
https://www.bakerlab.org