Message boards : Number crunching : Report Problems with Rosetta Version 5.22
Previous · 1 · 2 · 3 · 4 · 5
Author | Message |
---|---|
[B^S] Dr. Bill Skiba Send message Joined: 26 Oct 05 Posts: 5 Credit: 238,426 RAC: 0 |
Just aborted this work unit. https://boinc.bakerlab.org/rosetta/result.php?resultid=25006316 Stuck at 1hr 7min - suspened and resumed several times to no avail. Next work Rosetta work unit seems to be running normally. rosetta 5.22 windows 2K athlon xp 2500 barton |
Clare Jarvis Send message Joined: 14 Dec 05 Posts: 8 Credit: 874,698 RAC: 0 |
I have been having similar problems. I cannot leave Rosetta alone or it simply hangs. But if I visit and hit "Update" every day then I get much better production. Is this a problem with Rosetta or with Boinc. It is very frustrating. I wish the statistics page had the start time and date of each run along with the deadline. |
Pepo Send message Joined: 28 Sep 05 Posts: 115 Credit: 101,358 RAC: 0 |
I have (occassionally) the problem of stalled/hanging Rosettas (somewhere, not at 0% or 1% or 100% progress) already for ages, on Red Hat EL 4.1. Now using BCC 5.4.9, attached to 7 projects, Rosetta's share is ~20%. The computer is running for months betwen reboots, without graphics. The symptoms are that Rosetta app seems to be running, but the CPU time does not increase. Recently I've noticed that even BCC is not able to run benchmarks, if this happens. IIRC previously if BCC was able to switch to aother app, it got 0 CPU cyces (because Rosetta was consuming all) and did not increment time. Usually the only way to overcome this problem was to manually restart BCC. This way the Rosettas were able to continue and finish. (Whether correctly? Now I can see a few (5) process exited with code 131 (0x83) messages since March in the logs.) This time, a week ago I've made few snapshots of suspended rosetta 5.22' result t312__CASP7_JUMPRELAX_SAVE_ALL_OUT_BARCODE_hom010__711_1635_0 and reported them in the Rosetta WU's stall on RedHat Fedora thread. It is stuck at 28.80% (2:43:29 CPU time), maybe for a day already. I'll try to restart BCC, if something new will come into the files in it's slot/3/ dir. And then abort and report, it's now after deadline anyway... Yes, it restarted happily, CPU time jumped from 2:43:29 to 1:43:29 and is incrementing, but progress stayed at 28.80% and does not move. Aborting... <core_client_version>5.4.9</core_client_version> <message> aborted by user </message> <stderr_txt> Graphics are disabled due to configuration... # random seed: 1940641 # cpu_run_time_pref: 21600 SIGSEGV: segmentation violation Stack trace (14 frames): [0x884cb9f] [0x8864cfc] [0x88cade8] [0x8621564] [0x87f229b] [0x873b844] [0x873d0af] [0x85a95e9] [0x85b190a] [0x83d6c9f] [0x86022d3] [0x84740c8] [0x88c41e4] [0x8048111] Exiting... Graphics are disabled due to configuration... # cpu_run_time_pref: 21600 SIGSEGV: segmentation violation Stack trace (15 frames): [0x884cb9f] [0x8864cfc] [0x88cade8] [0x88e5473] [0x88b6601] [0x88b8029] [0x805fdd8] [0x83d75de] [0x83d90a0] [0x83d8f89] [0x83d72ca] [0x88cb7ef] [0x885bff0] [0x8865f65] [0x88f771a] Exiting... SIGSEGV: segmentation violation Stack trace (14 frames): [0x884cb9f] [0x8864cfc] [0x88cade8] [0x853664c] [0x854a184] [0x830867c] [0x8308fdf] [0x86c4a6a] [0x86c6f15] [0x83d6f08] [0x86022d3] [0x84740c8] [0x88c41e4] [0x8048111] Exiting... Graphics are disabled due to configuration... # cpu_run_time_pref: 21600 ERROR:: Exit at: fragments.cc line:459 FILE_LOCK::unlock(): close failed.: Bad file descriptor </stderr_txt> Peter |
Feet1st Send message Joined: 30 Dec 05 Posts: 1755 Credit: 4,690,520 RAC: 0 |
But if I visit and hit "Update" every day then I get much better production. Is this a problem with Rosetta or with Boinc. BOINC is responsible to contact the projects that it needs to get work from. Performing an update wouldn't have much to do with a hung work unit. Are you saying to end up without work? Or are you saying that your existing WUs are not ending properly? Add this signature to your EMail: Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might! https://boinc.bakerlab.org/rosetta/ |
Feet1st Send message Joined: 30 Dec 05 Posts: 1755 Credit: 4,690,520 RAC: 0 |
Pepo: I'm not clear how long you observed the running of the WU after restarting it. But the progress % does not change very frequently and this is normal. Here is some relevant information on the subject. Perhaps you are saying you let it run for over an hour with no progress... that would be another matter. But, if not, that portion of what you are describing is probably normal and does not require your intervention to abort. Add this signature to your EMail: Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might! https://boinc.bakerlab.org/rosetta/ |
Pepo Send message Joined: 28 Sep 05 Posts: 115 Credit: 101,358 RAC: 0 |
Pepo: I'm not clear how long you observed the running of the WU after restarting it. But the progress % does not change very frequently and this is normal. Here is some relevant information on the subject. Perhaps you are saying you let it run for over an hour with no progress... that would be another matter. But, if not, that portion of what you are describing is probably normal and does not require your intervention to abort. Yes, I've read the FAQ. If you look at the Rosetta WU's stall on RedHat Fedora thread I mentioned, the Rosetta was hung for at least more than a day, I could look into the logs to tell exactly. I usually check the machine once in a day-two (because of Rosetta :-) and restart Boinc if this happens. And it is happening for long already. I'm pretty sure that for few months. Peter |
Pepo Send message Joined: 28 Sep 05 Posts: 115 Credit: 101,358 RAC: 0 |
Pepo: I'm not clear how long you observed the running of the WU after restarting it. I'm sory, Feet1st, I did not read carefully enough. I aborted the result 20 minutes after restarting it. Peter |
Message boards :
Number crunching :
Report Problems with Rosetta Version 5.22
©2024 University of Washington
https://www.bakerlab.org