Report Problems with Rosetta Version 5.07

Message boards : Number crunching : Report Problems with Rosetta Version 5.07

To post messages, you must log in.

Previous · 1 . . . 3 · 4 · 5 · 6

AuthorMessage
DeMatt

Send message
Joined: 30 Apr 06
Posts: 2
Credit: 188,295
RAC: 0
Message 15701 - Posted: 9 May 2006, 1:49:54 UTC

Hmmm... I just joined Rosetta@home, after CPDN stopped giving out Mac units... but thus far have had no successful runs (3 failures, 1 just started processing). They've all failed with error code -161... which I think has to do with the fact that my computer isn't dedicated to Rosetta (I run 3 other projects on it) or even ON all the time. I noticed the latest unit failed the instant it (tried to) start up...

My computer: Power Mac G5 Dual Processor @ 2 GHz, running OS X 10.3.9
BOINC Client: Command-line version 5.2.13, set to "use only 1 CPU" and "leave apps in memory"; I just recently changed the timeslice setting from 60 to 120 minutes in the hopes it would help.

Some of the log file text:
Command-line error output:
2006-05-02 16:37:22 [rosetta@home] Unrecoverable error for result AB_CASP6_t241__465_2827_1 (<file_xfer_error>
  <file_name>AB_CASP6_t241__465_2827_1_0</file_name>
  <error_code>-161</error_code>
  <error_message></error_message>
</file_xfer_error>
)
2006-05-05 22:37:25 [rosetta@home] Unrecoverable error for result HBLR_1.0_1b72_RDFLAGS_474_909_0 (<file_xfer_error>
  <file_name>HBLR_1.0_1b72_RDFLAGS_474_909_0_0</file_name>
  <error_code>-161</error_code>
  <error_message></error_message>
</file_xfer_error>
)
2006-05-08 15:37:40 [rosetta@home] Unrecoverable error for result HBLR_1.0_1n0u_RDFLAGS_484_1900_0 (<file_xfer_error>
  <file_name>HBLR_1.0_1n0u_RDFLAGS_484_1900_0_0</file_name>
  <error_code>-161</error_code>
  <error_message></error_message>
</file_xfer_error>
)


From sched_request_boinc.bakerlab.org_rosetta.html:
<result>
    <name>HBLR_1.0_1n0u_RDFLAGS_484_1900_0</name>
    <final_cpu_time>0.950000</final_cpu_time>
    <exit_status>0</exit_status>
    <state>3</state>
    <app_version_num>507</app_version_num>
<stderr_out>
<core_client_version>5.2.13</core_client_version>
<stderr_txt>
# random seed: 3903101
# random seed: 3903101
# random seed: 3903101
# cpu_run_time_pref: 10800
# random seed: 3903101
# random seed: 3903101
Too many restarts with no progress. Keep application in memory while preempted.
WARNING! attempt to gzip file ./aa1n0u.out failed: file does not exist.
# DONE ::     0 starting structures built         0 (nstruct) times
# This process generated      0 decoys from       0 attempts

</stderr_txt>
<message><file_xfer_error>
  <file_name>HBLR_1.0_1n0u_RDFLAGS_484_1900_0_0</file_name>
  <error_code>-161</error_code>
  <error_message></error_message>
</file_xfer_error>

</message>
</stderr_out>
</result>


Should I be looking for logging information somewhere else?
ID: 15701 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
K1100LTSE
Avatar

Send message
Joined: 28 Feb 06
Posts: 7
Credit: 192,387
RAC: 0
Message 15720 - Posted: 9 May 2006, 15:18:08 UTC


ID: 15720 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile David@home
Avatar

Send message
Joined: 7 Oct 05
Posts: 29
Credit: 185,330
RAC: 0
Message 15738 - Posted: 9 May 2006, 23:05:40 UTC

I have a Rosetta 5.07 WU apparently stuck at 1% progress. It has completed two lots of one hour project swap intervals and Boinc Manager shows progress at 1.03%.

I will leave running overnight and check in the morning. Are there any error log files I should look out for on my system that may help?


ID: 15738 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Ian

Send message
Joined: 14 Apr 06
Posts: 29
Credit: 361,378
RAC: 763
Message 15742 - Posted: 10 May 2006, 1:27:28 UTC

Had this wu fail in the 7th: https://boinc.bakerlab.org/rosetta/workunit.php?wuid=16118342

Result: https://boinc.bakerlab.org/rosetta/result.php?resultid=19436273

Only just noticed - first error for ages.
Ian Cundell, St Albans, UK
ID: 15742 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Rhiju
Volunteer moderator

Send message
Joined: 8 Jan 06
Posts: 223
Credit: 3,546
RAC: 0
Message 15753 - Posted: 10 May 2006, 5:06:22 UTC - in response to Message 15701.  

I think Moderator9's comments are right on, but I think what's triggering the error is that Rosetta has been preempted 5 times -- and we have a "feature" that kills WUs that have started/stopped several times. I like the idea to increase the time to 120 (or even 240 minutes, which is what I run on my Mac!). Its a bit puzzling though because you have "leave apps in memory" set -- it shouldn't matter if its preempted. So please double check your Rosetta@home setting are "leave apps in memory" let us know if you continue to get errors.


Hmmm... I just joined Rosetta@home, after CPDN stopped giving out Mac units... but thus far have had no successful runs (3 failures, 1 just started processing). They've all failed with error code -161... which I think has to do with the fact that my computer isn't dedicated to Rosetta (I run 3 other projects on it) or even ON all the time. I noticed the latest unit failed the instant it (tried to) start up...

My computer: Power Mac G5 Dual Processor @ 2 GHz, running OS X 10.3.9
BOINC Client: Command-line version 5.2.13, set to "use only 1 CPU" and "leave apps in memory"; I just recently changed the timeslice setting from 60 to 120 minutes in the hopes it would help.

Some of the log file text:
Command-line error output:
2006-05-02 16:37:22 [rosetta@home] Unrecoverable error for result AB_CASP6_t241__465_2827_1 (<file_xfer_error>
  <file_name>AB_CASP6_t241__465_2827_1_0</file_name>
  <error_code>-161</error_code>
  <error_message></error_message>
</file_xfer_error>
)
2006-05-05 22:37:25 [rosetta@home] Unrecoverable error for result HBLR_1.0_1b72_RDFLAGS_474_909_0 (<file_xfer_error>
  <file_name>HBLR_1.0_1b72_RDFLAGS_474_909_0_0</file_name>
  <error_code>-161</error_code>
  <error_message></error_message>
</file_xfer_error>
)
2006-05-08 15:37:40 [rosetta@home] Unrecoverable error for result HBLR_1.0_1n0u_RDFLAGS_484_1900_0 (<file_xfer_error>
  <file_name>HBLR_1.0_1n0u_RDFLAGS_484_1900_0_0</file_name>
  <error_code>-161</error_code>
  <error_message></error_message>
</file_xfer_error>
)


From sched_request_boinc.bakerlab.org_rosetta.html:
<result>
    <name>HBLR_1.0_1n0u_RDFLAGS_484_1900_0</name>
    <final_cpu_time>0.950000</final_cpu_time>
    <exit_status>0</exit_status>
    <state>3</state>
    <app_version_num>507</app_version_num>
<stderr_out>
<core_client_version>5.2.13</core_client_version>
<stderr_txt>
# random seed: 3903101
# random seed: 3903101
# random seed: 3903101
# cpu_run_time_pref: 10800
# random seed: 3903101
# random seed: 3903101
Too many restarts with no progress. Keep application in memory while preempted.
WARNING! attempt to gzip file ./aa1n0u.out failed: file does not exist.
# DONE ::     0 starting structures built         0 (nstruct) times
# This process generated      0 decoys from       0 attempts

</stderr_txt>
<message><file_xfer_error>
  <file_name>HBLR_1.0_1n0u_RDFLAGS_484_1900_0_0</file_name>
  <error_code>-161</error_code>
  <error_message></error_message>
</file_xfer_error>

</message>
</stderr_out>
</result>


Should I be looking for logging information somewhere else?


ID: 15753 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
hugothehermit

Send message
Joined: 26 Sep 05
Posts: 238
Credit: 314,893
RAC: 0
Message 15760 - Posted: 10 May 2006, 9:23:35 UTC
Last modified: 10 May 2006, 9:39:13 UTC

I have a "Rosetta@Home 5.07, Win98se, BOINC ver 5.22, 1 day WU setting" WU sitting on 100% completed, it has been doing that for about 2 days ( I've had one other like this, I restarted BOINC and the WU started from the begining again, which is strange, so I thought I'd let this one go and see what happens, the answer is that it just sits there at 100% ).

What would you like me to do with it? If there as some type of Win98 debugger that you could talk me through? I would be happy to do that, or some memory / thread dump that I don't know about in 98, I assume that watchdog will kill it sometime within about 30hours or so.

It seems to be BOINC has lost the plot, the messages are:
10/05/06 10:12:34 AM||Suspending network activity - user request
10/05/06 10:54:59 AM|rosetta@home|Deferring communication with project for 1 days, 22 hours, 59 minutes, and 49 seconds
10/05/06 11:55:01 AM|rosetta@home|Deferring communication with project for 1 days, 21 hours, 59 minutes, and 46 seconds
10/05/06 12:55:07 PM|rosetta@home|Deferring communication with project for 1 days, 20 hours, 59 minutes, and 41 seconds
10/05/06 1:55:08 PM|rosetta@home|Deferring communication with project for 1 days, 19 hours, 59 minutes, and 40 seconds
10/05/06 2:55:08 PM|rosetta@home|Deferring communication with project for 1 days, 18 hours, 59 minutes, and 39 seconds
10/05/06 3:55:12 PM|rosetta@home|Deferring communication with project for 1 days, 17 hours, 59 minutes, and 36 seconds
10/05/06 4:55:14 PM|rosetta@home|Deferring communication with project for 1 days, 16 hours, 59 minutes, and 34 seconds
10/05/06 5:55:18 PM|rosetta@home|Deferring communication with project for 1 days, 15 hours, 59 minutes, and 30 seconds
10/05/06 6:55:22 PM|rosetta@home|Deferring communication with project for 1 days, 14 hours, 59 minutes, and 26 seconds


Which shouldn't happen as Suspending network activity should stop all attempts at network communication.

I doubt that I have enought (watchdog) time left to give it access to the Internet and see what happens :?

edited to add: and some spelling and stuff
I can't see the graphics (I know I tried) as it's run via (Win 98se)dos command line

Can a mod get rid of the graphic(s) that is making this so wide?
ID: 15760 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile duanra

Send message
Joined: 12 Feb 06
Posts: 8
Credit: 36,223
RAC: 0
Message 15769 - Posted: 10 May 2006, 12:47:35 UTC

Hello !
Using Rosetta@home v. 5.07, windows XP ant ATI Mobility Radeon Graphics card ; each time I open the rosetta screensaver to look at the graphics, it stops after a couple of minutes, my screen becomes black then it reopens again and I've got to close quickly the window of the screensaver or it continues all the time.
Conclusion : I cannot see the graphics without my screen crashing down.

(sorry for my poor English)
Duanra
ID: 15769 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Astro
Avatar

Send message
Joined: 2 Oct 05
Posts: 987
Credit: 500,253
RAC: 0
Message 15772 - Posted: 10 May 2006, 12:56:30 UTC

click on the "work/tasks" tab, highlight the running rosetta wu, then click the "show graphics" box, does the graphic fail there also?
ID: 15772 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
DeMatt

Send message
Joined: 30 Apr 06
Posts: 2
Credit: 188,295
RAC: 0
Message 16267 - Posted: 14 May 2006, 18:15:56 UTC - in response to Message 15753.  

Well... the problem is, this isn't a "cruncher" computer; it's a "home" computer, and as such, is typically turned on for less than 6 hours a day, in two or three time blocks. Hence, "leave apps in memory" doesn't help that much as the computer's memory is getting blanked twice or three times daily.

The good news is that I have since had a unit (a JUMP_ALLBARCODES_ANTIPARALLEL unit, 16268260) complete successfully under 5.07, and 5.12 and 5.13 have since released. I've had another JUMP_ALLBARCODES unit complete successfully under 5.12... have to see if 5.13 can get along with the all-too-frequent closings.

I think Moderator9's comments are right on, but I think what's triggering the error is that Rosetta has been preempted 5 times -- and we have a "feature" that kills WUs that have started/stopped several times. I like the idea to increase the time to 120 (or even 240 minutes, which is what I run on my Mac!). Its a bit puzzling though because you have "leave apps in memory" set -- it shouldn't matter if its preempted. So please double check your Rosetta@home setting are "leave apps in memory" let us know if you continue to get errors.


ID: 16267 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Jon C Melusky
Avatar

Send message
Joined: 29 Nov 05
Posts: 12
Credit: 208,931
RAC: 1,214
Message 16489 - Posted: 17 May 2006, 22:19:07 UTC - in response to Message 15681.  

Well, all I know is that Rosetta worked perfectly from 29 Nov 2005 to early April 2006 with only 384 ram, so I don't know why it used to work so well below basic requirements. Was it 512 ram back in Nov of 2005 ? Should I not have been allowed to attach to Rosetta with 384 ram ? Should I try Ralph with 384 ram ?

Please advise.
Jonathan


Do try joining Ralf.
There are computers there with less than 512 in memory.
Anders n


Hi Anders,

Thank you for the note. I joined Ralph and it worked one day and then stopped. I think it stopped because in another thread, it suggested for people to change their settings for RALPH, but leave their other settings for other projects. But Rosetta is running perfectly again. Yay, the 7 day dead in the water was worrying, but now it is back on track with nothing done on my part. I am starting to suspect that my ISP runs late night things on its servers to disrupt BOINC and other P2P programs. I get error messages late at night only from 3am to 5am or I see them in the morning also. I am thinking that I have to shut down BOINC at night and restart it in the morning. A long term solution is maybe I should go to a better ISP and see if things improve. I am glad my 4 projects run fine with 384RAM. Anyway, thanks for the time to read this far. (^:

Jonathan
ID: 16489 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Feet1st
Avatar

Send message
Joined: 30 Dec 05
Posts: 1755
Credit: 4,690,520
RAC: 0
Message 16490 - Posted: 17 May 2006, 22:30:23 UTC - in response to Message 16489.  

I get error messages late at night only from 3am to 5am or I see them in the morning also. I am thinking that I have to shut down BOINC at night and restart it in the morning.

You might also set up BOINC to not use the network during those hours of the day. This is in the General Preferences. You can set the "Use network only between the hours of" to something like 0600 - 0200. Then in your BOINC Manager, commands menu (called "activity" menu in the lastest BOINC version) select the option for "network activity based on preferences". It will just wait until the allowed time of day and do uploads and downloads during those hours.
Add this signature to your EMail:
Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might!
https://boinc.bakerlab.org/rosetta/
ID: 16490 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Jon C Melusky
Avatar

Send message
Joined: 29 Nov 05
Posts: 12
Credit: 208,931
RAC: 1,214
Message 16520 - Posted: 18 May 2006, 6:49:16 UTC - in response to Message 16490.  


You might also set up BOINC to not use the network during those hours of the day. This is in the General Preferences. You can set the "Use network only between the hours of" to something like 0600 - 0200.


Thank you ! I will try it. (^:

Jonathan
ID: 16520 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 3 · 4 · 5 · 6

Message boards : Number crunching : Report Problems with Rosetta Version 5.07



©2025 University of Washington
https://www.bakerlab.org