Problems and Technical Issues with Rosetta@home

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home

To post messages, you must log in.

Previous · 1 . . . 269 · 270 · 271 · 272 · 273 · 274 · 275 . . . 308 · Next

AuthorMessage
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 2001
Credit: 9,780,807
RAC: 8,163
Message 108872 - Posted: 25 Feb 2024, 10:25:05 UTC - in response to Message 108870.  

Seems like 5 million tasks have become available to run !!
As of 24 Feb 2024, 8:02:19 UTC [ Scheduler running ]
Total queued jobs: 4,921,248
In progress: 115,690
Successes last 24h: 43,490


Our cpus are ready!!
Let's do science!
ID: 108872 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Bill F
Avatar

Send message
Joined: 29 Jan 08
Posts: 48
Credit: 1,612,566
RAC: 1,661
Message 108873 - Posted: 25 Feb 2024, 12:50:29 UTC

And where did they go ? not now showing as available and not in progress
In October 1969 I took an oath to support and defend the Constitution of the United States against all enemies, foreign and domestic;
There was no expiration date.

ID: 108873 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
kotenok2000
Avatar

Send message
Joined: 22 Feb 11
Posts: 271
Credit: 507,897
RAC: 496
Message 108874 - Posted: 25 Feb 2024, 12:52:37 UTC - in response to Message 108873.  

I see 4916 rosetta beta available right now.
ID: 108874 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1720
Credit: 18,351,686
RAC: 24,923
Message 108875 - Posted: 26 Feb 2024, 6:14:37 UTC - in response to Message 108873.  

And where did they go ? not now showing as available and not in progress
?

From the Rosetta home page.
Server Status
As of 26 Feb 2024, 4:01:13 UTC [ Scheduler running ]
                                Total queued jobs:  4,276,875




Your i7-12700F has several that it's doing, and has done, but you might want to sort that system out- 14 hours to do 10 hours work isn't a good sign.
Grant
Darwin NT
ID: 108875 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
kotenok2000
Avatar

Send message
Joined: 22 Feb 11
Posts: 271
Credit: 507,897
RAC: 496
Message 108904 - Posted: 4 Mar 2024, 17:34:37 UTC

Vbox jobs still don't use multiattaching.
ID: 108904 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
kotenok2000
Avatar

Send message
Joined: 22 Feb 11
Posts: 271
Credit: 507,897
RAC: 496
Message 108905 - Posted: 4 Mar 2024, 19:22:59 UTC

And i see these warnings in output
ID: 108905 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 2001
Credit: 9,780,807
RAC: 8,163
Message 108907 - Posted: 5 Mar 2024, 13:34:45 UTC - in response to Message 108904.  

Vbox jobs still don't use multiattaching.


It's a lost cause....
ID: 108907 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Raj

Send message
Joined: 5 Dec 05
Posts: 7
Credit: 519,043
RAC: 182
Message 108913 - Posted: 6 Mar 2024, 19:46:17 UTC

Hi, I've been running Rosetta for years now, but I've noticed over the last couple of weeks that I keep getting errors saying that Rosetta has crashed. Looking at my past history, it appears I haven't successfully completed a task since 29th Feb. It seems that at that point, I was moved from Rosetta 4.20 to Beta v6.04. Is there a way to revert back to 4.20? I've suspended Rosetta for the moment.

If it matters, I don't use VBox with BOINC and I'm using BOINC Manager 7.4.2.1 on an an Nvidia GEForce GT 1030 graphics card, and and old AMD Phenom 1055T CPU

Thanks, Raj.
ID: 108913 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1720
Credit: 18,351,686
RAC: 24,923
Message 108920 - Posted: 7 Mar 2024, 5:53:31 UTC - in response to Message 108913.  

It seems that at that point, I was moved from Rosetta 4.20 to Beta v6.04. Is there a way to revert back to 4.20? I've suspended Rosetta for the moment.
Whatever the problem is, it's with your system.
There are no issues with the present Beta tasks- you're the only one that's posted about them erroring out.

You could try whitelisting your BOINC data folders with your AV programme, if that doesn't help then Re-setting Rosetta may sort it out (it basically clears out all of the application & support files, and re-downloads them from scratch).

The errors they are having can be due to faulty system memory- however the Beta tasks use way (way!) less RAM than the Rosetta 4.20 ones do, so it's more likely due to corrupted data/application/support files/database than a physical RAM issue.
Grant
Darwin NT
ID: 108920 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Raj

Send message
Joined: 5 Dec 05
Posts: 7
Credit: 519,043
RAC: 182
Message 108923 - Posted: 7 Mar 2024, 18:51:02 UTC - in response to Message 108920.  

thanks, I've excluded the dataprojects directory from the AV and have reset the project and restarted it. I'll keep an eye on it and see if it completes. I'll report back when it does.

Regards, Raj.
ID: 108923 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Raj

Send message
Joined: 5 Dec 05
Posts: 7
Credit: 519,043
RAC: 182
Message 108924 - Posted: 7 Mar 2024, 19:48:28 UTC - in response to Message 108923.  

I tried again after doing a reset and got computation errors for each task after under a minute. Is there anything else I can do to debug?
ID: 108924 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Bryn Mawr

Send message
Joined: 26 Dec 18
Posts: 398
Credit: 12,294,748
RAC: 9,249
Message 108928 - Posted: 8 Mar 2024, 2:55:15 UTC - in response to Message 108924.  

I tried again after doing a reset and got computation errors for each task after under a minute. Is there anything else I can do to debug?


You are running a very old version of the Boinc Manager. Try updating to a more current version (7.16.n or 7.20.n).
ID: 108928 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1720
Credit: 18,351,686
RAC: 24,923
Message 108931 - Posted: 8 Mar 2024, 5:17:09 UTC - in response to Message 108928.  
Last modified: 8 Mar 2024, 5:21:52 UTC

I tried again after doing a reset and got computation errors for each task after under a minute. Is there anything else I can do to debug?


You are running a very old version of the Boinc Manager. Try updating to a more current version (7.16.n or 7.20.n).
I can't see that having any effect- the BOINC Manger does just that- manage the science applications. It's the science applications that do the work, and are what are crashing out.




And if resetting the project & excluding the data folders from the AV programme haven't sorted it, i would give the long shot of doing a memory test just to make sure it's not some sort of memory issue (although a i said before- Rosetta 4.20 uses much more RAM).
How to do a memory test in WIn10


Edit- maybe run some hardware monitoring software & check the temperature of your CPU? Rosetta Beta may be making use of instructions that your other project doesn't, so it doesn't push it over the edge where Rosetta Beta does (although i'm grasping at straws here)
Grant
Darwin NT
ID: 108931 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Raj

Send message
Joined: 5 Dec 05
Posts: 7
Credit: 519,043
RAC: 182
Message 108943 - Posted: 8 Mar 2024, 17:04:55 UTC - in response to Message 108928.  

You are running a very old version of the Boinc Manager. Try updating to a more current version (7.16.n or 7.20.n).

Sorry, that was a typo, I'm running 7.24.1
ID: 108943 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Raj

Send message
Joined: 5 Dec 05
Posts: 7
Credit: 519,043
RAC: 182
Message 108944 - Posted: 8 Mar 2024, 17:29:10 UTC - in response to Message 108928.  

I ran the memory test and it reported no errors.
ID: 108944 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
kotenok2000
Avatar

Send message
Joined: 22 Feb 11
Posts: 271
Credit: 507,897
RAC: 496
Message 108945 - Posted: 8 Mar 2024, 18:14:49 UTC

ID: 108945 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Raj

Send message
Joined: 5 Dec 05
Posts: 7
Credit: 519,043
RAC: 182
Message 108972 - Posted: 10 Mar 2024, 19:07:33 UTC - in response to Message 108945.  

I ran that (in stress mode) for about an hour and received no errors or warnings at the end of it.
ID: 108972 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Bryn Mawr

Send message
Joined: 26 Dec 18
Posts: 398
Credit: 12,294,748
RAC: 9,249
Message 108983 - Posted: 14 Mar 2024, 21:25:43 UTC

Just downloaded 4 beta 6.05 tasks one of which immediately (0.02 seconds CPU) failed with :-

<core_client_version>7.24.1</core_client_version>
<![CDATA[
<message>
process exited with code 1 (0x1, -255)</message>
<stderr_txt>
command: ../../projects/boinc.bakerlab.org_rosetta/rosetta_beta_6.05_x86_64-pc-linux-gnu @7a_hal_c_hal_7aa_12899_d40_0001.flags -nstruct 10000 -cpu_run_time 28800 -boinc:max_nstruct 20000 -checkpoint_interval 120 -mute all -database minirosetta_database -in::file::zip minirosetta_database.zip -boinc::watchdog -boinc::cpu_run_timeout 36000 -run::rng mt19937
Using database: database_0f7f01a1b07/database

ERROR: Error in protocols::cyclic_peptide_predict::SimpleCycpepPredictpplication::set_up_n_to_c_cyclization_mover() function: residue 1 does not have a LOWER_CONNECT.
ERROR:: Exit from: src/protocols/cyclic_peptide_predict/SimpleCycpepPredictApplication.cc line: 2442
BOINC:: Error reading and gzipping output datafile: default.out
21:15:06 (176255): called boinc_finish(1)

</stderr_txt>
]]>

Boinc 7.24.1 and Ubuntu 22.04.4
ID: 108983 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Raj

Send message
Joined: 5 Dec 05
Posts: 7
Credit: 519,043
RAC: 182
Message 108984 - Posted: 14 Mar 2024, 23:08:51 UTC - in response to Message 108972.  

Just an update - I've gone back in to look at my tasks on the website, and since yesterday I've had several successful completions, although also a lot of failures that show status "Error while computing". I'm not sure if this is a public URL, but this is what I'm checking: https://boinc.bakerlab.org/rosetta/results.php?hostid=3481412
ID: 108984 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
kotenok2000
Avatar

Send message
Joined: 22 Feb 11
Posts: 271
Credit: 507,897
RAC: 496
Message 108986 - Posted: 14 Mar 2024, 23:32:41 UTC

Someone serverside made incorrect workunits in which residue 1 does not have a LOWER_CONNECT.
ID: 108986 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 269 · 270 · 271 · 272 · 273 · 274 · 275 . . . 308 · Next

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home



©2024 University of Washington
https://www.bakerlab.org