Problems and Technical Issues with Rosetta@home

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home

To post messages, you must log in.

Previous · 1 . . . 44 · 45 · 46 · 47 · 48 · 49 · 50 . . . 55 · Next

AuthorMessage
Dusty

Send message
Joined: 1 Mar 08
Posts: 41
Credit: 2,667,354
RAC: 0
Message 78676 - Posted: 2 Sep 2015, 12:05:36 UTC - in response to Message 78673.  

Two of our servers went down again. We are currently looking into it. Hold tight.


All server status shows disabled except for the data-driven web pages, yet my completed tasks are being successfully uploaded. However, I am not receiving credit for any of the uploads. Will I receive credit for them later?
Well, server status shows all green but for "file deleter".

The account info shows updated stats but it looks like the stats XML files aren't generated as none of the external stats sites are showing any updates for now...

Ralf

I see that BOINC stats for the world isn't showing any rac for 31 Aug (understandable since the servers were down) or the 1st--which I don't understand since I started receiving credits yesterday afternoon. I'm not sure what time BOINC stats for the world gets the data, but I'm sure it's after midnight GMT, so there should have been some credits available for posting.
ID: 78676 · Rating: 0 · rate: Rate + / Rate - Report as offensive
BarryAZ

Send message
Joined: 27 Dec 05
Posts: 153
Credit: 30,843,285
RAC: 0
Message 78681 - Posted: 3 Sep 2015, 4:36:38 UTC

I'm wondering if the folks at the project are aware that the stats are not being run for the external sites.

Sort of part of the 'wonderment' we get to observe.

ID: 78681 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile David E K
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 1 Jul 05
Posts: 1018
Credit: 4,334,829
RAC: 0
Message 78682 - Posted: 3 Sep 2015, 15:30:39 UTC

The stats files should be available now and back to being updated. Thanks for the heads up!
ID: 78682 · Rating: 0 · rate: Rate + / Rate - Report as offensive
BarryAZ

Send message
Joined: 27 Dec 05
Posts: 153
Credit: 30,843,285
RAC: 0
Message 78683 - Posted: 3 Sep 2015, 19:38:36 UTC - in response to Message 78682.  

Thanks -- looks like the stats sites are picking up some of the data -- sometimes it takes them a couple of days to get fully updated.


The stats files should be available now and back to being updated. Thanks for the heads up!


ID: 78683 · Rating: 0 · rate: Rate + / Rate - Report as offensive
alvin

Send message
Joined: 19 Jul 15
Posts: 5
Credit: 6,550,555
RAC: 0
Message 78723 - Posted: 8 Sep 2015, 2:27:19 UTC

I have currently running this project and its all fine except one thing
download data amount
here is monthly report
address download upload total
bakerlab.org 24.0 GB (5.4 %) 6.00 GB (6.7 %) 30.0 GB (5.6 %)

It's strange as I have opposite issue with other projects - they have huge ratio for download:upload as 1:5 or more.

The issue is amount of traffic : could I ask you to pack results on server and client side if possible?

Could compressing data be an option in settings?

I suppose all those years ages ago noone cares about those amounts, but why the difference disbalance between incoming data and outcoming data is so huge?
Anyway I think some action either on project side or whole boinc side could be done to pursue the balance and minimise traffic.
ID: 78723 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Link
Avatar

Send message
Joined: 4 May 07
Posts: 356
Credit: 382,349
RAC: 0
Message 78730 - Posted: 8 Sep 2015, 7:12:57 UTC - in response to Message 78723.  

The issue is amount of traffic : could I ask you to pack results on server and client side if possible?

Could compressing data be an option in settings?

The data is compressed, you get all input files and the database as zip or gz. However I don't know, if they use the best compression available with these formats.


I suppose all those years ages ago noone cares about those amounts, but why the difference disbalance between incoming data and outcoming data is so huge?

That's simply project specific thing.
.
ID: 78730 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2125
Credit: 41,249,734
RAC: 9,368
Message 78781 - Posted: 13 Sep 2015, 1:11:48 UTC

Total queued jobs: 0

Ready to send 2,796
In progress 873,792

Basically, nothing in the pipeline to come down
ID: 78781 · Rating: 0 · rate: Rate + / Rate - Report as offensive
MELund

Send message
Joined: 12 Nov 10
Posts: 2
Credit: 112,163
RAC: 0
Message 78841 - Posted: 21 Sep 2015, 17:52:37 UTC

Just a heads up all my jobs are bad. Output for upload looks like this.

9/21/2015 12:00:38 AM | rosetta@home | Task FKRP_rosetta_cm_t000__0_C1_SAVE_ALL_OUT_IGNORE_THE_REST_275412_278640_0 exited with zero status but no 'finished' file
9/21/2015 12:00:38 AM | rosetta@home | If this happens repeatedly you may need to reset the project.
9/21/2015 12:00:38 AM | rosetta@home | Task FKRP_rosetta_cm_t000__0_C2_SAVE_ALL_OUT_IGNORE_THE_REST_275412_272371_0 exited with zero status but no 'finished' file
9/21/2015 12:00:38 AM | rosetta@home | If this happens repeatedly you may need to reset the project.

If you are just looking to know the problem is there. It still is. If you need more detail here I can try to help.

ID: 78841 · Rating: 0 · rate: Rate + / Rate - Report as offensive
MELund

Send message
Joined: 12 Nov 10
Posts: 2
Credit: 112,163
RAC: 0
Message 78842 - Posted: 21 Sep 2015, 17:54:48 UTC - in response to Message 78841.  

[quote]Just a heads up all my jobs are bad. Output for upload looks like this.

9/21/2015 12:00:38 AM | rosetta@home | Task FKRP_rosetta_cm_t000__0_C1_SAVE_ALL_OUT_IGNORE_THE_REST_275412_278640_0 exited with zero status but no 'finished' file
9/21/2015 12:00:38 AM | rosetta@home | If this happens repeatedly you may need to reset the project.
9/21/2015 12:00:38 AM | rosetta@home | Task FKRP_rosetta_cm_t000__0_C2_SAVE_ALL_OUT_IGNORE_THE_REST_275412_272371_0 exited with zero status but no 'finished' file
9/21/2015 12:00:38 AM | rosetta@home | If this happens repeatedly you may need to reset the project.

If you are just looking to know if the problem is there. It still is. If you need more detail here I can try to help. I have tried resetting the project a bunch of times.

ID: 78842 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Timo
Avatar

Send message
Joined: 9 Jan 12
Posts: 185
Credit: 45,649,459
RAC: 0
Message 78843 - Posted: 21 Sep 2015, 18:17:19 UTC - in response to Message 78842.  

[quote]Just a heads up all my jobs are bad. Output for upload looks like this.

9/21/2015 12:00:38 AM | rosetta@home | Task FKRP_rosetta_cm_t000__0_C1_SAVE_ALL_OUT_IGNORE_THE_REST_275412_278640_0 exited with zero status but no 'finished' file
9/21/2015 12:00:38 AM | rosetta@home | If this happens repeatedly you may need to reset the project.
9/21/2015 12:00:38 AM | rosetta@home | Task FKRP_rosetta_cm_t000__0_C2_SAVE_ALL_OUT_IGNORE_THE_REST_275412_272371_0 exited with zero status but no 'finished' file
9/21/2015 12:00:38 AM | rosetta@home | If this happens repeatedly you may need to reset the project.

If you are just looking to know if the problem is there. It still is. If you need more detail here I can try to help. I have tried resetting the project a bunch of times.


There are two possible known causes for this "no 'finished' file" issue - a) There is an issue with local antivirus software interfering with the files being written to the boinc DATA directory used by Rosetta, or the directory can otherwise not be written to due to security settings or permissions, etc. or b) There is a known BOINC issue that randomly crops up and causes this exact error when the 'Use at most ___ % of CPU time' option is set to anything lower than 100%' (this issue can affect many different projects not just Rosetta).

I suggest reading through this thread (which by the way, shows at the bottom an alternative to setting the CPU time % option to anything lower than 100% if that is the case for you.)
ID: 78843 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Link
Avatar

Send message
Joined: 4 May 07
Posts: 356
Credit: 382,349
RAC: 0
Message 78847 - Posted: 23 Sep 2015, 17:28:34 UTC - in response to Message 78843.  

There are two possible known causes for this "no 'finished' file" issue - a) There is an issue with local antivirus software interfering with the files being written to the boinc DATA directory used by Rosetta, or the directory can otherwise not be written to due to security settings or permissions, etc. or b) There is a known BOINC issue that randomly crops up and causes this exact error when the 'Use at most ___ % of CPU time' option is set to anything lower than 100%' (this issue can affect many different projects not just Rosetta).

from my observation there's also

c) 100% HDD load for longer periods of time lead also to exit with zero status but no 'finished' file.

Anyway, for FKRP* tasks I also noticed increase in this type of exit and with the seldom checkpointing (I've seen 5+ hours between checkpoints (yes, my Pentium M is quite slow)) lots of computing time is lost.
.
ID: 78847 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Robby1959

Send message
Joined: 10 May 07
Posts: 38
Credit: 9,298,741
RAC: 0
Message 78882 - Posted: 5 Oct 2015, 18:54:56 UTC

I am still have work units error out . mainly one machine . is it my issue or a work unit problem
ID: 78882 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Timo
Avatar

Send message
Joined: 9 Jan 12
Posts: 185
Credit: 45,649,459
RAC: 0
Message 78884 - Posted: 6 Oct 2015, 12:55:49 UTC - in response to Message 78882.  

I am still have work units error out . mainly one machine . is it my issue or a work unit problem


Looking at your two computers, all of the failures are on computer 1614373 but generally they all say 'Aborted by User' and the same work units got sent out to other users and were completed successfully so whatever happened it was on your side of things, just some examples:



So, not sure why they were aborted but generally 'aborted by user' indicates someone clicked on the job in BOINC manager and clicked the 'Abort' button.

ID: 78884 · Rating: 0 · rate: Rate + / Rate - Report as offensive
HenryRevel

Send message
Joined: 23 Oct 06
Posts: 1
Credit: 20,074
RAC: 0
Message 79199 - Posted: 12 Dec 2015, 22:15:20 UTC

Hello Rosetta@home,

This is the first time posting on here. No error's just info to share about the estimated time on Boinc client for your task. Counter does not function properly and counts forward when client is open. When first get the task on them is about 12hrs when said all and done it is totaling 24hrs+ Why not have it estimated at or around 24hrs then. Guess so not to discourage others to do the task cause it takes so long maybe lol. I finish each task first then move to next one so not to over clock or raise high temperatures on CPU. Yes I know there are settings to change the usage and can compensate for it. Then takes even longer to do the computation if turn it down I use 75% of CPU and 75% CPU time when use. If away from keyboard change to 90% CPU 75% time. Still estimated timer counts up either way when client is open. Close then re-open, est. Timer jumps back down then slowly counts back upwards. Not sure if this a glitch or it is a work related task searching for it's answer.
ID: 79199 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Timo
Avatar

Send message
Joined: 9 Jan 12
Posts: 185
Credit: 45,649,459
RAC: 0
Message 79200 - Posted: 12 Dec 2015, 23:01:11 UTC - in response to Message 79199.  

Counter does not function properly and counts forward when client is open.


This isn't a glitch, it's simply a testament to the difficulties of of estimating remaining time in a imperfectly-linear (read: not-quite-linear) process. I've never really bothered to even look at the 'remaining time', but it will never be 100% accurate as the runtime is only an 'estimate' based on how many models/decoys can be squeezed into your preferred 'target runtime' as set in your Rosetta preferences.

Sure, it could just count down to your preferred target time, but then people would complain about it not stopping at exactly 0 seconds remaining as the target time is just a target.
ID: 79200 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Steve

Send message
Joined: 22 Nov 15
Posts: 8
Credit: 164,345
RAC: 0
Message 79236 - Posted: 17 Dec 2015, 16:51:17 UTC

Hi, I'm experiencing the following:

Most Rosetta tasks complete in around 6 to 8 hours on my PC - but frequently I find one or two tasks stop showing an estimated remaining time [just --- in the column]and then they continue to run indefinitely. I let one run for over 36 hours, still not completed, and 'stuck' on about 4% done, but the % varies (never more than 20% though).

I accept that estimating is inexact - more of a guess sometimes? - but I've taken to aborting these apparent zombie tasks that are just eating CPU and seemingly not producing results after 12 hours or so.

Am I being too pessimistic? should I just let these run and run in the hope they will someday finish? or are these bad tasks?

If you need more details please let me know.

Thanks
Steve
ID: 79236 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Link
Avatar

Send message
Joined: 4 May 07
Posts: 356
Credit: 382,349
RAC: 0
Message 79238 - Posted: 17 Dec 2015, 18:30:59 UTC - in response to Message 79236.  

I accept that estimating is inexact - more of a guess sometimes? - but I've taken to aborting these apparent zombie tasks that are just eating CPU and seemingly not producing results after 12 hours or so.

Am I being too pessimistic? should I just let these run and run in the hope they will someday finish? or are these bad tasks?

What is your runtime preference? But in general, as long as they indeed use CPU, let them run, if they don't, restart the BOINC client and then they probably should finish fine.
.
ID: 79238 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Timo
Avatar

Send message
Joined: 9 Jan 12
Posts: 185
Credit: 45,649,459
RAC: 0
Message 79239 - Posted: 17 Dec 2015, 20:49:10 UTC - in response to Message 79238.  

I've taken to aborting these apparent zombie tasks that are just eating CPU and seemingly not producing results after 12 hours or so.

Am I being too pessimistic? should I just let these run and run in the hope they will someday finish? or are these bad tasks?

restart the BOINC client and then they probably should finish fine.


(in my experience these seemingly zombie tasks actually DO finish on their own eventually, and no the CPU is not spinning for nothing, its generally searching a deep pocket of confirmation space, with that said though, I've RARELY seen any tasks go on for any longer than an hour or so after their target runtime)...

Anyways, if it DOES happen that a task goes well past its target run time then.. Instead of aborting, a better solution is actually to exit BOINC (ensure its set to 'Stop running tasks when exiting the BOINC Manager') and re-start BOINC. This will trigger any 'stuck' tasks to finish and report themselves in properly.
ID: 79239 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Link
Avatar

Send message
Joined: 4 May 07
Posts: 356
Credit: 382,349
RAC: 0
Message 79240 - Posted: 17 Dec 2015, 21:29:48 UTC - in response to Message 79239.  

Anyways, if it DOES happen that a task goes well past its target run time then.. Instead of aborting, a better solution is actually to exit BOINC (ensure its set to 'Stop running tasks when exiting the BOINC Manager') and re-start BOINC. This will trigger any 'stuck' tasks to finish and report themselves in properly.

Restarting BOINC makes only sense if one of the tasks is really stuck, i.e. not using any CPU time. Always check in task manager first, going over the target run time means nothing, because actually it's target CPU time, so if you do lots of CPU intensive tasks, the run time might get much longer.

If all tasks are using CPU, they are not stuck. Since some Rosetta tasks are checkpointing very seldom (up to several hours between checkpoints), one should avoid restarting BOINC if not really necessary.
.
ID: 79240 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Steve

Send message
Joined: 22 Nov 15
Posts: 8
Credit: 164,345
RAC: 0
Message 79244 - Posted: 18 Dec 2015, 13:30:34 UTC - in response to Message 79240.  

Thanks for all the responses, nice to know this thread is monitored :-)
OK, I'll be more patient and let them run - maybe reboot the PC every couple of days, which will restart BOINC manager and client.
Season's best wishes
Steve
ID: 79244 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Previous · 1 . . . 44 · 45 · 46 · 47 · 48 · 49 · 50 . . . 55 · Next

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home



©2024 University of Washington
https://www.bakerlab.org