Message boards : Number crunching : Problems and Technical Issues with Rosetta@home
Previous · 1 . . . 44 · 45 · 46 · 47 · 48 · 49 · 50 . . . 55 · Next
Author | Message |
---|---|
Dusty Send message Joined: 1 Mar 08 Posts: 41 Credit: 2,667,354 RAC: 0 |
Well, server status shows all green but for "file deleter".Two of our servers went down again. We are currently looking into it. Hold tight. I see that BOINC stats for the world isn't showing any rac for 31 Aug (understandable since the servers were down) or the 1st--which I don't understand since I started receiving credits yesterday afternoon. I'm not sure what time BOINC stats for the world gets the data, but I'm sure it's after midnight GMT, so there should have been some credits available for posting. |
BarryAZ Send message Joined: 27 Dec 05 Posts: 153 Credit: 30,843,285 RAC: 0 |
I'm wondering if the folks at the project are aware that the stats are not being run for the external sites. Sort of part of the 'wonderment' we get to observe. |
David E K Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 1 Jul 05 Posts: 1018 Credit: 4,334,829 RAC: 0 |
The stats files should be available now and back to being updated. Thanks for the heads up! |
BarryAZ Send message Joined: 27 Dec 05 Posts: 153 Credit: 30,843,285 RAC: 0 |
Thanks -- looks like the stats sites are picking up some of the data -- sometimes it takes them a couple of days to get fully updated. The stats files should be available now and back to being updated. Thanks for the heads up! |
alvin Send message Joined: 19 Jul 15 Posts: 5 Credit: 6,550,555 RAC: 0 |
I have currently running this project and its all fine except one thing download data amount here is monthly report address download upload total bakerlab.org 24.0 GB (5.4 %) 6.00 GB (6.7 %) 30.0 GB (5.6 %) It's strange as I have opposite issue with other projects - they have huge ratio for download:upload as 1:5 or more. The issue is amount of traffic : could I ask you to pack results on server and client side if possible? Could compressing data be an option in settings? I suppose all those years ages ago noone cares about those amounts, but why the difference disbalance between incoming data and outcoming data is so huge? Anyway I think some action either on project side or whole boinc side could be done to pursue the balance and minimise traffic. |
Link Send message Joined: 4 May 07 Posts: 356 Credit: 382,349 RAC: 0 |
The issue is amount of traffic : could I ask you to pack results on server and client side if possible? The data is compressed, you get all input files and the database as zip or gz. However I don't know, if they use the best compression available with these formats. I suppose all those years ages ago noone cares about those amounts, but why the difference disbalance between incoming data and outcoming data is so huge? That's simply project specific thing. . |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2125 Credit: 41,249,734 RAC: 9,368 |
Total queued jobs: 0 Ready to send 2,796 In progress 873,792 Basically, nothing in the pipeline to come down |
MELund Send message Joined: 12 Nov 10 Posts: 2 Credit: 112,163 RAC: 0 |
Just a heads up all my jobs are bad. Output for upload looks like this. 9/21/2015 12:00:38 AM | rosetta@home | Task FKRP_rosetta_cm_t000__0_C1_SAVE_ALL_OUT_IGNORE_THE_REST_275412_278640_0 exited with zero status but no 'finished' file 9/21/2015 12:00:38 AM | rosetta@home | If this happens repeatedly you may need to reset the project. 9/21/2015 12:00:38 AM | rosetta@home | Task FKRP_rosetta_cm_t000__0_C2_SAVE_ALL_OUT_IGNORE_THE_REST_275412_272371_0 exited with zero status but no 'finished' file 9/21/2015 12:00:38 AM | rosetta@home | If this happens repeatedly you may need to reset the project. If you are just looking to know the problem is there. It still is. If you need more detail here I can try to help. |
MELund Send message Joined: 12 Nov 10 Posts: 2 Credit: 112,163 RAC: 0 |
[quote]Just a heads up all my jobs are bad. Output for upload looks like this. 9/21/2015 12:00:38 AM | rosetta@home | Task FKRP_rosetta_cm_t000__0_C1_SAVE_ALL_OUT_IGNORE_THE_REST_275412_278640_0 exited with zero status but no 'finished' file 9/21/2015 12:00:38 AM | rosetta@home | If this happens repeatedly you may need to reset the project. 9/21/2015 12:00:38 AM | rosetta@home | Task FKRP_rosetta_cm_t000__0_C2_SAVE_ALL_OUT_IGNORE_THE_REST_275412_272371_0 exited with zero status but no 'finished' file 9/21/2015 12:00:38 AM | rosetta@home | If this happens repeatedly you may need to reset the project. If you are just looking to know if the problem is there. It still is. If you need more detail here I can try to help. I have tried resetting the project a bunch of times. |
Timo Send message Joined: 9 Jan 12 Posts: 185 Credit: 45,649,459 RAC: 0 |
[quote]Just a heads up all my jobs are bad. Output for upload looks like this. There are two possible known causes for this "no 'finished' file" issue - a) There is an issue with local antivirus software interfering with the files being written to the boinc DATA directory used by Rosetta, or the directory can otherwise not be written to due to security settings or permissions, etc. or b) There is a known BOINC issue that randomly crops up and causes this exact error when the 'Use at most ___ % of CPU time' option is set to anything lower than 100%' (this issue can affect many different projects not just Rosetta). I suggest reading through this thread (which by the way, shows at the bottom an alternative to setting the CPU time % option to anything lower than 100% if that is the case for you.) |
Link Send message Joined: 4 May 07 Posts: 356 Credit: 382,349 RAC: 0 |
There are two possible known causes for this "no 'finished' file" issue - a) There is an issue with local antivirus software interfering with the files being written to the boinc DATA directory used by Rosetta, or the directory can otherwise not be written to due to security settings or permissions, etc. or b) There is a known BOINC issue that randomly crops up and causes this exact error when the 'Use at most ___ % of CPU time' option is set to anything lower than 100%' (this issue can affect many different projects not just Rosetta). from my observation there's also c) 100% HDD load for longer periods of time lead also to exit with zero status but no 'finished' file. Anyway, for FKRP* tasks I also noticed increase in this type of exit and with the seldom checkpointing (I've seen 5+ hours between checkpoints (yes, my Pentium M is quite slow)) lots of computing time is lost. . |
Robby1959 Send message Joined: 10 May 07 Posts: 38 Credit: 9,298,741 RAC: 0 |
I am still have work units error out . mainly one machine . is it my issue or a work unit problem |
Timo Send message Joined: 9 Jan 12 Posts: 185 Credit: 45,649,459 RAC: 0 |
I am still have work units error out . mainly one machine . is it my issue or a work unit problem Looking at your two computers, all of the failures are on computer 1614373 but generally they all say 'Aborted by User' and the same work units got sent out to other users and were completed successfully so whatever happened it was on your side of things, just some examples:
|
HenryRevel Send message Joined: 23 Oct 06 Posts: 1 Credit: 20,074 RAC: 0 |
Hello Rosetta@home, This is the first time posting on here. No error's just info to share about the estimated time on Boinc client for your task. Counter does not function properly and counts forward when client is open. When first get the task on them is about 12hrs when said all and done it is totaling 24hrs+ Why not have it estimated at or around 24hrs then. Guess so not to discourage others to do the task cause it takes so long maybe lol. I finish each task first then move to next one so not to over clock or raise high temperatures on CPU. Yes I know there are settings to change the usage and can compensate for it. Then takes even longer to do the computation if turn it down I use 75% of CPU and 75% CPU time when use. If away from keyboard change to 90% CPU 75% time. Still estimated timer counts up either way when client is open. Close then re-open, est. Timer jumps back down then slowly counts back upwards. Not sure if this a glitch or it is a work related task searching for it's answer. |
Timo Send message Joined: 9 Jan 12 Posts: 185 Credit: 45,649,459 RAC: 0 |
Counter does not function properly and counts forward when client is open. This isn't a glitch, it's simply a testament to the difficulties of of estimating remaining time in a imperfectly-linear (read: not-quite-linear) process. I've never really bothered to even look at the 'remaining time', but it will never be 100% accurate as the runtime is only an 'estimate' based on how many models/decoys can be squeezed into your preferred 'target runtime' as set in your Rosetta preferences. Sure, it could just count down to your preferred target time, but then people would complain about it not stopping at exactly 0 seconds remaining as the target time is just a target. |
Steve Send message Joined: 22 Nov 15 Posts: 8 Credit: 164,345 RAC: 0 |
Hi, I'm experiencing the following: Most Rosetta tasks complete in around 6 to 8 hours on my PC - but frequently I find one or two tasks stop showing an estimated remaining time [just --- in the column]and then they continue to run indefinitely. I let one run for over 36 hours, still not completed, and 'stuck' on about 4% done, but the % varies (never more than 20% though). I accept that estimating is inexact - more of a guess sometimes? - but I've taken to aborting these apparent zombie tasks that are just eating CPU and seemingly not producing results after 12 hours or so. Am I being too pessimistic? should I just let these run and run in the hope they will someday finish? or are these bad tasks? If you need more details please let me know. Thanks Steve |
Link Send message Joined: 4 May 07 Posts: 356 Credit: 382,349 RAC: 0 |
I accept that estimating is inexact - more of a guess sometimes? - but I've taken to aborting these apparent zombie tasks that are just eating CPU and seemingly not producing results after 12 hours or so. What is your runtime preference? But in general, as long as they indeed use CPU, let them run, if they don't, restart the BOINC client and then they probably should finish fine. . |
Timo Send message Joined: 9 Jan 12 Posts: 185 Credit: 45,649,459 RAC: 0 |
I've taken to aborting these apparent zombie tasks that are just eating CPU and seemingly not producing results after 12 hours or so. (in my experience these seemingly zombie tasks actually DO finish on their own eventually, and no the CPU is not spinning for nothing, its generally searching a deep pocket of confirmation space, with that said though, I've RARELY seen any tasks go on for any longer than an hour or so after their target runtime)... Anyways, if it DOES happen that a task goes well past its target run time then.. Instead of aborting, a better solution is actually to exit BOINC (ensure its set to 'Stop running tasks when exiting the BOINC Manager') and re-start BOINC. This will trigger any 'stuck' tasks to finish and report themselves in properly. |
Link Send message Joined: 4 May 07 Posts: 356 Credit: 382,349 RAC: 0 |
Anyways, if it DOES happen that a task goes well past its target run time then.. Instead of aborting, a better solution is actually to exit BOINC (ensure its set to 'Stop running tasks when exiting the BOINC Manager') and re-start BOINC. This will trigger any 'stuck' tasks to finish and report themselves in properly. Restarting BOINC makes only sense if one of the tasks is really stuck, i.e. not using any CPU time. Always check in task manager first, going over the target run time means nothing, because actually it's target CPU time, so if you do lots of CPU intensive tasks, the run time might get much longer. If all tasks are using CPU, they are not stuck. Since some Rosetta tasks are checkpointing very seldom (up to several hours between checkpoints), one should avoid restarting BOINC if not really necessary. . |
Steve Send message Joined: 22 Nov 15 Posts: 8 Credit: 164,345 RAC: 0 |
Thanks for all the responses, nice to know this thread is monitored :-) OK, I'll be more patient and let them run - maybe reboot the PC every couple of days, which will restart BOINC manager and client. Season's best wishes Steve |
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
©2024 University of Washington
https://www.bakerlab.org