Problems and Technical Issues with Rosetta@home

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 55 · Next

AuthorMessage
Warped

Send message
Joined: 15 Jan 06
Posts: 48
Credit: 1,788,185
RAC: 0
Message 70445 - Posted: 30 May 2011, 15:57:42 UTC - in response to Message 70425.  

Hey guys, I submitted a new job for MVH, which you can read about in the protein-protein interface thread if you're interested. This job is slightly different from the previous ones (it includes more stubs), so I wanted to do some extra checking to make sure that it wouldn't break anything.

Hopefully, I'll get some jobs for Ebola targets later this week!


Shawn, thanks for sorting out the checkpointing issue - a vast improvement.
ID: 70445 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 70449 - Posted: 30 May 2011, 17:43:31 UTC - in response to Message 70442.  

But there is an intermittent problem with validation somewhere. Some of my team are getting credits straight away, some after a delay (not long but undetermined), one for much of the day. No biggie, but a kick may be in order...
I wouldn't call that an "intermittent" problem. A couple of WUs got validated during the day, but the list of "Pending credit" WU's just keeps getting longer...
Hope that isn't like a balloon that gets slowly blown up until you get a big bang (again).

Ralf



The problem persists. Credit is granted, but does not show up in the average chart. The line just keeps going down like a airplane in a nose dive.
Would hope someone from the team is reading this thread and paying attention.
ID: 70449 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile rochester new york
Avatar

Send message
Joined: 2 Jul 06
Posts: 2842
Credit: 2,020,043
RAC: 0
Message 70450 - Posted: 30 May 2011, 18:51:54 UTC - in response to Message 70449.  

But there is an intermittent problem with validation somewhere. Some of my team are getting credits straight away, some after a delay (not long but undetermined), one for much of the day. No biggie, but a kick may be in order...
I wouldn't call that an "intermittent" problem. A couple of WUs got validated during the day, but the list of "Pending credit" WU's just keeps getting longer...
Hope that isn't like a balloon that gets slowly blown up until you get a big bang (again).

Ralf



The problem persists. Credit is granted, but does not show up in the average chart. The line just keeps going down like a airplane in a nose dive.
Would hope someone from the team is reading this thread and paying attention.


they might read it but not much getting done because of the holiday
ID: 70450 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 70451 - Posted: 30 May 2011, 19:18:12 UTC - in response to Message 70450.  

But there is an intermittent problem with validation somewhere. Some of my team are getting credits straight away, some after a delay (not long but undetermined), one for much of the day. No biggie, but a kick may be in order...
I wouldn't call that an "intermittent" problem. A couple of WUs got validated during the day, but the list of "Pending credit" WU's just keeps getting longer...
Hope that isn't like a balloon that gets slowly blown up until you get a big bang (again).

Ralf



The problem persists. Credit is granted, but does not show up in the average chart. The line just keeps going down like a airplane in a nose dive.
Would hope someone from the team is reading this thread and paying attention.


they might read it but not much getting done because of the holiday


yeah forgot about that, but do researchers/tech temas really take holidays? lol
ID: 70451 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Shawn
Volunteer moderator
Project developer
Project scientist

Send message
Joined: 22 Jan 10
Posts: 17
Credit: 53,741
RAC: 0
Message 70453 - Posted: 30 May 2011, 21:52:34 UTC

Hey guys, I'm not too familiar with how credits are awarded, but I'll answer to the best of my knowledge.

As I have previously mentioned, my protocol is a pretty "jumpy" one, in which some trajectories take much longer than others. In fact, the ones that tend to be short are the ones that finish at the very beginning, because the protocol has identified those guys as being "non-productive". So basically, at the very beginning, you will have generated a bunch of models really quickly. However, if you have a model that's running more slowly, unfortunately, you might not earn as much credit, but on the other hand, those runs are much more likely to provide a useful structure, so the time you spend on those longer models is very much appreciated!

However, the credit assigned for these protocols is normalized to a running-average of how long each work unit takes. Since this number is constantly changing, and because the distribution of "long vs. short" jobs is skewed, I'm not surprised that the credits awarded is a little bit finicky. But crunching more models isn't causing your credits to go down; it would have gone down anyway (and to a greater extent) because of the peculiarities of the normalization.

I'll check sometime this week with other lab members who are more familiar with this process when they are available though.
ID: 70453 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 70463 - Posted: 31 May 2011, 15:27:22 UTC

Shawn, I don't think it is your work units that are causing this problem.
I looked at some of my tasks from just before the big crash in RAC on my system, all of them were granted 20 points or more than the claimed credit. This should be keeping my RAC heading up even at a slow pace. The drop of over 100 credits RAC in 1 day can not really be explained by your tasks jumping all over the place.
That would account for perhaps a slow decline followed by a slow increase if anything.

A random sampling of the tasks I have had shows quite a big of ProteinG and others like casd and IF3 and so on. All these are granted 12-20 pts over the claimed credit.

Perhaps you can ask Keith or someone to look at my account and see if they can explain the 100 pt drop in under 24hrs.
ID: 70463 · Rating: 0 · rate: Rate + / Rate - Report as offensive
bookwyrm

Send message
Joined: 3 Jan 11
Posts: 3
Credit: 1,232,986
RAC: 0
Message 70465 - Posted: 31 May 2011, 15:43:41 UTC - in response to Message 70463.  

Shawn, I don't think it is your work units that are causing this problem.
I looked at some of my tasks from just before the big crash in RAC on my system, all of them were granted 20 points or more than the claimed credit. This should be keeping my RAC heading up even at a slow pace. The drop of over 100 credits RAC in 1 day can not really be explained by your tasks jumping all over the place.
That would account for perhaps a slow decline followed by a slow increase if anything.

A random sampling of the tasks I have had shows quite a big of ProteinG and others like casd and IF3 and so on. All these are granted 12-20 pts over the claimed credit.

Perhaps you can ask Keith or someone to look at my account and see if they can explain the 100 pt drop in under 24hrs.


There was no work units roughly between the 26th and 28th
There was a drop of about 65 on my RAC due to that.
ID: 70465 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 70467 - Posted: 31 May 2011, 17:15:06 UTC - in response to Message 70465.  
Last modified: 31 May 2011, 17:16:40 UTC

Shawn, I don't think it is your work units that are causing this problem.
I looked at some of my tasks from just before the big crash in RAC on my system, all of them were granted 20 points or more than the claimed credit. This should be keeping my RAC heading up even at a slow pace. The drop of over 100 credits RAC in 1 day can not really be explained by your tasks jumping all over the place.
That would account for perhaps a slow decline followed by a slow increase if anything.

A random sampling of the tasks I have had shows quite a big of ProteinG and others like casd and IF3 and so on. All these are granted 12-20 pts over the claimed credit.

Perhaps you can ask Keith or someone to look at my account and see if they can explain the 100 pt drop in under 24hrs.


There was no work units roughly between the 26th and 28th
There was a drop of about 65 on my RAC due to that.


Still there is something wrong with that picture. I looked at my account and yes the 27th there was no work reported. Still for just a couple of tasks to cause a 100 pt drop in RAC in just a day is still strange. I have aborted tasks before to get the balance right with all my projects and never had 100 pt RAC drop in just a day.
ID: 70467 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Shawn
Volunteer moderator
Project developer
Project scientist

Send message
Joined: 22 Jan 10
Posts: 17
Credit: 53,741
RAC: 0
Message 70469 - Posted: 31 May 2011, 21:15:31 UTC

I spoke to dekim about the crediting issue, and he believes it's related to the fact that there were no work units sent out last week. Since we have plenty of work queued up now, this shouldn't be an issue, but if it persists this week, please let me know again.
ID: 70469 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 70475 - Posted: 1 Jun 2011, 13:33:02 UTC - in response to Message 70469.  

I spoke to dekim about the crediting issue, and he believes it's related to the fact that there were no work units sent out last week. Since we have plenty of work queued up now, this shouldn't be an issue, but if it persists this week, please let me know again.



Theory still does not pan out. I had 2 tasks or more per day completed as I keep a buffer of work. I missed only 1 day of tasks due to the no work issue.
Since RAC is averaged over a week if I remember correctly, there still is no reason for a 100 pt drop in just 1 day!

This is what I am trying to get across to you guys.
100 pts in 1 day RAC!!!

How is that even possible?
Ralph never dropped that much in a day even when there was no work.
So how can Rosie do that?????
ID: 70475 · Rating: 0 · rate: Rate + / Rate - Report as offensive
bookwyrm

Send message
Joined: 3 Jan 11
Posts: 3
Credit: 1,232,986
RAC: 0
Message 70478 - Posted: 1 Jun 2011, 16:01:44 UTC - in response to Message 70475.  
Last modified: 1 Jun 2011, 16:10:27 UTC

I spoke to dekim about the crediting issue, and he believes it's related to the fact that there were no work units sent out last week. Since we have plenty of work queued up now, this shouldn't be an issue, but if it persists this week, please let me know again.



Theory still does not pan out. I had 2 tasks or more per day completed as I keep a buffer of work. I missed only 1 day of tasks due to the no work issue.
Since RAC is averaged over a week if I remember correctly, there still is no reason for a 100 pt drop in just 1 day!

This is what I am trying to get across to you guys.
100 pts in 1 day RAC!!!

How is that even possible?
Ralph never dropped that much in a day even when there was no work.
So how can Rosie do that?????


It was almost 2 days of no work.

According to boinc-wiki.info, RAC is calculated and updated when the project grants you credit and it takes into account of
1) what your RAC is before credits were granted
2) how long since the last time RAC was calculated(how long between the times when credits were granted)
3) how much credit you've gained since the last update
Because RAC is only updated when credits are granted, you will notice that on the statistics tab in BOINC for rosetta that RAC didn't change for a couple of days when the network was down which you can compare with the total credits graph as well. If RAC was a moving average, then over those days when there was no work, the RAC would drop but more slowly rather than static for a couple of days and a steep drop.

If there was a large amount of time between the RAC calculations, more weight will be put on the credit gained since the last RAC update. Conversely, if very little time had passed between the RAC calculations then the current RAC will have more weight in the calculations and the new RAC will be much closer to the current RAC.

You should check how many hours it had been when your tasks were returned on the 26th when the no work problem started and when the next task was returned sometime on the 28th/29th depending on how long you've set the target running time. If the 1st couple of tasks you returned after the WU drought was abnormally low (lower than RAC), it would make your RAC drop significantly.

Of course you can take this with a pinch of salt. I've only spent a couple of minutes looking at the equation so I might have missed something out.
ID: 70478 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 70480 - Posted: 1 Jun 2011, 16:07:51 UTC

One possible explanation would be if you happened to combine a low credit task (those that happen to hit a model that runs for a long time, and result in poor credit granted as compared to claimed), with a validator (that's a server process that others noted was behind for a time there) that was behind on issuing credit. And it sounds like you were looking at credit in the BOINC Manager, but if not the stats websites are always behind due to how the credit information is disseminated. So if the stats site was behind, or if BOINC Manager had not recently completed a scheduler request (which might occur when it's received no work for a day or so from a project) then your chart may have reflected the combination of factors in to one extreme data point.

Your current RAC shows 461. Multiply by 7 and that reflects a weekly average of 3,227. Have a day run by with zero credit issued, and your weekly total drops to 2,766 for a daily average of 395. So, if you went with no credit for exactly a day, one would expect your RAC to drop by 66 points. You saw it drop 100. So if you were without work for 30 hours rather then 24, or if some credit were granted 6 hours later then the normal instantaneous grant, that would get you to 100 pretty quickly.
Rosetta Moderator: Mod.Sense
ID: 70480 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 70484 - Posted: 1 Jun 2011, 17:16:16 UTC - in response to Message 70480.  
Last modified: 1 Jun 2011, 17:17:15 UTC

One possible explanation would be if you happened to combine a low credit task (those that happen to hit a model that runs for a long time, and result in poor credit granted as compared to claimed), with a validator (that's a server process that others noted was behind for a time there) that was behind on issuing credit. And it sounds like you were looking at credit in the BOINC Manager, but if not the stats websites are always behind due to how the credit information is disseminated. So if the stats site was behind, or if BOINC Manager had not recently completed a scheduler request (which might occur when it's received no work for a day or so from a project) then your chart may have reflected the combination of factors in to one extreme data point.

Your current RAC shows 461. Multiply by 7 and that reflects a weekly average of 3,227. Have a day run by with zero credit issued, and your weekly total drops to 2,766 for a daily average of 395. So, if you went with no credit for exactly a day, one would expect your RAC to drop by 66 points. You saw it drop 100. So if you were without work for 30 hours rather then 24, or if some credit were granted 6 hours later then the normal instantaneous grant, that would get you to 100 pretty quickly.



Ah ok Mod.
Now that makes more sense to me.
I do not know the precise time involved, but your calculation along with the total days/hrs listed by bookwyrm seems to work out to the amount of credit lost.

BTW, the graph I was looking at was the BOINC manager graph.
I don't trust the stats sites to be anywhere near accurate.
ID: 70484 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile John C MacAlister

Send message
Joined: 26 Mar 11
Posts: 4
Credit: 46,289
RAC: 0
Message 70487 - Posted: 2 Jun 2011, 10:45:58 UTC
Last modified: 2 Jun 2011, 10:46:32 UTC

Hi:

The task shown has been running for about 18 hours with some 30 to go....does anyone know if is this unusual?

John




Rosetta@home


Task details


Task ID 426450248
Name NTRC_looprlx_inactive_SAVE_ALL_OUT_26638_78102_0
Workunit 389162774
Created 1 Jun 2011 1:50:33 UTC
Sent 1 Jun 2011 1:55:49 UTC
Received ---
Server state In Progress
Outcome Unknown
Client state New
Exit status 0 (0x0)
Computer ID 1447830
Report deadline 11 Jun 2011 1:55:49 UTC
CPU time 0
stderr out

Validate state Initial
Claimed credit 0
Granted credit 0
application version ---

Home | Join | About | Participants | Community | Statistics

Copyright © 2011 University of Washington

Last Modified: 3 Dec 2007 20:36:17 UTC
Back to top ^
ID: 70487 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile John C MacAlister

Send message
Joined: 26 Mar 11
Posts: 4
Credit: 46,289
RAC: 0
Message 70488 - Posted: 2 Jun 2011, 13:36:52 UTC
Last modified: 2 Jun 2011, 13:37:24 UTC

I suspended this task when it had run for 19:42 and reported still 31:54 remaining....
ID: 70488 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile John C MacAlister

Send message
Joined: 26 Mar 11
Posts: 4
Credit: 46,289
RAC: 0
Message 70490 - Posted: 2 Jun 2011, 17:10:27 UTC
Last modified: 2 Jun 2011, 17:11:03 UTC

Problem cleared when I restarted BOINC.
ID: 70490 · Rating: 0 · rate: Rate + / Rate - Report as offensive
P . P . L .

Send message
Joined: 20 Aug 06
Posts: 581
Credit: 4,865,274
RAC: 0
Message 70660 - Posted: 29 Jun 2011, 3:53:02 UTC

Hi.

It's been posted elsewhere but i've put this here in case someone is monitoring this thread.

You have a bunch of services down.

rah_validator_mini bk1 Not running
rah_assimilator_beta1 bk1 Not running
rah_assimilator_mini1 bk1 Not running
rah_assimilator_mini2 bk1 Not running
rah_assimilator_mini3 bk2 Running
rah_assimilator_mini4 bk2 Running
rah_assimilator_mini5 bk1 Not running
rah_assimilator_mini6 bk1 Not running

ID: 70660 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Italy98

Send message
Joined: 17 Aug 09
Posts: 8
Credit: 87,446
RAC: 0
Message 70763 - Posted: 22 Jul 2011, 2:33:24 UTC

Hello, Am I downloading the incorrect work units? Each file shows 3:37:30 for the processing time, however, it seems that for every one second processed the time increases by two to three seconds. The elapsed time of the current work unit is 49:20 and with 3:17:13 remaining for a total processing time of 4:06:32 or an increase of 30+ min processing time. Thanks.
ID: 70763 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 70782 - Posted: 24 Jul 2011, 13:22:52 UTC

Italy98, please keep in mind that time to completion is just an estimate. And BOINC's methods for estimation have their limitations. Rosetta@home allows you to define a runtime preference for tasks. If you have recently changed this, BOINC will need to process a few tasks before it understands how to revise the estimates to be more accurate.

The maximum runtime possible would be if you configured your preference up to the 24hr maximum, and then had a task that happened to run long and required the watch dog to step in and complete the task. This occurs 4 hours after your preferred runtime. So 28hrs would be the most CPU time you should see a task use.

So please try to not worry too much about the numbers when looking from one hour to the next. Instead try to look once per day to determine if it looks like there are any problems.
Rosetta Moderator: Mod.Sense
ID: 70782 · Rating: 0 · rate: Rate + / Rate - Report as offensive
P . P . L .

Send message
Joined: 20 Aug 06
Posts: 581
Credit: 4,865,274
RAC: 0
Message 70792 - Posted: 26 Jul 2011, 22:31:39 UTC

Hi.

Your Systems seem to be on a go slow today, i've got tasks that have been sitting pending for around an hour.

Not much work either.

Database status
State Approximate #results
Ready to send__11

ID: 70792 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 55 · Next

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home



©2024 University of Washington
https://www.bakerlab.org