Default Run Time

Message boards : Number crunching : Default Run Time

To post messages, you must log in.

AuthorMessage
Mike Gelvin
Avatar

Send message
Joined: 7 Oct 05
Posts: 65
Credit: 10,612,039
RAC: 0
Message 16652 - Posted: 19 May 2006, 18:23:49 UTC

Web page for Default Run Time entry states:

Target CPU run time
(not selected defaults to 4 hours)

I have never chosen a run time, allowing the project to chose what might be best for me, so I assume the default is in play here (4 hours). However it actually appears to be set at 3 hours across all my machines. Is this an error?

ID: 16652 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile David E K
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 1 Jul 05
Posts: 1018
Credit: 4,334,829
RAC: 0
Message 16653 - Posted: 19 May 2006, 18:30:42 UTC

It does look like an error on the web page. I believe Rhiju may have switched the default to 3 hours. I will change the page to reflect that.
ID: 16653 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Feet1st
Avatar

Send message
Joined: 30 Dec 05
Posts: 1755
Credit: 4,690,520
RAC: 0
Message 16659 - Posted: 19 May 2006, 20:10:33 UTC

So THAT's why so many people's WUs run in 10,000 seconds! Here I had assumed it was just common (REALLY REALLY common) to hit that point and calculate that the next model would take more than an hour.
Add this signature to your EMail:
Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might!
https://boinc.bakerlab.org/rosetta/
ID: 16659 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
neil.hunter14

Send message
Joined: 9 May 06
Posts: 10
Credit: 278,867
RAC: 0
Message 16681 - Posted: 20 May 2006, 8:11:32 UTC

I don't quite follow the logic of being able to change the CPU Run-Time. I set mine at 12 hours yesterday, and sure enough, the WU ran for about that length of time. Now I have it set to 2 hours. And the model runs for two hours.

My question is: Do I get more credit for longer run-times? If the CPU Run Time is too short, am I wasting part of the model, that will never then be computed?

Should I leave it stuck at 4 hours as the default?
Surely a slower PC will take longer to compute, and therefore the amount of number crunching my PC can do in an hour, might take 4 hours on an older P3 machine.

What is the reason for being able to change the run-time?

Neil.
ID: 16681 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile anders n

Send message
Joined: 19 Sep 05
Posts: 403
Credit: 537,991
RAC: 0
Message 16684 - Posted: 20 May 2006, 8:27:01 UTC - in response to Message 16681.  
Last modified: 20 May 2006, 8:27:40 UTC

I don't quite follow the logic of being able to change the CPU Run-Time. I set mine at 12 hours yesterday, and sure enough, the WU ran for about that length of time. Now I have it set to 2 hours. And the model runs for two hours.

My question is: Do I get more credit for longer run-times? If the CPU Run Time is too short, am I wasting part of the model, that will never then be computed?

Should I leave it stuck at 4 hours as the default?
Surely a slower PC will take longer to compute, and therefore the amount of number crunching my PC can do in an hour, might take 4 hours on an older P3 machine.

What is the reason for being able to change the run-time?

Neil.



This was created to keep cruchers on modems happy. Less Mb to download.

Now it also can help the server to be happy with less uploads/downloads per computer.

The project has a 8 H setting as best for them IF it works fine on your computers.

Anders n
ID: 16684 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
tralala

Send message
Joined: 8 Apr 06
Posts: 376
Credit: 581,806
RAC: 0
Message 16688 - Posted: 20 May 2006, 10:21:17 UTC - in response to Message 16681.  

My question is: Do I get more credit for longer run-times? If the CPU Run Time is too short, am I wasting part of the model, that will never then be computed?
Neil.

You get more credit for longer run-times. The CPU Run Time does not affect credit at all. For 12 hours you get 6 times the credit than for 2 hours (on the same machine).
Each WU does as many models as possible with unique starting positions given the run-time-preference . Since there is an infinite amount of possible starting positions it does not alter the scientific output whether you do 6 times 10 models with 6 differen WUs or one time 60 models with one WU. The only difference is bandwidth consumption. The shorter runtimes are available for those who like shorter WUs and as a safety net for failing WU (with the implemented watchdog now probably no longer of much importance).
ID: 16688 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Moderator9
Volunteer moderator

Send message
Joined: 22 Jan 06
Posts: 1014
Credit: 0
RAC: 0
Message 16702 - Posted: 20 May 2006, 14:48:45 UTC
Last modified: 20 May 2006, 14:51:34 UTC

What others have noted here is basically correct. All of the Rosetta work units of the same name are identical except for a random number that determines where the processing will begin. That is almost unique to Rosetta. For example at SETI and Einstein each Work unit is unique. At CPDN each model type is identical, but the run time parameters are different. So a user adjustable run time is not practical at those projects.

Credits for all BOINC projects can be computed down to a specific number of credits for a certain number of hours of CPU time. So for any particular work unit the credits per hour are the same no matter how many hours it runs.

All work units will produce at lest one model, no matter how long your time setting is. So if a particular model type would take 4 hours to create the first model, then it will run at least 4 hours no matter what your time setting is. But if your time setting is for 24 hours, then that same work unit will run 24 hours and create 6 models.

What confuses people is that if your time setting was say 6 hours for that same work unit, it would still only produce 1 model and complete in four hours. This is because the work units work in increments of whole models. They will not cut off half way through a model unless there is some kind of problem. So the time setting is actually approximate. This means that for work units that have long model creating intervals the accuracy of the setting can be way off (by as much as one whole model creating time interval). For work units that produce fast models (some can make a model in less than 5 min), the accuracy of the run time setting can be very high.

The time setting allows users to determine how many work units they will download over a certain period of time. This effectively allows them to determine how much network bandwidth they will use over time. For modem users, and people who are charged by how much data passes through their network connect, this is very important. Particularly if they are running a lot of machines.

For the rest of us it is not really important, except to reduce the work for the Rosetta servers. Time settings between 8 and 12 hours produce a nice balance between providing timely results, and impact on the servers.

There is some slight advantage in the longer run time in terms of team competitions. The longer run time and the correspondingly higher credit awarded for each returned result, does produce a slightly higher RAC. But this averages out over longer time periods. This is just an artifact of the way RAC is calculated by the BOINC world.

Moderator9
ROSETTA@home FAQ
Moderator Contact
ID: 16702 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
neil.hunter14

Send message
Joined: 9 May 06
Posts: 10
Credit: 278,867
RAC: 0
Message 16723 - Posted: 20 May 2006, 19:12:50 UTC


ID: 16723 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
tralala

Send message
Joined: 8 Apr 06
Posts: 376
Credit: 581,806
RAC: 0
Message 16724 - Posted: 20 May 2006, 20:31:23 UTC - in response to Message 16723.  


ID: 16724 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile dcdc

Send message
Joined: 3 Nov 05
Posts: 1831
Credit: 119,621,870
RAC: 11,205
Message 16828 - Posted: 22 May 2006, 12:06:57 UTC
Last modified: 22 May 2006, 12:09:27 UTC

Without wanting to drag this thread off-topic, regarding the cache-size/FSB/CPU core etc... effects on rosetta:

is it possible to run the same job, with the same seed on two computers to get a comparison? If so, how is this done?
ID: 16828 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Feet1st
Avatar

Send message
Joined: 30 Dec 05
Posts: 1755
Credit: 4,690,520
RAC: 0
Message 16836 - Posted: 22 May 2006, 15:13:06 UTC

Yes! Rosetta has a property that can be set to establish the seed. Unfortunately I don't recall the property name nor the file that contains it.
Add this signature to your EMail:
Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might!
https://boinc.bakerlab.org/rosetta/
ID: 16836 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Moderator9
Volunteer moderator

Send message
Joined: 22 Jan 06
Posts: 1014
Credit: 0
RAC: 0
Message 16839 - Posted: 22 May 2006, 15:25:19 UTC - in response to Message 16828.  

Without wanting to drag this thread off-topic, regarding the cache-size/FSB/CPU core etc... effects on rosetta:

is it possible to run the same job, with the same seed on two computers to get a comparison? If so, how is this done?

I just had a long exchange with Rhiju on this subject a few days ago. The answer is Yes it is possible to do what you describe, and they run this test on a regular schedule.

In anticipation of your next question, the first model duplicates in both runs if the random number is the same at the start. However, in subsequent models the work unit processing will diverge to some degree, as the small influence of hydrogen elements in the model are not considered, and they become cumulative as processing proceeds.

To bring it back on topic a bit, clearly longer run times might have an effect in comparisons of this type, but the goal here is to produce as many possible and plausible structures as possible, so this divergence is not a bad thing, and longer run times help. I am informed that these differences are growing smaller as the software improves. Obviously the best situation would be if all work units reached the same conclusion, and that conclusion was correct for the particular protein. It is heading in that direction but there is a way to go yet.

Moderator9
ROSETTA@home FAQ
Moderator Contact
ID: 16839 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mike Gelvin
Avatar

Send message
Joined: 7 Oct 05
Posts: 65
Credit: 10,612,039
RAC: 0
Message 16870 - Posted: 22 May 2006, 21:04:44 UTC - in response to Message 16839.  

Without wanting to drag this thread off-topic, regarding the cache-size/FSB/CPU core etc... effects on rosetta:

is it possible to run the same job, with the same seed on two computers to get a comparison? If so, how is this done?

I just had a long exchange with Rhiju on this subject a few days ago. The answer is Yes it is possible to do what you describe, and they run this test on a regular schedule.

In anticipation of your next question, the first model duplicates in both runs if the random number is the same at the start. However, in subsequent models the work unit processing will diverge to some degree, as the small influence of hydrogen elements in the model are not considered, and they become cumulative as processing proceeds.

To bring it back on topic a bit, clearly longer run times might have an effect in comparisons of this type, but the goal here is to produce as many possible and plausible structures as possible, so this divergence is not a bad thing, and longer run times help. I am informed that these differences are growing smaller as the software improves. Obviously the best situation would be if all work units reached the same conclusion, and that conclusion was correct for the particular protein. It is heading in that direction but there is a way to go yet.


I'm still trying to understand what this means. It appears that it means that subsequent models are not "whole new attempts". If a model gets generated and has a terrible energy (not sure what that means)... then why continue looking in that neighborhood? Wouldnt either looking in a whole new place each time (using a previous analogy)... or if the first attempt is not as good as some "X" then dont try anymore near here. This "X" could be fed with the workunit and be a feedback from other models that have started to "zero in" on the answer. Not sure I'm making sense here, just some questions.
ID: 16870 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Moderator9
Volunteer moderator

Send message
Joined: 22 Jan 06
Posts: 1014
Credit: 0
RAC: 0
Message 16872 - Posted: 22 May 2006, 21:21:27 UTC - in response to Message 16870.  

Without wanting to drag this thread off-topic, regarding the cache-size/FSB/CPU core etc... effects on rosetta:

is it possible to run the same job, with the same seed on two computers to get a comparison? If so, how is this done?

I just had a long exchange with Rhiju on this subject a few days ago. The answer is Yes it is possible to do what you describe, and they run this test on a regular schedule.

In anticipation of your next question, the first model duplicates in both runs if the random number is the same at the start. However, in subsequent models the work unit processing will diverge to some degree, as the small influence of hydrogen elements in the model are not considered, and they become cumulative as processing proceeds.

To bring it back on topic a bit, clearly longer run times might have an effect in comparisons of this type, but the goal here is to produce as many possible and plausible structures as possible, so this divergence is not a bad thing, and longer run times help. I am informed that these differences are growing smaller as the software improves. Obviously the best situation would be if all work units reached the same conclusion, and that conclusion was correct for the particular protein. It is heading in that direction but there is a way to go yet.


I'm still trying to understand what this means. It appears that it means that subsequent models are not "whole new attempts". If a model gets generated and has a terrible energy (not sure what that means)... then why continue looking in that neighborhood? Wouldn't either looking in a whole new place each time (using a previous analogy)... or if the first attempt is not as good as some "X" then dont try anymore near here. This "X" could be fed with the workunit and be a feedback from other models that have started to "zero in" on the answer. Not sure I'm making sense here, just some questions.


Yes you are making sense. First you are on the right track. Every work unit of a given type is identical. The only thing different is the Pseudo random number that determines the starting point of the processing. If two work units of the same type have the same random number, they will generate the same first model. They will use information from this first model when they start the next, but there are subtile differences (caused by the hydrogen elements in the protein) And these small differences will provide a slightly different direction for the second and subsequent models.

However, at each model start, what was learned in terms of energy level is applied to the next model.

Taking some liberties with the information fro brevity sake, the energy of a structure is simply a measure of the total energy required for the structure to hold its shape. This is a function of the chemical bonds between the amino acids that make up the protein. In some cases the forces are pushing other aminos away from the structure, in other cases they are attracting. But the idea is that the energy you see in the graphic is a measure of the total energy within the structure. You could almost think of it as a measurement of the horsepower required for the protein to hold its shape. In nature the most efficient shapes require the least power to maintain, and that is why natural structures are always found at the lowest energy levels. That in turn is why Rosetta is looking for those low energy levels.
.
Moderator9
ROSETTA@home FAQ
Moderator Contact
ID: 16872 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Feet1st
Avatar

Send message
Joined: 30 Dec 05
Posts: 1755
Credit: 4,690,520
RAC: 0
Message 16876 - Posted: 22 May 2006, 21:56:57 UTC - in response to Message 16870.  

I'm still trying to understand what this means. It appears that it means that subsequent models are not "whole new attempts". If a model gets generated and has a terrible energy (not sure what that means)... then why continue looking in that neighborhood? Wouldnt either looking in a whole new place each time (using a previous analogy)... or if the first attempt is not as good as some "X" then dont try anymore near here. This "X" could be fed with the workunit and be a feedback from other models that have started to "zero in" on the answer. Not sure I'm making sense here, just some questions.

Basically, each new model run IS a new start. It gets a different random number and takes a new perspective of looking at the protein. The Moderator was addressing the question posed about running through an identical WU with identical random number, because they wanted to get an accurate, recreateable benchmark to measure. But this isn't what is happening by default. Your model runs will each be different.

As for "don't try anymore near here"... I believe that sort of logic is built in to the algorythm as it makes each model. But there are cases where what starts out looking like a really terrible model it suddenly "drops in a deep well" and looks really good. It's like the landscape and elevation analogy Dr. Baker uses, and you climb a huge volcano, it's looking worse and worse every step of the way... then suddenly you drop into the creator and find it's lower (energy, i.e. good) than the base of the volcano where you started. i.e. it was worth the climb to discover it!

If they could find a rule to identify, accurately ahead of time when it will be worth the climb and when it's a waste of time, they will build that logic into the program. This is the sort of thing they're working towards all the time. This is the hope of the whole project, that rules of this sort can be devised which allow the same answer to be discovered with less and less model runs over time.

As for your idea of immediate feedback used as guidance to future model runs, I believe to some extent they do that at the server level as they devise some of the new WUs. I haven't seen much detail on it though.
Add this signature to your EMail:
Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might!
https://boinc.bakerlab.org/rosetta/
ID: 16876 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote

Message boards : Number crunching : Default Run Time



©2024 University of Washington
https://www.bakerlab.org