Christoph Jansen's formula of protein size and model completion times

Author	Message
Hoelder1in Send message Joined: 30 Sep 05 Posts: 169 Credit: 3,915,947 RAC: 0	Message 24593 - Posted: 24 Aug 2006, 8:28:53 UTC Last modified: 24 Aug 2006, 8:48:30 UTC I copied this very interesting post by Christoph Jansen over from the Ralph forum because my comments would have been completely off-topic there: ...I have done a check of some 20 Rosetta WUs and have found out that the time to calculate one decoy is pretty exactly proportional to the number of amino acids in the protein to the power of 1,3. My formula is (number of amino acids)^1.3n(decoys) / time = const.(for a given machine) It yields pretty good values that vary by an average of 2.3% around the median. I am still collecting numbers to compare, but the latest two samples I put in after adjusting the proportionality factor had 99,9 and 100,2 of the average "work factor" for my machine. And the length of proteins varies from 28 to 157 amino acids, which is a factor of nearly six in length. I think it is reassuring that the model completion times only grow polynomially with the size of the protein and even with such a small exponent. One might have feared that, since the size of the parameter space grows exponentially with the number of amino acids in a protein, so do the model completion times, in which case the dependence would be: <something>^<number of amino acids> n/time = const, or more conveniently <something> * <number of amino acids> + log(n/time) = const So perhaps it is the number of required models (to reach a desired rmsd) that scales exponentially with the size of the protein, rather than the individual model completion times ? Or perhaps it is not protein size but contact order (how often the chain touches itself in the folded state - I hope I am right about that) which determines how many models are needed ? Well, you can't determine this from the data you have, Christoph but I am sure the Baker lab have figured this out. These scaling laws seem to be an excellent way to test the quality of the different algorithms (imagine that with your analysis you could determine that for one particlular WU type the exponent is, say 1.15, rather than 1.3...). This is all very interesting and thought-provoking (much more so than the credit stuff) ! Team betterhumans.com - discuss and celebrate the future - hoelder1in.org ID: 24593 · Rating: 1 · rate: / Reply Quote

Christoph Jansen Send message Joined: 6 Jun 06 Posts: 248 Credit: 267,153 RAC: 0	Message 24621 - Posted: 24 Aug 2006, 9:11:11 UTC - in response to Message 24593. Last modified: 24 Aug 2006, 9:11:56 UTC Well, you can't determine this from the data you have, Christoph but I am sure the Baker lab have figured this out. These scaling laws seem to be an excellent way to test the quality of the different algorithms (imagine that with your analysis you could determine that for one particlular WU type the exponent is, say 1.15, rather than 1.3...). Hi Hoelder1in, very interesting thoughts by you. Not at all what I had in mind, but I only did that calculation out of interest and was surprised it came out so straightforwardly simple. Maybe a system identification for various algorithms can shed a light on the topic. Whatever it says, my intention was only to share that observation. After all I am a chemist, and we are mostly awfully bad at math (I always admired those who weren't), else we would have become physicists ;-) I would not be surprised if it leads to nothing and only expresses things the Baker team already knows in a different way. Regards, Christoph ID: 24621 · Rating: 0 · rate: / Reply Quote

Hoelder1in Send message Joined: 30 Sep 05 Posts: 169 Credit: 3,915,947 RAC: 0	Message 24630 - Posted: 24 Aug 2006, 9:30:54 UTC - in response to Message 24621. After all I am a chemist... I guess I was awfully bad in the chemistry lab or else I might have become a chemist. ;-) Team betterhumans.com - discuss and celebrate the future - hoelder1in.org ID: 24630 · Rating: 0 · rate: / Reply Quote

Christoph Jansen Send message Joined: 6 Jun 06 Posts: 248 Credit: 267,153 RAC: 0	Message 24646 - Posted: 24 Aug 2006, 9:53:29 UTC - in response to Message 24630. I guess I was awfully bad in the chemistry lab or else I might have become a chemist. ;-) I worked in the lab course for physicists and had a group of eight people that I'd coach each term practically and in theory. I loved it and most of my colleagues loathed it because "them guys don't know anything". Pretty enervating attitude for both sides. I always learned something new or discovered connections that I should have noticed before but hadn't. Teaching is a great way of finding out if you really understood what you do yourself. ID: 24646 · Rating: 0 · rate: / Reply Quote

dcdc Send message Joined: 3 Nov 05 Posts: 1836 Credit: 124,981,563 RAC: 0	Message 24740 - Posted: 24 Aug 2006, 16:37:21 UTC Good work Hoelder1in ;) From my understanding, it makes sense that the job times only increase proportionally with the number of amino acids as, in basic terms, the algorithm first tries the best fit for the first amino acid, and then moves along the chain to the next one. Each energy calculation for each amino acid will be the same process but in bigger proteins there will be more of these to do. I guess this is good news for the project as we only need to increase the total amount of CPU power exponentially, and not the power of individual computers! ID: 24740 · Rating: 0 · rate: / Reply Quote

Hoelder1in Send message Joined: 30 Sep 05 Posts: 169 Credit: 3,915,947 RAC: 0	Message 24743 - Posted: 24 Aug 2006, 16:51:38 UTC - in response to Message 24740. I guess this is good news for the project as we only need to increase the total amount of CPU power exponentially, and not the power of individual computers! On the other hand, the power of individual computers does increase exponentially with time (the 18 months doubling time of computing power described by Moores's Law)... Team betterhumans.com - discuss and celebrate the future - hoelder1in.org ID: 24743 · Rating: 0 · rate: / Reply Quote

dcdc Send message Joined: 3 Nov 05 Posts: 1836 Credit: 124,981,563 RAC: 0	Message 24744 - Posted: 24 Aug 2006, 16:56:35 UTC - in response to Message 24743. On the other hand, the power of individual computers does increase exponentially with time (the 18 months doubling time of computing power described by Moores's Law)... It does, but wouldn't that only be ok if we were only doubling the number of amino acids every 18 months? ;) We can do them next week at this rate! ID: 24744 · Rating: 0 · rate: / Reply Quote