Graphing the gap between known sequences and solved structures

Message boards : Rosetta@home Science : Graphing the gap between known sequences and solved structures

To post messages, you must log in.

AuthorMessage
student_

Send message
Joined: 24 Sep 05
Posts: 34
Credit: 4,743,914
RAC: 896
Message 54094 - Posted: 1 Jul 2008, 0:48:22 UTC

Since a large part of the protein structure prediction problem is that structure determination is much slower than sequence discovery, I thought I'd try to graph any data on sequences vs time and structures vs time to visualize that trend.

The Protein Data Bank has a nice graph for protein structures vs time. Unfortunately I can't find any similar data for sequences vs time.

For sequences data, the NCBI non-redundant protein database seems ideal off the top of my head since it aggregates several different large databases and removes identical sequences. The only way I know how to get the number of sequences in the nr database is to BLAST a protein and look at the report header. Does anyone know an easier way to get that type of high level information from NCBI's databases?
ID: 54094 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote

Message boards : Rosetta@home Science : Graphing the gap between known sequences and solved structures



©2024 University of Washington
https://www.bakerlab.org