Message boards : Number crunching : Problems and Technical Issues with Rosetta@home
Previous · 1 . . . 298 · 299 · 300 · 301 · 302 · 303 · 304 . . . 308 · Next
Author | Message |
---|---|
Dr Who Fan Send message Joined: 28 May 06 Posts: 79 Credit: 273,880 RAC: 361 |
Upload/Download SERVER(s) appear to be off-line again but the server status page is all green 11/23/2024 16:25:35 Internet access OK - project servers may be temporarily down. |
Bryn Mawr Send message Joined: 26 Dec 18 Posts: 398 Credit: 12,294,748 RAC: 9,249 |
ERROR: Error in protocols::cyclic_peptide_predict::SimpleCycpepPredictpplication::set_up_n_to_c_cyclization_mover() function: residue 1 does not have a LOWER_CONNECT.Unfortunately, one of the usual ones. Yes, presumably a definition error for the molecule being tested. |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1722 Credit: 18,356,357 RAC: 25,250 |
Upload/Download SERVER(s) appear to be off-line again but the server status page is all greenI'm not having any issues at all. Grant Darwin NT |
Dr Who Fan Send message Joined: 28 May 06 Posts: 79 Credit: 273,880 RAC: 361 |
Upload/Download SERVER(s) appear to be off-line again but the server status page is all greenI'm not having any issues at all. Did a manual retry a few minutes ago and they downloaded successfully. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2137 Credit: 41,518,559 RAC: 15,775 |
Upload/Download SERVER(s) appear to be off-line again but the server status page is all greenI'm not having any issues at all. I didn't see it here at Rosetta, but for 7 or 10 days it was happening to everyone at WCG and each of 6 files per upload needed 5-10 tries on tasks that uploaded and downloaded 4-6 times as often. If anything happened at Rosetta in that time it was lost among 40 files waiting to transfer to WCG at any one time. |
Bryn Mawr Send message Joined: 26 Dec 18 Posts: 398 Credit: 12,294,748 RAC: 9,249 |
[ Currently experiencing transient HTTPS errors on probably half of the downloads and this has been going on for maybe 4 days. Some downloads have taken 15 retries to clear. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2137 Credit: 41,518,559 RAC: 15,775 |
I didn't see it here at Rosetta, but for 7 or 10 days it was happening to everyone at WCG and each of 6 files per upload needed 5-10 tries on tasks that uploaded and downloaded 4-6 times as often. It's weird that I'm just as susceptible as anyone else to those errors coming from WCG, but don't see any here at Rosetta. The only solution I know is manually retrying for as long as it takes |
mmonnin Send message Joined: 2 Jun 16 Posts: 61 Credit: 25,390,629 RAC: 47,239 |
I have to retry all the time to download tasks here at Rosetta which is something new for Rosetta. Some retries work on the 1st attempt and others won't download after a dozen attempts. I've even aborted a task to download more work and those new ones will download. It's typically the smaller files from Rosetta that need reties. |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1722 Credit: 18,356,357 RAC: 25,250 |
Still no signs of file transfer issues in my Event log, sounds like there is some sort of network issue between ISPs. Grant Darwin NT |
Bill Swisher Send message Joined: 10 Jun 13 Posts: 38 Credit: 34,878,935 RAC: 105,916 |
transient http errors As a snowbird I relocated earlier this month. Between the time I shutdown one computer, I pack this one in my checked baggage, and when I turned it on at the new location WCG started giving me errors. LOTS of errors on multiple computers. Thinking it was because I switched ISP's I diddled around with it a lot before I did some real testing. First I fired off the VPN and used a place in Europe as my gateway, no change, then I really got serious. I did a ssh connect to the computers back where I live most of the time. They were clogged up also. After about a week and a half things settled down and traffic to WCG went to normal. Then Rosetta hiccuped a few times. At the moment all seems to be OK. |
JLDun Send message Joined: 31 May 08 Posts: 8 Credit: 73,164 RAC: 123 |
I was getting "transient" errors yesterday (on a phone, using Google Fiber for WiFi). |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1722 Credit: 18,356,357 RAC: 25,250 |
Server Status is showing all green, but there is a Validation backlog starting to build up again... Edit- and i've lost 2 Tasks to a Validation error. Grant Darwin NT |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2137 Credit: 41,518,559 RAC: 15,775 |
Server Status is showing all green, but there is a Validation backlog starting to build up again... It didn't take too much longer boinc-process is now being reported as down again |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1722 Credit: 18,356,357 RAC: 25,250 |
Another small batch of Rosetta Tasks (20,000), boinc-process host is still dead. Grant Darwin NT |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1722 Credit: 18,356,357 RAC: 25,250 |
And boinc-process host lives again, backlog mostly cleared. Till the next time. Grant Darwin NT |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1722 Credit: 18,356,357 RAC: 25,250 |
And another small batch of work released (25,000). Current group of Tasks using 800MB to 1.2GB of RAM each, so very low RAM systems could be having problems. Grant Darwin NT |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1722 Credit: 18,356,357 RAC: 25,250 |
1.7 million Tasks ready to send- hopefully there won't be too many that die within seconds of starting. Grant Darwin NT |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2137 Credit: 41,518,559 RAC: 15,775 |
1.7 million Tasks ready to send - hopefully there won't be too many that die within seconds of starting. I wasn't expecting that. The first downloads I had were 7-8hrs ago, so I'm clearing everything else down to make space for a full Rosetta cache |
Jean-David Beyer Send message Joined: 2 Nov 05 Posts: 195 Credit: 6,613,600 RAC: 9,094 |
The ones I get that fail -- even before starting -- are canceled by the server. They send a task to someone who has not finished and not failed. Then tey send me one of te same thing. Then the first person completes the work unit. Then they cancel me. This is a rude process. They should not send me a task if they are still waiting for the first user to complete. Workunit 1415251024 name rb_11_29_646130_639665__t000__0_C1_SAVE_ALL_OUT_IGNORE_THE_REST_3010044_217 application Rosetta created 29 Nov 2024, 10:19:10 UTC canonical result 1590758128 granted credit 339.97 minimum quorum 1 initial replication 1 max # of error/total/success tasks 1, 2, 1 Task click for details Computer Sent Time reported or deadline explain Status Run time (sec) CPU time (sec) Credit Application 1590758128 3773674 29 Nov 2024, 10:20:20 UTC 2 Dec 2024, 11:52:45 UTC Completed and validated 19,549.58 19,469.61 339.97 Rosetta v4.20 windows_x86_64 1590852975 5910575 2 Dec 2024, 10:20:27 UTC 2 Dec 2024, 12:15:32 UTC Cancelled by server 0.00 0.00 --- Rosetta v4.20 x86_64-pc-linux-gnu |
robertmiles Send message Joined: 16 Jun 08 Posts: 1233 Credit: 14,338,560 RAC: 2,994 |
The ones I get that fail -- even before starting -- are canceled by the server. They send a task to someone who has not finished and not failed. Then tey send me one of te same thing. Then the first person completes the work unit. Then they cancel me. This is a rude process. They should not send me a task if they are still waiting for the first user to complete. Would you prefer that they cancel workunits after they start? They obviously want one copy to finish soon, and may not have information on whether the first one will ever finish. I think, though, that they will let the second one finish and give it credit if it starts before the first one finishes, |
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
©2024 University of Washington
https://www.bakerlab.org