Short Deadlines

Message boards : Number crunching : Short Deadlines

To post messages, you must log in.

AuthorMessage
James W

Send message
Joined: 25 Nov 12
Posts: 130
Credit: 1,766,254
RAC: 0
Message 80120 - Posted: 27 May 2016, 5:40:01 UTC

Is there a reason the WU starting with rb_05_26 (rb_month_day) have such short turnaround times (2-3 days)? I have to keep an eye on this so as to not overwork my system.
ID: 80120 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile dcdc

Send message
Joined: 3 Nov 05
Posts: 1831
Credit: 119,617,765
RAC: 11,361
Message 80121 - Posted: 27 May 2016, 12:45:40 UTC - in response to Message 80120.  

Is there a reason the WU starting with rb_05_26 (rb_month_day) have such short turnaround times (2-3 days)? I have to keep an eye on this so as to not overwork my system.


CASP has started, so they're pumping jobs through as quickly as possible:

We are still in the server stage of CASP and so jobs are run through the robetta platform which gives the jobs a rb_* prefix. The CASP jobs have been running with this prefix.

from here: https://boinc.bakerlab.org/rosetta/forum_thread.php?id=6822

D
ID: 80121 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 80122 - Posted: 27 May 2016, 13:25:52 UTC
Last modified: 27 May 2016, 13:26:03 UTC

...the good news is that it is not a large number of tasks pulling the short deadlines. Otherwise, systems with large work caches can make another request for work and get a pile that all insert themselves in front of the others, and start running at "high priority". Seems to go OK so long as there are only a small fraction with short deadlines, and the cache is not more than a few days.
Rosetta Moderator: Mod.Sense
ID: 80122 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 80123 - Posted: 27 May 2016, 13:29:08 UTC

Each CASP target released over the coming months will have a very quick (3 days I think it is) deadline for submissions from "server" predictions, and then a longer period for "human" predictions. Some human predictions are aided by server predictions too.
Rosetta Moderator: Mod.Sense
ID: 80123 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Timo
Avatar

Send message
Joined: 9 Jan 12
Posts: 185
Credit: 45,649,459
RAC: 0
Message 80124 - Posted: 27 May 2016, 14:47:12 UTC
Last modified: 27 May 2016, 14:47:56 UTC

I don't mind short deadlines. In fact, I totally can't justify making a researcher wait many many days to start getting an answer to a query. How the hell can someone be expected to iterate effectively when the answers come back at such a slow pace?!

I work with databases and crunching of large datasets at my job, and I get mildly annoyed when my queries at work take more than a couple of hours because it makes iterating through questions very painful. I surely don't want to be the person making a query take a whole week. XD
ID: 80124 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 80125 - Posted: 27 May 2016, 19:22:54 UTC - in response to Message 80124.  

I don't mind short deadlines. In fact, I totally can't justify making a researcher wait many many days to start getting an answer to a query. How the hell can someone be expected to iterate effectively when the answers come back at such a slow pace?!

The champion is Climate Prediction Network. They give you a year (no kidding). I, among others, have suggested that is too long. But they are resistant to change.

ID: 80125 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2124
Credit: 41,219,446
RAC: 10,842
Message 80127 - Posted: 28 May 2016, 2:35:51 UTC - in response to Message 80124.  

I don't mind short deadlines. In fact, I totally can't justify making a researcher wait many many days to start getting an answer to a query. How the hell can someone be expected to iterate effectively when the answers come back at such a slow pace?!

I work with databases and crunching of large datasets at my job, and I get mildly annoyed when my queries at work take more than a couple of hours because it makes iterating through questions very painful. I surely don't want to be the person making a query take a whole week. XD

They're planning for 2 days, so 2 days is what they set. If they wanted a result in 1 day, that's what they'd set. It's only a problem if they fail to get results back by the deadline. They need to bear in mind that task runs are defaulting to 6 hours too.

For those of us that adjust our task buffer, we should ensure we keep less than 2 days in hand during CASP so Boinc doesn't mess up on scheduling. For example, to account for runtime variation, I've dropped mine to a 1.5 day buffer.
ID: 80127 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile dcdc

Send message
Joined: 3 Nov 05
Posts: 1831
Credit: 119,617,765
RAC: 11,361
Message 80131 - Posted: 28 May 2016, 13:12:21 UTC - in response to Message 80127.  

Is there any benefit to using the "report results immediately" setting in config.cc given the desire for a quick turn around?
ID: 80131 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Link
Avatar

Send message
Joined: 4 May 07
Posts: 356
Credit: 382,349
RAC: 0
Message 80134 - Posted: 28 May 2016, 22:10:58 UTC - in response to Message 80131.  

Is there any benefit to using the "report results immediately" setting in config.cc given the desire for a quick turn around?

No, this only adds unnecessary load to the servers.
.
ID: 80134 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2124
Credit: 41,219,446
RAC: 10,842
Message 80136 - Posted: 29 May 2016, 3:30:40 UTC - in response to Message 80127.  

For those of us that adjust our task buffer, we should ensure we keep less than 2 days in hand during CASP so Boinc doesn't mess up on scheduling. For example, to account for runtime variation, I've dropped mine to a 1.5 day buffer.

On this subject, I returned from my usual 3-4 days away to discover I'd had one of those "Rosetta Mini for Android is not available for your type of computer" messages which results in a 24hr delay in the next update attempt. I just got home in time to force a manual update and beat the task deadlines by a couple of hours when the next update was still several hours away. 20 tasks got uploaded.

What was decided on this? Is it a Boinc scheduling issue or something from the Rosetta servers that can be corrected? This is something that's more of a concern for people who don't monitor task progress. Where the default buffer is just 0.25 days there's a risk of running taskless for 18 hours - unless another project is available. Either way, it's not good for Rosetta or CASP.

ID: 80136 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 80138 - Posted: 29 May 2016, 4:35:13 UTC - in response to Message 80136.  

What was decided on this? Is it a Boinc scheduling issue or something from the Rosetta servers that can be corrected? This is something that's more of a concern for people who don't monitor task progress. Where the default buffer is just 0.25 days there's a risk of running taskless for 18 hours - unless another project is available. Either way, it's not good for Rosetta or CASP.


The servers are so busy delivering work there are brief periods where the servers do not have work units ready to deliver. Since these periods are so brief before the active tasks preparing work have more ready, there is a race condition for new work. If you machine happens to catch a few in a row when no prepared work is ready, the BOINC Manager backoffs double and quickly reach 24hrs.
Rosetta Moderator: Mod.Sense
ID: 80138 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Link
Avatar

Send message
Joined: 4 May 07
Posts: 356
Credit: 382,349
RAC: 0
Message 80139 - Posted: 29 May 2016, 8:13:49 UTC - in response to Message 80138.  

If you machine happens to catch a few in a row when no prepared work is ready, the BOINC Manager backoffs double and quickly reach 24hrs.

This is not the "normal" BOINC backoff, it's 24 hours after the first failed request. It would be good if people getting this could post their sched_reply_boinc.bakerlab.org_rosetta.xml (maybe without <email_hash> and <cross_project_id>). Most interesting would be <request_delay> for the start.
.
ID: 80139 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2124
Credit: 41,219,446
RAC: 10,842
Message 80140 - Posted: 29 May 2016, 20:33:46 UTC - in response to Message 80139.  

If you machine happens to catch a few in a row when no prepared work is ready, the BOINC Manager backoffs double and quickly reach 24hrs.

This is not the "normal" BOINC backoff, it's 24 hours after the first failed request. It would be good if people getting this could post their sched_reply_boinc.bakerlab.org_rosetta.xml (maybe without <email_hash> and <cross_project_id>). Most interesting would be <request_delay> for the start.

This is my complete file, minus the two fields you mentioned.

Request delay shows as 242.4 which looks to be the 4 minutes or so that shows whenever I normally return results.

I agree, the 24hr back-off does appear to be after the first failed request.
<scheduler_reply>
<scheduler_version>605</scheduler_version>
<master_url>https://boinc.bakerlab.org/rosetta/</master_url>
<request_delay>242.400000</request_delay>
<project_name>rosetta@home</project_name>
<symstore>https://boinc.bakerlab.org/rosetta/symstore</symstore>
<user_name>Sid Celery</user_name>
<user_total_credit>7599263.984229</user_total_credit>
<user_expavg_credit>4990.702189</user_expavg_credit>
<user_create_time>1202737705</user_create_time>
<project_preferences>
<resource_share>2900</resource_share>
<project_specific>
<max_fps>0</max_fps>
<max_cpu>0</max_cpu>
<cpu_run_time>28800</cpu_run_time>
</project_specific>
</project_preferences>

<host_total_credit>3863834.506854</host_total_credit>
<host_expavg_credit>3812.011345</host_expavg_credit>
<host_venue></host_venue>
<host_create_time>1371356722</host_create_time>
<team_name>TheChels</team_name>
<verify_files_on_app_start/>
<gui_urls>
<gui_url>
<name>FoldIt!</name>
<description>Want to play a game?</description>
<url>http://fold.it</url>
</gui_url>
<gui_url>
<name>Science of Rosetta</name>
<description>An overview of the basic science behind Rosetta@home</description>
<url>https://boinc.bakerlab.org/rosetta/rah_education/</url>
</gui_url>
<gui_url>
<name>Message boards</name>
<description>Correspond with other users on the Rosetta@home message boards</description>
<url>https://boinc.bakerlab.org/rosetta/forum_index.php</url>
</gui_url>
<gui_url>
<name>Help</name>
<description>Ask questions and report problems</description>
<url>https://boinc.bakerlab.org/rosetta/forum_help_desk.php</url>
</gui_url>
<gui_url>
<name>Your account</name>
<description>View your account information and credit totals</description>
<url>https://boinc.bakerlab.org/rosetta/show_user.php?userid=241409</url>
</gui_url>
<gui_url>
<name>Your preferences</name>
<description>View and modify your Rosetta@home account profile and preferences</description>
<url>https://boinc.bakerlab.org/rosetta/home.php</url>
</gui_url>
<gui_url>
<name>Your results</name>
<description>View your last week (or more) of computational results and work</description>
<url>https://boinc.bakerlab.org/rosetta/results.php?userid=241409</url>
</gui_url>
<gui_url>
<name>Your computers</name>
<description>View a listing of all the computers on which you are running Rosetta@home</description>
<url>https://boinc.bakerlab.org/rosetta/hosts_user.php?userid=241409</url>
</gui_url>
<ifteam>
<gui_url>
<name>Team</name>
<description>Info about TheChels</description>
<url>https://boinc.bakerlab.org/rosetta/team_display.php?teamid=7844</url>
</gui_url>
</ifteam>
</gui_urls>
</scheduler_reply>


ID: 80140 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Link
Avatar

Send message
Joined: 4 May 07
Posts: 356
Credit: 382,349
RAC: 0
Message 80141 - Posted: 30 May 2016, 15:03:23 UTC - in response to Message 80140.  

@Sid Celery: is that a reply, which caused the 24 hour delay? Probably not as far as I can see... because that's the one that would be interesting, not any other one (yes, I should have write that more clear).
.
ID: 80141 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2124
Credit: 41,219,446
RAC: 10,842
Message 80142 - Posted: 31 May 2016, 0:36:46 UTC - in response to Message 80141.  

@Sid Celery: is that a reply, which caused the 24 hour delay? Probably not as far as I can see... because that's the one that would be interesting, not any other one (yes, I should have write that more clear).

Oh. So does it change for each upload? If so, I'll try to seek it out next time it happens (if it happens again).

I may still be misunderstanding your question - I'm not good at this.

Fwiw I'm adding one or two cores to processing tasks on my smartphone just to use up a few extra of those available tasks (as if it'll make any noticeable difference)
ID: 80142 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2124
Credit: 41,219,446
RAC: 10,842
Message 80143 - Posted: 31 May 2016, 2:40:29 UTC - in response to Message 80141.  

@Sid Celery: is that a reply, which caused the 24 hour delay? Probably not as far as I can see... because that's the one that would be interesting, not any other one (yes, I should have write that more clear).

Ok, it just happened again, and as you suspected, request_delay now shows 86400 = 24hrs:
<scheduler_version>605</scheduler_version>
<master_url>https://boinc.bakerlab.org/rosetta/</master_url>
<request_delay>86400.000000</request_delay>
<message priority="high">No work sent</message>
<message priority="high">Rosetta Mini for Android is not available for your type of computer.</message>
<project_name>rosetta@home</project_name>
<symstore>https://boinc.bakerlab.org/rosetta/symstore</symstore>
<user_name>Sid Celery</user_name>
<user_total_credit>7606484.010862</user_total_credit>
<user_expavg_credit>5086.039514</user_expavg_credit>
<user_create_time>1202737705</user_create_time>

ID: 80143 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Link
Avatar

Send message
Joined: 4 May 07
Posts: 356
Credit: 382,349
RAC: 0
Message 80144 - Posted: 31 May 2016, 6:55:16 UTC - in response to Message 80143.  

Well... than the project admins now know what needs to be fixed.
.
ID: 80144 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote

Message boards : Number crunching : Short Deadlines



©2024 University of Washington
https://www.bakerlab.org