Rosetta WU delivery out of control

Message boards : Number crunching : Rosetta WU delivery out of control

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
Admin
Project administrator

Send message
Joined: 1 Jul 05
Posts: 4805
Credit: 0
RAC: 0
Message 95967 - Posted: 3 May 2020, 23:24:30 UTC

I updated the scheduler so please post here if you are continuing to have issues with crazy cache sizes. The scheduler now uses the cpu run time preference as the estimated job duration when determining how many jobs to send to a host. Hopefully this should improve the job cache size.
ID: 95967 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1680
Credit: 17,841,115
RAC: 22,993
Message 95970 - Posted: 4 May 2020, 1:03:53 UTC - in response to Message 95967.  

I updated the scheduler so please post here if you are continuing to have issues with crazy cache sizes. The scheduler now uses the cpu run time preference as the estimated job duration when determining how many jobs to send to a host. Hopefully this should improve the job cache size.
Good to hear.
Thank you for your efforts.
Grant
Darwin NT
ID: 95970 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1680
Credit: 17,841,115
RAC: 22,993
Message 96069 - Posted: 4 May 2020, 20:25:59 UTC

I doubt there was a sudden increase in the number of users signing up, so this graph shows just how bad the issue was.


Almost 600k more Tasks in progress than usual, over a 50% increase.
As tasks miss deadlines & get resent, over the next week or so the number should end up back around the 1 to 1.1 million mark.


The next new application release (or surge in new users) should result in a much smaller jump in In progress Tasks, and no mass missing of deadlines or Tasks being aborted.
Good for the project & good for the crunchers.
Grant
Darwin NT
ID: 96069 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Admin
Project administrator

Send message
Joined: 1 Jul 05
Posts: 4805
Credit: 0
RAC: 0
Message 96072 - Posted: 4 May 2020, 20:32:56 UTC
Last modified: 4 May 2020, 22:42:57 UTC

Hopefully the updates I made fixed this issue. Thanks!

Edit: When I reattached, the appropriate amount of jobs were downloaded so that's a good sign.
ID: 96072 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 · 2

Message boards : Number crunching : Rosetta WU delivery out of control



©2024 University of Washington
https://www.bakerlab.org