Miscellaneous Work Unit Errors

Message boards : Number crunching : Miscellaneous Work Unit Errors

To post messages, you must log in.

Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · 9 . . . 10 · Next

AuthorMessage
Kevin

Send message
Joined: 15 Jan 06
Posts: 21
Credit: 109,496
RAC: 0
Message 12541 - Posted: 23 Mar 2006, 1:39:03 UTC

I had a unit fail with this:
<core_client_version>5.3.12.tx36</core_client_version>
<message>The system cannot find the path specified. (0x3) - exit code 3 (0x3)
</message>

Any clue to what caused this error? I think this error may have occurred when I restarted my computer and Rosetta failed to quit so windows just ended the process. Rosetta fails to quit occasionally on both XP and OS X and I end up with an error similar to this.

https://boinc.bakerlab.org/rosetta/workunit.php?wuid=11697219
ID: 12541 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
darioml

Send message
Joined: 9 Feb 06
Posts: 1
Credit: 220,203
RAC: 0
Message 12562 - Posted: 23 Mar 2006, 10:01:09 UTC

Hello.

I'm using BOINC 5.2.13 in Windows XP SP2 running Rosetta@home 4.82 and SETI@home 4.18 projects.

I changed my settings back to stay in memory while preempted, as well as the work unit time fixed to 2 hours, but still most of the Rosetta WUs give errors :(

For example these ones this morning:

3/23/2006 9:59:05 AM|rosetta@home|Unrecoverable error for result FA_RLXsc_hom010_1scjB_361_277_0 ( - exit code -1073741811 (0xc000000d))
3/23/2006 9:59:05 AM|rosetta@home|Unrecoverable error for result FA_RLXop_hom028_1opd__361_296_0 ( - exit code -1073741811 (0xc000000d))

I don't have ANY problems with SETI, except that sometimes the scheduler doesn't respond, but nothing related with the calculation.

When this bug will be fixed? When it crashes, sometimes the whole BOINC crashes and XP shows the window to report the problem to Microsoft...

Thanks,

Dar
ID: 12562 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mike

Send message
Joined: 21 Dec 05
Posts: 9
Credit: 35,252
RAC: 0
Message 12564 - Posted: 23 Mar 2006, 11:10:59 UTC

Hi. Try turning off all screen savers. I did this 6 days ago with no problems since.
ID: 12564 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Bob Guy

Send message
Joined: 7 Oct 05
Posts: 39
Credit: 24,895
RAC: 0
Message 12601 - Posted: 24 Mar 2006, 4:23:16 UTC

Had this error:

3/23/2006 6:36:47 PM|rosetta@home|Unrecoverable error for result FA_RLXvi_hom027_2vik__362_83_0 ( - exit code -1073741819 (0xc0000005))


Leave in memory = yes

This WU was never interrupted - it ran from start to failure without being paused. I was actually eating dinner when this occurred so the computer was otherwise idle.

I NEVER use the screensaver or viewed the graphics for this WU.

Runs with SETI, SETI Beta, Einstein, Predictor and QAH - no problems with any of the other projects.

Other R@H WUs complete normally.

System is not overclocked and the temps are mid-range - I don't think the CPU is working very hard.
ID: 12601 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile dag
Avatar

Send message
Joined: 16 Dec 05
Posts: 106
Credit: 1,000,020
RAC: 0
Message 12632 - Posted: 24 Mar 2006, 18:16:16 UTC
Last modified: 24 Mar 2006, 18:21:39 UTC


https://boinc.bakerlab.org/rosetta/result.php?resultid=14511478

15 hours in a slot on an unused laptop overnight - 1.3 hrs cpu time accumulated, ~25% progress.

But, this may be progress of a sort as it didn't hang at 1%!
dag
--Finding aliens is cool, but understanding the structure of proteins is useful.
ID: 12632 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Los Alcoholicos~Megaflix

Send message
Joined: 10 Nov 05
Posts: 24
Credit: 77,199
RAC: 0
Message 12675 - Posted: 25 Mar 2006, 11:58:03 UTC
Last modified: 25 Mar 2006, 12:00:27 UTC

exit code -1073741819

I seem to have one computer which generates a lot of workunits with above error. Always the same error, but at different moments in the workunit. Maybe the project researchers could look into it. The computer's name in the computer list is Megaflix. It's got at least a 25% failure rate, but it's still climbing. I've installed a fresh Windows, with no software at all. Just for testing I installed only Boinc with Rosetta and it keeps producing the errors.

Leave applications in memory has been set to yes since the beginning, so that's not an issue. I had set the work units to run for 2 hours, but since it produced errors almost from the start I set it lower, to 1 hour. It decreased the number of errors just a little bit, but not much.

Might be handy in trying to find a cure for this error.

I'm willing to use the computer as a testing guinee pig if you want to...
ID: 12675 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Moderator9
Volunteer moderator

Send message
Joined: 22 Jan 06
Posts: 1014
Credit: 0
RAC: 0
Message 12685 - Posted: 25 Mar 2006, 14:18:44 UTC - in response to Message 12675.  

exit code -1073741819

I seem to have one computer which generates a lot of workunits with above error. Always the same error, but at different moments in the workunit. Maybe the project researchers could look into it. The computer's name in the computer list is Megaflix. It's got at least a 25% failure rate, but it's still climbing. I've installed a fresh Windows, with no software at all. Just for testing I installed only Boinc with Rosetta and it keeps producing the errors.

Leave applications in memory has been set to yes since the beginning, so that's not an issue. I had set the work units to run for 2 hours, but since it produced errors almost from the start I set it lower, to 1 hour. It decreased the number of errors just a little bit, but not much.

Might be handy in trying to find a cure for this error.

I'm willing to use the computer as a testing guinee pig if you want to...


Cab you attach this computer to RALPH. It might help to see those errors over there.

Moderator9
ROSETTA@home FAQ
Moderator Contact
ID: 12685 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Los Alcoholicos~Megaflix

Send message
Joined: 10 Nov 05
Posts: 24
Credit: 77,199
RAC: 0
Message 12703 - Posted: 25 Mar 2006, 23:17:03 UTC - in response to Message 12685.  


Cab you attach this computer to RALPH. It might help to see those errors over there.


Ok. I'll finish the outstanding workunits of Rosetta and then I'll connect it to Ralph.
ID: 12703 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Carlos_Pfitzner
Avatar

Send message
Joined: 22 Dec 05
Posts: 71
Credit: 138,867
RAC: 0
Message 12717 - Posted: 26 Mar 2006, 21:58:20 UTC

Exit status -164 (0xffffff5c)
https://boinc.bakerlab.org/rosetta/result.php?resultid=15006663
Rosetta 4.82 Windows
Click signature for global team stats
ID: 12717 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Los Alcoholicos~Megaflix

Send message
Joined: 10 Nov 05
Posts: 24
Credit: 77,199
RAC: 0
Message 12731 - Posted: 27 Mar 2006, 23:12:57 UTC - in response to Message 12685.  

exit code -1073741819

I seem to have one computer which generates a lot of workunits with above error. Always the same error, but at different moments in the workunit. Maybe the project researchers could look into it. The computer's name in the computer list is Megaflix. It's got at least a 25% failure rate, but it's still climbing. I've installed a fresh Windows, with no software at all. Just for testing I installed only Boinc with Rosetta and it keeps producing the errors.

Leave applications in memory has been set to yes since the beginning, so that's not an issue. I had set the work units to run for 2 hours, but since it produced errors almost from the start I set it lower, to 1 hour. It decreased the number of errors just a little bit, but not much.

Might be handy in trying to find a cure for this error.

I'm willing to use the computer as a testing guinee pig if you want to...


Cab you attach this computer to RALPH. It might help to see those errors over there.


Boinc ran as a service on that computer, so I installed Boinc as a single user installation to see if I could find the source of the errors, while still finishing the few workunits left before connecting to Ralph. However, I haven't had a single error since I'm not running as a service anymore.

Re-installing it as a service does give back those errors. Maybe they could look into that? Might be a 'service-related' issue in Boinc? I'll still connect the computer to Ralph (running as a service) after finishing the Rosetta workunits.
ID: 12731 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile UBT - Timbo

Send message
Joined: 25 Sep 05
Posts: 20
Credit: 2,299,279
RAC: 648
Message 12793 - Posted: 29 Mar 2006, 15:28:15 UTC - in response to Message 10953.  

[quoteReport all Work Unit errors on this thread that are NOT -

    "1%" Hang"
    "Max Time Exceeded"
    or other "stuck" or "hung" workuinits

[/quote]


Hi all,

Have seen the message about downloading the PDB file (I dl'd version 4.83 to "match" the v4.83 application I have) and having had issues before, thought that maybe, if I had problems this time around, then at least some decent reports will go back.

And I've just had the following errors:

29/03/2006 13:36:59|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5153_0 ( - exit code -529697949 (0xe06d7363))
29/03/2006 13:37:42|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5070_0 ( - exit code -529697949 (0xe06d7363))
29/03/2006 13:38:25|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5196_0 ( - exit code -529697949 (0xe06d7363))
29/03/2006 13:39:08|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5188_0 ( - exit code -529697949 (0xe06d7363))
29/03/2006 13:39:50|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5215_0 ( - exit code -529697949 (0xe06d7363))
29/03/2006 13:40:31|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5154_0 ( - exit code -529697949 (0xe06d7363))
29/03/2006 13:41:12|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5114_0 ( - exit code -529697949 (0xe06d7363))
29/03/2006 13:41:53|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5117_0 ( - exit code -529697949 (0xe06d7363))
29/03/2006 13:42:32|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5175_0 ( - exit code -529697949 (0xe06d7363))
29/03/2006 13:43:12|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5184_0 ( - exit code -529697949 (0xe06d7363))


This is using 3GHz P4 (with HT), 512Mb memory, Win XP (Srv Pck 2) + BOINC v5.3.28

Hope this helps the "cause" to resolve the bugs.



Will go back to crunching on RALPH instead...!


regards,

Tim

ID: 12793 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Rom Walton (BOINC)
Volunteer moderator
Project developer

Send message
Joined: 17 Sep 05
Posts: 18
Credit: 40,071
RAC: 0
Message 12797 - Posted: 29 Mar 2006, 19:18:53 UTC - in response to Message 12793.  
Last modified: 29 Mar 2006, 19:19:24 UTC

Report all Work Unit errors on this thread that are NOT -

    "1%" Hang"
    "Max Time Exceeded"
    or other "stuck" or "hung" workuinits




Hi all,

Have seen the message about downloading the PDB file (I dl'd version 4.83 to "match" the v4.83 application I have) and having had issues before, thought that maybe, if I had problems this time around, then at least some decent reports will go back.

And I've just had the following errors:

29/03/2006 13:36:59|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5153_0 ( - exit code -529697949 (0xe06d7363))
29/03/2006 13:37:42|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5070_0 ( - exit code -529697949 (0xe06d7363))
29/03/2006 13:38:25|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5196_0 ( - exit code -529697949 (0xe06d7363))
29/03/2006 13:39:08|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5188_0 ( - exit code -529697949 (0xe06d7363))
29/03/2006 13:39:50|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5215_0 ( - exit code -529697949 (0xe06d7363))
29/03/2006 13:40:31|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5154_0 ( - exit code -529697949 (0xe06d7363))
29/03/2006 13:41:12|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5114_0 ( - exit code -529697949 (0xe06d7363))
29/03/2006 13:41:53|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5117_0 ( - exit code -529697949 (0xe06d7363))
29/03/2006 13:42:32|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5175_0 ( - exit code -529697949 (0xe06d7363))
29/03/2006 13:43:12|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5184_0 ( - exit code -529697949 (0xe06d7363))


This is using 3GHz P4 (with HT), 512Mb memory, Win XP (Srv Pck 2) + BOINC v5.3.28

Hope this helps the "cause" to resolve the bugs.



Will go back to crunching on RALPH instead...!


regards,

Tim


That error code useally means the machine ran out of memory during the execution of the workunit. Since you only have 512MB of RAM and one instance of Rosetta can use up to 250MB of Ram, I would recommend turning off HT.

----- Rom
My Blog
ID: 12797 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile UBT - Timbo

Send message
Joined: 25 Sep 05
Posts: 20
Credit: 2,299,279
RAC: 648
Message 12802 - Posted: 29 Mar 2006, 22:35:17 UTC - in response to Message 12797.  
Last modified: 29 Mar 2006, 22:37:26 UTC

And I've just had the following errors:

29/03/2006 13:36:59|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5153_0 ( - exit code -529697949 (0xe06d7363))
29/03/2006 13:37:42|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5070_0 ( - exit code -529697949 (0xe06d7363))
29/03/2006 13:38:25|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5196_0 ( - exit code -529697949 (0xe06d7363))
29/03/2006 13:39:08|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5188_0 ( - exit code -529697949 (0xe06d7363))
29/03/2006 13:39:50|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5215_0 ( - exit code -529697949 (0xe06d7363))
29/03/2006 13:40:31|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5154_0 ( - exit code -529697949 (0xe06d7363))
29/03/2006 13:41:12|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5114_0 ( - exit code -529697949 (0xe06d7363))
29/03/2006 13:41:53|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5117_0 ( - exit code -529697949 (0xe06d7363))
29/03/2006 13:42:32|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5175_0 ( - exit code -529697949 (0xe06d7363))
29/03/2006 13:43:12|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5184_0 ( - exit code -529697949 (0xe06d7363))


This is using 3GHz P4 (with HT), 512Mb memory, Win XP (Srv Pck 2) + BOINC v5.3.28

That error code useally means the machine ran out of memory during the execution of the workunit. Since you only have 512MB of RAM and one instance of Rosetta can use up to 250MB of Ram, I would recommend turning off HT.



Hi,

Thanks for the reply.

So, in this day and age of service to customers, why doesn't the error message say that?

(instead of "exit code -529697949")

2nd: from this page:

The minimum spec is: Windows XP CPU: 500MHz or higher HDD space: 200MB Memory: 512MB.

Think they need to "tweak" this to state: "PER PROCESS".


In the meantime, will go back to crunching for other projects.

(edit) All the other projects I crunch for don't have any issues with regards to only having 512Mb of memory...!



Oh well.....

regards,

Tim

(Unless some-one's got a spare stick of 512Mb PC2700 memory lying around they might want to donate?)
ID: 12802 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile rbpeake

Send message
Joined: 25 Sep 05
Posts: 168
Credit: 247,828
RAC: 0
Message 12811 - Posted: 30 Mar 2006, 3:57:48 UTC - in response to Message 12802.  

Oh well.....

regards,

Tim

(Unless some-one's got a spare stick of 512Mb PC2700 memory lying around they might want to donate?)

The good news is, extra memory these days is fairly inexpensive... :)

Regards,
Bob P.
ID: 12811 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
hugothehermit

Send message
Joined: 26 Sep 05
Posts: 238
Credit: 314,893
RAC: 0
Message 12932 - Posted: 2 Apr 2006, 7:27:15 UTC
Last modified: 2 Apr 2006, 7:42:52 UTC

I had a WU error out with a C000005 today P4 3.0GHz 1GB (1022.73 as reported by BOINC) RAM, it's only doing Rosetta@home so no switching.

I went to the Your account-> Computers on this account (view account) -> hugo-p3ghz -> results here to find the WU to report it, and found out somehow I'm not effectivly reporting my WU's, I know I've done the WU's before the time has expired.

I don't care about the credits but have a look into this as you maybe losing CPU power.

I was also going to say (and attach a picture) that the WU that I was doing was constantly doing a full atom relax in the middle of the energy distibution, but I can't remember what the name was though I remember that it was an simple protein that basically was curly curly curly (colours for the strands) that went up down and up again, I actually took three screenshots and saved none of them as I wanted to get the 90 odd % one.

It sort of looked like a funnel on top and inverted funnel on the bottom and a wide horizontal line where they met, the full atom relaxes were all in the middle of these "funnels"

mutiple edits
ID: 12932 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile anders n

Send message
Joined: 19 Sep 05
Posts: 403
Credit: 537,991
RAC: 0
Message 12933 - Posted: 2 Apr 2006, 7:31:07 UTC - in response to Message 12932.  

I had a WU error out with a c0000005 today

I don't care about the credits but have a look into this as you are losing FLOPS.




It seems to be a WIN 98 issue.

I have had this on WIN 98 to.

Anders n
ID: 12933 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
hugothehermit

Send message
Joined: 26 Sep 05
Posts: 238
Credit: 314,893
RAC: 0
Message 12934 - Posted: 2 Apr 2006, 7:45:13 UTC

t seems to be a WIN 98 issue.

I have had this on WIN 98 to.

Anders n


I'm on XP, what's more I never knew I was reporting bad results, I just never looked :?
ID: 12934 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile anders n

Send message
Joined: 19 Sep 05
Posts: 403
Credit: 537,991
RAC: 0
Message 12935 - Posted: 2 Apr 2006, 7:57:31 UTC - in response to Message 12934.  

t seems to be a WIN 98 issue.

I have had this on WIN 98 to.

Anders n


I'm on XP, what's more I never knew I was reporting bad results, I just never looked :?



Hi hugotheherit

This is one of your results. A win 98 right?

https://boinc.bakerlab.org/rosetta/result.php?resultid=15296368

Anders n
ID: 12935 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Kevin

Send message
Joined: 15 Jan 06
Posts: 21
Credit: 109,496
RAC: 0
Message 12936 - Posted: 2 Apr 2006, 8:13:33 UTC

I had a workunit error out with

Unrecoverable error for result HIGHERTEMP_HELIX_1elwA_411_81_1 ( - exit code -1073741811 (0xc000000d))


https://boinc.bakerlab.org/rosetta/result.php?resultid=15613409

This unit was at 79781 seconds when this happened.
ID: 12936 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Hydnum.Repandum

Send message
Joined: 1 Dec 05
Posts: 1
Credit: 1,271,991
RAC: 0
Message 12940 - Posted: 2 Apr 2006, 10:36:17 UTC

I have an observation that might help with the errors:

I normally run SETI and when this is not available then I run Rosetta. I was running Rosetta very happy for 2 days without any erros (waiting for SETI workunits). Then when SETI started uploading/downloading (very slowley with long queues of retries) I suddenly started having lots of Rosetta w/u failing half way in different machines (about 10 machines running Win XP Profess). This was the error message:

2006-03-29 00:25:42 [rosetta@home] Unrecoverable error for result HB_BARCODE_30_1bq9A_351_34656_0 ( - exit code -164 (0xffffff5c))

Thus I conclude that Rosetta has problems when BOINC is busy uploding/downloading/switching to other projects.
ID: 12940 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · 9 . . . 10 · Next

Message boards : Number crunching : Miscellaneous Work Unit Errors



©2024 University of Washington
https://www.bakerlab.org