Miscellaneous Work Unit Errors

Message boards : Number crunching : Miscellaneous Work Unit Errors

To post messages, you must log in.

Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · 9 · Next

AuthorMessage
Los Alcoholicos~Megaflix

Send message
Joined: 10 Nov 05
Posts: 24
Credit: 77,199
RAC: 0
Message 12731 - Posted: 27 Mar 2006, 23:12:57 UTC - in response to Message 12685.  

exit code -1073741819

I seem to have one computer which generates a lot of workunits with above error. Always the same error, but at different moments in the workunit. Maybe the project researchers could look into it. The computer's name in the computer list is Megaflix. It's got at least a 25% failure rate, but it's still climbing. I've installed a fresh Windows, with no software at all. Just for testing I installed only Boinc with Rosetta and it keeps producing the errors.

Leave applications in memory has been set to yes since the beginning, so that's not an issue. I had set the work units to run for 2 hours, but since it produced errors almost from the start I set it lower, to 1 hour. It decreased the number of errors just a little bit, but not much.

Might be handy in trying to find a cure for this error.

I'm willing to use the computer as a testing guinee pig if you want to...


Cab you attach this computer to RALPH. It might help to see those errors over there.


Boinc ran as a service on that computer, so I installed Boinc as a single user installation to see if I could find the source of the errors, while still finishing the few workunits left before connecting to Ralph. However, I haven't had a single error since I'm not running as a service anymore.

Re-installing it as a service does give back those errors. Maybe they could look into that? Might be a 'service-related' issue in Boinc? I'll still connect the computer to Ralph (running as a service) after finishing the Rosetta workunits.
ID: 12731 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile UBT - Timbo

Send message
Joined: 25 Sep 05
Posts: 20
Credit: 2,299,279
RAC: 0
Message 12793 - Posted: 29 Mar 2006, 15:28:15 UTC - in response to Message 10953.  

[quoteReport all Work Unit errors on this thread that are NOT -

    "1%" Hang"
    "Max Time Exceeded"
    or other "stuck" or "hung" workuinits

[/quote]


Hi all,

Have seen the message about downloading the PDB file (I dl'd version 4.83 to "match" the v4.83 application I have) and having had issues before, thought that maybe, if I had problems this time around, then at least some decent reports will go back.

And I've just had the following errors:

29/03/2006 13:36:59|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5153_0 ( - exit code -529697949 (0xe06d7363))
29/03/2006 13:37:42|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5070_0 ( - exit code -529697949 (0xe06d7363))
29/03/2006 13:38:25|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5196_0 ( - exit code -529697949 (0xe06d7363))
29/03/2006 13:39:08|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5188_0 ( - exit code -529697949 (0xe06d7363))
29/03/2006 13:39:50|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5215_0 ( - exit code -529697949 (0xe06d7363))
29/03/2006 13:40:31|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5154_0 ( - exit code -529697949 (0xe06d7363))
29/03/2006 13:41:12|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5114_0 ( - exit code -529697949 (0xe06d7363))
29/03/2006 13:41:53|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5117_0 ( - exit code -529697949 (0xe06d7363))
29/03/2006 13:42:32|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5175_0 ( - exit code -529697949 (0xe06d7363))
29/03/2006 13:43:12|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5184_0 ( - exit code -529697949 (0xe06d7363))


This is using 3GHz P4 (with HT), 512Mb memory, Win XP (Srv Pck 2) + BOINC v5.3.28

Hope this helps the "cause" to resolve the bugs.



Will go back to crunching on RALPH instead...!


regards,

Tim

ID: 12793 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Rom Walton (BOINC)
Volunteer moderator
Project developer

Send message
Joined: 17 Sep 05
Posts: 18
Credit: 40,071
RAC: 0
Message 12797 - Posted: 29 Mar 2006, 19:18:53 UTC - in response to Message 12793.  
Last modified: 29 Mar 2006, 19:19:24 UTC

Report all Work Unit errors on this thread that are NOT -

    "1%" Hang"
    "Max Time Exceeded"
    or other "stuck" or "hung" workuinits




Hi all,

Have seen the message about downloading the PDB file (I dl'd version 4.83 to "match" the v4.83 application I have) and having had issues before, thought that maybe, if I had problems this time around, then at least some decent reports will go back.

And I've just had the following errors:

29/03/2006 13:36:59|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5153_0 ( - exit code -529697949 (0xe06d7363))
29/03/2006 13:37:42|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5070_0 ( - exit code -529697949 (0xe06d7363))
29/03/2006 13:38:25|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5196_0 ( - exit code -529697949 (0xe06d7363))
29/03/2006 13:39:08|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5188_0 ( - exit code -529697949 (0xe06d7363))
29/03/2006 13:39:50|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5215_0 ( - exit code -529697949 (0xe06d7363))
29/03/2006 13:40:31|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5154_0 ( - exit code -529697949 (0xe06d7363))
29/03/2006 13:41:12|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5114_0 ( - exit code -529697949 (0xe06d7363))
29/03/2006 13:41:53|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5117_0 ( - exit code -529697949 (0xe06d7363))
29/03/2006 13:42:32|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5175_0 ( - exit code -529697949 (0xe06d7363))
29/03/2006 13:43:12|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5184_0 ( - exit code -529697949 (0xe06d7363))


This is using 3GHz P4 (with HT), 512Mb memory, Win XP (Srv Pck 2) + BOINC v5.3.28

Hope this helps the "cause" to resolve the bugs.



Will go back to crunching on RALPH instead...!


regards,

Tim


That error code useally means the machine ran out of memory during the execution of the workunit. Since you only have 512MB of RAM and one instance of Rosetta can use up to 250MB of Ram, I would recommend turning off HT.

----- Rom
My Blog
ID: 12797 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile UBT - Timbo

Send message
Joined: 25 Sep 05
Posts: 20
Credit: 2,299,279
RAC: 0
Message 12802 - Posted: 29 Mar 2006, 22:35:17 UTC - in response to Message 12797.  
Last modified: 29 Mar 2006, 22:37:26 UTC

And I've just had the following errors:

29/03/2006 13:36:59|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5153_0 ( - exit code -529697949 (0xe06d7363))
29/03/2006 13:37:42|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5070_0 ( - exit code -529697949 (0xe06d7363))
29/03/2006 13:38:25|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5196_0 ( - exit code -529697949 (0xe06d7363))
29/03/2006 13:39:08|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5188_0 ( - exit code -529697949 (0xe06d7363))
29/03/2006 13:39:50|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5215_0 ( - exit code -529697949 (0xe06d7363))
29/03/2006 13:40:31|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5154_0 ( - exit code -529697949 (0xe06d7363))
29/03/2006 13:41:12|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5114_0 ( - exit code -529697949 (0xe06d7363))
29/03/2006 13:41:53|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5117_0 ( - exit code -529697949 (0xe06d7363))
29/03/2006 13:42:32|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5175_0 ( - exit code -529697949 (0xe06d7363))
29/03/2006 13:43:12|rosetta@home|Unrecoverable error for result NO_TERM_STRAND_1hz6_383_5184_0 ( - exit code -529697949 (0xe06d7363))


This is using 3GHz P4 (with HT), 512Mb memory, Win XP (Srv Pck 2) + BOINC v5.3.28

That error code useally means the machine ran out of memory during the execution of the workunit. Since you only have 512MB of RAM and one instance of Rosetta can use up to 250MB of Ram, I would recommend turning off HT.



Hi,

Thanks for the reply.

So, in this day and age of service to customers, why doesn't the error message say that?

(instead of "exit code -529697949")

2nd: from this page:

The minimum spec is: Windows XP CPU: 500MHz or higher HDD space: 200MB Memory: 512MB.

Think they need to "tweak" this to state: "PER PROCESS".


In the meantime, will go back to crunching for other projects.

(edit) All the other projects I crunch for don't have any issues with regards to only having 512Mb of memory...!



Oh well.....

regards,

Tim

(Unless some-one's got a spare stick of 512Mb PC2700 memory lying around they might want to donate?)
ID: 12802 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile rbpeake

Send message
Joined: 25 Sep 05
Posts: 168
Credit: 247,828
RAC: 0
Message 12811 - Posted: 30 Mar 2006, 3:57:48 UTC - in response to Message 12802.  

Oh well.....

regards,

Tim

(Unless some-one's got a spare stick of 512Mb PC2700 memory lying around they might want to donate?)

The good news is, extra memory these days is fairly inexpensive... :)

Regards,
Bob P.
ID: 12811 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
hugothehermit

Send message
Joined: 26 Sep 05
Posts: 238
Credit: 314,893
RAC: 0
Message 12932 - Posted: 2 Apr 2006, 7:27:15 UTC
Last modified: 2 Apr 2006, 7:42:52 UTC

I had a WU error out with a C000005 today P4 3.0GHz 1GB (1022.73 as reported by BOINC) RAM, it's only doing Rosetta@home so no switching.

I went to the Your account-> Computers on this account (view account) -> hugo-p3ghz -> results here to find the WU to report it, and found out somehow I'm not effectivly reporting my WU's, I know I've done the WU's before the time has expired.

I don't care about the credits but have a look into this as you maybe losing CPU power.

I was also going to say (and attach a picture) that the WU that I was doing was constantly doing a full atom relax in the middle of the energy distibution, but I can't remember what the name was though I remember that it was an simple protein that basically was curly curly curly (colours for the strands) that went up down and up again, I actually took three screenshots and saved none of them as I wanted to get the 90 odd % one.

It sort of looked like a funnel on top and inverted funnel on the bottom and a wide horizontal line where they met, the full atom relaxes were all in the middle of these "funnels"

mutiple edits
ID: 12932 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile anders n

Send message
Joined: 19 Sep 05
Posts: 403
Credit: 537,991
RAC: 0
Message 12933 - Posted: 2 Apr 2006, 7:31:07 UTC - in response to Message 12932.  

I had a WU error out with a c0000005 today

I don't care about the credits but have a look into this as you are losing FLOPS.




It seems to be a WIN 98 issue.

I have had this on WIN 98 to.

Anders n
ID: 12933 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
hugothehermit

Send message
Joined: 26 Sep 05
Posts: 238
Credit: 314,893
RAC: 0
Message 12934 - Posted: 2 Apr 2006, 7:45:13 UTC

t seems to be a WIN 98 issue.

I have had this on WIN 98 to.

Anders n


I'm on XP, what's more I never knew I was reporting bad results, I just never looked :?
ID: 12934 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile anders n

Send message
Joined: 19 Sep 05
Posts: 403
Credit: 537,991
RAC: 0
Message 12935 - Posted: 2 Apr 2006, 7:57:31 UTC - in response to Message 12934.  

t seems to be a WIN 98 issue.

I have had this on WIN 98 to.

Anders n


I'm on XP, what's more I never knew I was reporting bad results, I just never looked :?



Hi hugotheherit

This is one of your results. A win 98 right?

https://boinc.bakerlab.org/rosetta/result.php?resultid=15296368

Anders n
ID: 12935 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Kevin

Send message
Joined: 15 Jan 06
Posts: 21
Credit: 109,496
RAC: 0
Message 12936 - Posted: 2 Apr 2006, 8:13:33 UTC

I had a workunit error out with

Unrecoverable error for result HIGHERTEMP_HELIX_1elwA_411_81_1 ( - exit code -1073741811 (0xc000000d))


https://boinc.bakerlab.org/rosetta/result.php?resultid=15613409

This unit was at 79781 seconds when this happened.
ID: 12936 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Hydnum.Repandum

Send message
Joined: 1 Dec 05
Posts: 1
Credit: 1,271,991
RAC: 0
Message 12940 - Posted: 2 Apr 2006, 10:36:17 UTC

I have an observation that might help with the errors:

I normally run SETI and when this is not available then I run Rosetta. I was running Rosetta very happy for 2 days without any erros (waiting for SETI workunits). Then when SETI started uploading/downloading (very slowley with long queues of retries) I suddenly started having lots of Rosetta w/u failing half way in different machines (about 10 machines running Win XP Profess). This was the error message:

2006-03-29 00:25:42 [rosetta@home] Unrecoverable error for result HB_BARCODE_30_1bq9A_351_34656_0 ( - exit code -164 (0xffffff5c))

Thus I conclude that Rosetta has problems when BOINC is busy uploding/downloading/switching to other projects.
ID: 12940 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
hugothehermit

Send message
Joined: 26 Sep 05
Posts: 238
Credit: 314,893
RAC: 0
Message 12972 - Posted: 3 Apr 2006, 7:42:55 UTC
Last modified: 3 Apr 2006, 8:07:00 UTC


Hi hugotheherit

This is one of your results. A win 98 right?

https://boinc.bakerlab.org/rosetta/result.php?resultid=15296368

Anders n


Yep, I have three machines running Rosetta@Home and my smoothwall router/internet sever etc... running seti cause it's a bit small for anything else :)

P4 3.0 Ghz, 1GB Ram Win XP Home SP2 (running Rosetta and Ralph, Ralphs out of work at the moment)
P4 1.0 Ghz, 256 Ram Win 98se (running Rosetta)
P3 933Mhz, 256 Ram Win 98se (running Rosetta)
P3 500Mhz, 128 Ram GNU/Linux (running smoothwall and seti@home)

Edit: GNU linux not Redhat :oops
ID: 12972 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
casio7131

Send message
Joined: 10 Oct 05
Posts: 35
Credit: 149,748
RAC: 0
Message 12979 - Posted: 3 Apr 2006, 13:48:31 UTC

3/04/2006 11:38:45 PM|rosetta@home|Unrecoverable error for result HB_BARCODE_30_4ubpA_351_49332_0 ( - exit code -1073741811 (0xc000000d))

resultid=15780509

ID: 12979 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Charles Dennett
Avatar

Send message
Joined: 27 Sep 05
Posts: 102
Credit: 2,081,660
RAC: 0
Message 13004 - Posted: 3 Apr 2006, 19:23:12 UTC
Last modified: 3 Apr 2006, 19:24:02 UTC

Please check out the the thread:

https://boinc.bakerlab.org/rosetta/forum_thread.php?id=1323

There seems to be a new kind of problem that has popped up the past few days on older Win98* machines where the the workunits are not reporting the cpu time back to the core client.

I know these older machines do not meet the minimum specs of the project, but at least mine and those of another person who report the same problem have been working fine up to now.

Just wanted to make sure this was brought to the attention of the project leaders.

Charlie

-Charlie
ID: 13004 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Monitor-Man

Send message
Joined: 19 Dec 05
Posts: 4
Credit: 6,034,589
RAC: 0
Message 13040 - Posted: 4 Apr 2006, 10:18:14 UTC

win 98 machines running 4.83 now completes WU's with 0 time and no errors roports as sucess but no credit as cpu time shows zero. but they do go so far through and the time & work done does increment but seems to suddenly say 100% and report.

I have 2 of these machines too old and not enough memory to run XP, this may be related to 4.83 as it seems to be a recent problem.

Machine names

dick.workgroup
piii-1g.WorkGroup

Regards

Rich
ID: 13040 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile anders n

Send message
Joined: 19 Sep 05
Posts: 403
Credit: 537,991
RAC: 0
Message 13041 - Posted: 4 Apr 2006, 10:48:07 UTC

The first time I saw the 0 time problem on Win 98 was with
Rosetta 4.82 Windows.

Anders n
ID: 13041 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
David Baker
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 17 Sep 05
Posts: 705
Credit: 559,847
RAC: 0
Message 13068 - Posted: 5 Apr 2006, 6:14:25 UTC - in response to Message 13004.  

Please check out the the thread:

https://boinc.bakerlab.org/rosetta/forum_thread.php?id=1323

There seems to be a new kind of problem that has popped up the past few days on older Win98* machines where the the workunits are not reporting the cpu time back to the core client.

I know these older machines do not meet the minimum specs of the project, but at least mine and those of another person who report the same problem have been working fine up to now.

Just wanted to make sure this was brought to the attention of the project leaders.

Charlie


Thanks. I asked Rom about this today, and he said that boinc had a special fix to deal with win98 lack of a timer function, and that his fix to the "leave in memory" problem might have messed up the boinc time keeper. he is looking into fixing it, but in the mean time all the results win98 computers are producing are getting properly collected and are helping us, and we will award credit for all of these jobs, so please bear with the problem for a bit longer.

ID: 13068 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
jomebrew

Send message
Joined: 31 Mar 06
Posts: 2
Credit: 25,914,516
RAC: 0
Message 13084 - Posted: 5 Apr 2006, 16:14:56 UTC

I have had this a three times on this machine since I started Rosetta 3/31. I get WIndowes XP dialog box that athe application errored. This is what is in the event log:

Faulting application rosetta_4.83_windows_intelx86.exe, version 0.0.0.0, faulting module rosetta_4.83_windows_intelx86.exe, version 0.0.0.0, fault address 0x004da3d4.

This has happened a few times on this machine. A P4 3ghz with WIndows XP Pro. It has also occurred on my AMD64 X2, but I do not know if the evnt log says the same thing.



ID: 13084 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile anders n

Send message
Joined: 19 Sep 05
Posts: 403
Credit: 537,991
RAC: 0
Message 13085 - Posted: 5 Apr 2006, 18:08:25 UTC

This WU https://boinc.bakerlab.org/rosetta/result.php?resultid=16041300

was aborted sins it did not count up the steps and did not regester the energi changes.

Anders n
ID: 13085 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Kevin

Send message
Joined: 15 Jan 06
Posts: 21
Credit: 109,496
RAC: 0
Message 13148 - Posted: 7 Apr 2006, 3:35:31 UTC - in response to Message 13085.  

This WU https://boinc.bakerlab.org/rosetta/result.php?resultid=16041300

was aborted sins it did not count up the steps and did not regester the energi changes.

Anders n



I just got one of these workunits and saw no energies or steps are being registered. How will this affect the workunit and what is returned to the server?
ID: 13148 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · 9 · Next

Message boards : Number crunching : Miscellaneous Work Unit Errors



©2025 University of Washington
https://www.bakerlab.org