Miscellaneous Work Unit Errors

Message boards : Number crunching : Miscellaneous Work Unit Errors

To post messages, you must log in.

Previous · 1 . . . 5 · 6 · 7 · 8 · 9 · 10 · Next

AuthorMessage
Rebel Alliance

Send message
Joined: 4 Nov 05
Posts: 50
Credit: 3,579,531
RAC: 0
Message 13241 - Posted: 8 Apr 2006, 15:06:43 UTC

These are starting to get to my machines as well and on the one machine that has them 3 out of the 4 work units crunch has failed with the same messages as the other people.
"***UNHANDLED EXCEPTION****
Reason: Access Violation (0xc0000005) at address 0x00599FF4 read attempt to address 0x07CDFF48"

"***UNHANDLED EXCEPTION****
Reason: Access Violation (0xc0000005) at address 0x00599FF4 read attempt to address 0x07CDFF60"

and

"***UNHANDLED EXCEPTION****
Reason: Access Violation (0xc0000005) at address 0x007022EA read attempt to address 0x07B5FC7C"

This machine is a amd 2000xp and has never had a problem with work units before.
ID: 13241 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
[DPC]Charley

Send message
Joined: 18 Mar 06
Posts: 9
Credit: 295,915
RAC: 0
Message 13242 - Posted: 8 Apr 2006, 15:10:12 UTC
Last modified: 8 Apr 2006, 15:11:28 UTC

I'm getting tons of errors on the HBLR_1* stuff as well.
Out of 11 work units, 9 returned an error taking from about 1 minute to a couple of minutes short of an hour on 1 box (193403, for the admins).

Error codes:
8-4-2006 5:56:24|rosetta@home|Unrecoverable error for result HBLR_1.0_1mky_425_5187_0 ( - exit code -1073741819 (0xc0000005))
8-4-2006 6:03:32|rosetta@home|Unrecoverable error for result HBLR_1.0_2tif_425_7375_0 ( - exit code -1073741819 (0xc0000005))
8-4-2006 6:04:41|rosetta@home|Unrecoverable error for result HBLR_1.0_1n0u_425_9208_0 ( - exit code -1073741819 (0xc0000005))
8-4-2006 7:02:35|rosetta@home|Unrecoverable error for result HBLR_1.0_1mky_425_9364_0 ( - exit code -1073741819 (0xc0000005))
8-4-2006 7:30:31|rosetta@home|Unrecoverable error for result HBLR_1.0_1ogw_425_9448_0 ( - exit code -1073741819 (0xc0000005))
8-4-2006 7:43:45|rosetta@home|Unrecoverable error for result HBLR_1.0_1di2_426_203_0 ( - exit code -1073741819 (0xc0000005))
8-4-2006 12:05:59|rosetta@home|Unrecoverable error for result HBLR_1.0_2tif_426_571_0 ( - exit code -1073741819 (0xc0000005))
8-4-2006 16:08:16|rosetta@home|Unrecoverable error for result HBLR_1.0_2tif_426_3762_0 ( - exit code -1073741819 (0xc0000005))
8-4-2006 16:10:55|rosetta@home|Unrecoverable error for result HBLR_1.0_2reb_426_4608_0 ( - exit code -1073741819 (0xc0000005))


Second machine (181715) is also pumping out errors. Taking from 150 to 1230 seconds. These are the first units it's doing with 4.97.
Error codes:
08/04/2006 16:26:31|rosetta@home|Unrecoverable error for result HBLR_1.0_1ogw_426_283_1 ( - exit code -1073741819 (0xc0000005))
08/04/2006 16:49:21|rosetta@home|Unrecoverable error for result HBLR_1.0_1r69_426_428_1 ( - exit code -1073741819 (0xc0000005))
08/04/2006 16:53:59|rosetta@home|Unrecoverable error for result HBLR_1.0_1mky_426_4883_0 ( - exit code -1073741819 (0xc0000005))


Number three (193007) isn't doing any better, 4 out of 4 errors. Can't reach those error codes right now.

[/b]Number four[/b] (187877) is doing slightly better, with only 1 error so far out of 4 units.
Error codes:
8-4-2006 10:52:59|rosetta@home|Unrecoverable error for result HBLR_1.0_1mky_426_753_0 ( - exit code -1073741819 (0xc0000005))


All boxen are running windows XP home or pro and Rosetta 4.97.
The differences I can make out on my four boxen:
SP1 generates less errors than SP2. (Box 4 is still on SP1)
Pentium generates less errors than AMD. (Box 4 is a P3 733MHz, other boxen are AMD XP 2500+ and AMD 64 3700+).
Important note: not statistically relevant data of course, need more people for that ;)

ID: 13242 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Species8472

Send message
Joined: 7 Apr 06
Posts: 1
Credit: 55,732
RAC: 0
Message 13243 - Posted: 8 Apr 2006, 15:14:18 UTC
Last modified: 8 Apr 2006, 15:55:40 UTC

4.97 WU have a 85% failure rate on my A64 X2 3800+, running at stock speed...
4.83 WU's finished fine.

Errors are all of the same type:

***UNHANDLED EXCEPTION****
Reason: Access Violation (0xc0000005) at address 0x007022EA read attempt to address 0x0704FFA0

Timespan reaches from 10 seconds --> 90 minutes / unit before failure.
ID: 13243 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Buffalo Bill
Avatar

Send message
Joined: 25 Mar 06
Posts: 71
Credit: 1,630,458
RAC: 0
Message 13244 - Posted: 8 Apr 2006, 15:39:13 UTC

Lots here too...

Errors: Desktop
08/04/2006 7:09:37 AM|rosetta@home|Unrecoverable error for result HBLR_1.0_1r69_426_2525_0 ( - exit code -1073741819 (0xc0000005))
08/04/2006 9:11:01 AM|rosetta@home|Unrecoverable error for result HBLR_1.0_1dcj_426_4094_0 ( - exit code -1073741819 (0xc0000005))
08/04/2006 9:12:53 AM|rosetta@home|Unrecoverable error for result HBLR_1.0_1r69_426_4262_0 ( - exit code -1073741819 (0xc0000005))

Errors: Laptop
4/8/2006 12:20:26 AM|rosetta@home|Unrecoverable error for result
HBLR_1.0_1n0u_425_4428_0 ( - exit code -1073741819 (0xc0000005))
4/8/2006 12:22:28 AM|rosetta@home|Unrecoverable error for result
HBLR_1.0_2tif_425_9497_0 ( - exit code -1073741819 (0xc0000005))
4/8/2006 12:30:01 AM|rosetta@home|Unrecoverable error for result
HBLR_1.0_1n0u_426_963_0 ( - exit code -1073741819 (0xc0000005))
4/8/2006 4:57:17 AM|rosetta@home|Unrecoverable error for result
HBLR_1.0_1ogw_426_1087_0 ( - exit code -1073741819 (0xc0000005))
4/8/2006 5:02:37 AM|rosetta@home|Unrecoverable error for result
HBLR_1.0_1ogw_425_7374_1 ( - exit code -1073741819 (0xc0000005))
4/8/2006 5:14:59 AM|rosetta@home|Unrecoverable error for result
HBLR_1.0_2tif_425_5274_1 ( - exit code -1073741819 (0xc0000005))
ID: 13244 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
[DPC] C0w Crunch3rz

Send message
Joined: 11 Feb 06
Posts: 1
Credit: 286,166
RAC: 0
Message 13246 - Posted: 8 Apr 2006, 16:29:01 UTC

Is it (technically) possible to switch back to 4.83? It seems that these errors only occur in 4.97.
ID: 13246 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
genes
Avatar

Send message
Joined: 8 Oct 05
Posts: 60
Credit: 702,872
RAC: 777
Message 13247 - Posted: 8 Apr 2006, 16:54:51 UTC
Last modified: 8 Apr 2006, 16:56:25 UTC

I've gotten 8 errors with 4.97 over the last 2 days on several machines, and that's just with Rosetta! There's also Ralph, which is currently using 4.97, and I'm having errors there as well.

They are ALL 0xC0000005 errors (access violation). I could list them here, but there are already plenty to look at. Just checking in.

I think I have had only one finish without errors.
ID: 13247 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jimi@0wned.org.uk

Send message
Joined: 10 Mar 06
Posts: 29
Credit: 335,252
RAC: 0
Message 13249 - Posted: 8 Apr 2006, 17:07:11 UTC

When the last 4.83 WU finishes, I'm pulling my boxes til this gets fixed. All the 4.97s have failed; some have failed elsewhere before, others have gone on to fail elsewhere. It's a show-stopper, whatever the change was.
ID: 13249 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Nite Owl
Avatar

Send message
Joined: 2 Nov 05
Posts: 87
Credit: 3,019,449
RAC: 0
Message 13251 - Posted: 8 Apr 2006, 17:22:06 UTC
Last modified: 8 Apr 2006, 17:26:36 UTC

Somehow I get the feeling 4.97 is NOT ready for prime time.... 24 of 25 WU's failed running at Rosetta since about 5:20 EDT yesterday, and 15 of 16 running at Ralph since it was released...... Hopefully this can be resolved without having to wait until Monday..... Hopefully
Join the Teddies@WCG
ID: 13251 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Dimitris Hatzopoulos

Send message
Joined: 5 Jan 06
Posts: 336
Credit: 80,939
RAC: 0
Message 13253 - Posted: 8 Apr 2006, 17:42:49 UTC
Last modified: 8 Apr 2006, 18:26:13 UTC

Mostly failures with v4.97 here too.

PS: It'd been almost 2 months since I had errors on my machines, so it's probably a 4.97 thing...
Best UFO Resources
Wikipedia R@h
How-To: Join Distributed Computing projects that benefit humanity
ID: 13253 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile rbpeake

Send message
Joined: 25 Sep 05
Posts: 168
Credit: 247,828
RAC: 0
Message 13258 - Posted: 8 Apr 2006, 18:27:17 UTC - in response to Message 13238.  

Unfortunately, I'm just leaving for a family weekend trip so can't figure things out right away. Please bear with us for a couple of days.

I guess the moral of the story is, make new releases early in the week when project people are available to help if that becomes necessary. Hindsight is of course wonderful, but perhaps a useful suggestion for the future. :)

Regards,
Bob P.
ID: 13258 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
odb

Send message
Joined: 1 Jan 06
Posts: 1
Credit: 33,903
RAC: 0
Message 13259 - Posted: 8 Apr 2006, 18:28:19 UTC

all 4.97's ive done with this machine have been erroring out, stopped crunchin to save on power bills til its fixed

https://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=197500
ID: 13259 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sam-TNO-

Send message
Joined: 15 Feb 06
Posts: 2
Credit: 252,795
RAC: 0
Message 13260 - Posted: 8 Apr 2006, 18:33:07 UTC

https://boinc.bakerlab.org/rosetta/result.php?resultid=16647818
https://boinc.bakerlab.org/rosetta/result.php?resultid=16647880
https://boinc.bakerlab.org/rosetta/result.php?resultid=16647868
https://boinc.bakerlab.org/rosetta/result.php?resultid=16647860

All of the first four wu's I got with 4.97 failed. Three out of four within 85 seconds...

Sam-TNO-
Team-SciFi
ID: 13260 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Rebel Alliance

Send message
Joined: 4 Nov 05
Posts: 50
Credit: 3,579,531
RAC: 0
Message 13275 - Posted: 8 Apr 2006, 20:56:35 UTC

Only one of my machines has made it to these work units. So far 6 out of 8 have failed on that machine.

ID: 13275 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile DonAnalog

Send message
Joined: 8 Apr 06
Posts: 1
Credit: 7,433
RAC: 0
Message 13276 - Posted: 8 Apr 2006, 21:12:06 UTC

I just started this today... and I was getting Unrecoverable error for result HBLR 1.0 s after about 10 minutes for 6 tries, then one ran for about 3 hours before giving that error... then another one ran about 90 minutes before a different error: Unrecoverable error for result FARELAX_NOFILTERS_1aiu....
I an turning this time waster off... someone might reply to me IF these results are somewhat usable, otherwise I am not going to donate my spare computer time to filling up a BIT bucket!

Don Jones

ID: 13276 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Dimitris Hatzopoulos

Send message
Joined: 5 Jan 06
Posts: 336
Credit: 80,939
RAC: 0
Message 13278 - Posted: 8 Apr 2006, 21:48:55 UTC

Linux v4.97 seems to work OK sofar (even for 8-hour long WUs).

Windows keeps having problems.

Sofar I've reduced WU runtime to just 1-hr WUs as suggested (to get through the weekend), although I'll probably just set the "No new work" flag for the project on Win PCs and crunch for my backup projects, until it gets fixed.

That's another good thing about BOINC projects.
Best UFO Resources
Wikipedia R@h
How-To: Join Distributed Computing projects that benefit humanity
ID: 13278 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Carlos_Pfitzner
Avatar

Send message
Joined: 22 Dec 05
Posts: 71
Credit: 138,867
RAC: 0
Message 13279 - Posted: 8 Apr 2006, 21:58:37 UTC
Last modified: 8 Apr 2006, 22:01:51 UTC

Exit status -1073741819 (0xc0000005)
at address 0x00599FF4 read attempt to address 0x0ACFFB34
https://boinc.bakerlab.org/rosetta/result.php?resultid=16646578
Rosetta 4.97 Windows
Messages related to above error ...
Date Location Project ID Message
4/8/2006 4:37:40 PM carlos.cp3 rosetta@home 130 Starting result HBLR_1.0_2tif_426_5745_0 using rosetta version 497
4/8/2006 4:37:41 PM carlos.cp3 --- 131 request_reschedule_cpus: process exited
4/8/2006 5:16:26 PM carlos.cp3 rosetta@home 132 Started download of aa1dcj_09_05.400_v1_3.gz
4/8/2006 5:34:15 PM carlos.cp3 rosetta@home 133 Finished download of aa1dcj_09_05.400_v1_3.gz
4/8/2006 5:34:15 PM carlos.cp3 rosetta@home 134 Throughput 2698 bytes/sec
4/8/2006 5:34:16 PM carlos.cp3 --- 135 request_reschedule_cpus: files downloaded
4/8/2006 6:15:09 PM carlos.cp3 rosetta@home 136 Unrecoverable error for result HBLR_1.0_2tif_426_5745_0 ( - exit code -1073741819 (0xc0000005))
4/8/2006 6:15:09 PM carlos.cp3 --- 137 request_reschedule_cpus: process exited
4/8/2006 6:15:09 PM carlos.cp3 rosetta@home 138 Computation for result HBLR_1.0_2tif_426_5745_0 finished


Click signature for global team stats
ID: 13279 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Nite Owl
Avatar

Send message
Joined: 2 Nov 05
Posts: 87
Credit: 3,019,449
RAC: 0
Message 13280 - Posted: 8 Apr 2006, 22:03:43 UTC
Last modified: 8 Apr 2006, 22:04:22 UTC

Stopped running Rosetta........... again. Somebody ring a bell when this mess is fixed.... In the meantime I'll be Grid.org...
ID: 13280 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Moderator9
Volunteer moderator

Send message
Joined: 22 Jan 06
Posts: 1014
Credit: 0
RAC: 0
Message 13282 - Posted: 8 Apr 2006, 22:35:29 UTC - in response to Message 13280.  

Stopped running Rosetta........... again. Somebody ring a bell when this mess is fixed.... In the meantime I'll be Grid.org...


I have put in a report of the errors for Version 4.97 to the project team. Usually they react very fast, so we should see something very soon. If I get some word off line I will post it here.

Moderator9
ROSETTA@home FAQ
Moderator Contact
ID: 13282 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Lucky Angel~AES_koetje

Send message
Joined: 18 Mar 06
Posts: 4
Credit: 0
RAC: 0
Message 13283 - Posted: 8 Apr 2006, 22:39:58 UTC - in response to Message 13238.  

I am new here and using version 4.97. I too have almost all my WU's failing with similar codes. ***unrecoverable error for result HBLR_1.0_2reb_426_1061_0 (-exit code -1073741819 (0xc0000005))***



I'm really sorry about these problems. I checked yesterday on RALPH and everything seemed fine, but there clearly is a problem. Unfortunately, I'm just leaving for a family weekend trip so can't figure things out right away. Please bear with us for a couple of days.

Nothing wrong with your trip. But I wonder if you do realize the consequences if nobody else seems to react to serious problems
ID: 13283 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Carlos_Pfitzner
Avatar

Send message
Joined: 22 Dec 05
Posts: 71
Credit: 138,867
RAC: 0
Message 13284 - Posted: 8 Apr 2006, 22:44:25 UTC

List of results that erroed out using Rosetta 4.97 on my Windows PC's
*Until the date of this post only - tommorrow I will make a new list !

Exit status -1073741819 (0xc0000005)
https://boinc.bakerlab.org/rosetta/result.php?resultid=16623409

Exit status -1073741819 (0xc0000005)
https://boinc.bakerlab.org/rosetta/result.php?resultid=16525178

Exit status -1073741819 (0xc0000005)
https://boinc.bakerlab.org/rosetta/result.php?resultid=16525161

Exit status -1073741819 (0xc0000005)
https://boinc.bakerlab.org/rosetta/result.php?resultid=16646341

Exit status -1073741819 (0xc0000005)
https://boinc.bakerlab.org/rosetta/result.php?resultid=16518794

Exit status -1073741819 (0xc0000005) [
https://boinc.bakerlab.org/rosetta/result.php?resultid=16523459
Click signature for global team stats
ID: 13284 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 5 · 6 · 7 · 8 · 9 · 10 · Next

Message boards : Number crunching : Miscellaneous Work Unit Errors



©2024 University of Washington
https://www.bakerlab.org