Miscellaneous Work Unit Errors

Message boards : Number crunching : Miscellaneous Work Unit Errors

To post messages, you must log in.

Previous · 1 . . . 6 · 7 · 8 · 9 · 10 · Next

AuthorMessage
Laurenu2

Send message
Joined: 6 Nov 05
Posts: 57
Credit: 3,818,778
RAC: 0
Message 13286 - Posted: 8 Apr 2006, 22:48:58 UTC

David What is up with all the BAD W/U I must have close to 500
4/8/2006 3:22:39 PM|rosetta@home|Unrecoverable error for result HBLR_1.0_1hz6_426_5085_0 ( - exit code -1073741819 (0xc0000005))
messages on all of my nodes running Rosetta.
You Asked for my/our help here Well you must realize that if you continue to give out W/U that stall our PC, or that can not complete the DC'ers here WILL lose faith in this project and the quality of the data we produce.
As I see it you must do some or more in-house testing before a new Ver.# or WU batch.
And as stated below you should NOT do any releases at a time that you can not be a full staff to make fixes.
This project cost Me/us a lot of money to run Not counting my time. And I do expect a lot more the 1 week of good W/U's to keep running
Please remove the bad W/U's or Ver# that is causing this so we/I can stop spinning or Cooling fans
If You Want The Best You Must forget The Rest
---------------And Join Free-DC----------------
ID: 13286 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Moderator9
Volunteer moderator

Send message
Joined: 22 Jan 06
Posts: 1014
Credit: 0
RAC: 0
Message 13288 - Posted: 8 Apr 2006, 23:11:41 UTC

I just got this message from David Kim who is currently addressing this problem.

"I just reverted back to the previous app. You should notice a version
4.98 now, which is really version 4.83 for windows and mac, and 4.82
for linux."


You all should see some relief very soon. If you force an update it should load the new version once the server is set up.

Moderator9
ROSETTA@home FAQ
Moderator Contact
ID: 13288 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile rbpeake

Send message
Joined: 25 Sep 05
Posts: 168
Credit: 247,828
RAC: 0
Message 13289 - Posted: 8 Apr 2006, 23:12:37 UTC - in response to Message 13286.  

David What is up with all the BAD W/U I must have close to 500
4/8/2006 3:22:39 PM|rosetta@home|Unrecoverable error for result HBLR_1.0_1hz6_426_5085_0 ( - exit code -1073741819 (0xc0000005))
messages on all of my nodes running Rosetta.
You Asked for my/our help here Well you must realize that if you continue to give out W/U that stall our PC, or that can not complete the DC'ers here WILL lose faith in this project and the quality of the data we produce.

This is what hurts the project when no one is around to respond, like on a weekend. As I mentioned below, new releases should be made when the project team is immediately available and on high alert to respond quickly to any problems (they probably should be on extra high alert the first day or so of any new release, even one that has been beta tested).

With so many people making so many contributions in energy, time, and effort, a project needs to be extra prudent with any changes.

Just my 2 cents! :)

Regards,
Bob P.
ID: 13289 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Dimitris Hatzopoulos

Send message
Joined: 5 Jan 06
Posts: 336
Credit: 80,939
RAC: 0
Message 13292 - Posted: 8 Apr 2006, 23:32:59 UTC
Last modified: 8 Apr 2006, 23:35:37 UTC

OK, "new" v4.98 downloaded for Win and running already.

Meanwile, for the last 4 hours, I had set WU-runtime = 1hr and all WUs completed OK without errors on Win too.

Btw, I just re-connected to RALPH (Rosetta's ALPHA test project), which I had set to "No new work" recently, as my PCs never had any problems for the past 2 months anyway LOL
Best UFO Resources
Wikipedia R@h
How-To: Join Distributed Computing projects that benefit humanity
ID: 13292 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Dorphas

Send message
Joined: 14 Feb 06
Posts: 2
Credit: 60,275
RAC: 0
Message 13295 - Posted: 9 Apr 2006, 0:55:25 UTC

since last night ver .97 is causing TONS of errors on all my computers. this is why people avoid this project. i have had about enough of these bugs myself. about to cross the line and crunch for something else more stable.
ID: 13295 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Nuadormrac

Send message
Joined: 27 Sep 05
Posts: 37
Credit: 202,469
RAC: 0
Message 13305 - Posted: 9 Apr 2006, 7:57:34 UTC - in response to Message 13283.  
Last modified: 9 Apr 2006, 7:58:12 UTC

I am new here and using version 4.97. I too have almost all my WU's failing with similar codes. ***unrecoverable error for result HBLR_1.0_2reb_426_1061_0 (-exit code -1073741819 (0xc0000005))***



I'm really sorry about these problems. I checked yesterday on RALPH and everything seemed fine, but there clearly is a problem. Unfortunately, I'm just leaving for a family weekend trip so can't figure things out right away. Please bear with us for a couple of days.

Nothing wrong with your trip. But I wonder if you do realize the consequences if nobody else seems to react to serious problems


Yeah, sorry we didn't catch something sooner, but on the WU types we were testing earlier, everything was going up and validating successfully. That was until we got the HBLR units in the morning, and only then did failures start comming out...

A couple HBLR units did validate on my machine here (over on RALPH), but the vast majority were a no go, with many of them getting 3 failures, and some 2 with 1 success. Others didn't have a report then, so not sure what became of them...

Until we got the newer WU types, wasn't able to report on any problems with them obviously, and could only report on the older types... Sorry us testers weren't able to catch this problem before it started rolling out...
ID: 13305 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Klaws

Send message
Joined: 23 Nov 05
Posts: 1
Credit: 0
RAC: 0
Message 13307 - Posted: 9 Apr 2006, 8:38:05 UTC

Yesterday, my machine (W2K SP4) displayed a message box which told me that there's no CD in the CD drive (drive F:). The message box was caused by Boinc (which runs nothing else but Rosetta). I tried the "Abort" button, the message box popped up again, and I hit it again.

The Boinc manager showed me the follwing line:

08.04.2006 04:23:46|rosetta@home|Unrecoverable error for result HBLR_1.0_1hz6_425_4925_0 ( - exit code -1073741819 (0xc0000005))

The next time the message box appeared, I tried "Continue". Same effect.

I then placed a CD in the drive. No more message boxes, but the Boinc manager insists on:

08.04.2006 21:39:33|rosetta@home|Unrecoverable error for result FARELAX_NOFILTERS_1ptq__427_57_0 ( - exit code -1073741819 (0xc0000005))

Seems every work unit fails.

HOWEVER, the question remains: why does Boinc/Rosetta attempt to access my CD drive? Boinc is installed on drive E: (C: is the "Windows drive", D: is the swap drive, E: is my "app drive", F: is the CD burner (no media inside when the first errors occured), G: is the DVD burner (no media inside at any "error time"), H: is my "scratch drive", I: is a (historic) FAT drive for MS-DOS dual boot...yup, J: is a FAT USB Flash disc (was not present when the errors occured), K: is a removable 120GB HD, L: is a removeable FAT HD (for backup purposes), M: is a remote 120GB "Firewire" HD (powered down and not accessible at this time), P: is a removable 200GB HD, U: is a virtal DVD ROM, drives V:-Z: are network drives, some available, some not...and the missing drive letters correspond to currently removed HDs). The reason for my elaborate listing of all drive letters is that Boinc/Rosetta appears to be interested only in drive F: (the CD drive, now with CD inside), and doesn't proceed to drive G:, even if drive F: now works. That relieves me somehow from the suspicion that the software is trying to spy out my machine!

Still, I consider the bahavior suspicious. At least it produces errors, so something should get fixed ASAP, IMHO!

Best regards, KLaus
- Klaws
ID: 13307 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile KWSN Sir Clark

Send message
Joined: 18 Sep 05
Posts: 46
Credit: 387,432
RAC: 0
Message 13321 - Posted: 9 Apr 2006, 14:24:29 UTC

WU 13577480 produced:

<core_client_version>5.2.13</core_client_version>
<message> - exit code -1073741819 (0xc0000005)
</message>
<stderr_txt>
# random seed: 1359430
# cpu_run_time_pref: 7200
No heartbeat from core client for 31 sec - exiting
# cpu_run_time_pref: 7200

***UNHANDLED EXCEPTION****
Reason: Access Violation (0xc0000005) at address 0x007022EA read attempt to address 0x0EB4FC6C


Dump of the Worker(offending) thread:
1: 04/09/06 15:06:25


Dump of the Timer thread:
2: 04/09/06 15:06:25


Dump of the Graphics thread:
3: 04/09/06 15:06:25


Exiting...

</stderr_txt>


Only BOINC running, apps sent to be pre-empted.

Errored out three times now so WU has cancelled.
ID: 13321 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Dimitris Hatzopoulos

Send message
Joined: 5 Jan 06
Posts: 336
Credit: 80,939
RAC: 0
Message 13337 - Posted: 9 Apr 2006, 17:21:35 UTC - in response to Message 13295.  

since last night ver .97 is causing TONS of errors on all my computers. this is why people avoid this project. i have had about enough of these bugs myself. about to cross the line and crunch for something else more stable.


Before rushing to condemn, let's summarise yesterday's event: it was a problem with v4.97 under Windows (as far as I could tell from my machines, Linux v4.97 had no problems) which lasted about 16 hours, until D.Kim rolled-back the executable to the previous stable version.

IMHO it was much less of a *real* issue than the "stuck at 1%" issue (which btw I encountered just ONCE in 3 months of crunching on 3x P4 PCs, but I realise it occurs much more often for other people), because yesterday WUs simply errored out for several hours, but NO manual intervention whatsoever was/is needed by the operator.

End result: probably about half a day of crunching lost for Win PCs.

So, my Win PCs spent about half a day crunching for nothing.

But, I look it this way: Currently all other BOINC projects use a "quorum" of 3 or 4, and initial replication 3-5, i.e. send the very same WU to 3-5 (!!!) PCs, effectively using just 1/3rd to 1/5th (and sometimes even less) of raw donated CPU time donated by us the BOINC donors.

PS: Ofcourse I'm unhappy that the project did an upgrade this way, without someone monitoring it closely for the next 6-12 hours. But it's Murphy's law and I try to keep things in perspective.
Best UFO Resources
Wikipedia R@h
How-To: Join Distributed Computing projects that benefit humanity
ID: 13337 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Carlos_Pfitzner
Avatar

Send message
Joined: 22 Dec 05
Posts: 71
Credit: 138,867
RAC: 0
Message 13349 - Posted: 9 Apr 2006, 19:36:32 UTC

Until we got the newer WU types, wasn't able to report on any problems with them obviously, and could only report on the older types... Sorry us testers weren't able to catch this problem before it started rolling out...


I disagree of the quoted above

I started the thread at ralph@home announcing the new version 4.97 to test
on 7 April at 09:37 UTC

by 12:52 UTC 7 April, I have already reported this error on Windows 4.97

by 00:24 UTC 8 April, Son Goku posted that 4.97 was working fine

After that time ... 8 April, is that 4.97 go to rosetta@home ...


I wonder why 4.95 that was working very well ... fixed several problems.
is not what was placed into rosetta@home instead -:(

http://www.fadbeens.co.uk/phpBB2/viewtopic.php?t=53&start=165

Now, that was rolled back to 4.83, I know why I crunched all the day
w/o completing only one WU of 4.98 into two of my pcs

-> see my signature ... my rac is failing down on rosetta -:(

So, I will STOP crunching to rosetta again, until a new version that
checkpoint enough to allow swapping apps removing from ram, comes in.

4.95 was that version !!! 4.96 too
Click signature for global team stats
ID: 13349 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Moderator9
Volunteer moderator

Send message
Joined: 22 Jan 06
Posts: 1014
Credit: 0
RAC: 0
Message 13354 - Posted: 9 Apr 2006, 20:03:31 UTC - in response to Message 13349.  

Until we got the newer WU types, wasn't able to report on any problems with them obviously, and could only report on the older types... Sorry us testers weren't able to catch this problem before it started rolling out...


I disagree of the quoted above

I started the thread at ralph@home announcing the new version 4.97 to test
on 7 April at 09:37 UTC

by 12:52 UTC 7 April, I have already reported this error on Windows 4.97

by 00:24 UTC 8 April, Son Goku posted that 4.97 was working fine

After that time ... 8 April, is that 4.97 go to rosetta@home ...


I wonder why 4.95 that was working very well ... fixed several problems.
is not what was placed into rosetta@home instead -:(

http://www.fadbeens.co.uk/phpBB2/viewtopic.php?t=53&start=165

Now, that was rolled back to 4.83, I know why I crunched all the day
w/o completing only one WU of 4.98 into two of my pcs

-> see my signature ... my rac is failing down on rosetta -:(

So, I will STOP crunching to rosetta again, until a new version that
checkpoint enough to allow swapping apps removing from ram, comes in.

4.95 was that version !!! 4.96 too


In fact the information that I have is that Rosetta version 4.97 WAS in fact RALPH version 4.95. The version number was all that changed when it was implemented for Rosetta. What was not known at the time was how the newer WUs would react in the production environment. What is interesting here is that the RALPH testers are usually running BOINC version 5.2.32 and most Rosetta users are running BOINC 5.2.13. This may be part of the issue.

In any case you are wrong about what was implemented in Rosetta. While the version number for the Rosetta application is different, it is the same application that was working well in RALPH. RALPH version 4.97 is not what was deployed in Rosetta.

The workunit testing in RALPH did not show any problems with the newer workunits, however RALPH is a VERY limited subset of the types of systems and configurations running in Rosetta. Because of this fact it is not possible to test every possible issue before new work unit types are deployed.

As has been pointed out on may occasions, a number of your systems are below the minimum memory requirements for the project. The ones that are not, are reporting a significant portion of the memory as not available for Rosetta to use. This single fact has been and will continue to be the largest problem facing you in running Rosetta or Ralph. There are almost no problems reported for systems running with more memory unless a batch of bad work units comes along, and that will happen from time to time.

Moderator9
ROSETTA@home FAQ
Moderator Contact
ID: 13354 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Astro
Avatar

Send message
Joined: 2 Oct 05
Posts: 987
Credit: 500,253
RAC: 0
Message 13356 - Posted: 9 Apr 2006, 20:31:04 UTC - in response to Message 13354.  
Last modified: 9 Apr 2006, 20:34:52 UTC


What is interesting here is that the RALPH testers are usually running BOINC version 5.2.32 and most Rosetta users are running BOINC 5.2.13. This may be part of the issue.

5.2.32? I can't find that build anywhere. I know I never tested it. 5.3.31 is the latest alpha build. Here's a list of all V5 builds from Boinc to date and release dates/times
(note: I trimmed mac, apple, and pdbs' from the list to shorten it):

boinc_5.1.1_i686-pc-linux-gnu.sh 30-Aug-2005 21:37 2.8M
boinc_5.1.1_windows_intelx86.exe 30-Aug-2005 21:37 9.9M
boinc_5.1.2_i686-pc-linux-gnu.sh 07-Sep-2005 11:35 2.8M
boinc_5.1.2_windows_intelx86.exe 07-Sep-2005 11:26 10M
boinc_5.1.3_i686-pc-linux-gnu.sh 09-Sep-2005 11:16 2.8M
boinc_5.1.3_windows_intelx86.exe 09-Sep-2005 11:17 10M
boinc_5.1.4_i686-pc-linux-gnu.sh 20-Sep-2005 14:48 2.8M
boinc_5.1.4_windows_intelx86.exe 20-Sep-2005 14:48 10M
boinc_5.1.5_i686-pc-linux-gnu.sh 29-Sep-2005 00:46 2.8M
boinc_5.1.5_windows_intelx86.exe 29-Sep-2005 00:46 10M
boinc_5.1.6_i686-pc-linux-gnu.sh 01-Oct-2005 02:53 2.8M
boinc_5.1.6_windows_intelx86.exe 01-Oct-2005 02:54 10M
boinc_5.1.8_i686-pc-linux-gnu.sh 05-Oct-2005 19:28 2.8M
boinc_5.1.8_windows_intelx86.exe 05-Oct-2005 19:28 10M
boinc_5.1.9_i686-pc-linux-gnu.sh 09-Oct-2005 18:42 2.9M
boinc_5.1.9_windows_intelx86.exe 09-Oct-2005 18:42 10M
boinc_5.1.10_i686-pc-linux-gnu.sh 10-Oct-2005 12:27 2.9M
boinc_5.1.10_windows_intelx86.exe 10-Oct-2005 12:27 10M
boinc_5.2.0_i686-pc-linux-gnu.sh 10-Oct-2005 16:48 2.9M
boinc_5.2.0_windows_intelx86.exe 10-Oct-2005 16:48 10M
boinc_5.2.1_i686-pc-linux-gnu.sh 10-Oct-2005 19:55 2.9M
boinc_5.2.1_windows_intelx86.exe 10-Oct-2005 19:56 10M
boinc_5.2.2_i686-pc-linux-gnu.sh 17-Oct-2005 14:33 2.9M
boinc_5.2.2_windows_intelx86.exe 17-Oct-2005 14:34 10M
boinc_5.2.3_i686-pc-linux-gnu.sh 19-Oct-2005 19:59 3.4M
boinc_5.2.4_i686-pc-linux-gnu.sh 19-Oct-2005 23:22 3.4M
boinc_5.2.5_i686-pc-linux-gnu.sh 27-Oct-2005 14:01 3.4M
boinc_5.2.5_windows_intelx86.exe 27-Oct-2005 14:02 10M
boinc_5.2.6_i686-pc-linux-gnu.sh 31-Oct-2005 17:05 3.4M
boinc_5.2.6_windows_intelx86.exe 31-Oct-2005 17:06 10M
boinc_5.2.7_i686-pc-linux-gnu.sh 08-Nov-2005 00:52 3.4M
boinc_5.2.7_windows_intelx86.exe 08-Nov-2005 00:53 10M
boinc_5.2.8_i686-pc-linux-gnu.sh 22-Nov-2005 17:47 3.4M
boinc_5.2.8_windows_intelx86.exe 22-Nov-2005 17:48 10M
boinc_5.2.9_windows_intelx86.exe 25-Nov-2005 20:06 10M
boinc_5.2.10_windows_intelx86.exe 26-Nov-2005 01:08 10M
boinc_5.2.11_windows_intelx86.exe 26-Nov-2005 03:48 10M
boinc_5.2.12_windows_intelx86.exe 26-Nov-2005 18:22 10M
boinc_5.2.13_i686-pc-linux-gnu.sh 29-Nov-2005 02:46 3.5M
boinc_5.2.13_windows_intelx86.exe 29-Nov-2005 02:47 10M
boinc_5.2.14_windows_intelx86.exe 04-Dec-2005 03:51 10M
boinc_5.2.15_i686-pc-linux-gnu.sh 28-Dec-2005 06:14 3.5M
boinc_5.2.15_windows_intelx86.exe 28-Dec-2005 06:14 10M
boinc_5.3.2_windows_intelx86.exe 06-Dec-2005 03:32 10M
boinc_5.3.3_windows_intelx86.exe 19-Dec-2005 05:59 10M
boinc_5.3.6_windows_intelx86.exe 28-Dec-2005 05:50 10M
boinc_5.3.15_i686-pc-linux-gnu.sh 30-Jan-2006 15:27 3.5M
boinc_5.3.15_windows_intelx86.exe 27-Jan-2006 13:22 10M
boinc_5.3.16_i686-pc-linux-gnu.sh 30-Jan-2006 19:19 3.5M
boinc_5.3.16_windows_intelx86.exe 30-Jan-2006 19:09 10M
boinc_5.3.17_windows_intelx86.exe 02-Feb-2006 13:19 10M
boinc_5.3.20_i686-pc-linux-gnu.sh 23-Feb-2006 00:57 3.5M
boinc_5.3.20_windows_intelx86.exe 23-Feb-2006 00:49 10M
boinc_5.3.21_i686-pc-linux-gnu.sh 24-Feb-2006 00:21 3.5M
boinc_5.3.21_windows_intelx86.exe 24-Feb-2006 00:24 10M
boinc_5.3.22_i686-pc-linux-gnu.sh 24-Feb-2006 17:35 3.5M
boinc_5.3.22_windows_intelx86.exe 24-Feb-2006 18:00 10M
boinc_5.3.23_i686-pc-linux-gnu.sh 01-Mar-2006 03:08 3.5M
boinc_5.3.23_windows_intelx86.exe 01-Mar-2006 03:05 10M
boinc_5.3.24_i686-pc-linux-gnu.sh 06-Mar-2006 13:14 3.5M
boinc_5.3.24_windows_intelx86.exe 06-Mar-2006 12:39 10M
boinc_5.3.26_i686-pc-linux-gnu.sh 14-Mar-2006 01:21 3.5M
boinc_5.3.26_windows_intelx86.exe 14-Mar-2006 01:11 10M
boinc_5.3.27_i686-pc-linux-gnu.sh 17-Mar-2006 02:18 3.6M
boinc_5.3.27_windows_intelx86.exe 17-Mar-2006 02:14 10M
boinc_5.3.28_i686-pc-linux-gnu.sh 21-Mar-2006 15:20 3.6M
boinc_5.3.28_windows_intelx86.exe 21-Mar-2006 15:03 10M
boinc_5.3.29_i686-pc-linux-gnu.sh 28-Mar-2006 00:09 3.6M
boinc_5.3.29_windows_intelx86.exe 28-Mar-2006 00:32 10M
boinc_5.3.30_i686-pc-linux-gnu.sh 28-Mar-2006 23:26 3.6M
boinc_5.3.30_windows_intelx86.exe 29-Mar-2006 00:02 10M
boinc_5.3.31_i686-pc-linux-gnu.sh 30-Mar-2006 18:37 3.6M
boinc_5.3.31_windows_intelx86.exe 30-Mar-2006 19:54 8.6M


Perhaps you've some new source I'm not aware of, and I'd like to try it. I've seen you mention it a couple times.

ID: 13356 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Dimitris Hatzopoulos

Send message
Joined: 5 Jan 06
Posts: 336
Credit: 80,939
RAC: 0
Message 13358 - Posted: 9 Apr 2006, 20:40:23 UTC
Last modified: 9 Apr 2006, 20:45:42 UTC

It seems like BOINC v5.3.31 is latest, to see the BETA versions one can use the URL

http://boinc.berkeley.edu/download.php?dev=1

whereas the "official" stable versions are at

http://boinc.berkeley.edu/download.php
Best UFO Resources
Wikipedia R@h
How-To: Join Distributed Computing projects that benefit humanity
ID: 13358 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Moderator9
Volunteer moderator

Send message
Joined: 22 Jan 06
Posts: 1014
Credit: 0
RAC: 0
Message 13362 - Posted: 9 Apr 2006, 21:06:32 UTC - in response to Message 13358.  

It seems like BOINC v5.3.31 is latest, to see the BETA versions one can use the URL

http://boinc.berkeley.edu/download.php?dev=1

whereas the "official" stable versions are at

http://boinc.berkeley.edu/download.php


I guess I need to turn on the lights when I type. I meant to say BOINC 5.2.28. Rom had asked all the RALPH testers top upgrade to this version for improved error reporting. I think almost all of them performed the upgrade.

In any case it would be interesting to see if this had any impact on what has happened over the last day or so.

Moderator9
ROSETTA@home FAQ
Moderator Contact
ID: 13362 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Astro
Avatar

Send message
Joined: 2 Oct 05
Posts: 987
Credit: 500,253
RAC: 0
Message 13371 - Posted: 9 Apr 2006, 23:52:37 UTC

I hate to post this, but do you mean 5.3.28? The highest release with a recommended even version number two (2) is 5.2.15.
ID: 13371 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Carlos_Pfitzner
Avatar

Send message
Joined: 22 Dec 05
Posts: 71
Credit: 138,867
RAC: 0
Message 13372 - Posted: 10 Apr 2006, 1:11:06 UTC - in response to Message 13354.  
Last modified: 10 Apr 2006, 1:14:50 UTC

Until we got the newer WU types, wasn't able to report on any problems with them obviously, and could only report on the older types... Sorry us testers weren't able to catch this problem before it started rolling out...


I disagree of the quoted above

I started the thread at ralph@home announcing the new version 4.97 to test
on 7 April at 09:37 UTC

by 12:52 UTC 7 April, I have already reported this error on Windows 4.97

by 00:24 UTC 8 April, Son Goku posted that 4.97 was working fine

After that time ... 8 April, is that 4.97 go to rosetta@home ...


I wonder why 4.95 that was working very well ... fixed several problems.
is not what was placed into rosetta@home instead -:(

http://www.fadbeens.co.uk/phpBB2/viewtopic.php?t=53&start=165

Now, that was rolled back to 4.83, I know why I crunched all the day
w/o completing only one WU of 4.98 into two of my pcs

-> see my signature ... my rac is failing down on rosetta -:(

So, I will STOP crunching to rosetta again, until a new version that
checkpoint enough to allow swapping apps removing from ram, comes in.

4.95 was that version !!! 4.96 too


In fact the information that I have is that Rosetta version 4.97 WAS in fact RALPH version 4.95. The version number was all that changed when it was implemented for Rosetta. What was not known at the time was how the newer WUs would react in the production environment. What is interesting here is that the RALPH testers are usually running BOINC version 5.2.32 and most Rosetta users are running BOINC 5.2.13. This may be part of the issue.

In any case you are wrong about what was implemented in Rosetta. While the version number for the Rosetta application is different, it is the same application that was working well in RALPH. RALPH version 4.97 is not what was deployed in Rosetta.

The workunit testing in RALPH did not show any problems with the newer workunits, however RALPH is a VERY limited subset of the types of systems and configurations running in Rosetta. Because of this fact it is not possible to test every possible issue before new work unit types are deployed.

As has been pointed out on may occasions, a number of your systems are below the minimum memory requirements for the project. The ones that are not, are reporting a significant portion of the memory as not available for Rosetta to use. This single fact has been and will continue to be the largest problem facing you in running Rosetta or Ralph. There are almost no problems reported for systems running with more memory unless a batch of bad work units comes along, and that will happen from time to time.


Read here, scroll down to end
http://ralph.bakerlab.org/forum_thread.php?id=155
Thanks,

Click signature for global team stats
ID: 13372 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Fuzzy Hollynoodles
Avatar

Send message
Joined: 7 Oct 05
Posts: 234
Credit: 15,020
RAC: 0
Message 13388 - Posted: 10 Apr 2006, 13:06:51 UTC - in response to Message 13362.  
Last modified: 10 Apr 2006, 13:10:12 UTC



I guess I need to turn on the lights when I type. I meant to say BOINC 5.2.28. Rom had asked all the RALPH testers top upgrade to this version for improved error reporting. I think almost all of them performed the upgrade.

In any case it would be interesting to see if this had any impact on what has happened over the last day or so.


I guess you mean the 5.3.28 version? :-) Maybe some more light is needed? ;-)

That was the one Rom asked us to upgrade to. And it is pretty stable, runs fine on my computer.



[b]"I'm trying to maintain a shred of dignity in this world." - Me[/b]

ID: 13388 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
STE\/E

Send message
Joined: 17 Sep 05
Posts: 125
Credit: 4,100,301
RAC: 114
Message 13393 - Posted: 10 Apr 2006, 15:18:41 UTC - in response to Message 13388.  
Last modified: 10 Apr 2006, 15:41:38 UTC

I guess you mean the 5.3.28 version? :-) Maybe some more light is needed? ;-)

That was the one Rom asked us to upgrade to. And it is pretty stable, runs fine on my computer.


v5.3.28 Blows as far as I'm concerned. The BOINC Manager will use between 2-3% of the CPU even when it isn't open. The only way to get it to stop using the 2-3% is to Close the Manager completly.

It will also use 5-50% of the CPU if you open the Manager and the Work or now called Task Window, it acts real jerky at times too when adjusting the Windows ... I never seen any of this with v5.2.15 the previous version I was using ...
ID: 13393 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Moderator9
Volunteer moderator

Send message
Joined: 22 Jan 06
Posts: 1014
Credit: 0
RAC: 0
Message 13394 - Posted: 10 Apr 2006, 15:40:29 UTC - in response to Message 13388.  
Last modified: 10 Apr 2006, 15:44:29 UTC



I guess I need to turn on the lights when I type. I meant to say BOINC 5.2.28. Rom had asked all the RALPH testers top upgrade to this version for improved error reporting. I think almost all of them performed the upgrade.

In any case it would be interesting to see if this had any impact on what has happened over the last day or so.


I guess you mean the 5.3.28 version? :-) Maybe some more light is needed? ;-)

That was the one Rom asked us to upgrade to. And it is pretty stable, runs fine on my computer.



OK, Ok! I get so used to typing the same version number over and over, it is only natural that from time to time I will mess it up. ;>) But I get it now, you just want to pick on the sleepy moderator sitting in the dark room.

Yes what I meant to say was BOINC version
5.3.28

So Tony, you and "Fuzzy" leave me alone to sulk.

Moderator9
ROSETTA@home FAQ
Moderator Contact
ID: 13394 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Robinski

Send message
Joined: 7 Mar 06
Posts: 51
Credit: 85,383
RAC: 0
Message 13416 - Posted: 10 Apr 2006, 19:53:51 UTC - in response to Message 13362.  

It seems like BOINC v5.3.31 is latest, to see the BETA versions one can use the URL

http://boinc.berkeley.edu/download.php?dev=1

whereas the "official" stable versions are at

http://boinc.berkeley.edu/download.php


I guess I need to turn on the lights when I type. I meant to say BOINC 5.2.28. Rom had asked all the RALPH testers top upgrade to this version for improved error reporting. I think almost all of them performed the upgrade.

In any case it would be interesting to see if this had any impact on what has happened over the last day or so.


I did run some Ralph WU's on a 5.2.13 Boinc Client and they finished fine.
It was however on a machine that I hadn't running this weekend, when the 4.97 Problems hit.
Member of the Dutch Power Cows

Trying to get the world on IPv6, do you have it? check here: IPv6.RHarmsen.nl
ID: 13416 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 6 · 7 · 8 · 9 · 10 · Next

Message boards : Number crunching : Miscellaneous Work Unit Errors



©2024 University of Washington
https://www.bakerlab.org