Odd problem boinc 5.4.9

Message boards : Number crunching : Odd problem boinc 5.4.9

To post messages, you must log in.

AuthorMessage
Gaijin

Send message
Joined: 3 May 06
Posts: 5
Credit: 4,520
RAC: 0
Message 16208 - Posted: 14 May 2006, 4:15:04 UTC

Hey guys, here is my problem
I saw that the boinc manager was updated to a newer version, so I download it. Close my old one (5.2.13, installed the new one and when I opened the new version this is where all my problems started. Before installing the new version I paused the 2 WU I had running,and then when I opened up the new version of boinc i decide to resume them. Shortly after I've resumed my WU, I see that nothing is crunching, I check my log and I got a strange error saying that the cpu can't compute (or something like that) and that it'll try to restart the WU and that if it happens again that I may need to restart the project. I check my log back after 5 minutes and I got something like 20 of those errors. So I decide to reset the project, badluck again, the same thing happens with the 2 new WU. Then I decided to downgrade my version to the old one that was running very smooth, I don't understant why but the 2 WU from the start were there and one finished computing before the other failled. Now this bring me to another problem, I've checked my results page and my computer page, I know have 3 computers (al are the same one) with with different WU that are stated as "running" when obviously they aren't because I've reseted the project 2 time... or 3...

All I would like to know is if this problem is due to some kind of error I've made and how to "repair" it so that I could countinue crunshing WU with only 1 computer in "my computer" page that will have all the result on it.

I don't know if you guys understand what I've tried to explain...

my computer page
ID: 16208 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Moderator9
Volunteer moderator

Send message
Joined: 22 Jan 06
Posts: 1014
Credit: 0
RAC: 0
Message 16209 - Posted: 14 May 2006, 4:23:54 UTC - in response to Message 16208.  

Hey guys, here is my problem
...All I would like to know is if this problem is due to some kind of error I've made and how to "repair" it so that I could countinue crunshing WU with only 1 computer in "my computer" page that will have all the result on it.

I don't know if you guys understand what I've tried to explain...

my computer page


Well starting at the end of the question. The Merge function is currently disabled and the last info I have from the project is that it will require a server update to fix it. That update is not currently scheduled, and so that will have to wait.

You will have fewer problems with the 5.4.9 BOINC that you will with 5.2.13. So my advice would be to upgrade it to 5.4.9 and we can go from there. I have never seen a message like the one you describe, so what I would suggest is after you install BOINC version 5.4.9, start it up, make certain you are attached to Rosetta, If not attach to the project. If it does not start to work, go to the messages tab and copy the messages into a post here so we can see what you are seeing. From that we might be able to figure out what is happening.

Moderator9
ROSETTA@home FAQ
Moderator Contact
ID: 16209 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Feet1st
Avatar

Send message
Joined: 30 Dec 05
Posts: 1755
Credit: 4,690,520
RAC: 0
Message 16210 - Posted: 14 May 2006, 4:27:59 UTC - in response to Message 16208.  

All I would like to know is if this problem is due to some kind of error I've made and how to "repair" it so that I could countinue crunshing WU with only 1 computer in "my computer" page that will have all the result on it.

Well, as for the 3 computers, no. Until the merge function is activated again, you will still have them all. It doesn't hurt anything. They all belong to you, the same user on the same team etc.

I believe he may be talking about a message like this:
5/13/2006 12:38:17 PM|ralph@home|Task HOMOLOG_ABRELAX_hom007_t283__511_12_0 exited with zero status but no 'finished' file
5/13/2006 12:38:17 PM|ralph@home|If this happens repeatedly you may need to reset the project.


Add this signature to your EMail:
Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might!
https://boinc.bakerlab.org/rosetta/
ID: 16210 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Gaijin

Send message
Joined: 3 May 06
Posts: 5
Credit: 4,520
RAC: 0
Message 16211 - Posted: 14 May 2006, 4:38:10 UTC

I believe he may be talking about a message like this:

5/13/2006 12:38:17 PM|ralph@home|Task HOMOLOG_ABRELAX_hom007_t283__511_12_0 exited with zero status but no 'finished' file
5/13/2006 12:38:17 PM|ralph@home|If this happens repeatedly you may need to reset the project.


this is exactly the message I had. what should I do now, finish my current WU on boinc 5.2.13 and then upgrade to 5.4.9?
What will happen with the other WU that are marked as "running" on the "other" computer, I suppose I will have to forget about those WU...
ID: 16211 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Feet1st
Avatar

Send message
Joined: 30 Dec 05
Posts: 1755
Credit: 4,690,520
RAC: 0
Message 16212 - Posted: 14 May 2006, 4:47:37 UTC - in response to Message 16211.  
Last modified: 14 May 2006, 4:49:46 UTC

What will happen with the other WU that are marked as "running" on the "other" computer, I suppose I will have to forget about those WU...

Sounds like those WUs are gone. You no longer have them. The server has no way to know you've reloaded and aren't actually "running" them. I suppose it would be more accurate of them to say the WU was "downloaded".

Bottom line, they will reach their deadline and be reused or discarded. It doesn't hurt anything. People have PC's die, or they decide to turn it off for 2 weeks or whatever. Many reasons for WUs to not report back at all. The project is built to be "fault tollerant".

You SHOULD be able to upgrade to the new BOINC version and continue crunching the WUs you have. I mean, what you tried to do the first time should have worked. No need to suspend the WUs though. Just end BOINC, install (to same directory), start BOINC. I've done two PCs with WUs in progress and they were able to continue crunching on the new version.
Add this signature to your EMail:
Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might!
https://boinc.bakerlab.org/rosetta/
ID: 16212 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Astro
Avatar

Send message
Joined: 2 Oct 05
Posts: 987
Credit: 500,253
RAC: 0
Message 16213 - Posted: 14 May 2006, 4:55:56 UTC - in response to Message 16208.  

Hey guys, here is my problem
I saw that the boinc manager was updated to a newer version, so I download it. Close my old one (5.2.13, installed the new one and when I opened the new version ...

what did you click on to close "the old one"?

tony
ID: 16213 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Gaijin

Send message
Joined: 3 May 06
Posts: 5
Credit: 4,520
RAC: 0
Message 16214 - Posted: 14 May 2006, 4:59:09 UTC

Well to close "the old one" I simply paused the 2 WU, told boinc not to accept anymore work and I pressed the red "X" on the top right corner...
ID: 16214 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Astro
Avatar

Send message
Joined: 2 Oct 05
Posts: 987
Credit: 500,253
RAC: 0
Message 16215 - Posted: 14 May 2006, 5:02:53 UTC - in response to Message 16214.  
Last modified: 14 May 2006, 5:03:24 UTC

Well to close "the old one" I simply paused the 2 WU, told boinc not to accept anymore work and I pressed the red "X" on the top right corner...

Ahh, I think I found the problem, the "red X" only closes the boinc manager down, It does NOT close down "boinc.exe" (the daemon that actually does the work). Installing the new over the old while boinc.exe was running caused the corruption.

To exit all of boinc, you need to click "file-exit", or right click on the "B" in the systray and select exit.

tony
ID: 16215 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Astro
Avatar

Send message
Joined: 2 Oct 05
Posts: 987
Credit: 500,253
RAC: 0
Message 16216 - Posted: 14 May 2006, 5:05:40 UTC

at that point your "clientstate.xml" file became irrevocably corrupted. All the work is GONE.

Have you managed to get new work to replace it, and is it going OK?
ID: 16216 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Astro
Avatar

Send message
Joined: 2 Oct 05
Posts: 987
Credit: 500,253
RAC: 0
Message 16218 - Posted: 14 May 2006, 5:08:54 UTC

FYI there are three seperate parts to Boinc itself, the manager, the daemon, and the screensaver(if used). Boinc controls the project application files for all projects (Rosetta for example). It decides what to run, when to run it, how much to download, when to upload, when to report, etc. Boinc doesn't make any application run faster, as it's really just a manager for applications.

It's the project applications that actually crunch the numbers
ID: 16218 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Feet1st
Avatar

Send message
Joined: 30 Dec 05
Posts: 1755
Credit: 4,690,520
RAC: 0
Message 16219 - Posted: 14 May 2006, 5:15:00 UTC
Last modified: 14 May 2006, 5:17:00 UTC

Tony, aren't you supposed to be at the races? :)

Gaijin, I just wanted to point out that when Tony says the WUs are "gone", be means that they aren't on your computer anymore, and there's no way to get them back. You do still see them on the website... and as I said earlier... that's O... K.

So, Tony, The chicken's got it wrong then? You are certainly the one who knows. Does BOINC actually have a link detailing proper upgrade steps?
Add this signature to your EMail:
Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might!
https://boinc.bakerlab.org/rosetta/
ID: 16219 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Gaijin

Send message
Joined: 3 May 06
Posts: 5
Credit: 4,520
RAC: 0
Message 16220 - Posted: 14 May 2006, 5:16:21 UTC
Last modified: 14 May 2006, 5:23:23 UTC

Have you managed to get new work to replace it, and is it going OK?


Yeah well I have a WU working right now.. I guess what I'll do after this one is that I'm going to close boinc, uninstall it and then install the new version.

Everything should be back to normal after that I guess...


EDIT: I really apreciate your help guys. it's great to see that the people here are helping each other. Thank you Feet1st, Moderator9 and Tony for your help and quick answers :D
ID: 16220 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Feet1st
Avatar

Send message
Joined: 30 Dec 05
Posts: 1755
Credit: 4,690,520
RAC: 0
Message 16221 - Posted: 14 May 2006, 5:18:32 UTC - in response to Message 16220.  

Everything should be back to normal after that I guess...

Yep, I edited in a link to my last post. There the BOINC upgrade is discussed and you'll see that the "uninstall" step is not required. Just the "exit" that Tony described.

Add this signature to your EMail:
Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might!
https://boinc.bakerlab.org/rosetta/
ID: 16221 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Astro
Avatar

Send message
Joined: 2 Oct 05
Posts: 987
Credit: 500,253
RAC: 0
Message 16222 - Posted: 14 May 2006, 5:24:41 UTC - in response to Message 16220.  

Have you managed to get new work to replace it, and is it going OK?


Yeah well I have a WU working right now.. I guess what I'll do after this one is that I'm going to close boinc, uninstall it and then install the new version.

Everything should be back to normal after that I guess...

Uninstalling Boinc through Windows "Add/remove Programs" does'nt remove the boinc folder and subfolders located at C:programfilesboincet al. It even leaves the common project files untouched.

I'm sorry you've had to learn about the red x the way you did.

If I were you. I'd uninstall, then reinstall boinc 5.4.9. After this I'd do a detach/re- attach to Rosetta. Detaching will delete the Rosetta folders and files.

This will start you off with a clean install.

tony
ID: 16222 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Astro
Avatar

Send message
Joined: 2 Oct 05
Posts: 987
Credit: 500,253
RAC: 0
Message 16223 - Posted: 14 May 2006, 5:26:45 UTC - in response to Message 16221.  
Last modified: 14 May 2006, 5:29:18 UTC

Everything should be back to normal after that I guess...

Yep, I edited in a link to my last post. There the BOINC upgrade is discussed and you'll see that the "uninstall" step is not required. Just the "exit" that Tony described.

Sorry, Feet1st I didn't see your post. I just recommended he does an uninstall. I'm not sure what he's goofed up in Boinc, and since he's new, it'd be easier to start him off fresh, rather than to risk future problems with may cause him to lose interest in Rosetta.

I don't often recommend "nuking" the whole thing, this is one of those few times I will.

tony
ID: 16223 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Moderator9
Volunteer moderator

Send message
Joined: 22 Jan 06
Posts: 1014
Credit: 0
RAC: 0
Message 16234 - Posted: 14 May 2006, 6:20:48 UTC - in response to Message 16223.  

Everything should be back to normal after that I guess...

Yep, I edited in a link to my last post. There the BOINC upgrade is discussed and you'll see that the "uninstall" step is not required. Just the "exit" that Tony described.

Sorry, Feet1st I didn't see your post. I just recommended he does an uninstall. I'm not sure what he's goofed up in Boinc, and since he's new, it'd be easier to start him off fresh, rather than to risk future problems with may cause him to lose interest in Rosetta.

I don't often recommend "nuking" the whole thing, this is one of those few times I will.

tony


I concur with Tony. You need to start fresh this time. There would be no other way to be certain something else is not messed up.

But just a note about the "finished file" message should you see it again. That is not a fatal error. It is a warning. The system will move on to the next Work unit. If you see it a lot of time in a row, that is when you should consider a reset.

Moderator9
ROSETTA@home FAQ
Moderator Contact
ID: 16234 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Gaijin

Send message
Joined: 3 May 06
Posts: 5
Credit: 4,520
RAC: 0
Message 16239 - Posted: 14 May 2006, 7:46:39 UTC

wow thank you all for the big help!! quick and friendly answers I really apreciate ! Everything is running smooth now I just did what Tony and Moderator9 said, I'm sorry tho for the 4 WU lost, I would have loved to crunch them :P
ID: 16239 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
FluffyChicken
Avatar

Send message
Joined: 1 Nov 05
Posts: 1260
Credit: 369,635
RAC: 0
Message 16245 - Posted: 14 May 2006, 9:51:30 UTC

[Talking windows here]
Boinc should not need closing, the installation program does detect if it is running (as I mentioned in that by Feet1st) and closes it, installs and starts up again. I have done this on 5 computer so far, all three installation types, shared user, single user and service install.

You could have tried just reinstalling as the installation porgram will see that the versions match and give you a repair option, (aside:which is also usful if you try a different boinc.exe out and you forget to backup ;-))
Maybe it was because you suspended/paused the work units (now called 'tasks' btw) and this casued the corruption on the update? or you had bad luck.

Though when everything goes wrong, it is certainly easier to start afresh :-) [unless it a CPDN model at 75% or so ;-)]
Team mauisun.org
ID: 16245 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote

Message boards : Number crunching : Odd problem boinc 5.4.9



©2024 University of Washington
https://www.bakerlab.org