Processing Ceases

Questions and Answers : Windows : Processing Ceases

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
Gregory D. MELLOTT

Send message
Joined: 8 May 09
Posts: 1
Credit: 1,484,976
RAC: 0
Message 66733 - Posted: 1 Jul 2010, 14:42:22 UTC

Hi, I also have been finding hung or dropped prosesses that BOINC is supposed to manage. I have not been able to determine at what point the process ceased to work. BOINC always seemed to just keep them running, though they are not using any CPU time anymore; and go on and start another. If I get a bunch of them, Windows starts looking for more virtual memory.
The usually way I note the matter is to start the Task Manager [Ctl+Alt+Del] and look at the list of processes. And the also look and BOINC's list. Those using 0 CPU time will usually not be in BOINC's list. Anyway, I close BOINC, then shut it down (and the processes it managing [as asked about in a window]). Then only those that have dropped out of BOINC's control will remain in the Task manager's list. So I just 'End' the BOINC started processes now hung up (never with any active CPU time that I remember) and restart BOINC. It seems to happen on about all the machines I've used at some time or other. Though one seems to have it happen more often. It may be the ratio of usage by the various projects that has it working out that way. Also it seems to happen on about all the projects, though 'when' does vary.
ID: 66733 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile E the P

Send message
Joined: 5 Jun 06
Posts: 36
Credit: 28,333,251
RAC: 0
Message 67808 - Posted: 24 Sep 2010, 12:28:25 UTC

For whatever it's worth I've begun to see the same thing, but only on one of my machines. I'll try to keep track of what WUs are causing the probelem
ID: 67808 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile E the P

Send message
Joined: 5 Jun 06
Posts: 36
Credit: 28,333,251
RAC: 0
Message 67852 - Posted: 28 Sep 2010, 17:12:55 UTC - in response to Message 67808.  

For whatever it's worth I've begun to see the same thing, but only on one of my machines. I'll try to keep track of what WUs are causing the probelem



I've been checking and there is no pattern that I can tell. Different WUs with not common name.
ID: 67852 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
alan.conwell

Send message
Joined: 30 Aug 07
Posts: 1
Credit: 255,341
RAC: 0
Message 67990 - Posted: 8 Oct 2010, 14:03:36 UTC

This WU runs for just a few minutes, then memory use ramps up until all memory is consumed, I get an "out of memory message" popup, and the whole sequence repeats. I've aborted this WU.

rs_stg0_lrlx_t363__run1_SAVE_ALL_OUT_19372_1246_0
ID: 67990 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile E the P

Send message
Joined: 5 Jun 06
Posts: 36
Credit: 28,333,251
RAC: 0
Message 68148 - Posted: 20 Oct 2010, 14:47:20 UTC - in response to Message 67852.  

For whatever it's worth I've begun to see the same thing, but only on one of my machines. I'll try to keep track of what WUs are causing the probelem



I've been checking and there is no pattern that I can tell. Different WUs with not common name.


Interesting note. Once they moved to 2.16 the problem has not happened again. Could it have been related to the reported memory issues?
ID: 68148 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
TPCBF

Send message
Joined: 29 Nov 10
Posts: 111
Credit: 5,076,752
RAC: 1,533
Message 68782 - Posted: 7 Dec 2010, 17:25:27 UTC - in response to Message 68148.  

For whatever it's worth I've begun to see the same thing, but only on one of my machines. I'll try to keep track of what WUs are causing the probelem



I've been checking and there is no pattern that I can tell. Different WUs with not common name.


Interesting note. Once they moved to 2.16 the problem has not happened again. Could it have been related to the reported memory issues?
I have at least a very similar problem, with the latest v2.17 however. after running a few WU just fine, two PCs just stop actually processing a task at a very low percentage (0.4 in one case, 1.19 in the other). Both machines have been rebooted a couple of times but no change. One machine has now been sitting, mainly unused until I type this, for 24h with no progress... :-(


ID: 68782 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 68787 - Posted: 8 Dec 2010, 7:05:36 UTC

TPCBF, welcome to Rosetta.

I see you have 7 machines. Please open a new thread on the Number Crunching board (where more people will see it), and tell us more about which machine, and what you are seeing on the tasks tab of BOINC and in Windows task manager. Were you describing the % of CPU utilization from task manager? Or the % complete shown in BOINC?
Rosetta Moderator: Mod.Sense
ID: 68787 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Torsten Persson

Send message
Joined: 11 Feb 08
Posts: 5
Credit: 31,638,460
RAC: 3,752
Message 69791 - Posted: 12 Mar 2011, 3:54:27 UTC - in response to Message 66496.  

Well, here's another one:

rs_stg0_lrlx_T389_casp8_SAVE_ALL_OUT_20772_2567_0

It seems a waste that these work units complete 10%-15% before crashing. This one quit processing in the middle of the night, again. BOINC points a finger at the project, and the project just shrugs.

Is anybody watching? Does anybody care? Is this a normal occurrence? Should this information be posted elsewhere?

WTF!

deesy

I have the same problem and haven't been able to spot any pattern. I've used Windows XP and a few Linux versions, I have changed memory modules, improved the cooling. The only thing I haven't changed is the motherboard (Asus P5KPL) and the processor (Core2Duo 2.66 MHz).

Torsten
ID: 69791 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Murasaki
Avatar

Send message
Joined: 20 Apr 06
Posts: 303
Credit: 511,418
RAC: 0
Message 69796 - Posted: 12 Mar 2011, 13:26:42 UTC - in response to Message 69791.  

Well, here's another one:

rs_stg0_lrlx_T389_casp8_SAVE_ALL_OUT_20772_2567_0

It seems a waste that these work units complete 10%-15% before crashing. This one quit processing in the middle of the night, again. BOINC points a finger at the project, and the project just shrugs.

Is anybody watching? Does anybody care? Is this a normal occurrence? Should this information be posted elsewhere?

WTF!

deesy

I have the same problem and haven't been able to spot any pattern. I've used Windows XP and a few Linux versions, I have changed memory modules, improved the cooling. The only thing I haven't changed is the motherboard (Asus P5KPL) and the processor (Core2Duo 2.66 MHz).

Torsten


Before taking the drastic step of pulling apart your computer it is usually best to post a message in the Number crunching forum to ask if other users are getting similar errors. If other people are affected then it is not a problem with your computer.

Looking at recent failures on your systems a lot of your problems seem to be with tasks called "Ferredoxin-like_abinitio_SAVE_ALL_OUT_design..."

Taking a look at the minirosetta 2.17 thread in the Number crunching forum shows that several other users are seeing problems with tasks called ""Ferredoxin..." This means it is probably a problem with that batch of work and the problem will be fixed when the scientists withdraw that batch and send out a replacement.
ID: 69796 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 · 2

Questions and Answers : Windows : Processing Ceases



©2024 University of Washington
https://www.bakerlab.org