Questions and Answers : Windows : Processing Ceases
Previous · 1 · 2
Author | Message |
---|---|
Gregory D. MELLOTT Send message Joined: 8 May 09 Posts: 1 Credit: 1,484,976 RAC: 0 |
Hi, I also have been finding hung or dropped prosesses that BOINC is supposed to manage. I have not been able to determine at what point the process ceased to work. BOINC always seemed to just keep them running, though they are not using any CPU time anymore; and go on and start another. If I get a bunch of them, Windows starts looking for more virtual memory. The usually way I note the matter is to start the Task Manager [Ctl+Alt+Del] and look at the list of processes. And the also look and BOINC's list. Those using 0 CPU time will usually not be in BOINC's list. Anyway, I close BOINC, then shut it down (and the processes it managing [as asked about in a window]). Then only those that have dropped out of BOINC's control will remain in the Task manager's list. So I just 'End' the BOINC started processes now hung up (never with any active CPU time that I remember) and restart BOINC. It seems to happen on about all the machines I've used at some time or other. Though one seems to have it happen more often. It may be the ratio of usage by the various projects that has it working out that way. Also it seems to happen on about all the projects, though 'when' does vary. |
E the P Send message Joined: 5 Jun 06 Posts: 36 Credit: 28,333,251 RAC: 0 |
For whatever it's worth I've begun to see the same thing, but only on one of my machines. I'll try to keep track of what WUs are causing the probelem |
E the P Send message Joined: 5 Jun 06 Posts: 36 Credit: 28,333,251 RAC: 0 |
For whatever it's worth I've begun to see the same thing, but only on one of my machines. I'll try to keep track of what WUs are causing the probelem I've been checking and there is no pattern that I can tell. Different WUs with not common name. |
alan.conwell Send message Joined: 30 Aug 07 Posts: 1 Credit: 255,341 RAC: 0 |
This WU runs for just a few minutes, then memory use ramps up until all memory is consumed, I get an "out of memory message" popup, and the whole sequence repeats. I've aborted this WU. rs_stg0_lrlx_t363__run1_SAVE_ALL_OUT_19372_1246_0 |
E the P Send message Joined: 5 Jun 06 Posts: 36 Credit: 28,333,251 RAC: 0 |
For whatever it's worth I've begun to see the same thing, but only on one of my machines. I'll try to keep track of what WUs are causing the probelem Interesting note. Once they moved to 2.16 the problem has not happened again. Could it have been related to the reported memory issues? |
TPCBF Send message Joined: 29 Nov 10 Posts: 111 Credit: 5,077,437 RAC: 1,580 |
I have at least a very similar problem, with the latest v2.17 however. after running a few WU just fine, two PCs just stop actually processing a task at a very low percentage (0.4 in one case, 1.19 in the other). Both machines have been rebooted a couple of times but no change. One machine has now been sitting, mainly unused until I type this, for 24h with no progress... :-(For whatever it's worth I've begun to see the same thing, but only on one of my machines. I'll try to keep track of what WUs are causing the probelem |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
TPCBF, welcome to Rosetta. I see you have 7 machines. Please open a new thread on the Number Crunching board (where more people will see it), and tell us more about which machine, and what you are seeing on the tasks tab of BOINC and in Windows task manager. Were you describing the % of CPU utilization from task manager? Or the % complete shown in BOINC? Rosetta Moderator: Mod.Sense |
Torsten Persson Send message Joined: 11 Feb 08 Posts: 5 Credit: 31,638,907 RAC: 3,777 |
Well, here's another one: I have the same problem and haven't been able to spot any pattern. I've used Windows XP and a few Linux versions, I have changed memory modules, improved the cooling. The only thing I haven't changed is the motherboard (Asus P5KPL) and the processor (Core2Duo 2.66 MHz). Torsten |
Murasaki Send message Joined: 20 Apr 06 Posts: 303 Credit: 511,418 RAC: 0 |
Well, here's another one: Before taking the drastic step of pulling apart your computer it is usually best to post a message in the Number crunching forum to ask if other users are getting similar errors. If other people are affected then it is not a problem with your computer. Looking at recent failures on your systems a lot of your problems seem to be with tasks called "Ferredoxin-like_abinitio_SAVE_ALL_OUT_design..." Taking a look at the minirosetta 2.17 thread in the Number crunching forum shows that several other users are seeing problems with tasks called ""Ferredoxin..." This means it is probably a problem with that batch of work and the problem will be fixed when the scientists withdraw that batch and send out a replacement. |
Questions and Answers :
Windows :
Processing Ceases
©2024 University of Washington
https://www.bakerlab.org