Message boards : Number crunching : Client Errors
Previous · 1 . . . 5 · 6 · 7 · 8
Author | Message |
---|---|
wbblakemore Send message Joined: 18 Dec 07 Posts: 33 Credit: 4,181 RAC: 0 |
OK .... we're coming up on three months since this thread was opened back in mid-February. At this point, I'm tempted to just write Rosetta off as a bad idea. How about it, support people? Are we any closer to a fix for this problem? |
mikey Send message Joined: 5 Jan 06 Posts: 1895 Credit: 9,183,973 RAC: 3,314 |
OK .... we're coming up on three months since this thread was opened back in mid-February. At this point, I'm tempted to just write Rosetta off as a bad idea. I AGREE, since Ralph works WHY have they not discussed moving everything over there and at least doing SOMETHING worthwhile?!!! For a project like Rosetta and as big as they say they are and to get the funding they do this is PATHETIC!!! |
(retired account) Send message Joined: 4 May 12 Posts: 5 Credit: 200,841 RAC: 0 |
Is there anyone with our problem that does not have all of the following attributes: Joined yesterday to participate in the Pentathlon, but all workunit so far failed with client errors. 1) yes 2) yes 3) no, only an AMD Phenom II X6 Stopped Einstein GPU units, but no effect, still client errors. Since I lack the time for tweaking in the moment, I will try to participate only with my subnotebook, unfortunately being a lot weaker. System specs for the records: CPU: AMD Phenom II X6 1090T @ 3.20GHz (stock speed) RAM: 8GB GPU: NVIDIA GeForce GTX 560 Ti (2048MB) driver: 285.62 OS: Win7 Prof. x64 Edition BOINC: 7.0.25 (64bit) Regards |
A.M. Send message Joined: 13 Jun 06 Posts: 12 Credit: 954,586 RAC: 0 |
I've been getting some good WUs... still a lot of errors, although not of the type seen previously. Most of what I'm seeing right now seems to be memory Access Violations. |
(retired account) Send message Joined: 4 May 12 Posts: 5 Credit: 200,841 RAC: 0 |
Joined yesterday to participate in the Pentathlon, but all workunit so far failed with client errors. Footnote: astonishingly enough I have accumulated credits without getting granted credits... ? |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
R@h awards credit, even for tasks that end with computation errors. This credit is done on a daily basis, and is not reflected on the work units display of granted credit. You have to look at each specific task's details to see the granted credit. Welcome aboard! I see your 6 CPU system is having consistent client errors. What BOINC version are you running on that machine? Rosetta Moderator: Mod.Sense |
woland Send message Joined: 17 Dec 05 Posts: 5 Credit: 124,792 RAC: 0 |
Guys, I'm sorry to say that, but this is really embarrassing. I'm also software developer and I also have to deal with user reported bugs and I cannot imagine having a bug reported 3 months ago, with tons of data to reproduce the issue, and no answer. There's a bug in validation code - where else could it be? Results are calculated correctly but marked as invalid because of CUDA information in it. How long can it take to debug the validation code and cover the uncaught parsing exception or whatever it is... |
mikey Send message Joined: 5 Jan 06 Posts: 1895 Credit: 9,183,973 RAC: 3,314 |
Guys, I'm sorry to say that, but this is really embarrassing. I'm also software developer and I also have to deal with user reported bugs and I cannot imagine having a bug reported 3 months ago, with tons of data to reproduce the issue, and no answer. There's a bug in validation code - where else could it be? Results are calculated correctly but marked as invalid because of CUDA information in it. How long can it take to debug the validation code and cover the uncaught parsing exception or whatever it is... I am not sure they care enough yet! Sure they care some but as long as people are still sending back units Rosetta is turning out the research, it is just NOT as helpful as it should be. When their workunits dry up, probably not anytime soon as they are STILL doing Challenges even now, then they will say 'sorry we missed it, it was a bug in a couple of lines of code and should be fixed now, yada, yada, yada'! I wish we had the power to write some of their sponsors and put a bug in their ear about the problems! I don't say that to be mean, I LIKE Rosetta!!! Rosetta is getting to be like the GSA movies, 'PARTY ON DUDE, the money is rolling in, who cares that we have very little results to show for it!' I just don't think they care about the crunchers right now, too many OTHER things going on!! Many years ago Seti had a problem, they stopped sending out units instead of overloading the server to send out a unit, get it back as bad and having to send it out again and again and again!! As bad as Seti could, and can, be, it STILL did some things to perfection!!! |
In Memory of Kimsey M Fowler Sr Send message Joined: 10 Mar 12 Posts: 26 Credit: 39,033,222 RAC: 0 |
Guys, I'm sorry to say that, but this is really embarrassing. I'm also software developer and I also have to deal with user reported bugs and I cannot imagine having a bug reported 3 months ago, with tons of data to reproduce the issue, and no answer. There's a bug in validation code - where else could it be? I'm just shaking my head in frustration about this too... I'm a former software engineer from an SEI CMM Level 5 software design organization in Seattle. I was dumbstruck by an e-mail back from Rosetta staff yesterday that no further effort will be expended to determine why the Rosetta servers are rejecting WU's. One can assume that the staff doesn't see this problem as widespread enough to make it worth their time to look into. Yesterday's post by David Baker suggests he is thrilled with the available computing power available to the project at the present time. Might I suggest that those of you experiencing this problem consider donating your computing resources to folding@home at Stanford Medical School. Their software is self-contained, stable, doesn't run under BOINC middleware, computes on your choice of CPU and/or GPU, has an excellent working simulation of the protein molecule that can be manipulated with the mouse to view/rotate/enlarge/etc, lots of interesting information that's easily accessible about each protein you're folding and why it is important, and is the world's largest computing network. |
wbblakemore Send message Joined: 18 Dec 07 Posts: 33 Credit: 4,181 RAC: 0 |
Thanks for the update. Words simply fail me when I try to express my contempt for support staff that can't be bothered with actually providing support. My best regards to all those valiant users who tried to assist in dealing with this issue. You're good people who have earned my respect. I'm outta here ... |
woland Send message Joined: 17 Dec 05 Posts: 5 Credit: 124,792 RAC: 0 |
I was dumbstruck by an e-mail back from Rosetta staff yesterday that no further effort will be expended to determine why the Rosetta servers are rejecting WU's. Please tell me that this is a joke... |
Sky King Send message Joined: 28 Feb 12 Posts: 11 Credit: 15,912 RAC: 0 |
Might I suggest that those of you experiencing this problem consider donating your computing resources to folding@home at Stanford Medical School. Their software is self-contained, stable, doesn't run under BOINC middleware, computes on your choice of CPU and/or GPU, has an excellent working simulation of the protein molecule that can be manipulated with the mouse to view/rotate/enlarge/etc, lots of interesting information that's easily accessible about each protein you're folding and why it is important, and is the world's largest computing network. If people want some help/advice on F@H, thee are probably some here who can provide a lot of insight and help in getting started. I myself am a 10 million point F@H contributor, and have been at the very bleeding edge of beta'ing the newest Windows SMP, Windows GPU, and linux VM appliance clients. I don't want to use up a lot of R@H's forum space touting a "competitor" but here's my observations about F@H. Running a basic F@H client as a service in the background of your PC is very straightforward and involves very little interaction from you, the user. Install it, fire it up, let it run, and check on your stats every week or so, and you're good to go, and maybe churn out 2,000 PPD. However, as you move up the performance curve to optimized SMP or GPU configs and you're trying to squeeze out every last bonus point, the workload increases quite a bit. That was kind of the downside of F@H for me, I was squeezing every last point out of my i7 and getting big bonuses. (Under an initiative called "-bigadv", users of higher end i7s and up can opt in to a bonus program where you get time-sensitive bonus points for returning very large, complex units quickly--like 3 day deadlines.) In fact, I couldn't run the GPU client because the i7 needs every spare cycle in order to make the deadlines and thus be bonus eligible... and the bonuses for this scientifically urgent work were way more points than my ATI 4850 could churn. After over a year of being deeply involved in profiling the performance optimization of the i7 using both Windows native SMP clients and VMware linux appliances, suddenly I got hammered by Stanford about 3 months ago... 8 core i7's are no longer bonus eligible, you have to be running at least 16 cores on the same WU or you can't make the deadlines and you lose all your points. Not even a high overclock can get you home on 8 cores. Suddenly I was going from 20,000 PPD on my i7 alone to a max of about 3 or 4,000 PPD. Feeling somewhat abandoned, but committed to protein folding, I decided to bail out of the huge workload of super-optimized F@H and opted into the simple life of BOINC-managed folding at R@H. What could be easier, this will be great! Of course, I got 8 WUs a day and couldn't figure out why my CPU was never busy, and when I investigated, I found the client error issue, hence my bump of this thread a few months ago. So I am right back where I started... Big ass CPU, big ass GPU, and feeling like no one is all that jazzed about my willingness to contribute it all to optimized folding. R@H doesn't want my cycles because I don't want to have to downgrade back to my ATI 4850 card and abandon my brand new NV 560. So, I can pull my new nvidia 560 out and run R@H... or I can leave it in and run pretty "stock" F@H clients easily for about 2K PPD, or I can put in a lot of work and run a carefully managed, optimized F@H config and maybe get 6K PPD in return for the huge fan noise, heat, and power cost associated with running the CPU and both GPU cores at 100%. I haven't decided what to do. For now, I have the i7 on BOINC with all my cycles going to the World Community Grid. I was staying on BOINC in the hopes that R@H would be fixed, but I wuill probably wait for a long weekend with some down time and switch my iron back to F@H. But my point is, if people here need help with F@H, there are some pretty experienced folders here. |
mikey Send message Joined: 5 Jan 06 Posts: 1895 Credit: 9,183,973 RAC: 3,314 |
Might I suggest that those of you experiencing this problem consider donating your computing resources to folding@home at Stanford Medical School. Their software is self-contained, stable, doesn't run under BOINC middleware, computes on your choice of CPU and/or GPU, has an excellent working simulation of the protein molecule that can be manipulated with the mouse to view/rotate/enlarge/etc, lots of interesting information that's easily accessible about each protein you're folding and why it is important, and is the world's largest computing network. I think Poem does folding type work too but under Boinc. |
mikey Send message Joined: 5 Jan 06 Posts: 1895 Credit: 9,183,973 RAC: 3,314 |
Guys, I'm sorry to say that, but this is really embarrassing. I'm also software developer and I also have to deal with user reported bugs and I cannot imagine having a bug reported 3 months ago, with tons of data to reproduce the issue, and no answer. There's a bug in validation code - where else could it be? I agree with the total frustration being expressed above! I have over 1 million Rosetta credits and will not get one single one more as Rosetta does NOT CARE anymore!! Rosetta YOU are a selfish project with your sights set so low that you are at present unable, but more likely unable, to make YOUR project work alongside other projects as Boinc itself is DESIGNED TO DO!!! In the future I expect to see Rosetta on the trash pile of projects that could have been something, but instead died off! |
Lance Stringham Send message Joined: 8 Oct 06 Posts: 3 Credit: 38,575,303 RAC: 0 |
Any progress on resolving this server validation bug with clients who have gpus installed? I'm still being affected by it and there has been no new information in this thread for a while. I really am starting to lose my patience with this problem. Thank you. |
woland Send message Joined: 17 Dec 05 Posts: 5 Credit: 124,792 RAC: 0 |
No, they simply don't care. Sorry Rosetta, I've already left you for Poem. If you don't care - why should I? |
peristalsis Send message Joined: 29 Mar 09 Posts: 8 Credit: 2,421,694 RAC: 0 |
I take a look at my Boinc messages this morning to see how things are going. I see a lot of errors with Rossmann2x3. Check here and see all of my errors duplicated by another machine. It's a relief that it is a problem with Rosetta and not my machine. It is not an enjoyable experience knowing I've wasted some of my bandwidth allowance on processing crap coding. Aborted the remaining Rossmann2x3 unit. Calm down, it's not important, life is not perfect. Just blowing off steam...p |
The-Real-Link Send message Joined: 27 Dec 10 Posts: 6 Credit: 2,676,652 RAC: 0 |
Hey guys, same problem here for my E5645 config. Now it's interesting, I was able to run with my old E5620 system for months without any isues at all, and then they started failing. Oddly enough though, I can't even get these new processors to complete a valid unit at all. I let the project stay detatched or a good week or so and then it did seem to fix itself by downloading my preferred workload (several days) as it queued up a few dozen units. They all appeared to be crunched successfully and uploaded, yet, on my stats page there are pages of "over" and "client errors" shown. Also running an EVGA board, EVGA GTX 680, with Windows 7 x64. I wouldn't mind crunching for this project but I simply can't get any work. Despite my log saying the work is successful and that the project was also successfully uploaded, my work queue is stuck at 8 per day (which is odd because that would be true with my old E5620s but not my E5645s) - I'd imagine I should be seeing a minimum of 12 units per day. I turn work in and yet don't see any more than the 8 come back when I should see a doubling if I understand it right. Any help is appreciated. Sorry for the rambling, just frustrated. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2126 Credit: 41,254,333 RAC: 7,970 |
Hey guys, same problem here for my E5645 config. Now it's interesting, I was able to run with my old E5620 system for months without any isues at all, and then they started failing. Oddly enough though, I can't even get these new processors to complete a valid unit at all. Urgh... Usually I can see something obvious, but the spec of your machine looks high (more than mine anyway) - no idea why yours aren't validating when they seem to complete successfully. Take a look at this message and see if you can spot anything in your Boinc manager settings that might be a problem. If you can't then it's a real mystery. I doubt it's anything to do with your 1 hour run setting :( |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
The most obvious thing that follows recent pattern is that you are running the newer version of BOINC Manager: <core_client_version>7.0.25</core_client_version> ...which is the topic in this thread. Rosetta Moderator: Mod.Sense |
Message boards :
Number crunching :
Client Errors
©2024 University of Washington
https://www.bakerlab.org