Problems and Technical Issues with Rosetta@home

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home

To post messages, you must log in.

Previous · 1 . . . 233 · 234 · 235 · 236 · 237 · 238 · 239 . . . 302 · Next

AuthorMessage
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 11,842,216
RAC: 9,161
Message 107071 - Posted: 3 Oct 2022, 17:27:21 UTC
Last modified: 3 Oct 2022, 17:28:23 UTC

Looks pretty simple to me. Your first task manager post shows all your CPU allocated to Boinc, which is good. There's nothing wrong outside Boinc.

But within Boinc 25% of the CPU goes to python, so whatever that is, your GPU task you mentioned?, is using a quarter of your 12 cores, as in 3 of them. More than you're accounting for. You only have 9 left, so you can't run more than 9 Rosettas at full speed, in fact probably 8 due to various other activity.
ID: 107071 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 107072 - Posted: 3 Oct 2022, 17:35:48 UTC - in response to Message 107071.  
Last modified: 3 Oct 2022, 17:39:41 UTC

Looks pretty simple to me. Your first task manager post shows all your CPU allocated to Boinc, which is good. There's nothing wrong outside Boinc.

But within Boinc 25% of the CPU goes to python, so whatever that is, your GPU task you mentioned?, is using a quarter of your 12 cores, as in 3 of them. More than you're accounting for. You only have 9 left, so you can't run more than 9 Rosettas at full speed, in fact probably 8 due to various other activity.



It's going to take a few days to finish that GPU task. That project has been set to no new tasks for now.
If its going to interfere, i'll goto milkyway or something.
And I figured that based on last nights conversation that Python was eating up to much and when I looked it was huge. But I thought it was all contained in GPU? Or is GPU a controller now and Python runs on all the CPU's?
ID: 107072 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 107073 - Posted: 3 Oct 2022, 17:38:04 UTC - in response to Message 107070.  


That is just a repeat of task manager and we have already established what the problem is now.
ID: 107073 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 11,842,216
RAC: 9,161
Message 107074 - Posted: 3 Oct 2022, 18:25:19 UTC - in response to Message 107072.  

It's going to take a few days to finish that GPU task. That project has been set to no new tasks for now.
If its going to interfere, i'll goto milkyway or something.
And I figured that based on last nights conversation that Python was eating up to much and when I looked it was huge. But I thought it was all contained in GPU? Or is GPU a controller now and Python runs on all the CPU's?
You said earlier "I have WCG running on GPU, but nothing else besides FAH is running GPU."

I run WCG GPU and also folding. Folding uses about 1 core along with a GPU. WCG uses about half a core for a GPU of 8000Gflops like mine.

But why do you have an entry called "python" in your task manager? Neither of those use python AFAIK.
ID: 107074 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 107075 - Posted: 3 Oct 2022, 19:38:01 UTC - in response to Message 107074.  

because if you look at the images you will see that it is called python.
GPU grid runs python.
The specific task name is Python apps for GPU hosts.
I wish you would just stay with the conversation and not argue with me over names.
ID: 107075 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 107076 - Posted: 3 Oct 2022, 19:41:42 UTC

I'm stuffing a lot of tasks into the GPU's and it handles them just fine.
FAH runs on both GPU's and then GPU Grid (but I guess that's more a controller than a processor),Einstien also has GPU and WCG has GPU.

I'm maxed out on CPU projects. And now I have enough GPU projects to keep those cards busy.
So I am not adding anything more. I probably don't need to add milkway or whatever into the mix as WCG is back. But they keep coughing up those damn transient errors now and then. I would have thought they had the fixed by now the way they are hyping being back up and running.
ID: 107076 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
kotenok2000
Avatar

Send message
Joined: 22 Feb 11
Posts: 259
Credit: 497,274
RAC: 903
Message 107077 - Posted: 3 Oct 2022, 19:44:59 UTC - in response to Message 107076.  

Looks like they haven't enabled statistics yet
They have sent email at 01.10.2022 that said Last Result: February 15, 2022
I have crunched before 01.10.2022
10 is october.
ID: 107077 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 11,842,216
RAC: 9,161
Message 107078 - Posted: 3 Oct 2022, 20:04:52 UTC - in response to Message 107075.  

because if you look at the images you will see that it is called python.
GPU grid runs python.
The specific task name is Python apps for GPU hosts.
I wish you would just stay with the conversation and not argue with me over names.
You said WCG and Folding, you didn't say GPUgrid, if you're going to give me incorrect information I can't help you.
ID: 107078 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 11,842,216
RAC: 9,161
Message 107079 - Posted: 3 Oct 2022, 20:06:15 UTC - in response to Message 107076.  

WCG is back. But they keep coughing up those damn transient errors now and then. I would have thought they had the fixed by now the way they are hyping being back up and running.
No errors here. I'm on 2 million credit per day with zero failed tasks.
ID: 107079 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 107080 - Posted: 3 Oct 2022, 21:54:02 UTC - in response to Message 107078.  
Last modified: 3 Oct 2022, 21:57:13 UTC

because if you look at the images you will see that it is called python.
GPU grid runs python.
The specific task name is Python apps for GPU hosts.
I wish you would just stay with the conversation and not argue with me over names.
You said WCG and Folding, you didn't say GPUgrid, if you're going to give me incorrect information I can't help you.


It's all ok...we figured it out.
I was trying to make dinner and answer the two of you hitting me with questions at the same time and try to get images uploaded. All this after work. So excuse me if I was jumbling answers.

We know now what the issue is and I figured that was probably the case last night when I opened up the tab in task manager for BOINC. Was shocked at how many processes were running with GPU Python.

Anyway...I'll dump GPU Grid when it finishes and see how the system balances out after that.
ID: 107080 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2125
Credit: 41,249,070
RAC: 9,333
Message 107081 - Posted: 4 Oct 2022, 2:04:07 UTC - in response to Message 107079.  

WCG is back. But they keep coughing up those damn transient errors now and then. I would have thought they had that fixed by now the way they are hyping being back up and running.
No errors here. I'm on 2 million credit per day with zero failed tasks.

Tasks are running fine from WCG. What Greg's talking about is the downloading of tasks throwing up http transient errors by the bucketful.
I'm getting the same and it's driving me crackers.
They did solve it a week or two back, but it's returned with a vengeance in the last several days.
It's taking longer to get a successful download of all the task components than the OPN GPU tasks are taking to run here
ID: 107081 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Aurum

Send message
Joined: 12 Jul 17
Posts: 32
Credit: 38,158,977
RAC: 0
Message 107083 - Posted: 4 Oct 2022, 14:05:05 UTC - in response to Message 107080.  
Last modified: 4 Oct 2022, 14:07:10 UTC

Was shocked at how many processes were running with GPU Python.
Anyway...I'll dump GPU Grid when it finishes and see how the system balances out after that.
PythonGPU works good if run right. Do not try to run it on a CPU with less than 32 threads. I've tried 24 threads and it's very slow.
Run 2 PythonGPU WUs and nothing else is best. The 2 PythonGPU WUs play well together and can share those CPU threads. If you try to run a different project with a PythonGPU WU it'll have annoying quirks. Since most of the work is actually done on the CPU it can use less powerful GPUs, e.g. 1080, than the acemd4 WU needs. Here's the app_config I use on an i9-10980XE with a 3060 Ti:
<app_config>
<!-- i9-10980XE   18c36t   2x16=32 GB   L3 Cache 24.75 MB  3060 Ti -->
    <app>
        <name>PythonGPU</name>
        <plan_class>cuda1131</plan_class>
        <gpu_versions>
            <cpu_usage>32</cpu_usage>
            <gpu_usage>0.5</gpu_usage>
        </gpu_versions>
        <max_concurrent>2</max_concurrent>
        <fraction_done_exact/>
    </app>
</app_config>

ID: 107083 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 11,842,216
RAC: 9,161
Message 107084 - Posted: 4 Oct 2022, 14:15:49 UTC

I've given up on GPUgrid. Despite countless promises they refuse to produce tasks to run on AMD GPUs, which is what I have exclusively, due to them being faster for the cost than Nvidia, which severely cripples double precision, which is needed for a lot of projects, not just Milkyway. Some of the calculations for most projects are DP, a lot more than the 1/64 nonsense that Nvidia give us.
ID: 107084 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 11,842,216
RAC: 9,161
Message 107085 - Posted: 4 Oct 2022, 15:06:47 UTC

I've also given up on Folding@Home. Nice idea, because there's not much biology on Boinc. But if Folding aren't willing to join the rest of us on Boinc, I can no l0nger be bothered. It's too much effort to run a different program alongside Boinc, which has no clue what other projects are running on the processors and GPUs, so balancing loads is ridiculous.
ID: 107085 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 107086 - Posted: 4 Oct 2022, 18:20:19 UTC - in response to Message 107085.  

I've also given up on Folding@Home. Nice idea, because there's not much biology on Boinc. But if Folding aren't willing to join the rest of us on Boinc, I can no l0nger be bothered. It's too much effort to run a different program alongside Boinc, which has no clue what other projects are running on the processors and GPUs, so balancing loads is ridiculous.



I have no problem with FAH on my system.
Seems to balance fine with BOINC
ID: 107086 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 107087 - Posted: 4 Oct 2022, 18:23:04 UTC

Is this guy done with 4.2 already? 0 tasks in queue!
FFS...this is nuts.
I guess 4.2 is small batches of stuff.
Where is all the rest of the stuff that Robetta has stocked up?
All for the in house systems?
ID: 107087 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
kotenok2000
Avatar

Send message
Joined: 22 Feb 11
Posts: 259
Credit: 497,274
RAC: 903
Message 107088 - Posted: 4 Oct 2022, 18:25:00 UTC - in response to Message 107086.  

Just divide 100 by number of cores and substract resulting number from 100 and put number in use at most x% of the cpus
Example 100/8=12.5
100-12.5=87.5
ID: 107088 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 11,842,216
RAC: 9,161
Message 107089 - Posted: 4 Oct 2022, 18:33:27 UTC - in response to Message 107086.  

I've also given up on Folding@Home. Nice idea, because there's not much biology on Boinc. But if Folding aren't willing to join the rest of us on Boinc, I can no l0nger be bothered. It's too much effort to run a different program alongside Boinc, which has no clue what other projects are running on the processors and GPUs, so balancing loads is ridiculous.


I have no problem with FAH on my system.
Seems to balance fine with BOINC
I don't see how. If I run two projects in Boinc, it chooses to run one or the other. If I run Folding aswell, Folding doesn't know when Boinc managed to get WCG tasks.
ID: 107089 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 11,842,216
RAC: 9,161
Message 107090 - Posted: 4 Oct 2022, 18:34:25 UTC - in response to Message 107088.  

Just divide 100 by number of cores and substract resulting number from 100 and put number in use at most x% of the cpus
Example 100/8=12.5
100-12.5=87.5
Until you have 24 cores and get recurring decimals.... I have to remember Boinc rounds down for CPUs and up for GPUs!
ID: 107090 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 107091 - Posted: 4 Oct 2022, 18:44:53 UTC - in response to Message 107089.  

I've also given up on Folding@Home. Nice idea, because there's not much biology on Boinc. But if Folding aren't willing to join the rest of us on Boinc, I can no l0nger be bothered. It's too much effort to run a different program alongside Boinc, which has no clue what other projects are running on the processors and GPUs, so balancing loads is ridiculous.


I have no problem with FAH on my system.
Seems to balance fine with BOINC
I don't see how. If I run two projects in Boinc, it chooses to run one or the other. If I run Folding aswell, Folding doesn't know when Boinc managed to get WCG tasks.



I don't know...maybe because I am not using CPU for FAH. I have so many CPU projects there is no room for anything else. So I put FAH in GPU only mode.

Since this project doesn't have anything going on, I am going back to GPU and see what that does to the other projects I run in CPU. If it cuts them in half then I will terminate GPU and find something else.

But this all takes time since I run only 14 hrs a day.
ID: 107091 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 233 · 234 · 235 · 236 · 237 · 238 · 239 . . . 302 · Next

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home



©2024 University of Washington
https://www.bakerlab.org