16% more GPU performance for Panthor

boogiepop · January 9, 2025, 3:47pm

There is a problem with the clock configuration of RK3588 in every known implementation, mainline, bsp etc, and thus the GPU / NPU clocks are not being set to nominal max values when CRU is used.

To understand why, better to look at how clocks are assigned to individual soc cores.

The base source frequency in 24MHz that is feeding everywhere and it is fixed.

There are several PLLs that take this frequency, and creates or synthesize another frequency from it. But the PLLs are not generally changed once they are configured. So they are configured once the device boots up and thats it.

Individual cores (ie, gpu) get their source clock form the PLL output. The cores however dynamically change the PLL output by dividing it. So they can reduce the PLL input to their own desired clock, by dividing them (only decrease, they cant multiply).

So simply something like below:

24Mhz -> PLL (multipliers and dividers [p,m,s,k]) -> GPU/NPU (divider) = dynamic frequency.

There are a bunch of PLLs but for GPU only CPL, GPLL, V0PLL, AUPLL, SPLL, NPLL is relevant. Gpu can select one of those and can divide it with a value in between [1-32].

default configuration for those PLLs are:

AUPLL: 786.432 Mhz
CPLL : 1500 Mhz
V0PLL: 1188 Mhz
NPLL : 850 Mhz
GPLL : 1188 Mhz

you can verify this by

sudo cat /sys/kernel/debug/clk/clk_summary | grep pll_

The problem with frequency divider is, you get your step frequency more sensitive in the lower frequency range, in the higher freq. range your steps will be collasal.

Ie:
for CPLL: 1500/1=1500Mhz, 1500/2=750Mhz, 1500/3=500Mhz, 1500/4=375Mhz ....

so you can jump from 1500 to 750 and to 500 from there, there frequencies in between can not be set.

and if you take all frequencies above listed PLLs and divide them you will get the below steps:

... 500, 594, 702, 750, 786, 850, 1188, 1500 Mhz. And this is the exact problem.

When you look at the opp_table of gpu/npu, you can see that it is configured to get ... 500, 600, 700, 800, 900, 1000 Mhz already, but the PLLs and the core dividers can not create such frequency.

Instead, when you request ie: 1000Mhz, clock driver gets the highest possible frequency, which is less than the reeusted in above table, In this case it is 850 Mhz. This is the top frequency you can get and thats the problem.

You can also verify this by.
set the the gpu governor to performance
sudo bash -c "echo performance > /sys/class/devfreq/fb000000.gpu/governor" (Note: you fb00000.gpu node might have different name, check it first)

check the actual assigned clock
sudo cat /sys/kernel/debug/clk/clk_summary | grep gpu

you will see that it will be maximum 850Mhz no matter what.

The Solution:

The actual solution is to tune the PLL clocks so that they can provide the frequencies which are requested in the opp_tables but this is not as easy as it sounds, because the above PLLs are also used in bunch of other cores which are not gpu, and if your new clock does not meet the frequency tolerance of the other cores then you will break other hardware cores.

But there is an easy approach. NPLL is mainly used for NPU and it has the same clock opptable as GPU. So if we modify this we will have the least impact to other components.

So i just pumped the NPLL clock from 850Mhz to 4Ghz so that we will bunch more divided frequencies. With that change, the available frequencies would be:

500, 571, 594, 666, 702, 750, 786, 800, 1000, 1188, 1333, 1500 ...

Now we can get proper 1Ghz or 800Mhz.

Here is the fix https://github.com/hbiyik/linux/commit/e4fd428dd34fe13cbd5fa6ed79e2f787bc7655b0

new when the governor is set to performance you can actually get 1Ghz set any can verify this by.
sudo cat /sys/kernel/debug/clk/clk_summary | grep gpu

I have also benchmarked this with glmark2-wayland -b terrain on weston.

Before the score was 112fps with the fix it is 130fps! So this is our way of getting lost %16 performance.

It worth to note that setting NPLL to 4Ghz is just a workaround, becuase step frequencies like 900 are still not available. Also 4Ghz is way above of this max 1.5Ghz define in bsp SOC, which also contradicts with TRM statement of 4.5Ghz (The FracPLL Fout part)

. Alternatively it can be set to 1Ghz to be more conservative but then 4000/5=800Mhz option will be gone…

Future Plans

It is also possible to bypass the kernel and set the individual CRU registers with mmm tool that i created. I have also set the GPU freq to 1.5GHz so the tool works, but a second tool is actually needed to change the regulator voltages over RK806 to feed such frequencies to gpu properly otherwise it will just crash when voltage is not enough.

a general tip to get involved with mmm.

//to get the actual CRU registers status
sudo python mmm.py get -c rk3588 -d CRU
//to get clocks in the CRU
sudo python mmm.py get -c rk3588 -d CRU -p clock
//to set PLL source of to GPU to GPLL
sudo python mmm.py set -c rk3588 -d CRU -r GPU_CLKSEL -p sel GPLL
//to set PLL source of to GPU to SPLL
sudo python mmm.py set -c rk3588 -d CRU -r GPU_CLKSEL -p sel SPLL
//to set GPU clock divider to 1 (catuion value - 1 must be entered )
sudo python mmm.py set -c rk3588 -d CRU -r GPU_CLKSEL -p div 0
//to set GPU clock divider to 2 (catuion value - 1 must be entered )
sudo python mmm.py set -c rk3588 -d CRU -r GPU_CLKSEL -p div 1

so the tool is quite capable, but be careful, dont burn your device or break it. You have been warned…
Any developer or tinkerer interest is appreciated in the tool, but again be careful, it is a scalpel for a surgeon…

PVTPLL situation is continued here

DarkevilPT · January 8, 2025, 4:47pm

You just squashed the peformance trigger on the chip ! Congratulations Great post ma dude!

dominik · January 8, 2025, 8:06pm

Awesome work!
I’ve seen some dts to overclock rk3588 to higher values, but probably not that complete like here.
Hopefully we will get best values optimized in future release.
Earlier @tkaiser was able to get the best from this SoC, maybe he can share his methods?

avaf · January 8, 2025, 9:11pm

I have these values??: I mean, how do i know at which frequency gpu is running?
oot@rock5b:/home/rock# cat /sys/kernel/debug/clk/clk_summary | grep GPU

 scmi_clk_gpu                         1        1        0  1000000000          0     0  50000         Y
    clk_gpu_pvtm                      0        0        0    24000000          0     0  50000         N
          clk_gpu_src                 3        3        0   198000000          0     0  50000         Y
             clk_core_gpu_pvtm        0        0        0   198000000          0     0  50000         N
             clk_gpu_stacks           1        3        0   198000000          0     0  50000         Y
             clk_gpu_coregroup        1        3        0   198000000          0     0  50000         Y
             clk_gpu                  1        3        0   198000000          0     0  50000         Y

rock@rock5b:~$ glmark2-es2-wayland -b terrain
arm_release_ver of this libmali is 'g6p0-01eac0', rk_so_ver is '6'.
=======================================================
    glmark2 2021.02
=======================================================
    OpenGL Information
    GL_VENDOR:     ARM
    GL_RENDERER:   Mali-LODX
    GL_VERSION:    OpenGL ES 3.2 v1.g6p0-01eac0.ba52c908d926792b8f5fe28f383a2b03
=======================================================
[terrain] <default>: FPS: 319 FrameTime: 3.135 ms
=======================================================
                                  glmark2 Score: 319 
=======================================================

willy · January 8, 2025, 9:06pm

Nice work! Just out of curiosity, have you checked whether the power draw increases a bit by bumping the PLL to 4 GHz ? I think it should be negligible but possibly observable.

boogiepop · January 8, 2025, 9:21pm

did not check it, but just pumping the PLL shouldnt increase the consumption, it is just a roaming clock. the consumption comes from a core that is attached to that clock with its own driver. in our case gpu will have divider 4 for with 1 ghz

if you are concerned about 4ghz, you might as well set it 1Ghz, it will apply the same affect.

boogiepop · January 8, 2025, 9:18pm

that is actually interesting, you are running the blob driver.

when i dump the clock of gpu with blob driver i get the following

[alarm@alarm mmm]$ sudo python mmm.py get -c rk3588 -d CRU -r GPU_CLKSEL
-c rk3588 -d CRU -r GPU_CLKSEL -p div = 5, (default=0), (values=[0~31])
-c rk3588 -d CRU -r GPU_CLKSEL -p sel = GPLL, (default=GPLL), (values=GPLL,CPLL,AUPLL,NPLL,SPLL)
-c rk3588 -d CRU -r GPU_CLKSEL -p testout_div = 31, (default=0), (values=[0~31])
-c rk3588 -d CRU -r GPU_CLKSEL -p testout_mux = PLL, (default=PLL), (values=PLL,PVTM)
-c rk3588 -d CRU -r GPU_CLKSEL -p mux = PLL, (default=PLL), (values=PLL,PVTM)
-c rk3588 -d CRU -r GPU_CLKSEL -p reserved = 0, (default=0)
-c rk3588 -d CRU -r GPU_CLKSEL -p clock = 198 Mhz

[alarm@alarm mmm]$ sudo python mmm.py get -c rk3588 -d CRU -p clock     
-c rk3588 -d CRU -r V0PLL_CON0 -p clock = 1188 Mhz
-c rk3588 -d CRU -r AUPLL_CON0 -p clock = 786 Mhz
-c rk3588 -d CRU -r CPLL_CON0 -p clock = 1500 Mhz
-c rk3588 -d CRU -r GPLL_CON0 -p clock = 1188 Mhz
-c rk3588 -d CRU -r NPLL_CON0 -p clock = 850 Mhz
-c rk3588 -d CRU -r GPU_CLKSEL -p clock = 198 Mhz

so the gpu clock is set to use PLL (not pvtm), the source is GPLL. and divider is 5+1=6, so the clock is 1188/6=198Mhz.

But i can not make sense of the glmark results. It is too high for 200Mhz.

My only theory is, blob driver is using smcc to set the clocks, and thus the requested clocks are set by the BL31. BL31 has different execution level than normal kernel, so somehow soc might have different IO base addr for the BL31 part. So what normal registers report should not be valid. In any case, a weird situation…

boogiepop · January 8, 2025, 9:19pm

thats a very valid but very hard to answer question

willy · January 8, 2025, 9:25pm

It’s really not a concern, mostly a matter of curiosity. PLLs are free-running clocks controlled on their phase after the divide and at such frequencies they can usually draw a few milliamps. Thanks!

willy · January 8, 2025, 9:28pm

Just thinking, I’ve used opengl a tiny little bit several years ago and found that it was apparently possible to port generic code there, but the communication latency with the host was horrible for me (I really don’t know the right way to do things). Maybe it would be feasible to simply port the mhz utility to the GPU for this, if we find a way to accurately measure the processing time.

boogiepop · January 8, 2025, 9:30pm

that should be somehow possible with the PMU (performance measuring unit) of the GPU, but the code will not be very portable i assume.

avaf · January 8, 2025, 10:09pm

@boogiepop

I applied your proposed tunning but there are no changes in the case of mali blob.

root@rock5b:/home/rock# cat /sys/kernel/debug/clk/clk_summary | grep gpu
 scmi_clk_gpu                         1        1        0  1000000000          0     0  50000         Y
    clk_gpu_pvtm                      0        0        0    24000000          0     0  50000         N
          clk_gpu_src                 3        3        0   666666667          0     0  50000         Y
             clk_core_gpu_pvtm        0        0        0   666666667          0     0  50000         N
             clk_gpu_stacks           1        3        0   666666667          0     0  50000         Y
             clk_gpu_coregroup        1        3        0   666666667          0     0  50000         Y
             clk_gpu                  1        3        0   666666667          0     0  50000         Y
root@rock5b:/home/rock# 

rock@rock5b:~$ glmark2-es2-wayland -b terrain
arm_release_ver of this libmali is 'g6p0-01eac0', rk_so_ver is '6'.
=======================================================
    glmark2 2021.02
=======================================================
    OpenGL Information
    GL_VENDOR:     ARM
    GL_RENDERER:   Mali-LODX
    GL_VERSION:    OpenGL ES 3.2 v1.g6p0-01eac0.ba52c908d926792b8f5fe28f383a2b03
=======================================================
[terrain] <default>: FPS: 320 FrameTime: 3.125 ms
=======================================================
                                  glmark2 Score: 320 
=======================================================
rock@rock5b:~$

@willy
Unfortunately, porting your MHz to a GPU is beyond my knowledge, but I can run it here if someone does.

boogiepop · January 8, 2025, 10:24pm

yes mali blob wont take advantage out of it, because it is using PVTPLL i think i am also checking this in detail, may be i should update the title accrodingly to be more precise

willy · January 9, 2025, 4:58am

It’s beyond my knowledge as well But it makes an interesting project I should consider. I have no idea where to start to run code on the GPU there, I’m totally ignorant of these things.

boogiepop · January 9, 2025, 3:52pm

So, i would like to clarify more about what i have learned about the clock adventures of rk35xx.

In previous post i mentioned that GPU takes PLL frequency from either of CPLL, GPLL, AUPLL, V0PLL, SPLL. This is not complete. For the small cores like i2c, spi, pcie even correct but for bigger cores like CPU, GPU, NPU etc, there is another PLL source called PVTPLL.

PVTPLLs are dedicated to the core, and not shared across diffrent cores, sometimes there are even multiple PVTPLL for single core. (ie: CPU has different PVTPLL for litlle cores and for each big cores).

Unlike normal PLLs, PVTPLLs are meant to be dynamically configured with a twist.
PVTPLLs, gives the best possible frequency output for a given voltage, temperature, and chip quality.

Ie: you request 1Ghz from a PVTPLL, then you set the voltage to your target voltage, and start monitoring the PVTPLL circuit. PVTPLL runs a very small hardware benchmark circuit called ring oscillator and locks the frequency output to maximum possible. It can be 999Mhz, 950Mhz, or 1Ghz. Then the core gets this voltage and uses it.

Now comes the complicated part. This is my understanding someone may be correct me if i am wrong but, PVTPLL is not directly configured by the kernel. Instead it is configured by the BL31.

Kernel uses an interface called SMCCC to communicate with BL31, and request the frequency. BL31 sets the PVTPLL and configures the Core. This whole communication of BL31 with kernel is sometimes referrred as firmware or scmi. There are also other ways to communicate rather than smccc but in our rk3588 it is smccc.

So the initial problem with GPU clocks was not reaching to 1Ghz is that GPU was using normal PLLs rather than PVTPLL with Panthor driver. It seems that even though smi clock of gpu is defined in GPU block of the mainline DTS, it looks the me devfreq is not taking care of it. I think there needs to to be done something about this in mainline. When the issue in mainline is resolved i can also backport this to bsp hopefully.

When it comes to mali blob driver, it is actually using PVTPLL as a source and can sucessfully set the frequency to desired 1Ghz. However there seems to be still a problem. When you request a frequency from BL31 with PVTPLL, it is reporting the requested frequency as set frequency, not what PVTPLL provides., You can see that from the reference TF-A implementation of rk3588.

So how do we know what the actual frequency is?

when i probe the GPU_GRF register with mmm tool, i get directly a kernel crash. I interpret this as a security mechanism somehow since direct access to those registers from kernel or mmapped userspace is not allowed (theory). So my approach now to use pysmccc to probe the BL31, but use functions sip_smc_secure_reg_read and sip_smc_secure_reg_write callbacks to probe GPU_GRF registers. Normally those callbask are meant to access OTP registers, but worth to give it a shot. I also dont know if they are even implemented in BL31 as well.

ricoazzurro · January 21, 2025, 4:40am

Awesome work!

I’ve got great result in excellent score in glmark2-wayland. Now 3093 with the commit and performance governor, was 28XX ish before.

rico [ ~ ]$ glmark2-wayland 
=======================================================
    glmark2 2023.01
=======================================================
    OpenGL Information
    GL_VENDOR:      Mesa
    GL_RENDERER:    Mali-G610 (Panfrost)
    GL_VERSION:     3.1 Mesa 25.0.0-devel (git-7d41cfa1a9)
    Surface Config: buf=32 r=8 g=8 b=8 a=8 depth=24 stencil=0 samples=0
    Surface Size:   800x600 windowed
=======================================================
[build] use-vbo=false: FPS: 3772 FrameTime: 0.265 ms
[build] use-vbo=true: FPS: 4143 FrameTime: 0.241 ms
[texture] texture-filter=nearest: FPS: 5000 FrameTime: 0.200 ms
[texture] texture-filter=linear: FPS: 5014 FrameTime: 0.199 ms
[texture] texture-filter=mipmap: FPS: 4979 FrameTime: 0.201 ms
[shading] shading=gouraud: FPS: 3353 FrameTime: 0.298 ms
[shading] shading=blinn-phong-inf: FPS: 3294 FrameTime: 0.304 ms
[shading] shading=phong: FPS: 2975 FrameTime: 0.336 ms
[shading] shading=cel: FPS: 3212 FrameTime: 0.311 ms
[bump] bump-render=high-poly: FPS: 2016 FrameTime: 0.496 ms
[bump] bump-render=normals: FPS: 5057 FrameTime: 0.198 ms
[bump] bump-render=height: FPS: 4980 FrameTime: 0.201 ms
[effect2d] kernel=0,1,0;1,-4,1;0,1,0;: FPS: 3224 FrameTime: 0.310 ms
[effect2d] kernel=1,1,1,1,1;1,1,1,1,1;1,1,1,1,1;: FPS: 1634 FrameTime: 0.612 ms
[pulsar] light=false:quads=5:texture=false: FPS: 4919 FrameTime: 0.203 ms
[desktop] blur-radius=5:effect=blur:passes=1:separable=true:windows=4: FPS: 661 FrameTime: 1.515 ms
[desktop] effect=shadow:windows=4: FPS: 2618 FrameTime: 0.382 ms
[buffer] columns=200:interleave=false:update-dispersion=0.9:update-fraction=0.5:update-method=map: FPS: 509 FrameTime: 1.966 ms
[buffer] columns=200:interleave=false:update-dispersion=0.9:update-fraction=0.5:update-method=subdata: FPS: 510 FrameTime: 1.964 ms
[buffer] columns=200:interleave=true:update-dispersion=0.9:update-fraction=0.5:update-method=map: FPS: 613 FrameTime: 1.633 ms
[ideas] speed=duration: FPS: 2284 FrameTime: 0.438 ms
[jellyfish] <default>: FPS: 2851 FrameTime: 0.351 ms
[terrain] <default>: FPS: 119 FrameTime: 8.463 ms
[shadow] <default>: FPS: 1895 FrameTime: 0.528 ms
[refract] <default>: FPS: 287 FrameTime: 3.493 ms
[conditionals] fragment-steps=0:vertex-steps=0: FPS: 4273 FrameTime: 0.234 ms
[conditionals] fragment-steps=5:vertex-steps=0: FPS: 3773 FrameTime: 0.265 ms
[conditionals] fragment-steps=0:vertex-steps=5: FPS: 4261 FrameTime: 0.235 ms
[function] fragment-complexity=low:fragment-steps=5: FPS: 4167 FrameTime: 0.240 ms
[function] fragment-complexity=medium:fragment-steps=5: FPS: 3686 FrameTime: 0.271 ms
[loop] fragment-loop=false:fragment-steps=5:vertex-steps=5: FPS: 4151 FrameTime: 0.241 ms
[loop] fragment-steps=5:fragment-uniform=false:vertex-steps=5: FPS: 4159 FrameTime: 0.240 ms
[loop] fragment-steps=5:fragment-uniform=true:vertex-steps=5: FPS: 3718 FrameTime: 0.269 ms
=======================================================
                                  glmark2 Score: 3093 
=======================================================

Owner · January 23, 2025, 6:15am

Hello, guys. Sorry for bothering, i have Orange Pi 5, not Radxa, but maybe there is helpful info about GPU perf. For Orange Pi 5 we have Ubuntu 24.04 from Joshua Riek with vender-based kernel 6.10 and Panfrost driver. There are also builds for Radxa ROCK 5.

And with it i have 4134 score from glmark2-wayland:

owner@Enterprise:~/Downloads/mmm$ glmark2-wayland
=======================================================
glmark2 2023.01
=======================================================
OpenGL Information
GL_VENDOR: Panfrost
GL_RENDERER: Mali-G610 (Panfrost)
GL_VERSION: 3.3 (Compatibility Profile) Mesa 23.0.0-devel
Surface Config: buf=32 r=8 g=8 b=8 a=8 depth=24 stencil=0 samples=0
Surface Size: 800x600 windowed
=======================================================
[build] use-vbo=false: FPS: 4699 FrameTime: 0.213 ms
[build] use-vbo=true: FPS: 5526 FrameTime: 0.181 ms
[texture] texture-filter=nearest: FPS: 6912 FrameTime: 0.145 ms
[texture] texture-filter=linear: FPS: 6901 FrameTime: 0.145 ms
[texture] texture-filter=mipmap: FPS: 6899 FrameTime: 0.145 ms
[shading] shading=gouraud: FPS: 4745 FrameTime: 0.211 ms
[shading] shading=blinn-phong-inf: FPS: 4314 FrameTime: 0.232 ms
[shading] shading=phong: FPS: 3781 FrameTime: 0.265 ms
[shading] shading=cel: FPS: 3698 FrameTime: 0.270 ms
[bump] bump-render=high-poly: FPS: 2140 FrameTime: 0.467 ms
[bump] bump-render=normals: FPS: 6631 FrameTime: 0.151 ms
[bump] bump-render=height: FPS: 6506 FrameTime: 0.154 ms
[effect2d] kernel=0,1,0;1,-4,1;0,1,0;: FPS: 3733 FrameTime: 0.268 ms
[effect2d] kernel=1,1,1,1,1;1,1,1,1,1;1,1,1,1,1;: FPS: 1632 FrameTime: 0.613 ms
[pulsar] light=false:quads=5:texture=false: FPS: 6439 FrameTime: 0.155 ms
[desktop] blur-radius=5:effect=blur:passes=1:separable=true:windows=4: FPS: 1055 FrameTime: 0.948 ms
[desktop] effect=shadow:windows=4: FPS: 3502 FrameTime: 0.286 ms
[buffer] columns=200:interleave=false:update-dispersion=0.9:update-fraction=0.5:update-method=map: FPS: 576 FrameTime: 1.738 ms
[buffer] columns=200:interleave=false:update-dispersion=0.9:update-fraction=0.5:update-method=subdata: FPS: 579 FrameTime: 1.729 ms
[buffer] columns=200:interleave=true:update-dispersion=0.9:update-fraction=0.5:update-method=map: FPS: 661 FrameTime: 1.514 ms
[ideas] speed=duration: FPS: 1969 FrameTime: 0.508 ms
[jellyfish] : FPS: 3314 FrameTime: 0.302 ms
[terrain] : FPS: 152 FrameTime: 6.617 ms
[shadow] : FPS: 2947 FrameTime: 0.339 ms
[refract] : FPS: 326 FrameTime: 3.070 ms
[conditionals] fragment-steps=0:vertex-steps=0: FPS: 6256 FrameTime: 0.160 ms
[conditionals] fragment-steps=5:vertex-steps=0: FPS: 5114 FrameTime: 0.196 ms
[conditionals] fragment-steps=0:vertex-steps=5: FPS: 6239 FrameTime: 0.160 ms
[function] fragment-complexity=low:fragment-steps=5: FPS: 6284 FrameTime: 0.159 ms
[function] fragment-complexity=medium:fragment-steps=5: FPS: 5087 FrameTime: 0.197 ms
[loop] fragment-loop=false:fragment-steps=5:vertex-steps=5: FPS: 6386 FrameTime: 0.157 ms
[loop] fragment-steps=5:fragment-uniform=false:vertex-steps=5: FPS: 6379 FrameTime: 0.157 ms
[loop] fragment-steps=5:fragment-uniform=true:vertex-steps=5: FPS: 5076 FrameTime: 0.197 ms
=======================================================
glmark2 Score: 4134
=======================================================
warning: queue 0xaaaafa025030 destroyed while proxies still attached:
wl_display@1 still attached

More info:

owner@Enterprise:~/Downloads/mmm$ sudo cat /sys/kernel/debug/clk/clk_summary | grep gpu
scmi_clk_gpu 1 1 0 1000000000 0 0 50000 Y
clk_gpu_pvtm 0 0 0 24000000 0 0 50000 N
clk_gpu_src 3 3 0 198000000 0 0 50000 Y
clk_core_gpu_pvtm 0 0 0 198000000 0 0 50000 N
clk_gpu_stacks 1 3 0 198000000 0 0 50000 Y
clk_gpu_coregroup 1 3 0 198000000 0 0 50000 Y
clk_gpu 1 3 0 198000000 0 0 50000 Y

owner@Enterprise:~/Downloads/mmm$ sudo python3 mmm.py get -c rk3588 -d CRU -p clock
-c rk3588 -d CRU -r V0PLL_CON0 -p clock = 1188 Mhz
-c rk3588 -d CRU -r AUPLL_CON0 -p clock = 786 Mhz
-c rk3588 -d CRU -r CPLL_CON0 -p clock = 1500 Mhz
-c rk3588 -d CRU -r GPLL_CON0 -p clock = 1188 Mhz
-c rk3588 -d CRU -r NPLL_CON0 -p clock = 850 Mhz
-c rk3588 -d CRU -r GPU_CLKSEL -p clock = 0 Mhz

Maybe into this Ubuntu are already some kernel/driver hacks that we can boost to get some more perf? Like overclocking or such.

Link for the repo — GitHub - Joshua-Riek/ubuntu-rockchip: Ubuntu for Rockchip RK35XX Devices

boogiepop · January 23, 2025, 6:59am

On joshuas kernel default driver is mali ddk, and user space is panfork. Mali ddk uses pvtpll, therefore clocks of the gpu should be more or less fine.

Shadow · February 8, 2025, 12:14pm

Sorry to bother you guys here, I don’t wanna stain this beautiful thread with a stupid question, but is the GPU on the RK3588S so much different from the one in the RK3588?

How are you guys getting >3000 or even >4000 FPS?
My 5C barely does, like, 1600 where you get 3000-4500 and 3200 where you get 6600-6900 in glmark2-wayland (Joshua’s Ubuntu)…

incognito · February 8, 2025, 10:28pm

There is no difference between these GPUs