Thanks very much!Slightly diff from rk3588s.But I will interpret it
ROCK 5B Debug Party Invitation
I have some info of your board’s pvtm but what about pvtm?
Can you provid dmesg | grep cpu.cpu
It just depends on chip generations. Older ones were measuring leakage in the fab and using that to adjust voltage. Newer ones use PVTM to measure it at boot time.
Sure!
~$ dmesg|grep cpu.cpu
[ 4.706906] cpu cpu0: leakage=14
[ 4.706927] cpu cpu0: Looking up cpu-supply from device tree
[ 4.708386] cpu cpu0: pvtm=1481
[ 4.708476] cpu cpu0: pvtm-volt-sel=3
[ 4.708503] cpu cpu0: Looking up cpu-supply from device tree
[ 4.708619] cpu cpu0: Looking up mem-supply from device tree
[ 4.708955] cpu cpu4: leakage=10
[ 4.708973] cpu cpu4: Looking up cpu-supply from device tree
[ 4.715458] cpu cpu4: pvtm=1719
[ 4.719393] cpu cpu4: pvtm-volt-sel=5
[ 4.719418] cpu cpu4: Looking up cpu-supply from device tree
[ 4.719914] cpu cpu4: Looking up mem-supply from device tree
[ 4.720625] cpu cpu6: leakage=10
[ 4.720641] cpu cpu6: Looking up cpu-supply from device tree
[ 4.727141] cpu cpu6: pvtm=1740
[ 4.731097] cpu cpu6: pvtm-volt-sel=5
[ 4.731120] cpu cpu6: Looking up cpu-supply from device tree
[ 4.731611] cpu cpu6: Looking up mem-supply from device tree
[ 4.732680] cpu cpu0: avs=0
[ 4.733371] cpu cpu4: avs=0
[ 4.734058] cpu cpu6: avs=0
One of the two big clusters is close to the boundary and randomly switches between two volt-sel values (I believe it’s the 2nd one which oscillates between 5 and 6 but it might also be the 1st one between 4 and 5, I don’t remember).
At least with the doc above I could improve the measurement time in the DT to make it more stable.
Yes, the 2nd one according to start of the debug party. If the pvtm
value is between 1744 and 1776 --> pvtm-volt-sel=6
, with 1743 --> pvtm-volt-sel=5
Quick test with Rock 5B (with a ‘good’ RK3588 on it: pvtm-volt-sel
5/7/7), RPi USB-C power brick, only GbE networking and Netio 4KF Powerbox measuring idle consumption:
- DMC governor and all cpufreq policies set to
powersave
: 1280mW - just switching
policy0
toperformance
(the four A55 idling now at ~1820 MHz instead of ~400 MHz): 1350mW - all cpufreq policies set to
performance
as such all CPU cores idling at highest cpufreq OPP: 1500mW - now also DMC governor set to
performance
: 2100mW
That’s a whopping 70mW or 17.5mW per A55 depending on whether it is doing nothing at ~400 MHz or ~1800 MHz. With the A76 the difference it makes whether they’re idling at ~400 MHz or ~2350 MHz is 150mW or 37.5mW per core. So as long as they aren’t doing anything the consumption difference with them staying on the lowest vs. highest cpufreq OPP is negligible. Now we would need another Rock 5B Rev 1.3 with a ‘weak’ RK3588 on it ( pvtm-volt-sel=2
) to compare with same equipment and methodology how much of a difference this makes.
Memory is a different thing: the difference between 528 MHz and 2112 MHz is a massive 600mW compared to CPU clockspeeds. As such the DMC governor and the up_treshold
setting really matter.
The small consumption difference with ARM cores doing nothing regardless of cpufreq (sitting on an ‘Wait for Interrupt’ instruction which was already there with ARMv7) might have driven the decision on ARM to have one cpufreq policy per cluster (which you call profile?) unlike on other platforms like x86_64 where each core seems to behave independently (no idea which influence the cpufreq driver on modern x86_64 implementations really has or whether it’s just reporting what something else inside an Intel/AMD CPU decides).
And I guess the decision to split the four A76 into two clusters was mostly motivated by PVTM as to be able to drive them with different supply voltages and not so much keep idle A76 on lower cpufreq OPP.
Yeah I don’t know enough but policy (apols). Arm wise I presume its setup in a similar way to the newer triple-cluster architecture and maybe one policy can push further in Mhz than the other in a similar way the core count in Intel speedstep may do? I guess really its a verybig.biG.Little config really but just uses the same for verybig & biG.
This time I am totally asking you as expecting you have measured hard data of various governors.
I have only got one of those really crappy usb-c meters and haven’t bothered to get a readout.
Using my crappy usb-c meter and a 5v pi supply as it doesn’t handle PD its telling me 3.3watt approx on idle.
Running Glmark2 as an example with no governor or taskset changes ups wattage to approx avg 4.6 watts sometimes above 6 watt but often below and prob a decent guesstimate avg and ends up with a lacklustre glmark2 Score of 582.
Doing a weird and wonderful of setting only echo 2 > sys/devices/system/cpu/cpufreq/policy4/6/ondemand/up_threshold
with another horrid guesstimate wattage of approx avg 5.3 watt end seems to have the same occasional max outs of just over 6 watt with a not brilliant but ok glmark2 Score of 920.
After running it drops to that 3.3watt approx idle again.
echo performance | sudo tee /sys/bus/cpu/devices/cpu[046]/cpufreq/scaling_govenor /sys/class/devfreq/dmc/governor
Only seems to up wattage to a guesstimate of 5.8 watts whilst idle rises to 3.9 watt whilst returning a glmark2 score of 1014
Lols as yeah it makes no sense to me, but really need a logging meter to get a true avg than guesstimating it Actually just noticed it does have a resettable mWh log doh.
I encount exactly problem as you.when I set voltage bigger than 1000mv,Frequency is locked to 400mhz.
[ 476.994512] vdd_cpu_big0_s0: Restricting voltage, 1025000-1000000uV
[ 476.994567] vdd_cpu_big0_s0: Restricting voltage, 1025000-1000000uV
[ 476.994591] cpu cpu4: rockchip_cpufreq_set_volt: failed to set voltage (1025000 1025000 1025000 uV): -22
[ 476.994613] cpufreq: __target_index: Failed to change cpu frequency: -22
opp table:
opp-2400000000 {
opp-supported-hw = <0xfd 0xffff>;
opp-hz = <0x00 0x8f0d1800>;
opp-microvolt = <0xFA3E8 0xFA3E8 0xFA3E8 0xFA3E8 0xFA3E8 0xFA3E8>;
clock-latency-ns = <0x9c40>;
};
Do you have find a way to solve it?
How and why!
I try to set higher voltage,but got this:
[ 1070.744651] vdd_cpu_big0_s0: Restricting voltage, 1025000-1000000uV
[ 1070.744701] vdd_cpu_big0_s0: Restricting voltage, 1025000-1000000uV
[ 1070.744726] cpu cpu4: rockchip_cpufreq_set_volt: failed to set voltage (1025000 1025000 1025000 uV): -22
[ 1070.744746] cpufreq: __target_index: Failed to change cpu frequency: -22
Not entirely sure what this Glmark2 testing is about (as @icecream95 already explained it’s not testing what people expect).
In general I would always test with every governor set to powersave
and then with performance
again to get worst and best case. Afterwards with a somewhat ‘smart’ powermeter an optimization process could happen to get a good compromise between performance (when needed) and consumption. Only then the other governors and further settings like ondemand/up_threshold
(DMC) or ondemand/io_is_busy
(CPU) will be explored to find a good balance (as we’ve seen ondemand/up_threshold=40
isn’t a good choice but nobody cares, same with io_is_busy
).
But without power measurements that generate graphs this is a process most probably not worth the time and efforts.
BTW there’s also /sys/devices/platform/fb000000.gpu/devfreq/fb000000.gpu/
defaulting to simple_ondemand
and as such dynamically clocking the GPU between 200 and 1000 MHz. Is it worth measuring ‘GL performance’ without taking this into account?
You means I only need to set 3 and 6 postion values?
Like this? :
opp-2400000000 { //org 2400000000
opp-supported-hw = <0xff 0xffff>;
opp-hz = <0x00 0x8f0d1800>; //org <0x00 0x8f0d1800>
opp-microvolt = <0xf4240 0xf4240 0xFA3E8 0xf4240 0xf4240 0xFA3E8>;
clock-latency-ns = <0x9c40>;
};
When I done.No warning appear again,but volts has no change,still 1000mv
root@firefly2:~# cat /sys/kernel/debug/regulator/regulator_summary | grep big
vdd_cpu_big0_s0 1 3 0 normal 1000mV 0mA 550mV 1050mV
vdd_cpu_big0_s0 0 0mA 0mV 0mV
vdd_cpu_big1_s0 1 3 0 normal 1000mV 0mA 550mV 1050mV
vdd_cpu_big1_s0 0 0mA 0mV 0mV
It has far more relevance than the synthetic benchmarks that are just walls of constant load that you keep using.
It was just the 1st thing I noticed that has a more ‘normal’ load and really shows off how bad the frequency bouncing of the ondemand governor can be.
Could be a choice of a huge range of normal applications of varying load, the Glmark2 tests also provide a secondary bonus info of how the graphics subsystem is working the score doesn’t really matter but is a bonus metric to how the scheduler is doing.
Choose from a range of running apps that many Phoronix or Geekbench base there tests on but don’t use a very synthetic wall of load that doesn’t create ondemand frequency bounce inefficiency.
Pick something that has a more normal application load profile that does ‘bounce’ around with load.
Choice is yours Glmark2 to Tensorflow-lite ASR or image detection or some Browser tests that do bounce load.
I have no idea really as the symmetrical based load governors don’t seem to be able to be particularly performant and efficient at the same time on the Arm asymmetric core layouts we have and why I was asking if anyone knew anything about Sched-capacity & Sched-energy.
Benchmarks don’t mean anything. You have to test your use case and adjust your CPU governor according. A simple DNS and DHCP server doesn’t need the same policy as a media streaming server or desktop system.
Yep and that is what we should be benching real apps which Glmark2 is a closer form of than just some block mass load of 7zip.
This is what I am asking of common more desktop like applications running the way the schedulers will like be provided. Even more server like applications than just simple DNS & DHCP as doubt many will use a Rock5b solely for that, even if they could.
SuperTuxCart or maybe the demo lvl of Quake maybe as one example or anything bouncy with load that involves multiple sub systems.
I am not taking a swipe at benchmarks but I am questioning if the schedulers we commonly use are any good.
The Performance scheduler as above didn’t add that much watts to idle, so maybe it is just use performance when not mobile, its a question I am asking more than anything else. Are there better asymmetric schedulers and methods to test them?
I’ve had strange results yesterday when increasing the voltage. I got some combinations that worked for some frequencies. But not every time, sometimes after a reboot they were not accepted anymore. I got 2.35 GHz working at 1.037 at positions 1,2,4,5 and 1.05V at 3,6, but couldn’t reproduce it later. I’ve seen that modifying then reverting the DTS resulted in some values no more working, but I really think that instead it was caused by the reboot. I still haven’t figured the complete extent of this mess yet. After reading the document shared above, I think that the issue might be that we’re using too narrow ranges between min and max and that maybe it’s not possible to configure an exact voltage value that matches. I’ll need to do more tests on this.
For anything non-vsynced or at a moderately high resolution, it’s pretty easy to push the GPU to “1000” (990 for me) MHz. I don’t think that’s a big concern, at least not compared to the CPU clocking itself too low.
If you want both maximum performance and minimum consumption in this silly SBC world (ARM and not x86) you need benchmarks to be able to optimize stuff since defaults suck. Benchmarks that represent at least one use case you’re interested in (e.g. ‘server workloads in general’) are the prerequisit to
- test out tunables/settings (something we as users need to do since all we get from SoC vendors is crap suited for Android use cases)
- take decisions (wrt to scheduling for example on hybrid systems: which tasks or interrupts should be pinned to ‘little’ and which to ‘big’ cores, when do I need a big core since a little becomes a real-world bottleneck)
As if it would be that easy and as if there would be only the cpufreq governor. It’s about more than this and settings matter regardless which governor has been chosen (repeating myself again and again in this thread)
Example 1: clocking LPDDR with dmc_ondemand
memory governor and the up_treshold=40
default --> lower idle consumption at the cost of almost 15% performance loss. up_treshold=25
is better to keep idle consumption low but ramp up DRAM clockspeed immediately when needed and as such retain max performance at min idle consumption.
Example 2: I/O activity: ondemand/io_is_busy
wins over schedutil
if it’s about maximum I/O performance while keeping consumption at the minimum.
I’m testing with schedutil
since IIRC kernel 4.6 or something like that but on ARM never had any success compared to ondemand/io_is_busy
. And again my use cases are very limited (server workloads) and my main goal is minimum consumption and maximum performance combined. Which is possible just not with SoC vendors defaults since we’re dealing here with the ‘Android e-waste world’ needing a lot of adjustments for ‘Linux use cases’.
I’m not interested in any of these use cases so why should I care about this crap. Other than your perception (‘walls of constant load’) the chosen benchmarks are of actual value to help with developing settings that combine min consumption and max performance with the use cases I’m interested in since this is my only goal (if this wasn’t the goal why would I deal with this shitty ‘Android e-waste world’ and the horrible software support situation in the first place?)
I wouldn’t be too surprised if all of this stuff works amazingly well on expensive Android smartphones featuring triple CPU core clusters with Samsung or HiSilicon BSP kernels + userlands (tons of vulnerabilities included since SoC vendor’s BSP) while with mainline kernel you get a different experience.
Though I’ve no clue since not using anything Android and still being a fan of manually adjusting SMP/IRQ affinity since for the use cases I’m interested in it makes more sense.
@icecream95 thank you for your ‘security advisory’ wrt this 5.10 BSP kernel. Of course some issues exist with RK BSP kernels for a longer time already: https://github.com/armbian/build/search?q=CONFIG_DRM_IGNORE_IOTCL_PERMIT