Well, adjusting this (and providing one of the two USB3 enclosures with own power) fixed the issues.
But it seems every tweak I added over the years to Armbian (like coherent pool size or IRQ affinity and ondemand
governor tweaks) needs to be applied to Radxa’s OS images too.
Testing the raid0 with a simple iozone -e -I -a -s 1000M -r 1024k -r 16384k -i 0 -i 1
:
kB reclen write rewrite read reread
1024000 1024 258830 270261 341979 344249
1024000 16384 270022 271088 667757 679947
That was running on a little core and is total crap. Now on a big core (by prefixing the iozone call with taskset -c 7
):
kB reclen write rewrite read reread
1024000 1024 475382 457597 374120 374863
1024000 16384 777913 771802 736511 734899
Still crappy but when checking with sbc-bench -m
it’s obvious that CPU clockspeeds don’t ramp up quickly and to the max:
Time big.LITTLE load %cpu %sys %usr %nice %io %irq Temp
19:43:03: 408/1008MHz 0.35 3% 1% 0% 0% 1% 0% 32.4°C
19:43:08: 408/1008MHz 0.33 0% 0% 0% 0% 0% 0% 32.4°C
19:43:13: 816/1200MHz 0.38 13% 3% 0% 0% 8% 1% 32.4°C
19:43:18: 1416/1200MHz 0.43 13% 2% 0% 0% 10% 0% 32.4°C
19:43:23: 600/1008MHz 0.56 14% 4% 0% 0% 7% 1% 33.3°C
19:43:28: 408/1008MHz 0.51 1% 0% 0% 0% 0% 0% 32.4°C
One echo 1 > /sys/devices/system/cpu/cpufreq/policy6/ondemand/io_is_busy
later we’re at
kB reclen write rewrite read reread
1024000 1024 604081 603160 505333 506815
1024000 16384 791865 789173 775160 778706
Still not perfect (less than 800 MB/sec) but at least the big core immediately switches to highest clockspeed:
Time big.LITTLE load %cpu %sys %usr %nice %io %irq Temp
19:50:44: 408/1008MHz 0.00 2% 0% 0% 0% 0% 0% 32.4°C
19:50:49: 2400/1800MHz 0.00 5% 1% 0% 0% 3% 0% 32.4°C
19:50:54: 2400/1800MHz 0.08 12% 1% 0% 0% 11% 0% 32.4°C
19:50:59: 2400/1800MHz 0.15 13% 2% 0% 0% 9% 0% 34.2°C
19:51:04: 408/1008MHz 0.14 2% 0% 0% 0% 1% 0% 32.4°C
But 790 MB/s are close to what can be achieved with a RAID0 over two USB3 SuperSpeed buses since each bus maxes out at slightly above 400 MB/s.
Moral of the story: if you’re shipping with ondemand
cpufreq governor you need to take care about io_is_busy
and friends.
After adding the following to /etc/rc.local
even on a little core storage performance is ok-ish:
echo default >/sys/module/pcie_aspm/parameters/policy
for cpufreqpolicy in 0 4 6 ; do
echo 1 > /sys/devices/system/cpu/cpufreq/policy${cpufreqpolicy}/ondemand/io_is_busy
echo 25 > /sys/devices/system/cpu/cpufreq/policy${cpufreqpolicy}/ondemand/up_threshold
echo 10 > /sys/devices/system/cpu/cpufreq/policy${cpufreqpolicy}/ondemand/sampling_down_factor
echo 200000 > /sys/devices/system/cpu/cpufreq/policy${cpufreqpolicy}/ondemand/sampling_rate
done
This is the result tested with taskset -c 1 iozone -e -I -a -s 1000M -r 1024k -r 16384k -i 0 -i 1
:
kB reclen write rewrite read reread
1024000 1024 524857 526483 458726 459194
1024000 16384 780470 774856 733638 734297