You can make some comparisons using geekbench results. Pi4 (depending on clock speed) is rated at about 250/650 (1500Mhz) and up to 310/810 (2200Mhz). Best scores for Rock3A are those at 168/533 (and most are 130/450). This is 80% of performance and 70% for those lower scores, and up to 50% compared to overclocked pi. Of course it’s just benchmark and I don’t expect all results is directly comparable, but You can get some idea how big cores compares to little.
The real strength of RK3568 lay somewhere else - it’s newer&faster i/o with pcie 3.0.
I don’t think sodimm are ok here, it’s just better and cheaper to put onboard memory. Please note that RK3568 supports up to 8GB and You will not get anything better than rock3a with it’s memory (which is $75 for 8GB version)
ROCK 3B ideas and discussion
You do realize you can just put another controller for 2x or more additional SATA ports or, if you absolutely insist on single native SATA port, use this cheap board in m.2 b-key 2242 that has both PCIe & SATA wired to it?
No you can not as the pcie lanes are already used with the 1st so all bandwidth is taken even if you could find a m.2 splitter as there is a single pcie3.0 x2 .
Or go to all the expense of a splitter and pcie bifurcation to get 0.5 x bandwidth whilst a single on board would mean x6 @ full speed.
The other m.2 is a+e
https://www.aliexpress.com/item/1005003839974544.htm
It was just a bad choice of format and i/o for that SoC if you look at alternative devices built on that SoC that are almost all either router or nas devices.
By the times you have hacked in extender boards and convertors it becomes massively less cost effective the compute modules where more like it should of been but even then still some curious choices and even less cost effective.
I am not going to do anything as I didn’t purchase one or would as don’t think it makes a great SBC at that price for what it offers.
We are talking about Rock 3B, right?
@stuartiannaylor JMB585 will indeed occupy lower m.2 m key slot with PCIe 3.0 x2, but you still have PCIe 2.1 x1 muxed with SATA in upper m.2 b key slot, and can choose to use either. How is it bad? You can just put this without any adapters in upper slot, instead of messing with A E card you linked. Or a 4-port Marvell controller, again without riser or adapter. And again, for single SATA you can use this.
The upper pcie2 is a+e and no I can not use one of those cards as said, I am not going to do anything as I didn’t purchase one or would as don’t think it makes a great SBC at that price for what it offers.
If you check the update from @hipboi, the upper PCIe2 on Rock 3B will be B key, not E key.
That’s entirely up to you. I just pointed out that you are wrong about SBC capabilities, and provided examples how you could solve the problem described in your earlier post.
Its more feedback than anything as usually give whatever Radxa is offering a go but always felt the rk356x SoCs where extremely application focussed whilst the big dog rk3588 is the general purpose desktop.
I prob could use if b key thought it was a+e but still undecided as that port would be sataII as opposed to the others being sataIII which guess all would drop to the lower port speed or at least be queing and waiting for it.
I just don’t like the idea or the Pi like format.
That port will be SATA3 even if you use muxed SATA via simple adapter, let alone one of the PCIe to SATA3 controllers linked above.
you’re right, forgotten that it has 8GB max, better to have onboard RAM
just did a quick search on Geekbench, the best scores I can find are approximately that
Pi3+ r1.3 @1.4GHz = 120/350 (80%/78%)
RK3568 @2GHz = 150/450 (100%/100%)
Pi4 r1.1 @1.5GHz = 230/680 (153%/151%) <-- have it running 1.8GHz passively cooled
Pi4 r1.4 @1.8GHz = 270/740 (180%/164%)
RK3399 @1.4GHz = 270/750 (180%/167%)
RK3588 @1.8GHz = 500/2000 (333%/444%)
Seems like Pi is quite a tad faster, though it has no crypto engine; currently Pi 4 4GB is around US$100, and I think Rock 3 with 4GB RAM only should be quite a bit cheaper…
Not sure that QSGMII can be used for 2x2.5GbE at all… normally it is used for 4xGbE, but I doubt Rock 3B would have that.
Sure? Why not taking 183/589 instead?
Which use case does Geekbench represent? And what do these numbers mean? Those combined scores?
How is this huge result variation possible given that Geekbench claims to benchmark hardware?
Why do you list RK3588 with 1.8 GHz?
I am not very sure if you really know how benchmarking works…
Anyway, variation happens when one use a different OS, hw rev, config / DTS, and sometimes even connected devices. Typical benchmarking requires a CONTROLLED environment, and I don’t think it is easy to come by, like a temperature / humidity controlled room.
Back to the “comment”, the result I used are based on Android 12, while the one you stated is based on Ubuntu 20.04 LTS, that’s why. And why 1.8GHz for RK3588? you may go check it out yourself.
https://browser.geekbench.com/v5/cpu/search?utf8=✓&q=rk3588
And perhaps I should just focus on INT performance too…
LOL! Too funny!
I did exactly that. Of course not relying on this Geekbench garbage but by examining SoC and software behaviour.
And on RK3588 (as well as RK3566/RK3568) we’re dealing with PVTM which usually results in the rather uninteresting little cores clocking in at 1800 MHz but the more important A76 ones being clocked up to 2400 MHz: https://github.com/ThomasKaiser/Knowledge/blob/master/articles/Quick_Preview_of_ROCK_5B.md#pvtm
We’ve recently seen some comical PVTM failure resulting in the A76 cores being clocked at only 400 MHz.
While Geekbench is that stupid to rely on some sysfs entries that claim certain clockspeeds. And the 1800 MHz you were fooled by is due to Geekbench presenting the clockspeed of cpu0
which on ARM is usually a little core. Geekbench just generates some funny numbers and its target audience usually doesn’t give a sh*t but blindly trusts into these numbers.
Then maybe you should better look at my sbc-bench than this Geekbench garbage…
sbc-bench tries to generate insights and not just numbers. See these two Rock 3A results:
http://ix.io/40TX: 7-zip multi-threaded score of 5110
http://ix.io/48eg: 7-zip multi-threaded score of just 2690
How is that possible? Fortunately sbc-bench results also contain the answers. It’s not just silicon variation (the RK3568 on the 2nd test clocks severly lower than 2000 MHz) but it’s broken thermal trip points AKA bad settings.
As for results variations with RK3568 we have
- PVTM
- boot BLOBs that do DRAM initialization (I have a couple of older sbc-bench results in my collection with really bad memory performance)
- relevant kernel config like
CONFIG_HZ
or thermal trip points resulting in moderate up to absurd throttling like above - different environmental conditions like temperature
- background activitly that ruins benchmark numbers
Geekbench doesn’t care about any of these, just generates and uploads random numbers.
the result I used are based on Android 12, while the one you stated is based on Ubuntu 20.04 LTS, that’s why.
LOL! Too funny!
Sure, if you have the time to go into all these, it is good for you; I used to play with benchmarking when I was in college (yep they have class for that too), but that was so many years ago and I don’t have time to “study all these” anymore.
And of course Geekbench does not care… how would they support a SoC is beyond me, but there are so many SoCs and SBCs in the world, I don’t think it is economical for them to “support everything”. Perhaps the best way to benchmark is to make it work like memtest86, that the system only runs what it intended to (benchmarking), and no more.
Obviously Geekbench is targeting a much wider audience than SBC, so can you tell how a Rock 5 using a cheap PCIe x4 SSD fair against a M2 Macbook Air? or just some old tablet that runs Intel Z8350 dual booting Windows 10 and Android? with your benchmark? Different benchmark serves different purposes, so I won’t just call them crap, unless they deliberately lie.
There is no need to ‘support’ a SoC. It’s due diligence when generating these numbers. And Geekbench sucks here big time.
They don’t give a sh*t about a sane benchmarking environment and deliberately upload garbage numbers to their website even if those are just numbers without meaning.
It is not hard to
- measure CPU clockspeeds instead of trusting into some (often faked) sysfs entry. An ‘amateur’ like me uses @willy’s
mhz
utility for this in sbc-bench. Geekbench clockspeed reporting is pure BS. They report 2700MHz or 4000MHz for the same Core i7 in my old MacBook Pro depending on whether the benchmark runs on Windows or Linux. ARM’s big.LITTLE is there for a decade now but Geekbench doesn’t care. They (try to) measure single-threaded performance on a big core but report the little core’s sysfs clockspeed. Stupid isn’t it? - check background activity. It’s easy to spot this and to refuse to upload crappy numbers if this happened
- check throttling or thermals in general. It’s easy to spot this and to refuse to upload crappy numbers if this happened
- get the CPU’s cluster details and report them correctly and also test cores of all clusters individually. If ‘amateurs’ like me are able to do this in sbc-bench those Geekbench guys should be able to do this as well. At least benchmarking is their only job
Use case first! Always!
Which use case does Geekbench represent other than… none? I asked this many times various people and never got an answer. Which f*cking use case do these Geekbench combined scores represent?
With a ‘use case first’ approach it’s easy: I need macOS for my main computer so regardless of any performance considerations it’s the MacBook.
Then if the use case is ‘use this thing as desktop computer’ the important stuff is
- high random I/O performance of the boot drive (the drive the OS is installed on)
- as much graphics acceleration possible (both GPU and VPU)
- sufficient amount of RAM since today’s software is a sh*tload of complexity (crappy software stacking layers of complexity one over another) needing tons of RAM. Once the machine starts to pageout/swap especially on storage with low random I/O performance it’s game over
What will Geekbench tell you about this? Nothing since it only focusses on (multi-threaded) CPU performance which is something for totally different use cases. The target audience is consumers not willing/able to think about what’s important but just interested in numbers/graphs and the ‘less is better’ or ‘more is better’ lable so they can compare $whatever
without using their brain.
When talking about an M2 MacBook… why don’t you talk about an M1 MacBook Air?
You can get them less expensive (secondhand/refurbished) compared to the recent M2 model. Geekbench will tell you that there’s only a minor difference in ‘performance’ but if your use case is video editing and you’re stuck to the Apple ecosystem then there’s a massive performance difference since M2 has an Apple ProRes encoder/decoder ‘in hardware’.
It’s not my benchmark. sbc-bench is just a primitive tool that executes a small number of well established benchmarks in a controlled environment to be able to trash all those results that are garbage. It’s designed to get some real insights and not just numbers/graphs. IMHO developing a new benchmark is almost as stupid as developing cryptography if you’re not an absolute expert like @NicoD
Benchmarking is part of my day job and honestly +95% of all my benchmarking results need to be trashed since something went wrong (some detail you overlooked, some strange background activity, something you need to take into account since you learned something important on the way and so on). Compare with Geekbench or the Phoronix crap: both generate a cemetery of broken numbers. But this is part of their business model so why should this change?
As long as users are happy swallowing these BS numbers it works for them…
I have to agree with @tkaiser here. Geekbench is widely known for reporting random numbers that depend on the OS, its version, geekbench’s version, plenty of other variables nobody knows about, and probably the user’s mood and expectations, and all this without accurately reporting the execution conditions. Over the year we’ve read so many garbage values that its metric is not even a reliable one to validate how geekbench itself would perform on this or that machine in the unlikely case that your target use case would only be to run geekbench on the machine. As to how this translates to real world use cases… Honestly dickbench is only useful to compare one’s to others and claim “mine is bigger”.
And indeed, benchmark serves to provide a figure. The most important part of a benchmark is that it is reproducible. There’s no problem if reproducing it requires 10 pages of setup, provided that at the end you get the same results again, as it proves you’ve got the model right. That definitely requires eliminating all background noise or at least limiting it to a known level. Your indications seem extremely far from this at the moment, with only random elements being extracted to characterize the environment, with the most important one (CPU frequency) being reported wrong on a machine when it can vary 6-fold.
As a hint, a benchmark that only returns one value is garbage and totally useless. You need to find multiple dimensions and figure which one best matches your use case (i.e. mostly I/O, mostly CPU, mostly RAM, mostly GPU etc), and make sure about the condition they were executed in.
You don’t need to create a new benchmark, running a specific application that matches your use case can be a perfectly valid benchmark for your use case. You just need to limit undesired variations. For example I’m interested in compilation times and for this I’m keeping an old tar.gz of some software relevant to me, and build it with a binary build of the compiler I used on all benchmarks. That doesn’t prevent me from testing other combinations but at least I can compare the machines’ performance on this specific test, knowing that there are limits (e.g. number of cores etc).
@willy @tkaiser - everything You complain about GB it’s true and it’s easy to find out that produced result are not super reliable. It’s easy to get much lower scores as well as there are sometimes manipulated score uploaded to GB browser (like 80k results from cellphones, much better than big&hungry amd epyc).
Right now best scores are flooded with some AMD eng. samples with 192 cores. It’s that something real or yet again some fun from developers or bored users? We also know nothing about any other result, was cpu overclocked, was and how it was cooled? All those are unknown (as well as many other things that are very important) and this can alter result.
We all know that GB is also outdated and not precise about “core”. Right now we have little.big cores including newest intel cpus. Software (system and kernel) also can limit some numbers and it’s not clear if You don’t have some feature or if it’s just turned off/configuration is wrong. Probably it’s just easier to get lower scores than too high.
So what You expect from this benchmark? Should it just split from “single/multi core” to something like “single small, single big, all cores”? Some results depends on storage, so maybe it should also benchmark this? And what about kernel and system? I think that You will never have precise number that represents those resources. Also I think that if it’s worth to change method no, sooner or later there should be something like geekbench6.
I saw discussion at sbc-bench for result web-interface. Even when most of tests will not use full cpu or have some bottlenecks those represents some state. For same reason GB scores are somehow useful plus it’s easy to view, save, compare etc. Probably it’s easier to get to low score than too high. One number alone means almost nothing if You can’t compare that to something else like other SBC, other OS, other components. The real advantage is when You find that You setup perform much slower than most others, You can get some idea that there is something that limits your score and maybe fix that (or not if You find out that it’s not Your specific usage). The more recorded result the higher probability that You have same setup.
So GB score is not perfect and I don’t expect it to be. It gives some overview, something to compare wisely. Like pi-benchmarks for storage gives some idea what You can get with some sd cards or m.2 via adapters. Original topic was about rock3a 4x small cores vs older big cores from pi4 and that can be checked there via some results, some day maybe something like cortex A59 small cores would be better and much more energy efficient than those from pi4.
For now I said that rk3568 is rather slower than pi4 and that is visible in results. You could get two boards, perform any task to You want (like @willy tar.gz) measure time and be sure. Also You can ask for those files and .sh script and find out yourself, but I think there is something similar inside GB subtasks, and You can compare this task alone. It’s just easier to browse results than looking for someone who have that board/computer etc.
In a ‘cemetery of bogus numbers’ which is what Geekbench browser relies on (and which also applies to the PTS stuff over at openbenchmarking.org).
How do you explain @enoch coming up with these GB single/multi scores:
While realistic scores for RK3568 are much higher, e.g. 183/589 instead. He spent/wasted some time to create a comparison table based on BS numbers.
The main problem is Geekbench allowing to upload garbage. But that’s part of their business model so nothing will change.
Want to search for RK3588? Even 2 results today! RK3588S even 10 hits! So 12 combined!
While in reality it were already 205 a month ago: https://github.com/ThomasKaiser/sbc-bench/blob/master/results/geekbench-rk3588/results.md (this was an attempt to do some data mining in the garbage bin – I was interested in RK3588 PVTM variation).
They even manage to hide the garbage they collect…
One last time about this Geekbench garbage.
The ‘natural’ clockspeed of RPi 4 is 1800 MHz (the 1500 MHz limitation applied only to the early BCM2711 Rev. B0 SoCs). Let’s assume BCM2711 C0 with natural clockspeed gets a GB 280/730 score.
Ok, we shouldn’t assume but browse the results: https://browser.geekbench.com/search?q=bcm2711
First surprise: RPi 4 when running in Linux has only 1 CPU core while being equipped with 4 when running Android!
Then looking at 1st results page only we see these scores for 1800 MHz:
- 184/280
- 194/429
- 210/529
- 212/637
- 193/451
- 206/450
Impressive since far far away from the assumed 280/730 score. And also the result variation is that high that we already know that we’re not dealing with a benchmark but with a random number generator.
Anyway: back to the assumed 280/730 score. If we compare with RK3568 and use 180/590 instead the A55 are at 65% percent of A72 with single-threaded tasks but 80% with tasks utilizing all 4 cores.
- BCM2711 achieves a 2.6 ratio when comparing single-threaded with multi-threaded (with silly synthetic benchmarks we would see a 4.0 ratio when comparing one core with 4)
- RK3568 achieves a 3.3 ratio when comparing single-threaded with multi-threaded
So RK3568 is 1.26 more efficient when all cores are busy compared to BCM2711 which hints at the latter being a SoC suffering from internal bottlenecks. And if we take these 1.26 factor and look at the multi-threaded efficiency (80% / 1.26 = 64%) we see that GB results are at least consistent.
But which use case do Geekbench combined scores represent? Exactly: none.
When we look at some real-world task like compression/decompression with some specific algorithm (e.g. 7-zip’s internal benchmark which is a rough representation of ‘server workloads in general’ as multi-threaded benchmark score) then the A55 are at 83% percent of A72 with single-threaded tasks but 91% with tasks utilizing all 4 cores (see below). I’m comparing RPi 4 at 1.8 GHz with Rock 3A at 2.0 GHz (‘natural’ clockspeeds, no ‘overclocking’ or the like).
While @enoch got fooled by Geekbench result browser to believe the A55 are at 56% percent of A72 with single-threaded tasks but 61% with tasks utilizing all 4 cores.
56% vs. 83% single-threaded and 61% vs. 91% multi-threaded. By relying on numbers without meaning.
Single-threaded 7-zip
1 x Cortex-A72 @ 2000 MHz (BCM2711)
7-Zip (a) [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=C,Utf16=off,HugeFiles=on,64 bits,4 CPUs LE)
LE
CPU Freq: - - - - - - - - -
RAM size: 3794 MB, # CPU hardware threads: 4
RAM usage: 435 MB, # Benchmark threads: 1
Compressing | Decompressing
Dict Speed Usage R/U Rating | Speed Usage R/U Rating
KiB/s % MIPS MIPS | KiB/s % MIPS MIPS
22: 1675 100 1631 1630 | 24872 100 2125 2124
23: 1605 100 1637 1636 | 24365 100 2111 2109
24: 1527 100 1642 1642 | 23919 100 2100 2100
25: 1420 100 1622 1622 | 23271 100 2072 2071
---------------------------------- | ------------------------------
Avr: 100 1633 1632 | 100 2102 2101
Tot: 100 1868 1867
1 x Cortex-A72 @ 1800 MHz (BCM2711)
Executing benchmark single-threaded on cpu0 (Cortex-A72)
7-Zip (a) [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=C,Utf16=off,HugeFiles=on,64 bits,4 CPUs LE)
LE
CPU Freq: - 64000000 - - - - - - -
RAM size: 958 MB, # CPU hardware threads: 4
RAM usage: 435 MB, # Benchmark threads: 1
Compressing | Decompressing
Dict Speed Usage R/U Rating | Speed Usage R/U Rating
KiB/s % MIPS MIPS | KiB/s % MIPS MIPS
22: 1688 100 1643 1643 | 22726 100 1941 1940
23: 1531 100 1561 1561 | 22395 100 1939 1939
24: 1442 100 1551 1551 | 21999 100 1932 1931
25: 1351 100 1543 1543 | 21476 100 1912 1912
---------------------------------- | ------------------------------
Avr: 100 1575 1574 | 100 1931 1930
Tot: 100 1753 1752
1 x Cortex-A55 @ 2000 MHz (RK3568)
7-Zip (a) [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=C,Utf16=off,HugeFiles=on,64 bits,4 CPUs LE)
LE
CPU Freq: - - - - - - - - -
RAM size: 3924 MB, # CPU hardware threads: 4
RAM usage: 435 MB, # Benchmark threads: 1
Compressing | Decompressing
Dict Speed Usage R/U Rating | Speed Usage R/U Rating
KiB/s % MIPS MIPS | KiB/s % MIPS MIPS
22: 1039 100 1011 1011 | 22872 100 1953 1953
23: 955 100 974 974 | 22343 100 1934 1934
24: 906 100 974 974 | 21790 100 1913 1913
25: 844 100 965 965 | 21048 100 1874 1873
---------------------------------- | ------------------------------
Avr: 100 981 981 | 100 1919 1918
Tot: 100 1450 1450
Multi-threaded 7-zip
4 x Cortex-A72 @ 2000 MHz (BCM2711)
7-Zip (a) [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=C,Utf16=off,HugeFiles=on,64 bits,4 CPUs LE)
LE
CPU Freq: - - - - - - - - -
RAM size: 3794 MB, # CPU hardware threads: 4
RAM usage: 882 MB, # Benchmark threads: 4
Compressing | Decompressing
Dict Speed Usage R/U Rating | Speed Usage R/U Rating
KiB/s % MIPS MIPS | KiB/s % MIPS MIPS
22: 4257 348 1188 4141 | 94921 396 2043 8098
23: 3910 367 1086 3984 | 92508 395 2024 8004
24: 3892 374 1120 4185 | 90618 396 2009 7955
25: 3824 370 1179 4366 | 88417 397 1983 7869
---------------------------------- | ------------------------------
Avr: 365 1143 4169 | 396 2015 7982
Tot: 381 1579 6075
4 x Cortex-A72 @ 1800 MHz (BCM2711)
7-Zip (a) [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=C,Utf16=off,HugeFiles=on,64 bits,4 CPUs LE)
LE
CPU Freq: - - - - - - - - -
RAM size: 958 MB, # CPU hardware threads: 4
RAM usage: 882 MB, # Benchmark threads: 4
Compressing | Decompressing
Dict Speed Usage R/U Rating | Speed Usage R/U Rating
KiB/s % MIPS MIPS | KiB/s % MIPS MIPS
22: 4232 350 1177 4117 | 88569 399 1894 7556
23: 4030 359 1145 4107 | 87096 399 1887 7536
24: 3978 372 1151 4278 | 85017 398 1877 7463
25: 2601 307 968 2970 | 82219 399 1834 7317
---------------------------------- | ------------------------------
Avr: 347 1110 3868 | 399 1873 7468
Tot: 373 1491 5668
4 x Cortex-A55 @ 2000 MHz (RK3568)
7-Zip (a) [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=C,Utf16=off,HugeFiles=on,64 bits,4 CPUs LE)
LE
CPU Freq: - - 64000000 64000000 - - - - -
RAM size: 3924 MB, # CPU hardware threads: 4
RAM usage: 882 MB, # Benchmark threads: 4
Compressing | Decompressing
Dict Speed Usage R/U Rating | Speed Usage R/U Rating
KiB/s % MIPS MIPS | KiB/s % MIPS MIPS
22: 3115 359 845 3031 | 86765 399 1857 7402
23: 2966 367 824 3022 | 84797 399 1838 7337
24: 2903 378 826 3122 | 82388 399 1811 7233
25: 2759 383 823 3151 | 78741 395 1772 7008
---------------------------------- | ------------------------------
Avr: 372 829 3081 | 398 1820 7245
Tot: 385 1325 5163
What You think “realistic” is?
You easily found out that scores are not consistent and easy to reproduce. You can get some picture if there are many results at some level (still not all). Of course thay are not perfect, but they test few things and gives some vision about whole set (software and hardware). Easy to get BS but also gives some insights, let’s say on pi number calculation
Yes, totally agree that results for Android are usually higher than those for ubuntu. Maybe there is something optimized so it gives better score. Of course information about cores is not correct.
OC makes all results much harder ro compare, but it’s still some information because not everything is able to do that. I try to avoid such score because those usually require additional cooling (or just bigger cooling) and can cause some stability issues, but some can just do oc and be happy.