is it dual channel or not?
Raxda 5b with 32 Gb is a scam or not?
yes i have checked all that, thanks
@Goetterfunke said
“RAM is 2x Rayson RS4G32LV4”… " not sure what that means
and people have mentioned a 64Gb version, ram chips are usually 32gb max per rank so…
there’s also this i found on the web:
“RK3588 has a high-performance 4-channel external memory interface (LPDDR4/LPDDR4X/LPDDR5), capable of supporting demanding memory bandwidth.”
so not everyone is a gaymer, bandwidth dependant applications are plenty and that would have a HUGE impact on sales myself I’d buy a few just for this
Congratulations on your new SBC as my 5B blue 16 GB is on the way and expected next week.
All come with 2x DDR4 just the 2x 32gb 32bit was much harder to source in the same form factor.
Glad yours worked out fine, somehow my delivery label got butched up at the local post office having my first name twice and only having the first letter of the house number (which resulted in an invalid address). So they want to automatically return it. We’ll see if I’ll ever receive mine at this point
Hi There,
I don’t think it’s dual channel currently. Tried with
sudo dmidecode -t 17 | awk 'BEGIN { FS=":"; OFS="\t" } /Size|Channel/ { line = (line ? line OFS : "") $2 } /^$/ { print line; line="RAM" }' | grep -iv 'no'
From here https://superuser.com/questions/1678916/checking-if-ram-works-in-dual-channel-mode-on-linux
But no output.
Attached some pics. (Not all is working yet ofc. Running RC7.7.
Update: And of course I hope you all get yours too shortly.
dmidecode
won’t work because there is no DMI
Just run mem benchmark, like this one
wget https://raw.githubusercontent.com/ThomasKaiser/sbc-bench/master/sbc-bench.sh sudo /bin/bash ./sbc-bench.sh -c
And btw, SBC doesn’t support dmidecode, since dmidecode is using BIOS api to get this info. So you can’t use it on sbc
Thank you thats a looong output. I hope I got the right parts (pls tell me if its dual channel or not:D):
Memory performance (all 2 CPU clusters measured individually):
memcpy: 2940.1 MB/s (Cortex-A55)
memset: 9569.3 MB/s (Cortex-A55)
memcpy: 5564.7 MB/s (Cortex-A76)
memset: 9917.6 MB/s (Cortex-A76)
Cpuminer total scores (5 minutes execution): 8.36,8.35,8.34,8.33,8.32 kH/s
7-zip total scores (3 consecutive runs): 7635,7627,7656, single-threaded: 1291
OpenSSL results (all 2 CPU clusters measured individually):
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
aes-128-cbc 69920.19k 210327.57k 414177.96k 549927.25k 609214.46k 612564.99k (Cortex-A55)
aes-128-cbc 216500.04k 445785.77k 584401.49k 629754.54k 648596.14k 650106.20k (Cortex-A76)
aes-192-cbc 66626.85k 187401.13k 337420.71k 424537.77k 457509.55k 460789.08k (Cortex-A55)
aes-192-cbc 203714.70k 401191.83k 499121.32k 526581.08k 541144.41k 542081.02k (Cortex-A76)
aes-256-cbc 65204.37k 172122.18k 291922.94k 354556.25k 378432.17k 380316.33k (Cortex-A55)
aes-256-cbc 200576.10k 356975.79k 432751.70k 455417.51k 463885.65k 464825.00k (Cortex-A76)
Unable to upload full test results. Please copy&paste the below stuff to pastebin.com and
provide the URL. Check the output for throttling and swapping please.
sbc-bench v0.9.60 Radxa ROCK 5 Model B (Fri, 05 Jan 2024 00:20:33 +0000)
Distributor ID: Debian
Description: Debian GNU/Linux 12 (bookworm)
Release: 12
Codename: bookworm
/usr/bin/gcc (Debian 12.2.0-14) 12.2.0
Uptime: 00:20:33 up 8 min, 3 users, load average: 1.27, 1.04, 0.64, 50.8°C, 22400603
....
C copy backwards : 4732.2 MB/s (3, 0.3%)
C copy backwards (32 byte blocks) : 4675.8 MB/s (2)
C copy backwards (64 byte blocks) : 4677.2 MB/s (3, 0.2%)
C copy : 5327.4 MB/s (3, 0.1%)
C copy prefetched (32 bytes step) : 5558.4 MB/s (2)
C copy prefetched (64 bytes step) : 5510.8 MB/s (2)
C 2-pass copy : 1821.3 MB/s (3, 0.6%)
C 2-pass copy prefetched (32 bytes step) : 2515.7 MB/s (2)
C 2-pass copy prefetched (64 bytes step) : 2923.7 MB/s (2)
C scan 8 : 407.4 MB/s (2)
C scan 16 : 814.8 MB/s (2)
C scan 32 : 1629.3 MB/s (2)
C scan 64 : 3257.6 MB/s (2)
C fill : 9902.4 MB/s (3, 0.8%)
C fill (shuffle within 16 byte blocks) : 9922.6 MB/s (2)
C fill (shuffle within 32 byte blocks) : 9899.4 MB/s (3, 0.2%)
C fill (shuffle within 64 byte blocks) : 9842.6 MB/s (3, 0.4%)
---
libc memcpy copy : 5564.7 MB/s (3, 0.2%)
libc memchr scan : 5718.8 MB/s (3, 0.1%)
libc memset fill : 9917.6 MB/s (3, 0.8%)
---
NEON LDP/STP copy : 5528.7 MB/s (3, 0.3%)
NEON LDP/STP copy pldl2strm (32 bytes step) : 5551.7 MB/s (3, 0.1%)
NEON LDP/STP copy pldl2strm (64 bytes step) : 5587.2 MB/s (3, 0.2%)
NEON LDP/STP copy pldl1keep (32 bytes step) : 5661.2 MB/s (3)
NEON LDP/STP copy pldl1keep (64 bytes step) : 5666.3 MB/s (2)
NEON LD1/ST1 copy : 5412.8 MB/s (2)
NEON LDP load : 7088.4 MB/s (2)
NEON LDNP load : 7113.2 MB/s (2)
NEON STP fill : 9911.0 MB/s (3, 1.0%)
NEON STNP fill : 9926.2 MB/s (3, 0.5%)
ARM LDP/STP copy : 5553.5 MB/s (3, 0.3%)
ARM LDP load : 6480.5 MB/s (2)
ARM LDNP load : 6824.1 MB/s (2)
ARM STP fill : 9891.9 MB/s (3, 0.9%)
ARM STNP fill : 9905.7 MB/s (3, 0.4%)
....
block size : single random read / dual random read, [MADV_NOHUGEPAGE]
1024 : 0.0 ns / 0.0 ns
2048 : 0.0 ns / 0.0 ns
4096 : 0.0 ns / 0.0 ns
8192 : 0.1 ns / 0.0 ns
16384 : 0.4 ns / 0.4 ns
32768 : 1.9 ns / 2.4 ns
65536 : 3.3 ns / 5.9 ns
131072 : 9.0 ns / 14.5 ns
262144 : 21.3 ns / 30.8 ns
524288 : 31.6 ns / 39.6 ns
1048576 : 37.9 ns / 42.5 ns
2097152 : 44.4 ns / 43.9 ns
4194304 : 79.8 ns / 112.9 ns
8388608 : 139.0 ns / 185.9 ns
16777216 : 169.0 ns / 216.4 ns
33554432 : 187.2 ns / 237.9 ns
67108864 : 204.3 ns / 263.8 ns
block size : single random read / dual random read, [MADV_HUGEPAGE]
1024 : 0.0 ns / 0.0 ns
2048 : 0.0 ns / 0.0 ns
4096 : 0.0 ns / 0.0 ns
8192 : 0.0 ns / 0.0 ns
16384 : 0.4 ns / 0.5 ns
32768 : 1.5 ns / 2.6 ns
65536 : 3.5 ns / 6.0 ns
131072 : 6.3 ns / 9.8 ns
262144 : 21.0 ns / 30.7 ns
524288 : 31.3 ns / 39.6 ns
1048576 : 37.8 ns / 42.5 ns
2097152 : 44.5 ns / 43.4 ns
4194304 : 76.5 ns / 105.7 ns
8388608 : 128.2 ns / 171.0 ns
16777216 : 153.2 ns / 193.0 ns
33554432 : 165.9 ns / 201.7 ns
67108864 : 172.6 ns / 203.9 ns
....
NEON LDP/STP copy (from framebuffer) : 5360.3 MB/s (3, 0.1%)
NEON LDP/STP 2-pass copy (from framebuffer) : 2649.2 MB/s (2)
NEON LD1/ST1 copy (from framebuffer) : 5292.9 MB/s (2)
NEON LD1/ST1 2-pass copy (from framebuffer) : 2380.8 MB/s (2)
ARM LDP/STP copy (from framebuffer) : 5351.8 MB/s (2)
ARM LDP/STP 2-pass copy (from framebuffer) : 2470.0 MB/s (3, 0.2%)
...
Executing ramlat on cpu0 (Cortex-A55), results in ns:
size: 1x32 2x32 1x64 2x64 1xPTR 2xPTR 4xPTR 8xPTR
4k: 3.750 3.767 3.696 3.698 2.464 3.710 5.132 10.07
8k: 3.692 3.692 3.694 3.691 2.460 3.702 4.998 10.08
16k: 3.713 3.692 3.739 3.692 2.479 3.693 5.004 10.08
32k: 3.760 3.703 3.763 3.705 2.498 3.705 5.020 10.58
64k: 23.10 25.55 23.05 25.52 23.40 25.56 35.20 64.42
128k: 23.51 25.19 23.59 25.30 25.92 25.17 39.95 80.12
256k: 39.27 46.10 39.36 46.06 35.35 45.80 67.24 119.1
512k: 44.05 44.51 44.09 44.54 42.25 45.00 70.16 135.3
1024k: 44.23 46.17 44.21 44.76 42.60 45.20 71.46 136.3
2048k: 44.95 44.77 44.56 45.22 42.81 45.47 71.53 136.4
4096k: 94.70 101.1 83.44 101.6 81.79 117.5 169.7 334.7
8192k: 168.7 164.7 150.4 169.2 149.2 171.0 291.4 530.5
16384k: 170.7 172.9 175.3 174.4 165.6 195.4 334.1 589.3
32768k: 177.1 178.1 178.5 178.8 172.2 190.6 343.9 599.2
65536k: 182.1 185.2 181.9 182.7 178.7 191.8 346.9 604.2
131072k: 182.2 183.4 182.4 183.6 179.3 192.3 349.7 603.1
Executing ramlat on cpu4 (Cortex-A76), results in ns:
size: 1x32 2x32 1x64 2x64 1xPTR 2xPTR 4xPTR 8xPTR
4k: 4.910 4.908 4.909 4.909 4.908 4.908 4.908 9.314
8k: 4.908 4.908 4.908 4.908 4.909 4.909 4.908 9.564
16k: 4.908 4.908 4.908 4.909 4.908 4.909 4.909 9.562
32k: 4.909 4.910 4.908 4.909 4.908 4.908 4.909 9.570
64k: 4.916 4.913 4.916 4.913 4.916 4.916 4.916 9.577
128k: 14.73 14.73 14.72 14.72 14.72 16.52 20.61 37.15
256k: 14.75 14.73 14.73 14.73 14.72 16.26 20.86 37.23
512k: 15.30 15.59 15.48 15.66 15.40 16.63 20.92 37.50
1024k: 50.42 51.04 48.97 50.69 49.13 50.63 58.80 92.03
2048k: 50.12 51.17 49.19 51.41 49.20 51.72 61.72 93.21
4096k: 102.1 86.13 89.40 85.89 85.61 84.07 91.94 115.4
8192k: 167.5 144.5 149.9 141.5 156.7 138.5 149.1 165.8
16384k: 179.1 170.8 177.1 178.2 180.6 167.9 181.8 188.8
32768k: 184.3 184.9 184.0 185.3 189.3 175.3 182.4 193.5
65536k: 189.3 187.9 191.1 189.4 189.0 178.9 185.4 194.2
131072k: 193.6 193.8 193.5 193.2 194.1 183.9 189.0 197.2
....
CPU Freq: - - - - - - - - -
RAM size: 31782 MB, # CPU hardware threads: 8
RAM usage: 1765 MB, # Benchmark threads: 8
Compressing | Decompressing
Dict Speed Usage R/U Rating | Speed Usage R/U Rating
KiB/s % MIPS MIPS | KiB/s % MIPS MIPS
22: 7594 724 1020 7388 | 83666 707 1009 7136
23: 7871 764 1050 8020 | 82676 708 1011 7155
24: 7676 764 1081 8253 | 81675 709 1012 7169
25: 7651 769 1137 8736 | 80444 707 1013 7159
---------------------------------- | ------------------------------
Avr: 755 1072 8099 | 708 1011 7155
Tot: 731 1041 7627
7-Zip (a) [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=C.UTF-8,Utf16=on,HugeFiles=on,64 bits,8 CPUs LE)
LE
CPU Freq: 64000000 - - - - - - - -
RAM size: 31782 MB, # CPU hardware threads: 8
RAM usage: 1765 MB, # Benchmark threads: 8
Compressing | Decompressing
Dict Speed Usage R/U Rating | Speed Usage R/U Rating
KiB/s % MIPS MIPS | KiB/s % MIPS MIPS
22: 7974 762 1018 7757 | 83850 709 1009 7152
23: 7733 753 1046 7879 | 82901 710 1011 7174
24: 7595 756 1081 8166 | 81765 709 1012 7176
25: 7710 774 1138 8803 | 80263 705 1013 7143
---------------------------------- | ------------------------------
Avr: 761 1071 8152 | 708 1011 7161
Tot: 735 1041 7656
Compression: 8122,8099,8152
Decompression: 7147,7155,7161
Total: 7635,7627,7656
....
Architecture: aarch64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 8
On-line CPU(s) list: 0-7
Vendor ID: ARM
Model name: Cortex-A55
Model: 0
Thread(s) per core: 1
Core(s) per socket: 4
Socket(s): 1
Stepping: r2p0
BogoMIPS: 48.00
Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp
Model name: Cortex-A76
Model: 0
Thread(s) per core: 1
Core(s) per socket: 4
Socket(s): 1
Stepping: r4p0
BogoMIPS: 48.00
Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp
L1d cache: 384 KiB (8 instances)
L1i cache: 384 KiB (8 instances)
L2 cache: 2.5 MiB (8 instances)
L3 cache: 3 MiB (1 instance)
NUMA node(s): 1
NUMA node0 CPU(s): 0-7
Vulnerability Gather data sampling: Not affected
Vulnerability Itlb multihit: Not affected
Vulnerability L1tf: Not affected
Vulnerability Mds: Not affected
Vulnerability Meltdown: Not affected
Vulnerability Mmio stale data: Not affected
Vulnerability Retbleed: Not affected
Vulnerability Spec rstack overflow: Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1: Mitigation; __user pointer sanitization
Vulnerability Spectre v2: Mitigation; CSV2, BHB
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Not affected
....
vdd_cpu_big0_s0: 800 mV (1050 mV max)
vdd_cpu_big1_s0: 800 mV (1050 mV max)
opp-table-cluster0:
408 MHz 750.0 mV
600 MHz 750.0 mV
816 MHz 750.0 mV
1008 MHz 750.0 mV
1200 MHz 775.0 mV
1416 MHz 825.0 mV
1608 MHz 875.0 mV
1800 MHz 950.0 mV
opp-table-cluster1:
408 MHz 600.0 mV
600 MHz 600.0 mV
816 MHz 600.0 mV
1008 MHz 625.0 mV
1200 MHz 650.0 mV
1416 MHz 675.0 mV
1608 MHz 700.0 mV
1800 MHz 775.0 mV
2016 MHz 850.0 mV
2208 MHz 925.0 mV
opp-table-cluster2:
408 MHz 600.0 mV
600 MHz 600.0 mV
816 MHz 600.0 mV
1008 MHz 625.0 mV
1200 MHz 650.0 mV
1416 MHz 675.0 mV
1608 MHz 700.0 mV
1800 MHz 775.0 mV
2016 MHz 850.0 mV
2208 MHz 925.0 mV
Man, using pastebin would be simplier, but okey
personally I have no idea, now we need to found out the following things
- Results of tinymemorybenchmark of x86 PC with single channel 16GB
- Results of tinymemorybenchmark of the same x86 PC with dual channel 32GB
- Results of tinymemorybenchmark from Rock 5B with 16GB
and compare the % difference between them. While we have the 3rd (there a lot of sbc-bench results of Rock 5B) you still need the first two to understand how channel affects the latency
I’m not sure what concerns there are about dual channel memory, if you review the schematics/datasheet you will see the memory controller is configured for quad channel 4 x 16 bit I/0. Regardless of the memory capacity (4/8/16/32GB) there are 2 LPDDR4 ram chips & each is dual channel to support 2 x 16 bit. The only the difference between memory capacities may be clock speed however I doubt this and you need to look at the bootloader/kernel. Here is the output from the serial console on boot for 8GB version:
@�fbcc`cDDR Version V1.08 20220617
LPDDR4X, 2112MHz
channel[0] BW=16 Col=10 Bk=8 CS0 Row=16 CS1 Row=16 CS=2 Die BW=16 Size=2048MB
channel[1] BW=16 Col=10 Bk=8 CS0 Row=16 CS1 Row=16 CS=2 Die BW=16 Size=2048MB
channel[2] BW=16 Col=10 Bk=8 CS0 Row=16 CS1 Row=16 CS=2 Die BW=16 Size=2048MB
channel[3] BW=16 Col=10 Bk=8 CS0 Row=16 CS1 Row=16 CS=2 Die BW=16 Size=2048MB
Clock speed is 2112Mhz and each channel is addressing 2GB , therefore the concept of dual channel as used for x86 may not apply.
If I were purchasing a 16/32GB model I would be more concerned about how useful the extra memory is for your usecase because some of the peripheral blocks can only access physical memory in the 4GB range ie pice (GICv3 only supporting 32-bit addresses) & npu, these are the 2 I have looked into.
i have looked further on real world use cases like AI inferencing and seems the rk3588 is not powerful enough to make use of Bandwidth even at the default single channel configuration(about 35Gb/s theoretically)
but the bf16 > int8 transition is expected to be in a few months, if the radxa guys can make the NPU fully functional then you could offload things