Rock4 SE emmc input/output errors, read only FS

Hello,
today i received the 32gb eMMC module. I installed armbian on it (the build for the 4b as indicated on the armbian website). unfortunately fter a short time i get input output errors and the filesystem turns read only.

in my dmesg i can see this:

[ 161.444544] mmc1: cqhci: spurious TCN for tag 14
[ 161.444629] WARNING: CPU: 0 PID: 494 at drivers/mmc/host/cqhci-core.c:787 cqhci_irq+0x4b4/0x640
[ 161.444658] Modules linked in: lz4hc snd_soc_es8316 snd_soc_simple_card snd_soc_audio_graph_card btsdio snd_soc_rockchip_i2s snd_soc_simple_card_utils joydev snd_soc_hdmi_codec lz4 snd_soc_rockchip_pcm hci_uart snd_soc_core hantro_vpu© btqca btrtl rockchip_vdec© btbcm rockchip_iep btintel rockchip_rga v4l2_h264 videobuf2_dma_contig bluetooth videobuf2_dma_sg v4l2_mem2mem videobuf2_vmalloc snd_pcm_dmaengine snd_pcm snd_timer brcmfmac videobuf2_memops videobuf2_v4l2 videobuf2_common brcmutil cfg80211 videodev mc snd soundcore rfkill cpufreq_dt zram ip_tables x_tables autofs4 realtek panfrost gpu_sched dw_hdmi_cec dw_hdmi_i2s_audio dwmac_rk stmmac_platform stmmac pcs_xpcs
[ 161.444931] CPU: 0 PID: 494 Comm: kworker/0:2H Tainted: G C 5.15.80-rockchip64 #22.11.1
[ 161.444944] Hardware name: Radxa ROCK Pi 4B (DT)
[ 161.444952] Workqueue: kblockd blk_mq_run_work_fn
[ 161.444972] pstate: 800000c5 (Nzcv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=–)
[ 161.444985] pc : cqhci_irq+0x4b4/0x640
[ 161.444997] lr : cqhci_irq+0x4b4/0x640
[ 161.445008] sp : ffff800008003d10
[ 161.445014] x29: ffff800008003d10 x28: ffff00000626ac40 x27: ffff000005348580
[ 161.445036] x26: ffff00000530a298 x25: ffff800009529d90 x24: ffff800009d06d88
[ 161.445056] x23: ffff80000954e228 x22: 0000000000000002 x21: ffff000005348000
[ 161.445076] x20: 000000000000000e x19: ffff00000530a280 x18: 0000000000000001
[ 161.445096] x17: ffff8000edfe6000 x16: ffff800008004000 x15: 00000000000002aa
[ 161.445116] x14: ffff800008003a20 x13: 00000000ffffffea x12: ffff800009b2fd10
[ 161.445136] x11: 0000000000000003 x10: ffff800009b17cd0 x9 : ffff800009b17d28
[ 161.445157] x8 : 0000000000017fe8 x7 : c0000000ffffefff x6 : 0000000000000001
[ 161.445176] x5 : ffff8000edfe6000 x4 : 0000000000000000 x3 : 0000000000010004
[ 161.445196] x2 : 0000000000010003 x1 : 806d525e9addfc00 x0 : 0000000000000000
[ 161.445215] Call trace:
[ 161.445222] cqhci_irq+0x4b4/0x640
[ 161.445234] sdhci_arasan_cqhci_irq+0x5c/0x88
[ 161.445246] sdhci_irq+0xcc/0x10c0
[ 161.445261] __handle_irq_event_percpu+0x60/0x248
[ 161.445277] handle_irq_event_percpu+0x38/0x88
[ 161.445289] handle_irq_event+0x48/0xe8
[ 161.445301] handle_fasteoi_irq+0xb8/0x148
[ 161.445313] handle_domain_irq+0x90/0xd8
[ 161.445325] gic_handle_irq+0xb8/0x134
[ 161.445338] call_on_irq_stack+0x28/0x54
[ 161.445350] do_interrupt_handler+0x58/0x68
[ 161.445362] el1_interrupt+0x30/0x78
[ 161.445374] el1h_64_irq_handler+0x18/0x28
[ 161.445384] el1h_64_irq+0x74/0x78
[ 161.445394] preempt_count_sub+0x20/0xc0
[ 161.445407] _raw_spin_unlock_irqrestore+0x20/0x40
[ 161.445424] sdhci_cqe_enable+0x130/0x228
[ 161.445437] sdhci_arasan_cqe_enable+0x94/0xb8
[ 161.445449] cqhci_request+0xd0/0x650
[ 161.445460] mmc_cqe_start_req+0xb4/0x198
[ 161.445474] mmc_blk_mq_issue_rq+0x49c/0x9b0
[ 161.445486] mmc_mq_queue_rq+0x114/0x2b0
[ 161.445497] blk_mq_dispatch_rq_list+0x124/0x838
[ 161.445512] __blk_mq_sched_dispatch_requests+0xc4/0x1c8
[ 161.445523] blk_mq_sched_dispatch_requests+0x3c/0x78
[ 161.445533] __blk_mq_run_hw_queue+0x64/0xa0
[ 161.445545] blk_mq_run_work_fn+0x20/0x30
[ 161.445557] process_one_work+0x20c/0x4c8
[ 161.445571] worker_thread+0x48/0x478
[ 161.445582] kthread+0x138/0x150
[ 161.445593] ret_from_fork+0x10/0x20
[ 161.445605] —[ end trace 5f45817fb9424389 ]—
[ 196.340714] mmc1: running CQE recovery
[ 196.373413] mmc1: running CQE recovery
[ 299.787333] mmc1: running CQE recovery
[ 303.075469] mmc1: running CQE recovery

also i see the messages regarding the io errors:

[ 444.878511] Buffer I/O error on device mmcblk1p1, logical block 4164568
[ 444.878537] Buffer I/O error on device mmcblk1p1, logical block 4164569
[ 444.878546] Buffer I/O error on device mmcblk1p1, logical block 4164570
[ 444.878555] Buffer I/O error on device mmcblk1p1, logical block 4164571
[ 444.878563] Buffer I/O error on device mmcblk1p1, logical block 4164572
[ 444.878572] Buffer I/O error on device mmcblk1p1, logical block 4164573
[ 444.878580] Buffer I/O error on device mmcblk1p1, logical block 4164574
[ 444.878588] Buffer I/O error on device mmcblk1p1, logical block 4164575
[ 444.878595] Buffer I/O error on device mmcblk1p1, logical block 4164576
[ 444.878603] Buffer I/O error on device mmcblk1p1, logical block 4164577
[ 445.277760] mmc1: running CQE recovery
[ 445.281986] mmc1: running CQE recovery
[ 445.288143] mmc1: running CQE recovery
[ 445.300596] mmc1: running CQE recovery
[ 445.308118] mmc1: running CQE recovery
[ 445.309240] blk_update_request: I/O error, dev mmcblk1, sector 33453056 op 0x1:(WRITE) flags 0x4000 phys_seg 9 prio class 0
[ 445.316588] mmc1: running CQE recovery
[ 445.320245] mmc1: running CQE recovery
[ 445.321490] blk_update_request: I/O error, dev mmcblk1, sector 33455104 op 0x1:(WRITE) flags 0x0 phys_seg 9 prio class 0
[ 445.321517] EXT4-fs warning (device mmcblk1p1): ext4_end_bio:348: I/O error 10 writing to inode 124552 starting block 4182016)

IS this a problem with the armbian image, or is the eMMC module somehow damaged?

2 Likes

Hi kse,

Restart or re-burn, will this error recur?

If you have an sd card, burn the same image to test it.

By the way, what power supply do you use?

yes the problem will come back after a reflash of the emmc moduke. it just needs a few write operations like a apt upgrade.

it does not occur with a micro sd card.
i tried to migrate to nvme with armbian-install as alternative, but there systemd fails on boot with some stupidity and ends in an emergency shell.

I’m using a 33w usb-c PD adapter as power supply.

Is there somehow a debian minimal image officialle from radxa? i could only finde the xfce version, so i had to look for armbian

We don’t have a Debian minimal image at the moment.

You can use Ubuntu to test if the problem repeats.

Likely the same regression from v5.10.44 commit aea6cb99703e17019e025aa71643b4d3e0a24413 and its followup 98e48cd9283dbac0e1445ee780889f10b3d1db6a I also reported in https://forum.armbian.com/topic/20002-nanopc-t4-new-kernel-2202-generates-issues-on-mmc2-and-makes-system-not-properly-working/ . I though that 8a866d527ac0 (“regulator: core: Resolve supply name earlier to prevent double-init”) would fix it for good but it did not.

I’m having this issue as well, also on Armbian running the latest stable everything. It means you can’t install the OS from the SSD to emmc module due to the data corruption that occurs.

Hello,
I was curious to know if anything has changed in this thread.
I’m looking for modules now ( scarce in NA ) or am I wasting my time?

Hi, having the same issue with 2 Rock 4SEs with 32gb emmcs. Have tried both official images Debian and Armbian trunk from Radxa and armbian images direct from armbian. Under heavy load emmcs start erroring and eventually kernel remounts rootfs as read only. Can reduce errors by reducing CPU clock speed, but this is not a solution. Both sbcs work perfectly fine with same images running off microsd cards. Quite frustrating, have spent a decent number of hours trying to resolve issue.

Hi, @Damo

Can you provide a way to reproduce this issue? What do you mean by heavy load?

just start a compile of something or a bigger apt upgrade (without kernel update.

i were only able to get the radxa ubuntu 20 Server running stable.
Sadly there is no official headless debian 11

I have the same issues with rock 4 se and 64GB emmc.

It boots after the freshly imaged debian/ubuntu/armbian etc images, but after it, when I try to install the graphical environment with tasksel it dies.

Is there any solution for that? I need the emmc boot, without it it is useless for me.

[ 74.111070] hdmi-audio-codec hdmi-audio-codec.6.auto: ASoC: error at snd_soc_dai_startup on i2s-hifi: -22
[ 461.374685] blk_update_request: I/O error, dev mmcblk1, sector 3179520 op 0x1:(WRITE) flags 0x4000 phys_seg 8 prio class 0
[ 461.375863] EXT4-fs warning (device mmcblk1p2): ext4_end_bio:345: I/O error 10 writing to inode 46288 starting block 397824)
[ 461.376118] Buffer I/O error on device mmcblk1p2, logical block 389120
[ 461.376928] Buffer I/O error on device mmcblk1p2, logical block 389121
[ 461.377697] Buffer I/O error on device mmcblk1p2, logical block 389122
[ 461.378506] Buffer I/O error on device mmcblk1p2, logical block 389123
[ 461.379277] Buffer I/O error on device mmcblk1p2, logical block 389124
[ 461.380056] Buffer I/O error on device mmcblk1p2, logical block 389125
[ 461.380824] Buffer I/O error on device mmcblk1p2, logical block 389126
[ 461.381591] Buffer I/O error on device mmcblk1p2, logical block 389127
[ 461.382402] Buffer I/O error on device mmcblk1p2, logical block 389128
[ 461.383176] Buffer I/O error on device mmcblk1p2, logical block 389129
[ 470.090454] blk_update_request: I/O error, dev mmcblk1, sector 3342336 op 0x1:(WRITE) flags 0x4800 phys_seg 11 prio class 0
[ 470.091818] EXT4-fs warning (device mmcblk1p2): ext4_end_bio:345: I/O error 10 writing to inode 46867 starting block 417931)
[ 470.092000] buffer_io_error: 502 callbacks suppressed
[ 470.092008] Buffer I/O error on device mmcblk1p2, logical block 409600
[ 470.092814] Buffer I/O error on device mmcblk1p2, logical block 409601
[ 470.093595] Buffer I/O error on device mmcblk1p2, logical block 409602

@jack For me it is easily reproducible. Just install the base system with CLI.

Then run:

apt install tasksel dialog apt-utils git mc exfat-fuse exfat-utils nodejs npm

Then with tasksel select XFCe environment. And let the system try to install all thing. (If it is stuck at playmout, then just kill the install processes and continue with the given dpkg commant bye the system if you try to run apt-get -f install)

Hi @jack

As other commenters have said, easy to reproduce, just run a larger apt upgrade or firmware upgrade through armbian-config. Another way I’ve found to reproduce the error is to simulate heavy write operations using dd command. dd if=/dev/zero of=/tmp/test1.img bs=1G count=5 oflag=dsync for example.

Hi,

I also encountered this EMMC problem with Kernel 6.1.22 both on a RockPi 4B v1.4 and RockPi 4SE. The same image worked just fine on a RockPi 4B+ v1.73. The 4B+ uses an on-board EMMC module whereas the two failing boards use external EMMCs.

I tried to follow @abws’s advice first reverting the seemingly offending commits and in another attempt cherry-picking the commits mentioned in [1]. But in both cases the problem persists.

However, using the patch from [2] that limits EMMC max clock frequency to 150MHz solved the issue for me. Instead of patching the kernel device tree, setting max-frequency in a separate DT overlay is also viable.

Kind regards

I have to amend my own statement. I just bought a new Foresee 64GB eMMC module, flashed one of my images and booted up a RockPi 4b V1.4. Despite using an overlay to limit EMMC max clock frequency I see spurious TCN for tag errors and get lots of filesystem errors.

I tried some other patches, e.g. to disable eMMC command queue mode, but the problem persists.

Does Radxa see these errors with their provided images?
If not, does Radxa apply any special patches to get eMMC reliably working? I searched the Radxa Github repos for a while but didn’t see any obvious patches for the eMMC related device tree or drivers.

Edit: I tested all kernels from 6.1.22 to 6.1.27 and 6.3. They all have this issue.

Kind regards

As of yesterday I also observed the described EMMC issues on a 4B+ v1.73.

We have recently reduced ROCK 4’s eMMC controller to HS200 as a workaround of some eMMC related issues. The updated kernel should be available in production channel soon.