Is there any hardware overheat protection if rockchip,hw-tshut-temp is not defined in tsadc of device tree?

Without rockchip,hw-tshut-temp <100000>; or something in &tsadc, does that mean there is no hardware protection of overheating?

Let’s say there is a kernel oops and GPU driver runs away with itself. Will the SoC destroy itself without the rockchip,hw-tshut-temp being defined? Because otherwise we are relying on kernel controlled cooling maps.

Earlier Rock 4c+ devicetree kept tsadc disabled altogether - could not even monitor temperatures.
Later ones, which I have used to create an overlay on Armbian, enable it, but do not define the rockchip,hw-tshut-temp.

I have tested rockchip,hw-tshut-temp and it does work if defined (I set it to 60 degrees, and it reset the chip when I ran a cpu stresstest).

I am just wondering because I had one of my Rock 4C+ die yesterday. I will not be able to get to it until next week, but I got an alert it went above 80 degrees, very quickly, then it died, and now it won’t boot. The device tree cooling maps are at 95 degrees to do a clean shutdown, and 100 degrees for soft reset, but all kernel controlled. ts-adc is enabled to get temperatures, but rockchip,hw-tshut-temp is not defined at all.
The device in question has a reasonable 40x40x20mm heatsink on it too.
In normal use, it is very hard to even get the SoC to go over 85 degrees with s-tui cpu stresstest. Even with no heatsink, I have to point a hairdryer at it to get it above 85 and test that the kernel cooling-maps work (which they do, under normal non-oops circumstances).

Jan 19 17:30:01 lp-omet-11 CRON[5266]: pam_unix(cron:session): session closed for user root
Jan 19 17:35:01 lp-omet-11 CRON[5294]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Jan 19 17:35:01 lp-omet-11 CRON[5294]: pam_unix(cron:session): session closed for user root
Jan 19 17:45:01 lp-omet-11 CRON[5333]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Jan 19 17:45:01 lp-omet-11 CRON[5334]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Jan 19 17:45:01 lp-omet-11 CRON[5333]: pam_unix(cron:session): session closed for user root
Jan 19 17:45:01 lp-omet-11 CRON[5334]: pam_unix(cron:session): session closed for user root
Jan 19 17:55:01 lp-omet-11 CRON[5381]: pam_unix(cron:session): session closed for user root
Jan 19 17:55:01 lp-omet-11 CRON[5381]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Jan 19 18:00:01 lp-omet-11 CRON[5401]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Jan 19 18:00:01 lp-omet-11 CRON[5401]: pam_unix(cron:session): session closed for user root
Jan 19 18:05:01 lp-omet-11 CRON[5430]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Jan 19 18:05:01 lp-omet-11 CRON[5430]: pam_unix(cron:session): session closed for user root
Jan 19 18:13:16 lp-omet-11 kernel: [45314.166322] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000003840000
Jan 19 18:13:16 lp-omet-11 kernel: [45314.166045]   CM = 0, WnR = 1
Jan 19 18:13:16 lp-omet-11 kernel: [45314.164901]   FSC = 0x0f: level 3 permission fault
Jan 19 18:13:16 lp-omet-11 kernel: [45314.164327]   SET = 0, FnV = 0
Jan 19 18:13:16 lp-omet-11 kernel: [45314.163560]   ESR = 0x9600004f
Jan 19 18:13:16 lp-omet-11 kernel: [45314.162457] Unable to handle kernel write to read-only memory at virtual address ffff80000a8dba81
Jan 19 18:13:16 lp-omet-11 kernel: [45314.165689]   ISV = 0, ISS = 0x0000004f
Jan 19 18:13:16 lp-omet-11 kernel: [45314.164610]   EA = 0, S1PTW = 0
Jan 19 18:13:16 lp-omet-11 kernel: [45314.165335] Data abort info:
Jan 19 18:13:16 lp-omet-11 kernel: [45314.163300] Mem abort info:
Jan 19 18:13:16 lp-omet-11 kernel: [45314.163845]   EC = 0x25: DABT (current EL), IL = 32 bits
Jan 19 18:13:16 lp-omet-11 kernel: [45314.175911] pc : drm_gem_plane_helper_prepare_fb+0x24/0x148
Jan 19 18:13:16 lp-omet-11 kernel: [45314.174067] CPU: 1 PID: 1355 Comm: sway Tainted: G         C        5.15.80-rockchip64 #22.11.1
Jan 19 18:13:16 lp-omet-11 kernel: [45314.178607] x23: ffff000007f55b80 x22: ffff000001091900 x21: ffff00000426b800
Jan 19 18:13:16 lp-omet-11 kernel: [45314.168596] Modules linked in: sunrpc lz4hc lz4 btsdio bluetooth rockchip_vdec(C) hantro_vpu(C) rockchip_rga rockchip_iep v4l2_h264 videobuf2_dma_contig brcmfmac snd_soc_simple_card v4l2_mem2mem videobuf2_vmalloc videobuf2_dma_sg snd_soc_hdmi_codec snd_soc_rk817 snd_soc_simple_card_utils brcmutil videobuf2_memops cfg80211 videobuf2_v4l2 snd_soc_core snd_pcm_dmaengine videobuf2_common snd_pcm snd_timer snd hid_multitouch rfkill videodev joydev mc soundcore cpufreq_dt zram sch_fq_codel ramoops reed_solomon pstore_blk pstore_zone ip_tables x_tables autofs4 realtek panfrost dw_hdmi_i2s_audio dw_hdmi_cec gpu_sched dwmac_rk stmmac_platform stmmac pcs_xpcs
Jan 19 18:13:16 lp-omet-11 kernel: [45314.176971] sp : ffff80000a8dba80
Jan 19 18:13:16 lp-omet-11 kernel: [45314.176441] lr : drm_atomic_helper_prepare_planes+0x110/0x1b8
Jan 19 18:13:16 lp-omet-11 kernel: [45314.174852] Hardware name: Radxa ROCK Pi 4C+ (DT)
Jan 19 18:13:16 lp-omet-11 kernel: [45314.168087] Internal error: Oops: 9600004f [#1] PREEMPT SMP
Jan 19 18:13:16 lp-omet-11 kernel: [45314.175279] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
Jan 19 18:13:16 lp-omet-11 kernel: [45314.177276] x29: ffff80000a8dba80 x28: 0000000000000028 x27: 0000aaab0b852ad0
Jan 19 18:13:16 lp-omet-11 kernel: [45314.177942] x26: ffff000001ad9c00 x25: 0000000000000001 x24: ffff000007f55e80
Jan 19 18:13:16 lp-omet-11 kernel: [45314.166928] [ffff80000a8dba81] pgd=10000000f7fff003, p4d=10000000f7fff003, pud=10000000f7ffe003, pmd=10000000010c5003, pte=00680000f0ee5703
Jan 19 18:13:16 lp-omet-11 kernel: [45314.183911] Call trace:
Jan 19 18:13:16 lp-omet-11 kernel: [45314.180598] x14: 07d8089807800780 x13: 0000000000024414 x12: 0000000000000804
Jan 19 18:13:16 lp-omet-11 kernel: [45314.184629]  drm_atomic_helper_prepare_planes+0x110/0x1b8
Jan 19 18:13:16 lp-omet-11 kernel: [45314.185561]  drm_atomic_nonblocking_commit+0x4c/0x60
Jan 19 18:13:16 lp-omet-11 kernel: [45314.179271] x20: ffff000007f55b80 x19: 0000000000000000 x18: 0000000000000000
Jan 19 18:13:16 lp-omet-11 kernel: [45314.181261] x11: 0000000000000000 x10: 0000000000000780 x9 : 0000000000000000
Jan 19 18:13:16 lp-omet-11 kernel: [45314.179935] x17: 0048000000000465 x16: 0441043c04650438 x15: 0438000008980804
Jan 19 18:13:16 lp-omet-11 kernel: [45314.185129]  drm_atomic_helper_commit+0x8c/0x370
Jan 19 18:13:16 lp-omet-11 kernel: [45314.184142]  drm_gem_plane_helper_prepare_fb+0x24/0x148
Jan 19 18:13:16 lp-omet-11 kernel: [45314.182586] x5 : 0000000000000040 x4 : ffff8000090f6978 x3 : 00000000ffffffff
Jan 19 18:13:16 lp-omet-11 kernel: [45314.181924] x8 : ffff000001091300 x7 : 0000000000000000 x6 : 000000000000003f
Jan 19 18:13:16 lp-omet-11 kernel: [45314.183248] x2 : 0000000000000013 x1 : 0000000000000000 x0 : ffff000001fe1800
Jan 19 18:13:16 lp-omet-11 kernel: [45314.283875] detected fb_set_par error, error code: -16
Jan 19 18:13:16 lp-omet-11 kernel: [45314.188309]  do_el0_svc+0x24/0x88
Jan 19 18:13:16 lp-omet-11 kernel: [45314.188627]  el0_svc+0x20/0x50
Jan 19 18:13:16 lp-omet-11 kernel: [45314.190222] ---[ end trace 27518b4bffa9a414 ]---
Jan 19 18:13:16 lp-omet-11 kernel: [45314.189313]  el0t_64_sync+0x180/0x184
Jan 19 18:13:16 lp-omet-11 kernel: [45314.189669] Code: a90153f3 a9025bf5 aa0103f6 52800001 (f8001bf7)
Jan 19 18:13:16 lp-omet-11 kernel: [45314.187512]  invoke_syscall+0x44/0x108
Jan 19 18:13:16 lp-omet-11 kernel: [45314.187870]  el0_svc_common.constprop.3+0x94/0xf8
Jan 19 18:13:16 lp-omet-11 kernel: [45314.188922]  el0t_64_sync_handler+0x90/0xb8
Jan 19 18:13:16 lp-omet-11 kernel: [45314.186814]  drm_ioctl+0x244/0x470
Jan 19 18:13:16 lp-omet-11 kernel: [45314.186438]  drm_ioctl_kernel+0xc0/0x110
Jan 19 18:13:16 lp-omet-11 kernel: [45314.186020]  drm_mode_atomic_ioctl+0x9d8/0xb68
Jan 19 18:13:16 lp-omet-11 kernel: [45314.187140]  __arm64_sys_ioctl+0xa8/0xe8
Jan 19 18:13:16 lp-omet-11 systemd-logind[903]: Session c1 logged out. Waiting for processes to exit.
Jan 19 18:15:01 lp-omet-11 CRON[5474]: pam_unix(cron:session): session closed for user root
Jan 19 18:15:01 lp-omet-11 CRON[5473]: pam_unix(cron:session): session closed for user root
Jan 19 18:15:01 lp-omet-11 CRON[5473]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Jan 19 18:15:01 lp-omet-11 CRON[5474]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Jan 19 18:17:01 lp-omet-11 CRON[5491]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Jan 19 18:17:01 lp-omet-11 CRON[5491]: pam_unix(cron:session): session closed for user root
Jan 19 18:18:09 lp-omet-11 sshd[5504]: pam_unix(sshd:session): session opened for user root(uid=0) by (uid=0)
Jan 19 18:18:09 lp-omet-11 sshd[5504]: Accepted password for root from 192.168.1.93 port 10622 ssh2

this is how it looks now after a power cycle. I will not be able to get my hands on it until next week.

This is the last 3 days, so you can see it is usually fine all day.

You know what, I’m talking rubbish.

I can see from the running device tree (/sys/firmware/devicetree/base) that rockchip,hw-tshut-temp is already defined elsewhere and its set to 95c (0x17318). It would be better at 100c mind you since the cooling map says to shut down gracefully at 95c. It can’t shut doiwn gracefully and hardware reset at the same time :slight_smile:

Weird though. Wonder what happened to the board.