Disable a single RAM Channel/Bank or Chip

Hi,
I am experiencing some DDR training problems and I suspect it’s one RAM chip that’s affected because of some tests I did. I would like to modify the DDR training blob using ddrbin_tool and boot_merger tools, but have no idea which settings to set to be able to do ddr training for a particular channel or chip. The one that should be disabled is CH2. If there is a hardware solution to disable the channel, I am also open to that. I am not interested in replacing the IC, because one channel is enough for me to continue my tests.

the logs for current ddr training:

`DDR 9fa84341ce typ 24/09/06-09:51:11,fwver: v1.18
ch0 ttot10
ch0 ttot10
ch1 ttot10
ch2 ttot10
ch3 ttot10
ch0 ttot16
LPDDR4, 2112MHz
channel[0] BW=16 Col=10 Bk=8 CS0 Row=17 CS1 Row=17 CS=2 Die BW=16 Size=4096MB
ch1 ttot16
channel[1] BW=16 Col=10 Bk=8 CS0 Row=17 CS1 Row=17 CS=2 Die BW=16 Size=4096MB
channel[2] BW=16 Col=10 Bk=8 CS0 Row=17 CS1 Row=17 CS=2 Die BW=16 Size=4096MB
ch3 ttot16
channel[3] BW=16 Col=10 Bk=8 CS0 Row=17 CS1 Row=17 CS=2 Die BW=16 Size=4096MB
Manufacturer ID:0xff
ch:2 dq0 fail,write:0x1,read:0x2
ch:2 dq1 fail,write:0x2,read:0x4
ch:2 dq2 fail,write:0x4,read:0x8
ch:2 dq3 fail,write:0x8,read:0x10
ch:2 dq4 fail,write:0x10,read:0x20
ch:2 dq5 fail,write:0x20,read:0x0
ch:2 dq6 fail,write:0x40,read:0x80
ch:2 dq7 fail,write:0x80,read:0x0
ch:2 dq8 fail,write:0x100,read:0x200
ch:2 dq9 fail,write:0x200,read:0x400
ch:2 dq10 fail,write:0x400,read:0x800
ch:2 dq11 fail,write:0x800,read:0x1000
ch:2 dq12 fail,write:0x1000,read:0x2000
ch:2 dq13 fail,write:0x2000,read:0x4000
ch:2 dq14 fail,write:0x4000,read:0x8000
ch:2 dq15 fail,write:0x8000,read:0x0
ch:2 dq0 fail,write:0xfffffffe,read:0xffbfffbd
ch:2 dq1 fail,write:0xfffffffd,read:0xffbfffbb
ch:2 dq2 fail,write:0xfffffffb,read:0xffbfffb7
ch:2 dq3 fail,write:0xfffffff7,read:0xffbfffaf
ch:2 dq4 fail,write:0xffffffef,read:0xffbfff9f
ch:2 dq5 fail,write:0xffffffdf,read:0xffbfffbf
ch:2 dq6 fail,write:0xffffffbf,read:0xffbfff3f
ch:2 dq7 fail,write:0xffffff7f,read:0x0
ch:2 dq8 fail,write:0xfffffeff,read:0xffbffdbf
ch:2 dq9 fail,write:0xfffffdff,read:0xffbffbbf
ch:2 dq10 fail,write:0xfffffbff,read:0xffbff7bf
ch:2 dq11 fail,write:0xfffff7ff,read:0xffbfefbf
ch:2 dq12 fail,write:0xffffefff,read:0xffbfdfbf
ch:2 dq13 fail,write:0xffffdfff,read:0xffbfbfbf
ch:2 dq14 fail,write:0xffffbfff,read:0xffbf7fbf
ch:2 dq15 fail,write:0xffff7fff,read:0x0
error
ERR`

Any insight is appreciated.

Thanks.
Update: It is possible to mask other DRAM banks using channel mask parameter from the DDR training blob.

  1. Get rkbin tools and binaries from rockchip repo

  2. Extract the parameter files from the DDR loader using ddrbin_tool (manual is in the repo)

  3. Edit the mask for the desired channel (the one to be used) (i.e., channel mask = 1 for ch0)

  4. Generate the modified one using ddrbin_tool.

  5. Build the SPL loader with the modified DDR loader using bootmerger tool.

  6. Next is to modify u-boot and change the maximum allowed SDRAM, found in "include/configs/rk3588_common.h" : SDRAM_MAX_SIZE, I set it to SDRAM_MAX_SIZE = 0x3c000000;

  7. Build u-boot and flash as (in maskrom):

  8. `rkdeveloptool db modifed_spl_loader_masked_channel.bin

  9. rkdeveloptool wl 64 idbloader.img && rkdeveloptool wl 16384 u-boot.itb

    PreSerial: 2, raw, 0xfeb50000
    This is DRAM:  958 MiB
    Sysmem: init
    Relocation Offset: 39a1e000
    Relocation fdt: 379f9140 - 379fecb0
    CR: M/C/I
    Bad memblk0: 0x3c000000 - 0xff140000
    Using default environment
    
    Hotkey: ctrl+
    mmc@fe2c0000: 1, mmc@fe2e0000: 0
    Bootdev(atags): mmc 0
    MMC0: HS200, 200Mhz
    PartType: EFI
    DM: v2
    No misc partition
    boot mode: normal
    FIT: No boot partition
    No resource partition
    No resource partition
    Failed to load DTB, ret=-19
    No find valid DTB, ret=-22
    Failed to get kernel dtb, ret=-22
    In:    serial
    Out:   serial
    Err:   serial
    

I get only 1GB detected instead of 4GB, so it may just be me who messed up something along the way, but I am glad I brought back my board from the dead. I will update once more if I figure it out.
This setup can also workaround boards with DRAM chips of different sizes.

Can you tell how you did these tests? I seem to have ram problems as well. & also enclose the output lines between ``` signs, makes it easier to read

I had an issue with one i2C bus where the 1V8 wasn’t there due to a trace problem, which could be a via in the inner layers or a broken trace. After soldering the wire, the board didn’t seem to work anymore and it shows those errors. That’s the UART output using rkdeveloptool db command, probably because of soldering and excessive heat. That’s how I suspect one of RAM that was nearby the soldering had been affected, either the chip or the traces.

I managed to find the parameter to change to disable different banks.
You will need the ddrbin_tool and the ddr loader file, you get them from here

You can extract the parameters from the loader file:
ddrbin_tool rk3588 -g gen_param.txt rk3588_ddr_lp4_2112MHz_lp5_2400MHz_v1.18.bin
then edit the parameter “channel mask”
Usually the value is 15 or 0x0F which means all banks. Change that to:
channel mask=1, for ch0
or channel mask = 3 for ch0 and ch1,
or channel mask = 7 for ch0, ch1 and ch2
I haven’t tried the rest but I set it to channel mask=1 only as it seems to work the most for my case.

Run the command again with
ddrbin_tool rk3588 -g gen_param.txt rk3588_ddr_lp4_2112MHz_lp5_2400MHz_v1.18.bin
and build the SPL loader aswell using boot_merger, don’t forget to modiify the RKBOOT file

Next is to build idbloader.img using the modified loader then proceed as:
rkdeveloptool db modified_SPL_loader.bin
rkdeveloptool wl 64 idbloader.img

That’s how far I went and now I am now trying to figure out how to modify u-boot so it can continue booting with that memory channel. Currently it shows the following:

DDR 9fa84341ce typ 24/09/06-09:51:11,fwver: v1.18
ch0 ttot10
ch0 ttot10
ch0 ttot16
LPDDR4, 2112MHz
channel[0] BW=16 Col=10 Bk=8 CS0 Row=17 CS1 Row=17 CS=2 Die BW=16 Size=4096MB
Manufacturer ID:0xff
DQS rds:l1,l1
DDR 9fa84341ce typ 24/09/06-09:51:11,fwver: v1.18
ch0 ttot10
ch0 ttot10
ch0 ttot16
LPDDR4, 2112MHz
channel[0] BW=16 Col=10 Bk=8 CS0 Row=17 CS1 Row=17 CS=2 Die BW=16 Size=4096MB
Manufacturer ID:0xff
DQS rds:l3,l2
CH0 RX Vref:32.6%, TX Vref:15.2%,16.2%
DQ rds:h3 l0 l0 h1 l0 h1 l1 l0, l0 l1 l0 l2 l1 l0 l1 h3

stride=0x0, ddr_config=0x0
hash ch_mask0-1 0x10 0x20, bank_mask0-3 0x200 0x400 0x800 0x0, rank_mask0 0x40000000
change to F1: 528MHz
ch0 ttot10
change to F2: 1068MHz
ch0 ttot12
change to F3: 1560MHz
ch0 ttot14
change to F0: 2112MHz
ch0 ttot16
out
INFO:    Preloader serial: 2
NOTICE:  BL31: v2.3():v2.3-765-g588059758:derrick.huang, fwver: v1.46
NOTICE:  BL31: Built : 18:13:16, Apr 29 2024
INFO:    spec: 0x13
INFO:    code: 0x88
INFO:    ext 32k is valid
INFO:    ddr: stride-en 4CH
INFO:    GICv3 without legacy support detected.
INFO:    ARM GICv3 driver initialized in EL3
INFO:    valid_cpu_msk=0xff bcore0_rst = 0x0, bcore1_rst = 0x0
INFO:    l3 cache partition cfg-0
INFO:    system boots from cpu-hwid-0
INFO:    disable memory repair
INFO:    idle_st=0x21fff, pd_st=0x11fff9, repair_st=0xfff70001
INFO:    dfs DDR fsp_params[0].freq_mhz= 2112MHz
INFO:    dfs DDR fsp_params[1].freq_mhz= 528MHz
INFO:    dfs DDR fsp_params[2].freq_mhz= 1068MHz
INFO:    dfs DDR fsp_params[3].freq_mhz= 1560MHz
INFO:    BL31: Initialising Exception Handling Framework
INFO:    BL31: Initializing runtime services
WARNING: No OPTEE provided by BL2 boot loader, Booting device without OPTEE initialization. SMC`s destined for OPTEE will return SMC_UNK
ERROR:   Error initializing runtime service opteed_fast
INFO:    BL31: Preparing for EL3 exit to normal world
INFO:    Entry point address = 0x200000
INFO:    SPSR = 0x3c9


U-Boot rknext-2017.09-38-22849bf-gb89b1c0 #runner (Nov 11 2024 - 06:42:50 +0000)

Model: Radxa ROCK 5A
MPIDR: 0x81000000
PreSerial: 2, raw, 0xfeb50000
DRAM:  "Error" handler, esr 0xbe000011

* Reason:        Exception from SError interrupt
* ELR(PC)    =   00000000002ae678
* LR         =   000000000021aea8
* SP         =   000000000057fc00
* ESR_EL2    =   00000000be000011
* Reloc Off  =   0000000000000000

x0 : 00000000eb9ffeb8 x1 : 0000000000000000
x2 : 0000000000000148 x3 : 0000000000000038
x4 : 0000000000000000 x5 : 0000000000000110
x6 : 0000000000000000 x7 : 00000000ff140000
x8 : 000000000057fa38 x9 : 00000000fe680000
x10: 0000000000000001 x11: 000000000f140000
x12: 0000000000000000 x13: 0000000000200000
x14: 00000000003240f0 x15: 0000000000000002
x16: 0000000000000000 x17: 0000000000000000
x18: 000000000057fe30 x19: 00000000002b9610
x20: 00000000002b9528 x21: 0000000000000000
x22: 0000000000000000 x23: 0000000000000000
x24: 0000000000000000 x25: 0000000000000000
x26: 0000000000000000 x27: 0000000000000000
x28: 0000000000000000 x29: 000000000057fde0


Call trace:
  PC:   [< 002ae678 >]
  LR:   [< 0021aea8 >]

Stack:
        [< 002ae678 >]

Copy info from "Call trace..." to a file(eg. dump.txt), and run
command in your U-Boot project: ./scripts/stacktrace.sh dump.txt

Resetting CPU ...

### ERROR ### Please RESET the board ###