Converted Stable Diffusion 1.5 Models yield black images

I’ve converted a few models from Civitai now and every single model has so far creates solid black images. Is there something I’m not doing correctly? The only thing I’m seeing in the log is during startup of the docker image there’s some memory errors:

main_app-1  | [bmlib_memory][error] bm_alloc_gmem failed, dev_id = 0, size = 0x75d63000
main_app-1  | [BM_CHECK][error] BM_CHECK_RET fail /workspace/libsophon/bmlib/src/bmlib_memory.cpp: sg_malloc_device_byte: 639
main_app-1  | [bmlib_memory][error] sg_device_mem_range_valid saddr=0xfffffffff eaddr=0x1001ffffff out of range
main_app-1  | [bmlib_memory][error] sg_device_mem_range_valid saddr=0x1001ffffff eaddr=0x1003ffffff out of range
main_app-1  | [bmlib_memory][error] sg_device_mem_range_valid saddr=0x1003ffffff eaddr=0x1005ef6fff out of range
main_app-1  | e[32m2024-08-04 05:11:12 [INFO]    tpu_kernel_module loaded from binarye[0m
main_app-1  | [bmlib_memory][error] bm_alloc_gmem failed, dev_id = 0, size = 0x753b9000
main_app-1  | [BM_CHECK][error] BM_CHECK_RET fail /workspace/libsophon/bmlib/src/bmlib_memory.cpp: sg_malloc_device_byte: 639
main_app-1  | [bmlib_memory][error] sg_device_mem_range_valid saddr=0xfffffffff eaddr=0x1001ffffff out of range
main_app-1  | [bmlib_memory][error] sg_device_mem_range_valid saddr=0x1001ffffff eaddr=0x1003ffffff out of range
main_app-1  | [bmlib_memory][error] sg_device_mem_range_valid saddr=0x1003ffffff eaddr=0x1004178fff out of range
main_app-1  | [bmlib_memory][error] free gmem addr 0xfffffffff is invalide!
main_app-1  | [bmlib_memory][error] free gmem addr 0xfffffffff is invalide!

In this screenshot I’m using ChilloutMix (https://civitai.com/models/6424?modelVersionId=11745):

I have tried both the gui and the cli method of converting as detailed in the website document; not sure why this isn’t working - is there another log I should be looking at?

Hi, @Landside2,

Are you use the SD1.5 docker image in CasaOS?
If yes, please set the container Memory Limit to maixmum in CasaOS settings, like this photo

or No
could you check the TPU memory use command

memory_edit.sh -p bm1684x_sm7m_v1.0.dtb

make sure it have enough TPU memory, the recommand config for most model is NPU->7615MB, VPU->2360MB, VPP->2360MB

best,
Morgan

@Morgan thanks for the reply.

I am using this in CasaOS and I do have the memory Limit set to maximum

I checked and adjusted the memory settings as you suggested:

admin@Airbox:~$ sudo memory_edit.sh -p bm1684x_sm7m_v1.0.dtb
Info: use dts file /opt/sophon/memory_edit/output/bm1684x_sm7m_v1.0.dts
Info: chip is bm1684x
Info: =======================================================================
Info: get ddr information ...
Info: ddr12_size 8589934592 Byte [8192 MiB]
Info: ddr3_size 4294967296 Byte [4096 MiB]
Info: ddr4_size 4294967296 Byte [4096 MiB]
Info: ddr_size 16384 MiB
Info: =======================================================================
Info: get max memory size ...
Info: max npu size: 0x1dbf00000 [7615 MiB]
Info: max vpu size: 0xb8000000 [2944 MiB]
Info: max vpp size: 0x100000000 [4096 MiB]
Info: =======================================================================
Info: get now memory size ...
Info: now npu size: 0x1dbf00000 [7615 MiB]
Info: now vpu size: 0x77359400 [1907 MiB]
Info: now vpp size: 0x77359400 [1907 MiB]
admin@Airbox:~$ sudo memory_edit.sh -c -npu 7615 -vpu 2360 -vpp 2360 bm1684x_sm7m_v1.0.dtb
Info: use dts file /opt/sophon/memory_edit/output/bm1684x_sm7m_v1.0.dts
Info: chip is bm1684x
Info: =======================================================================
Info: get ddr information ...
Info: ddr12_size 8589934592 Byte [8192 MiB]
Info: ddr3_size 4294967296 Byte [4096 MiB]
Info: ddr4_size 4294967296 Byte [4096 MiB]
Info: ddr_size 16384 MiB
Info: =======================================================================
Info: get max memory size ...
Info: max npu size: 0x1dbf00000 [7615 MiB]
Info: max vpu size: 0xb8000000 [2944 MiB]
Info: max vpp size: 0x100000000 [4096 MiB]
Info: =======================================================================
Info: output configuration results ...
Info: vpu mem area(ddr3): 0x8000000 [128 MiB] 0x64800000 -> 0x6c7fffff
Info: ion npu mem area(ddr1): 0x1dbf00000 [7615 MiB] 0x24100000 -> 0x1ffffffff
Info: ion vpu mem area(ddr3): 0x93800000 [2360 MiB] 0x6c800000 -> 0xffffffff
Info: ion vpp mem area(ddr4): 0x93800000 [2360 MiB] 0x6c800000 -> 0xffffffff
Info: =======================================================================
Info: start check memory size ...
Info: check npu size: 0x1dbf00000 [7615 MiB]
Info: check vpu size: 0x93800000 [2360 MiB]
Info: check vpp size: 0x93800000 [2360 MiB]
Info: check edit size ok
Info: en_emmcfile ok, please run cmd and reboot system:
sudo cp /opt/sophon/memory_edit/emmcboot.itb /boot/emmcboot.itb && sync
admin@Airbox:~$ sudo cp /opt/sophon/memory_edit/emmcboot.itb /boot/emmcboot.itb && sync

Still no luck, getting a black image generation with everything except for the AbsoluteReality that ships with the docker. Anything else I can try or look at?

Hi, @Landside2,

Is it mean only AbsoulteRrality still work?

For this reason, i think your model effect by tpu-mlir, could you provide your version of tpu-mlir.

by the way, use command

bmrt_test --bmodel ./YOUR_BMODEL_PATH

to check is it can load and inference correctly on Airbox

best,
Morgan

Hi, @Landside2

could you provide your bmodel via google driver, I am willing to help you solve this problem or build a new bmodel for you

best,
Morgan

Hey @Morgan,

I think you may be right and the problem is with the conversion process. I grabbed some of the other models from https://github.com/radxa-edge/TPU-Edge-AI/releases (majicMIX_realistic, RealCartoon2.5D) and those work.

tpu_mlir is version 1.6.502 (installed via the instructions on h__ps://docs.radxa.com/en/sophon/airbox/local-ai-deploy/large-model/sd-lcm and h__ps://docs.radxa.com/en/sophon/airbox/model-compile/tpu_mlir_env)

root@9650c14a8fbd:/workspace/lcm-lora-sdv1-5# pip show tpu-mlir
Name: tpu_mlir
Version: 1.6.502
Summary: Machine learning compiler based on MLIR for Sophgo TPU v1.6.502-gcc378fc2c-20240513
Home-page:
h**ps://github.com/sophgo/tpu-mlir
Author: SOPHGO
Author-email: sales@sophgo.__
License: 2-Clause BSD
Location: /usr/local/lib/python3.10/dist-packages
Requires: graphviz, numpy, opencv-python-headless, pandas, Pillow, plotly, protobuf, pycocotools, scikit-image, scipy, tqdm, transformers
Required-by:
root@9650c14a8fbd:/workspace/lcm-lora-sdv1-5#

The layout of the folder is as such:

├── workspace
│ └── SD-lcm-tpu
│ └── lcm-lora-sdv1-5
│ └── tpu-mlir

I generated the file using these two commands:

python3 export_from_safetensor_sd15_cli_wrapper.py -u limitlessvision_v40.safetensors -l /workspace/lcm-lora-sdv1-5/ -b 1 -o limitless_cli_pt

python3 convert_bmodel_cli_wrapper.py -n limitless_cli_pt_sd15 -o limitless_cli_bmodel -s 512 512 768 512 512 768 -b 1 -v sd15D

I have also tried these without the lora

The docker is running on my workstation and I transfer the models to airbox after conversion. Here’s the uname-a and lscpu of the workstation if that helps:

Linux archif 6.10.2-arch1-2 #1 SMP PREEMPT_DYNAMIC Sat, 03 Aug 2024 17:56:17 +0000 x86_64 GNU/Linux

Architecture:             x86_64
  CPU op-mode(s):         32-bit, 64-bit
  Address sizes:          48 bits physical, 48 bits virtual
  Byte Order:             Little Endian
CPU(s):                   16
  On-line CPU(s) list:    0-15
Vendor ID:                AuthenticAMD
  Model name:             AMD Ryzen 7 5700U with Radeon Graphics
    CPU family:           23
    Model:                104
    Thread(s) per core:   2
    Core(s) per socket:   8
    Socket(s):            1
    Stepping:             1
    CPU(s) scaling MHz:   35%
    CPU max MHz:          4372.0000
    CPU min MHz:          400.0000
    BogoMIPS:             3594.00
    Flags:                fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid extd_apicid
                           aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce top
                          oext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xge
                          tbv1 cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr rdpru wbnoinvd cppc arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_v
                          msave_vmload vgif v_spec_ctrl umip rdpid overflow_recov succor smca
Virtualization features:  
  Virtualization:         AMD-V
Caches (sum of all):      
  L1d:                    256 KiB (8 instances)
  L1i:                    256 KiB (8 instances)
  L2:                     4 MiB (8 instances)
  L3:                     8 MiB (2 instances)
NUMA:                     
  NUMA node(s):           1
  NUMA node0 CPU(s):      0-15
Vulnerabilities:          
  Gather data sampling:   Not affected
  Itlb multihit:          Not affected
  L1tf:                   Not affected
  Mds:                    Not affected
  Meltdown:               Not affected
  Mmio stale data:        Not affected
  Reg file data sampling: Not affected
  Retbleed:               Mitigation; untrained return thunk; SMT enabled with STIBP protection
  Spec rstack overflow:   Mitigation; Safe RET
  Spec store bypass:      Mitigation; Speculative Store Bypass disabled via prctl
  Spectre v1:             Mitigation; usercopy/swapgs barriers and __user pointer sanitization
  Spectre v2:             Mitigation; Retpolines; IBPB conditional; STIBP always-on; RSB filling; PBRSB-eIBRS Not affected; BHI Not affected
  Srbds:                  Not affected
  Tsx async abort:        Not affected

The output from bmrt_test is as follows:

bmrt_test --bmodel sdv15_text.bmodel 
[BMRT][deal_with_options:1446] INFO:Loop num: 1
[BMRT][bmrt_test:723] WARNING:setpriority failed, cpu time might flutuate.
[BMRT][bmcpu_setup:406] INFO:cpu_lib 'libcpuop.so' is loaded.
bmcpu init: skip cpu_user_defined
open usercpu.so, init user_cpu_init 
[BMRT][load_bmodel:1084] INFO:Loading bmodel from [sdv15_text.bmodel]. Thanks for your patience...
[BMRT][load_bmodel:1023] INFO:pre net num: 0, load net num: 1
[BMRT][show_net_info:1520] INFO: ########################
[BMRT][show_net_info:1521] INFO: NetName: sdv15_te, Index=0
[BMRT][show_net_info:1523] INFO: ---- stage 0 ----
[BMRT][show_net_info:1532] INFO:   Input 0) 'te_input' shape=[ 1 77 ] dtype=INT32 scale=1 zero_point=0
[BMRT][show_net_info:1542] INFO:   Output 0) 'last_hidden_state_LayerNormalization' shape=[ 1 77 768 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1545] INFO: ########################
[BMRT][bmrt_test:782] INFO:==> running network #0, name: sdv15_te, loop: 0
[BMRT][bmrt_test:868] INFO:reading input #0, bytesize=308
[BMRT][print_array:706] INFO:  --> input_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=77
[BMRT][bmrt_test:1005] INFO:reading output #0, bytesize=236544
[BMRT][print_array:706] INFO:  --> output ref_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=59136
[BMRT][bmrt_test:1039] INFO:net[sdv15_te] stage[0], launch total time is 21863 us (npu 21786 us, cpu 77 us)
[BMRT][bmrt_test:1042] INFO:+++ The network[sdv15_te] stage[0] output_data +++
[BMRT][print_array:706] INFO:output data #0 shape: [1 77 768 ] < nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan ... > len=59136
[BMRT][bmrt_test:1083] INFO:load input time(s): 0.000213
[BMRT][bmrt_test:1084] INFO:calculate  time(s): 0.021867
[BMRT][bmrt_test:1085] INFO:get output time(s): 0.000236
[BMRT][bmrt_test:1086] INFO:compare    time(s): 0.000302
bmrt_test --bmodel sdv15_unet_multisize.bmodel
[BMRT][deal_with_options:1446] INFO:Loop num: 1
[BMRT][bmrt_test:723] WARNING:setpriority failed, cpu time might flutuate.
[BMRT][bmcpu_setup:406] INFO:cpu_lib 'libcpuop.so' is loaded.
bmcpu init: skip cpu_user_defined
open usercpu.so, init user_cpu_init 
[BMRT][load_bmodel:1084] INFO:Loading bmodel from [sdv15_unet_multisize.bmodel]. Thanks for your patience...
[BMRT][load_bmodel:1023] INFO:pre net num: 0, load net num: 1
[BMRT][show_net_info:1520] INFO: ########################
[BMRT][show_net_info:1521] INFO: NetName: sdv15_unet_fuse, Index=0
[BMRT][show_net_info:1523] INFO: ---- stage 0 ----
[BMRT][show_net_info:1532] INFO:   Input 0) 'sample.1' shape=[ 1 4 64 64 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1532] INFO:   Input 1) 'timestep.1' shape=[ 1 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1532] INFO:   Input 2) 'encoder_hidden_states.1' shape=[ 1 77 768 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1532] INFO:   Input 3) 'mid_block_additional_residual.1' shape=[ 1 1280 8 8 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1532] INFO:   Input 4) 'down_block_additional_residuals_0.1' shape=[ 1 320 64 64 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1532] INFO:   Input 5) 'down_block_additional_residuals_1.1' shape=[ 1 320 64 64 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1532] INFO:   Input 6) 'down_block_additional_residuals_2.1' shape=[ 1 320 64 64 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1532] INFO:   Input 7) 'down_block_additional_residuals_3.1' shape=[ 1 320 32 32 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1532] INFO:   Input 8) 'down_block_additional_residuals_4.1' shape=[ 1 640 32 32 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1532] INFO:   Input 9) 'down_block_additional_residuals_5.1' shape=[ 1 640 32 32 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1532] INFO:   Input 10) 'down_block_additional_residuals_6.1' shape=[ 1 640 16 16 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1532] INFO:   Input 11) 'down_block_additional_residuals_7.1' shape=[ 1 1280 16 16 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1532] INFO:   Input 12) 'down_block_additional_residuals_8.1' shape=[ 1 1280 16 16 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1532] INFO:   Input 13) 'down_block_additional_residuals_9.1' shape=[ 1 1280 8 8 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1532] INFO:   Input 14) 'down_block_additional_residuals_10.1' shape=[ 1 1280 8 8 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1532] INFO:   Input 15) 'down_block_additional_residuals_11.1' shape=[ 1 1280 8 8 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1542] INFO:   Output 0) '5703_f32' shape=[ 1 4 64 64 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1523] INFO: ---- stage 1 ----
[BMRT][show_net_info:1532] INFO:   Input 0) 'sample.1' shape=[ 1 4 96 64 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1532] INFO:   Input 1) 'timestep.1' shape=[ 1 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1532] INFO:   Input 2) 'encoder_hidden_states.1' shape=[ 1 77 768 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1532] INFO:   Input 3) 'mid_block_additional_residual.1' shape=[ 1 1280 12 8 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1532] INFO:   Input 4) 'down_block_additional_residuals_0.1' shape=[ 1 320 96 64 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1532] INFO:   Input 5) 'down_block_additional_residuals_1.1' shape=[ 1 320 96 64 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1532] INFO:   Input 6) 'down_block_additional_residuals_2.1' shape=[ 1 320 96 64 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1532] INFO:   Input 7) 'down_block_additional_residuals_3.1' shape=[ 1 320 48 32 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1532] INFO:   Input 8) 'down_block_additional_residuals_4.1' shape=[ 1 640 48 32 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1532] INFO:   Input 9) 'down_block_additional_residuals_5.1' shape=[ 1 640 48 32 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1532] INFO:   Input 10) 'down_block_additional_residuals_6.1' shape=[ 1 640 24 16 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1532] INFO:   Input 11) 'down_block_additional_residuals_7.1' shape=[ 1 1280 24 16 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1532] INFO:   Input 12) 'down_block_additional_residuals_8.1' shape=[ 1 1280 24 16 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1532] INFO:   Input 13) 'down_block_additional_residuals_9.1' shape=[ 1 1280 12 8 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1532] INFO:   Input 14) 'down_block_additional_residuals_10.1' shape=[ 1 1280 12 8 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1532] INFO:   Input 15) 'down_block_additional_residuals_11.1' shape=[ 1 1280 12 8 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1542] INFO:   Output 0) '5703_f32' shape=[ 1 4 96 64 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1523] INFO: ---- stage 2 ----
[BMRT][show_net_info:1532] INFO:   Input 0) 'sample.1' shape=[ 1 4 64 96 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1532] INFO:   Input 1) 'timestep.1' shape=[ 1 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1532] INFO:   Input 2) 'encoder_hidden_states.1' shape=[ 1 77 768 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1532] INFO:   Input 3) 'mid_block_additional_residual.1' shape=[ 1 1280 8 12 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1532] INFO:   Input 4) 'down_block_additional_residuals_0.1' shape=[ 1 320 64 96 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1532] INFO:   Input 5) 'down_block_additional_residuals_1.1' shape=[ 1 320 64 96 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1532] INFO:   Input 6) 'down_block_additional_residuals_2.1' shape=[ 1 320 64 96 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1532] INFO:   Input 7) 'down_block_additional_residuals_3.1' shape=[ 1 320 32 48 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1532] INFO:   Input 8) 'down_block_additional_residuals_4.1' shape=[ 1 640 32 48 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1532] INFO:   Input 9) 'down_block_additional_residuals_5.1' shape=[ 1 640 32 48 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1532] INFO:   Input 10) 'down_block_additional_residuals_6.1' shape=[ 1 640 16 24 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1532] INFO:   Input 11) 'down_block_additional_residuals_7.1' shape=[ 1 1280 16 24 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1532] INFO:   Input 12) 'down_block_additional_residuals_8.1' shape=[ 1 1280 16 24 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1532] INFO:   Input 13) 'down_block_additional_residuals_9.1' shape=[ 1 1280 8 12 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1532] INFO:   Input 14) 'down_block_additional_residuals_10.1' shape=[ 1 1280 8 12 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1532] INFO:   Input 15) 'down_block_additional_residuals_11.1' shape=[ 1 1280 8 12 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1542] INFO:   Output 0) '5703_f32' shape=[ 1 4 64 96 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1545] INFO: ########################
[BMRT][bmrt_test:782] INFO:==> running network #0, name: sdv15_unet_fuse, loop: 0
[BMRT][bmrt_test:868] INFO:reading input #0, bytesize=65536
[BMRT][print_array:706] INFO:  --> input_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=16384
[BMRT][bmrt_test:868] INFO:reading input #1, bytesize=4
[BMRT][print_array:706] INFO:  --> input_data: < 0 >
[BMRT][bmrt_test:868] INFO:reading input #2, bytesize=236544
[BMRT][print_array:706] INFO:  --> input_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=59136
[BMRT][bmrt_test:868] INFO:reading input #3, bytesize=327680
[BMRT][print_array:706] INFO:  --> input_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=81920
[BMRT][bmrt_test:868] INFO:reading input #4, bytesize=5242880
[BMRT][print_array:706] INFO:  --> input_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=1310720
[BMRT][bmrt_test:868] INFO:reading input #5, bytesize=5242880
[BMRT][print_array:706] INFO:  --> input_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=1310720
[BMRT][bmrt_test:868] INFO:reading input #6, bytesize=5242880
[BMRT][print_array:706] INFO:  --> input_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=1310720
[BMRT][bmrt_test:868] INFO:reading input #7, bytesize=1310720
[BMRT][print_array:706] INFO:  --> input_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=327680
[BMRT][bmrt_test:868] INFO:reading input #8, bytesize=2621440
[BMRT][print_array:706] INFO:  --> input_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=655360
[BMRT][bmrt_test:868] INFO:reading input #9, bytesize=2621440
[BMRT][print_array:706] INFO:  --> input_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=655360
[BMRT][bmrt_test:868] INFO:reading input #10, bytesize=655360
[BMRT][print_array:706] INFO:  --> input_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=163840
[BMRT][bmrt_test:868] INFO:reading input #11, bytesize=1310720
[BMRT][print_array:706] INFO:  --> input_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=327680
[BMRT][bmrt_test:868] INFO:reading input #12, bytesize=1310720
[BMRT][print_array:706] INFO:  --> input_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=327680
[BMRT][bmrt_test:868] INFO:reading input #13, bytesize=327680
[BMRT][print_array:706] INFO:  --> input_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=81920
[BMRT][bmrt_test:868] INFO:reading input #14, bytesize=327680
[BMRT][print_array:706] INFO:  --> input_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=81920
[BMRT][bmrt_test:868] INFO:reading input #15, bytesize=327680
[BMRT][print_array:706] INFO:  --> input_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=81920
[BMRT][bmrt_test:1005] INFO:reading output #0, bytesize=65536
[BMRT][print_array:706] INFO:  --> output ref_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=16384
[BMRT][bmrt_test:1039] INFO:net[sdv15_unet_fuse] stage[0], launch total time is 114835 us (npu 114715 us, cpu 120 us)
[BMRT][bmrt_test:1042] INFO:+++ The network[sdv15_unet_fuse] stage[0] output_data +++
[BMRT][print_array:706] INFO:output data #0 shape: [1 4 64 64 ] < -0.75 0.151245 -0.0391541 0.0606079 0.0619202 -0.00511551 0.012001 -0.00222397 -0.022522 -0.0433044 -0.0566711 -0.0669556 -0.0713501 -0.0565186 -0.0287018 -0.00335503 ... > len=16384
[BMRT][bmrt_test:1083] INFO:load input time(s): 0.025509
[BMRT][bmrt_test:1084] INFO:calculate  time(s): 0.114855
[BMRT][bmrt_test:1085] INFO:get output time(s): 0.000114
[BMRT][bmrt_test:1086] INFO:compare    time(s): 0.000155
[BMRT][bmrt_test:868] INFO:reading input #0, bytesize=98304
[BMRT][print_array:706] INFO:  --> input_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=24576
[BMRT][bmrt_test:868] INFO:reading input #1, bytesize=4
[BMRT][print_array:706] INFO:  --> input_data: < 0 >
[BMRT][bmrt_test:868] INFO:reading input #2, bytesize=236544
[BMRT][print_array:706] INFO:  --> input_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=59136
[BMRT][bmrt_test:868] INFO:reading input #3, bytesize=491520
[BMRT][print_array:706] INFO:  --> input_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=122880
[BMRT][bmrt_test:868] INFO:reading input #4, bytesize=7864320
[BMRT][print_array:706] INFO:  --> input_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=1966080
[BMRT][bmrt_test:868] INFO:reading input #5, bytesize=7864320
[BMRT][print_array:706] INFO:  --> input_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=1966080
[BMRT][bmrt_test:868] INFO:reading input #6, bytesize=7864320
[BMRT][print_array:706] INFO:  --> input_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=1966080
[BMRT][bmrt_test:868] INFO:reading input #7, bytesize=1966080
[BMRT][print_array:706] INFO:  --> input_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=491520
[BMRT][bmrt_test:868] INFO:reading input #8, bytesize=3932160
[BMRT][print_array:706] INFO:  --> input_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=983040
[BMRT][bmrt_test:868] INFO:reading input #9, bytesize=3932160
[BMRT][print_array:706] INFO:  --> input_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=983040
[BMRT][bmrt_test:868] INFO:reading input #10, bytesize=983040
[BMRT][print_array:706] INFO:  --> input_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=245760
[BMRT][bmrt_test:868] INFO:reading input #11, bytesize=1966080
[BMRT][print_array:706] INFO:  --> input_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=491520
[BMRT][bmrt_test:868] INFO:reading input #12, bytesize=1966080
[BMRT][print_array:706] INFO:  --> input_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=491520
[BMRT][bmrt_test:868] INFO:reading input #13, bytesize=491520
[BMRT][print_array:706] INFO:  --> input_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=122880
[BMRT][bmrt_test:868] INFO:reading input #14, bytesize=491520
[BMRT][print_array:706] INFO:  --> input_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=122880
[BMRT][bmrt_test:868] INFO:reading input #15, bytesize=491520
[BMRT][print_array:706] INFO:  --> input_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=122880
[BMRT][bmrt_test:1005] INFO:reading output #0, bytesize=98304
[BMRT][print_array:706] INFO:  --> output ref_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=24576
[BMRT][bmrt_test:1039] INFO:net[sdv15_unet_fuse] stage[1], launch total time is 195690 us (npu 195599 us, cpu 91 us)
[BMRT][bmrt_test:1042] INFO:+++ The network[sdv15_unet_fuse] stage[1] output_data +++
[BMRT][print_array:706] INFO:output data #0 shape: [1 4 96 64 ] < nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan ... > len=24576
[BMRT][bmrt_test:1083] INFO:load input time(s): 0.037982
[BMRT][bmrt_test:1084] INFO:calculate  time(s): 0.195693
[BMRT][bmrt_test:1085] INFO:get output time(s): 0.000119
[BMRT][bmrt_test:1086] INFO:compare    time(s): 0.000129
[BMRT][bmrt_test:868] INFO:reading input #0, bytesize=98304
[BMRT][print_array:706] INFO:  --> input_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=24576
[BMRT][bmrt_test:868] INFO:reading input #1, bytesize=4
[BMRT][print_array:706] INFO:  --> input_data: < 0 >
[BMRT][bmrt_test:868] INFO:reading input #2, bytesize=236544
[BMRT][print_array:706] INFO:  --> input_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=59136
[BMRT][bmrt_test:868] INFO:reading input #3, bytesize=491520
[BMRT][print_array:706] INFO:  --> input_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=122880
[BMRT][bmrt_test:868] INFO:reading input #4, bytesize=7864320
[BMRT][print_array:706] INFO:  --> input_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=1966080
[BMRT][bmrt_test:868] INFO:reading input #5, bytesize=7864320
[BMRT][print_array:706] INFO:  --> input_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=1966080
[BMRT][bmrt_test:868] INFO:reading input #6, bytesize=7864320
[BMRT][print_array:706] INFO:  --> input_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=1966080
[BMRT][bmrt_test:868] INFO:reading input #7, bytesize=1966080
[BMRT][print_array:706] INFO:  --> input_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=491520
[BMRT][bmrt_test:868] INFO:reading input #8, bytesize=3932160
[BMRT][print_array:706] INFO:  --> input_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=983040
[BMRT][bmrt_test:868] INFO:reading input #9, bytesize=3932160
[BMRT][print_array:706] INFO:  --> input_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=983040
[BMRT][bmrt_test:868] INFO:reading input #10, bytesize=983040
[BMRT][print_array:706] INFO:  --> input_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=245760
[BMRT][bmrt_test:868] INFO:reading input #11, bytesize=1966080
[BMRT][print_array:706] INFO:  --> input_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=491520
[BMRT][bmrt_test:868] INFO:reading input #12, bytesize=1966080
[BMRT][print_array:706] INFO:  --> input_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=491520
[BMRT][bmrt_test:868] INFO:reading input #13, bytesize=491520
[BMRT][print_array:706] INFO:  --> input_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=122880
[BMRT][bmrt_test:868] INFO:reading input #14, bytesize=491520
[BMRT][print_array:706] INFO:  --> input_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=122880
[BMRT][bmrt_test:868] INFO:reading input #15, bytesize=491520
[BMRT][print_array:706] INFO:  --> input_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=122880
[BMRT][bmrt_test:1005] INFO:reading output #0, bytesize=98304
[BMRT][print_array:706] INFO:  --> output ref_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=24576
[BMRT][bmrt_test:1039] INFO:net[sdv15_unet_fuse] stage[2], launch total time is 197835 us (npu 197750 us, cpu 85 us)
[BMRT][bmrt_test:1042] INFO:+++ The network[sdv15_unet_fuse] stage[2] output_data +++
[BMRT][print_array:706] INFO:output data #0 shape: [1 4 64 96 ] < nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan ... > len=24576
[BMRT][bmrt_test:1083] INFO:load input time(s): 0.038048
[BMRT][bmrt_test:1084] INFO:calculate  time(s): 0.197843
[BMRT][bmrt_test:1085] INFO:get output time(s): 0.000120
[BMRT][bmrt_test:1086] INFO:compare    time(s): 0.000196
bmrt_test --bmodel sdv15_vd_multisize.bmodel 
[BMRT][deal_with_options:1446] INFO:Loop num: 1
[BMRT][bmrt_test:723] WARNING:setpriority failed, cpu time might flutuate.
[BMRT][bmcpu_setup:406] INFO:cpu_lib 'libcpuop.so' is loaded.
bmcpu init: skip cpu_user_defined
open usercpu.so, init user_cpu_init 
[BMRT][load_bmodel:1084] INFO:Loading bmodel from [sdv15_vd_multisize.bmodel]. Thanks for your patience...
[BMRT][load_bmodel:1023] INFO:pre net num: 0, load net num: 1
[BMRT][show_net_info:1520] INFO: ########################
[BMRT][show_net_info:1521] INFO: NetName: sdv15_vd, Index=0
[BMRT][show_net_info:1523] INFO: ---- stage 0 ----
[BMRT][show_net_info:1532] INFO:   Input 0) 'x.1' shape=[ 1 4 64 64 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1542] INFO:   Output 0) '697_f32' shape=[ 1 3 512 512 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1523] INFO: ---- stage 1 ----
[BMRT][show_net_info:1532] INFO:   Input 0) 'x.1' shape=[ 1 4 96 64 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1542] INFO:   Output 0) '697_f32' shape=[ 1 3 768 512 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1523] INFO: ---- stage 2 ----
[BMRT][show_net_info:1532] INFO:   Input 0) 'x.1' shape=[ 1 4 64 96 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1542] INFO:   Output 0) '697_f32' shape=[ 1 3 512 768 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1545] INFO: ########################
[BMRT][bmrt_test:782] INFO:==> running network #0, name: sdv15_vd, loop: 0
[BMRT][bmrt_test:868] INFO:reading input #0, bytesize=65536
[BMRT][print_array:706] INFO:  --> input_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=16384
[BMRT][bmrt_test:1005] INFO:reading output #0, bytesize=3145728
[BMRT][print_array:706] INFO:  --> output ref_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=786432
[BMRT][bmrt_test:1039] INFO:net[sdv15_vd] stage[0], launch total time is 336867 us (npu 336785 us, cpu 82 us)
[BMRT][bmrt_test:1042] INFO:+++ The network[sdv15_vd] stage[0] output_data +++
[BMRT][print_array:706] INFO:output data #0 shape: [1 3 512 512 ] < nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan ... > len=786432
[BMRT][bmrt_test:1083] INFO:load input time(s): 0.000342
[BMRT][bmrt_test:1084] INFO:calculate  time(s): 0.336873
[BMRT][bmrt_test:1085] INFO:get output time(s): 0.002312
[BMRT][bmrt_test:1086] INFO:compare    time(s): 0.000546
[BMRT][bmrt_test:868] INFO:reading input #0, bytesize=98304
[BMRT][print_array:706] INFO:  --> input_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=24576
[BMRT][bmrt_test:1005] INFO:reading output #0, bytesize=4718592
[BMRT][print_array:706] INFO:  --> output ref_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=1179648
[BMRT][bmrt_test:1039] INFO:net[sdv15_vd] stage[1], launch total time is 503871 us (npu 503819 us, cpu 52 us)
[BMRT][bmrt_test:1042] INFO:+++ The network[sdv15_vd] stage[1] output_data +++
[BMRT][print_array:706] INFO:output data #0 shape: [1 3 768 512 ] < nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan ... > len=1179648
[BMRT][bmrt_test:1083] INFO:load input time(s): 0.000212
[BMRT][bmrt_test:1084] INFO:calculate  time(s): 0.503874
[BMRT][bmrt_test:1085] INFO:get output time(s): 0.003376
[BMRT][bmrt_test:1086] INFO:compare    time(s): 0.000737
[BMRT][bmrt_test:868] INFO:reading input #0, bytesize=98304
[BMRT][print_array:706] INFO:  --> input_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=24576
[BMRT][bmrt_test:1005] INFO:reading output #0, bytesize=4718592
[BMRT][print_array:706] INFO:  --> output ref_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=1179648
[BMRT][bmrt_test:1039] INFO:net[sdv15_vd] stage[2], launch total time is 507822 us (npu 507772 us, cpu 50 us)
[BMRT][bmrt_test:1042] INFO:+++ The network[sdv15_vd] stage[2] output_data +++
[BMRT][print_array:706] INFO:output data #0 shape: [1 3 512 768 ] < nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan ... > len=1179648
[BMRT][bmrt_test:1083] INFO:load input time(s): 0.000209
[BMRT][bmrt_test:1084] INFO:calculate  time(s): 0.507825
[BMRT][bmrt_test:1085] INFO:get output time(s): 0.003411
[BMRT][bmrt_test:1086] INFO:compare    time(s): 0.000707
bmrt_test --bmodel sdv15_ve_multisize.bmodel 
[BMRT][deal_with_options:1446] INFO:Loop num: 1
[BMRT][bmrt_test:723] WARNING:setpriority failed, cpu time might flutuate.
[BMRT][bmcpu_setup:406] INFO:cpu_lib 'libcpuop.so' is loaded.
bmcpu init: skip cpu_user_defined
open usercpu.so, init user_cpu_init 
[BMRT][load_bmodel:1084] INFO:Loading bmodel from [sdv15_ve_multisize.bmodel]. Thanks for your patience...
[BMRT][load_bmodel:1023] INFO:pre net num: 0, load net num: 1
[BMRT][show_net_info:1520] INFO: ########################
[BMRT][show_net_info:1521] INFO: NetName: sdv15_ve, Index=0
[BMRT][show_net_info:1523] INFO: ---- stage 0 ----
[BMRT][show_net_info:1532] INFO:   Input 0) 'x.1' shape=[ 1 3 512 512 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1542] INFO:   Output 0) '568_f32' shape=[ 1 8 64 64 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1523] INFO: ---- stage 1 ----
[BMRT][show_net_info:1532] INFO:   Input 0) 'x.1' shape=[ 1 3 768 512 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1542] INFO:   Output 0) '568_f32' shape=[ 1 8 96 64 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1523] INFO: ---- stage 2 ----
[BMRT][show_net_info:1532] INFO:   Input 0) 'x.1' shape=[ 1 3 512 768 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1542] INFO:   Output 0) '568_f32' shape=[ 1 8 64 96 ] dtype=FLOAT32 scale=1 zero_point=0
[BMRT][show_net_info:1545] INFO: ########################
[BMRT][bmrt_test:782] INFO:==> running network #0, name: sdv15_ve, loop: 0
[BMRT][bmrt_test:868] INFO:reading input #0, bytesize=3145728
[BMRT][print_array:706] INFO:  --> input_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=786432
[BMRT][bmrt_test:1005] INFO:reading output #0, bytesize=131072
[BMRT][print_array:706] INFO:  --> output ref_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=32768
[BMRT][bmrt_test:1039] INFO:net[sdv15_ve] stage[0], launch total time is 163083 us (npu 162974 us, cpu 109 us)
[BMRT][bmrt_test:1042] INFO:+++ The network[sdv15_ve] stage[0] output_data +++
[BMRT][print_array:706] INFO:output data #0 shape: [1 8 64 64 ] < nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan ... > len=32768
[BMRT][bmrt_test:1083] INFO:load input time(s): 0.003035
[BMRT][bmrt_test:1084] INFO:calculate  time(s): 0.163091
[BMRT][bmrt_test:1085] INFO:get output time(s): 0.000177
[BMRT][bmrt_test:1086] INFO:compare    time(s): 0.000169
[BMRT][bmrt_test:868] INFO:reading input #0, bytesize=4718592
[BMRT][print_array:706] INFO:  --> input_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=1179648
[BMRT][bmrt_test:1005] INFO:reading output #0, bytesize=196608
[BMRT][print_array:706] INFO:  --> output ref_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=49152
[BMRT][bmrt_test:1039] INFO:net[sdv15_ve] stage[1], launch total time is 243577 us (npu 243518 us, cpu 59 us)
[BMRT][bmrt_test:1042] INFO:+++ The network[sdv15_ve] stage[1] output_data +++
[BMRT][print_array:706] INFO:output data #0 shape: [1 8 96 64 ] < nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan ... > len=49152
[BMRT][bmrt_test:1083] INFO:load input time(s): 0.006236
[BMRT][bmrt_test:1084] INFO:calculate  time(s): 0.243584
[BMRT][bmrt_test:1085] INFO:get output time(s): 0.000206
[BMRT][bmrt_test:1086] INFO:compare    time(s): 0.000170
[BMRT][bmrt_test:868] INFO:reading input #0, bytesize=4718592
[BMRT][print_array:706] INFO:  --> input_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=1179648
[BMRT][bmrt_test:1005] INFO:reading output #0, bytesize=196608
[BMRT][print_array:706] INFO:  --> output ref_data: < 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... > len=49152
[BMRT][bmrt_test:1039] INFO:net[sdv15_ve] stage[2], launch total time is 245110 us (npu 245051 us, cpu 59 us)
[BMRT][bmrt_test:1042] INFO:+++ The network[sdv15_ve] stage[2] output_data +++
[BMRT][print_array:706] INFO:output data #0 shape: [1 8 64 96 ] < nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan ... > len=49152
[BMRT][bmrt_test:1083] INFO:load input time(s): 0.004655
[BMRT][bmrt_test:1084] INFO:calculate  time(s): 0.245117
[BMRT][bmrt_test:1085] INFO:get output time(s): 0.000194
[BMRT][bmrt_test:1086] INFO:compare    time(s): 0.000164

@Morgan
https://drive.google.com/drive/folders/1-1iTJZvvfA8yp0x86FmDIE-Wjw36KnGA?usp=drive_link here is the drive link you requested. I appreciate the offer to build a new bmodel but I would much rather figure out and resolve the conversion issue I’m experiencing. Thank you very much! :slight_smile:

Hi, @Landside2

after i check the bmrt_test, i found all the result is nan nan nan, that is the reason

i would check this version of tpu-mlir,

and btw, i just apply to access your driver file

best,
Morgan

@Morgan tpu-mlir is version 1.6.502 on the workstation that’s converting the files:

root@9650c14a8fbd:/workspace/lcm-lora-sdv1-5# pip show tpu-mlir
Name: tpu_mlir
Version: 1.6.502
Summary: Machine learning compiler based on MLIR for Sophgo TPU v1.6.502-gcc378fc2c-20240513
Home-page:
https://github.com/sophgo/tpu-mlir
Author: SOPHGO
Author-email: sales@sophgo.com
License: 2-Clause BSD
Location: /usr/local/lib/python3.10/dist-packages
Requires: graphviz, numpy, opencv-python-headless, pandas, Pillow, plotly, protobuf, pycocotools, scikit-image, scipy, tqdm, transformers
Required-by:
root@9650c14a8fbd:/workspace/lcm-lora-sdv1-5#

Hi, @Landside2

I have check the 1.6.502 tpu-mlir and reproduct this problem, so i upload a rollback version which i have tested for everyone

the address is https://github.com/radxa-edge/TPU-Edge-AI/releases/download/v0.1.0/tpu_mlir-1.6.404-py3-none-any.whl

please download and install it in your docker,

thanks you found this problem

best,
Morgan

1 Like

@Morgan thanks for looking into this! I was wondering, when generating the model, is it possible to supply multiple loras or are we just limited to 1 ?

@Landside2

it support multiple loras, but might need to change a little bit of the ConvertorPtOnnx.py code

1 Like