[FFmpeg] Introduce FFmpeg-Rockchip for hyper fast video transcoding via CLI

yep, NV12 from a MIPI camera. I don’t know if 12-bit modes (MEDIA_BUS_FMT_SRGGB12_1X12) could trigger this. Note: i think 2016x1080 is 10-bit mode.

I will try to investigate a bit more.

But some feedback:

-- mpp latest commit --
Author: Yanjun Liao <yanjun.liao@rock-chips.com>
Date:   Wed Jul 24 15:47:54 2024 +0800

    fix[265e]:Fix the st refernce frame err in tsvc
    
    This is a bug caused by the mark and use of ltr frames,
    when presence of multiple short temporal refenence frame,
    may lead to errors in reference relationships.
    
    Change-Id: I1962d81e39b704086a51b4e4098ba3feb64c47c6
    Signed-off-by: Yanjun Liao <yanjun.liao@rock-chips.com>

-- rga-multi latest commit --
Author: Yu Qiaowei <cerf.yu@rock-chips.com>
Date:   Thu Aug 29 15:01:34 2024 +0800

    normal: fix wrong full_csc_clip size in memcpy
    
    This causes the gauss mode to be checked when using full_csc with driver
    versions 1.3.5 and above.
    
    update to 1.10.1_[3]
    
    Signed-off-by: Yu Qiaowei <cerf.yu@rock-chips.com>
    Change-Id: I40c3c503c68a77f2c435a8932de0195b447d1d46

-- ffmpeg-rockchip latest commit --
Author: nyanmisaka <nst799610810@gmail.com>
Date:   Wed Oct 23 21:42:03 2024 +0800

    fixup! lavf/rkrga: add RKRGA scale, vpp and overlay filter
    
    fix nv24/nv42 check on rga2p
    
    Signed-off-by: nyanmisaka <nst799610810@gmail.com>

works, but dma-buf issue:

[AVFilterGraph @ 0xaaaae71af360] No such filter: ‘scale_rkrga’

Have i missed some build parameters?

works, but dma-buf issue.

Update: the dma-buf issue is only when rendering on screen (ffplay)

In the RK 6.1 kernel, their custom dma-heap driver was removed. It was replaced with the upstream dma-heap driver, which might be the problem. The patch added in my MPP branch will use the DRM allocator instead of dma-heap by default.

As for the missing scale_rkrga filter, check if --enable-rkrga is configured in your FFmpeg.

--enable-gpl --enable-version3 --enable-libdrm --enable-rkmpp --enable-rkrga

This fixed the 1px problem, and possibly the dma-buf issue (i think). Thanks!

PSA: please always use the MPP and RGA branches I mentioned in the Wiki and the latest ffmpeg-rockchip. Those small fixes are carefully selected and verified in Jellyfin Server.

Just for the record, i need to test with your branches, I just tested a uvc webcam that is H264 and H265 with the rockchip branches, and the 1 pixel (looks more 8 pixels) is back.
In order to decode hevc (webcam) i added this:

diff --git a/libavdevice/v4l2-common.c b/libavdevice/v4l2-common.c
index 1926179..d169cba 100644
--- a/libavdevice/v4l2-common.c
+++ b/libavdevice/v4l2-common.c
@@ -57,6 +57,9 @@ const struct fmt_map ff_fmt_conversion_table[] = {
 #ifdef V4L2_PIX_FMT_H264
     { AV_PIX_FMT_NONE,    AV_CODEC_ID_H264,     V4L2_PIX_FMT_H264    },
 #endif
+#ifdef V4L2_PIX_FMT_HEVC
+    { AV_PIX_FMT_NONE,    AV_CODEC_ID_H265,     V4L2_PIX_FMT_HEVC    },
+#endif
 #ifdef V4L2_PIX_FMT_MPEG4
     { AV_PIX_FMT_NONE,    AV_CODEC_ID_MPEG4,    V4L2_PIX_FMT_MPEG4   },
 #endif

I haven’t looked at libavdevice/v4l2-common.c, except to include a patch from rigaya to add nv16/nv24 formats. Please dump a frame as rawvideo and inspect it with YUView. Either it’s a problem in the upstream FFmpeg code, or RK needs to fix their kernel driver.

... -vframes 1 -f rawvideo /path/to/1.yuv
... -vframes 1 -f rawvideo /path/to/2.rgb

FYI, cloned the latest ffmpeg-rockchip, built and ran again, get_format() is choosing NV12 (which is the wrong format, should be yuv420p).

./ffplay -f hevc -vcodec hevc_rkmpp -i ~/ffmpeg_1920x1080_100frames.h265 -loglevel debug
ffplay version 9dbaf5a Copyright (c) 2003-2023 the FFmpeg developers
  built with gcc 12 (Ubuntu 12.3.0-1ubuntu1~22.04)
  configuration: --prefix=/usr --disable-libopenh264 --disable-vaapi --disable-vdpau --disable-decoder=h264_v4l2m2m --disable-decoder=vp8_v4l2m2m --disable-decoder=mpeg2_v4l2m2m --disable-decoder=mpeg4_v4l2m2m --disable-libxvid --disable-libx264 --disable-libx265 --enable-rkmpp --enable-nonfree --enable-gpl --enable-version3 --enable-libmp3lame --enable-libpulse --enable-libv4l2 --enable-libdrm --enable-libxml2 --enable-librtmp --enable-libfreetype --enable-openssl --enable-opengl --enable-libopus --enable-libvorbis --disable-shared --enable-decoder='aac,ac3,flac' --disable-cuvid --enable-rkrga
  libavutil      58. 29.100 / 58. 29.100
  libavcodec     60. 31.102 / 60. 31.102
  libavformat    60. 16.100 / 60. 16.100
  libavdevice    60.  3.100 / 60.  3.100
  libavfilter     9. 12.100 /  9. 12.100
  libswscale      7.  5.100 /  7.  5.100
  libswresample   4. 12.100 /  4. 12.100
  libpostproc    57.  3.100 / 57.  3.100
arm_release_ver of this libmali is 'g6p0-01eac0', rk_so_ver is '6'.
arm_release_ver of this libmali is 'g6p0-01eac0', rk_so_ver is '6'.
arm_release_ver of this libmali is 'g6p0-01eac0', rk_so_ver is '6'.
Initialized opengles2 renderer.
[hevc @ 0xffff5c000c20] Opening '/home/rock/ffmpeg_1920x1080_100frames.h265' for reading
[file @ 0xffff5c001290] Setting default whitelist 'file,crypto,data'
[hevc @ 0xffff5c000c20] Before avformat_find_stream_info() pos: 0 bytes read:32768 seeks:0 nb_streams:1
[hevc @ 0xffff5c009a10] nal_unit_type: 32(VPS), nuh_layer_id: 0, temporal_id: 0
[hevc @ 0xffff5c009a10] nal_unit_type: 33(SPS), nuh_layer_id: 0, temporal_id: 0
[hevc @ 0xffff5c009a10] nal_unit_type: 34(PPS), nuh_layer_id: 0, temporal_id: 0
[hevc @ 0xffff5c009a10] nal_unit_type: 19(IDR_W_RADL), nuh_layer_id: 0, temporal_id: 0
[hevc @ 0xffff5c009a10] Decoding VPS
[hevc @ 0xffff5c009a10] Main profile bitstream
[hevc @ 0xffff5c009a10] Decoding SPS
[hevc @ 0xffff5c009a10] Main profile bitstream
[hevc @ 0xffff5c009a10] Decoding VUI
[hevc @ 0xffff5c009a10] Decoding PPS
[extract_extradata @ 0xffff5c01cde0] nal_unit_type: 32(VPS), nuh_layer_id: 0, temporal_id: 0
[extract_extradata @ 0xffff5c01cde0] nal_unit_type: 33(SPS), nuh_layer_id: 0, temporal_id: 0
[extract_extradata @ 0xffff5c01cde0] nal_unit_type: 34(PPS), nuh_layer_id: 0, temporal_id: 0
[extract_extradata @ 0xffff5c01cde0] nal_unit_type: 19(IDR_W_RADL), nuh_layer_id: 0, temporal_id: 0
[hevc @ 0xffff5c009a10] nal_unit_type: 32(VPS), nuh_layer_id: 0, temporal_id: 0
[hevc @ 0xffff5c009a10] nal_unit_type: 33(SPS), nuh_layer_id: 0, temporal_id: 0
[hevc @ 0xffff5c009a10] nal_unit_type: 34(PPS), nuh_layer_id: 0, temporal_id: 0
[hevc @ 0xffff5c009a10] nal_unit_type: 19(IDR_W_RADL), nuh_layer_id: 0, temporal_id: 0
[hevc @ 0xffff5c009a10] Decoding VPS
[hevc @ 0xffff5c009a10] Main profile bitstream
[hevc @ 0xffff5c009a10] Decoding SPS
[hevc @ 0xffff5c009a10] Main profile bitstream
[hevc @ 0xffff5c009a10] Decoding VUI
[hevc @ 0xffff5c009a10] Decoding PPS
[hevc @ 0xffff5c009a10] Format yuvj420p chosen by get_format().
[hevc @ 0xffff5c009a10] Output frame with POC 0.
[hevc @ 0xffff5c009a10] Decoded frame with POC 0. sq=    0B f=0/0   
[hevc @ 0xffff5c009a10] nal_unit_type: 32(VPS), nuh_layer_id: 0, temporal_id: 0
[hevc @ 0xffff5c009a10] nal_unit_type: 33(SPS), nuh_layer_id: 0, temporal_id: 0
[hevc @ 0xffff5c009a10] nal_unit_type: 34(PPS), nuh_layer_id: 0, temporal_id: 0
[hevc @ 0xffff5c009a10] Decoding VPS
[hevc @ 0xffff5c009a10] Main profile bitstream
[hevc @ 0xffff5c009a10] Decoding SPS
[hevc @ 0xffff5c009a10] Main profile bitstream
[hevc @ 0xffff5c009a10] Decoding VUI
[hevc @ 0xffff5c009a10] Decoding PPS
[hevc @ 0xffff5c009a10] nal_unit_type: 1(TRAIL_R), nuh_layer_id: 0, temporal_id: 0
    Last message repeated 2 times
[hevc @ 0xffff5c009a10] nal_unit_type: 32(VPS), nuh_layer_id: 0, temporal_id: 0
[hevc @ 0xffff5c009a10] nal_unit_type: 33(SPS), nuh_layer_id: 0, temporal_id: 0
[hevc @ 0xffff5c009a10] nal_unit_type: 34(PPS), nuh_layer_id: 0, temporal_id: 0
[hevc @ 0xffff5c009a10] nal_unit_type: 19(IDR_W_RADL), nuh_layer_id: 0, temporal_id: 0
[hevc @ 0xffff5c009a10] Decoding VPS
[hevc @ 0xffff5c009a10] Main profile bitstream
[hevc @ 0xffff5c009a10] Decoding SPSKB vq=    0KB sq=    0B f=0/0   
[hevc @ 0xffff5c009a10] Main profile bitstream
[hevc @ 0xffff5c009a10] Decoding VUI
[hevc @ 0xffff5c009a10] Decoding PPS
[hevc @ 0xffff5c009a10] nal_unit_type: 1(TRAIL_R), nuh_layer_id: 0, temporal_id: 0
    Last message repeated 44 times
[hevc @ 0xffff5c000c20] All info found
[hevc @ 0xffff5c000c20] After avformat_find_stream_info() pos: 524288 bytes read:524288 seeks:0 frames:50
Input #0, hevc, from '/home/rock/ffmpeg_1920x1080_100frames.h265':
  Duration: N/A, bitrate: N/A
  Stream #0:0, 50, 1/1200000: Video: hevc (Main), 1 reference frame, yuvj420p(pc, bt709, left), 1920x1080 (1920x1088), 0/1, 25 fps, 25 tbr, 1200k tbn
[hevc_mp4toannexb @ 0xffff5c01cde0] The input looks like it is Annex B already
[hevc_rkmpp @ 0xffff5c307780] Format nv12 chosen by get_format().
[hevc_rkmpp @ 0xffff5c307780] Created a RKMPP hardware device
[hevc_rkmpp @ 0xffff5c307780] Decoder flushing
[hevc_rkmpp @ 0xffff5c307780] Wrote 14425 bytes to decoder
[hevc_rkmpp @ 0xffff5c307780] Wrote 16502 bytes to decoder
[hevc_rkmpp @ 0xffff5c307780] Wrote 5682 bytes to decoder
[hevc_rkmpp @ 0xffff5c307780] Noticed an info change
[hevc_rkmpp @ 0xffff5c307780] Format nv12 chosen by get_format().
[hevc_rkmpp @ 0xffff5c307780] Decoder options: deint=true afbc=0 fast_parse=true buf_mode=0
[hevc_rkmpp @ 0xffff5c307780] Configured with size: 1920x1080 | pix_fmt: nv12 | sw_pix_fmt: nv12
[hevc_rkmpp @ 0xffff5c307780] Wrote 14920 bytes to decoder
[hevc_rkmpp @ 0xffff5c307780] Received a frame4KB sq=    0B f=0/0   
Video frame changed from size:0x0 format:none serial:-1 to size:1920x1080 format:nv12 serial:1
detected 8 logical cores
[ffplay_buffer @ 0xffff540016b0] Setting 'video_size' to value '1920x1080'
[ffplay_buffer @ 0xffff540016b0] Setting 'pix_fmt' to value '23'
[ffplay_buffer @ 0xffff540016b0] Setting 'time_base' to value '1/1200000'
[ffplay_buffer @ 0xffff540016b0] Setting 'pixel_aspect' to value '0/1'
[ffplay_buffer @ 0xffff540016b0] Setting 'frame_rate' to value '25/1'
[ffplay_buffer @ 0xffff540016b0] w:1920 h:1080 pixfmt:nv12 tb:1/1200000 fr:25/1 sar:0/1
[auto_scale_0 @ 0xffff540020f0] w:iw h:ih flags:'' interl:0
[ffplay_buffersink @ 0xffff54001aa0] auto-inserting filter 'auto_scale_0' between the filter 'ffplay_buffer' and the filter 'ffplay_buffersink'
[AVFilterGraph @ 0xffff54003860] query_formats: 2 queried, 0 merged, 1 already done, 0 delayed
[auto_scale_0 @ 0xffff540020f0] picking yuv420p out of 5 ref:nv12 alpha:0
[auto_scale_0 @ 0xffff540020f0] w:1920 h:1080 fmt:nv12 sar:0/1 -> w:1920 h:1080 fmt:yuv420p sar:0/1 flags:0x00000004
[auto_scale_0 @ 0xffff540020f0] w:1920 h:1080 fmt:nv12 sar:0/1 -> w:1920 h:1080 fmt:yuv420p sar:0/1 flags:0x00000004
    Last message repeated 2 times
[hevc_rkmpp @ 0xffff5c307780] Wrote 48590 bytes to decoder
[hevc_rkmpp @ 0xffff5c307780] Received a frame
Created 1920x1080 texture with SDL_PIXELFORMAT_IYUV.
[hevc_rkmpp @ 0xffff5c307780] Received a frame
    Last message repeated 2 times
[hevc_rkmpp @ 0xffff5c307780] Wrote 7279 bytes to decoder
[hevc_rkmpp @ 0xffff5c307780] Wrote 7487 bytes to decoder
[hevc_rkmpp @ 0xffff5c307780] Wrote 6723 bytes to decoder
[hevc_rkmpp @ 0xffff5c307780] Received a frame
[hevc_rkmpp @ 0xffff5c307780] Wrote 6448 bytes to decoder
[hevc_rkmpp @ 0xffff5c307780] Received a frame
    Last message repeated 2 times

ffmpeg_1920x1080_100frames.h265.zip (1.0 MB)

This is expected behavior. The hardware decoders are designed to output semi-planar formats, namely NV12. YUV420P is only used by the software decoder.

ffmpeg-rockchip no longer glues the RGA filter and MPP decoder together, because this does not conform to FFmpeg conventions. You have to use the scale_rkrga filter in your code.

I think i get it, but how do I force auto_scale to pick NV12?

[auto_scale_0 @ 0xffff640022a0] w:1920 h:1080 fmt:nv12 sar:0/1 -> w:1920 h:1080 fmt:yuv420p sar:0/1 flags:0x00000004

-vf scale_rkrga=format=nv12

I tried to force nv12 but i got an error:

ffplay_buffer @ 0xffff68001860] Setting 'video_size' to value '1920x1080'
[ffplay_buffer @ 0xffff68001860] Setting 'pix_fmt' to value '23'
[ffplay_buffer @ 0xffff68001860] Setting 'time_base' to value '1/1200000'
[ffplay_buffer @ 0xffff68001860] Setting 'pixel_aspect' to value '0/1'
[ffplay_buffer @ 0xffff68001860] Setting 'frame_rate' to value '25/1'
[ffplay_buffer @ 0xffff68001860] w:1920 h:1080 pixfmt:nv12 tb:1/1200000 fr:25/1 sar:0/1
[AVFilterGraph @ 0xffff680039c0] Setting 'format' to value 'nv12'
[auto_scale_0 @ 0xffff68002770] w:iw h:ih flags:'' interl:0
[Parsed_scale_rkrga_0 @ 0xffff68001dc0] auto-inserting filter 'auto_scale_0' between the filter 'ffplay_buffer' and the filter 'Parsed_scale_rkrga_0'
Impossible to convert between the formats supported by the filter 'ffplay_buffer' and the filter 'auto_scale_0'

ffplay does not support hardware filters. You can use MPV player instead.

MPV is not a choice in my case. Gstreamer and ffplay 4.4.2 decode and render it fine (hw decoder).
I suppressed the filter in ffplay and rendered the NV12 frame as a texture; it works, but the green bar still exists. I need to investigate the NV12 frame format. I will dig into it. Thanks for your information and time.

The green bar means that the actual pitch/stride or offset of the decoder output image does not match the one you calculated. You have to fix it in your code.