• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2023, Vol. 45 ›› Issue (11): 1991-1998.

• 图形与图像 • 上一篇    下一篇

基于多尺度特征融合网络的HEVC帧内编码单元快速划分研究

刘雨墨1,2,刘剑飞1,2,郝禄国3,曾文彬4   

  1. (1.河北工业大学电子信息工程学院,天津 300131;2.河北工业大学电子与通信工程国家级实验教学示范中心,天津 300131;
    3.广东工业大学信息工程学院,广东 广州 510006;4.天津大学电气自动化与信息工程学院,天津 300072)
  • 收稿日期:2022-07-18 修回日期:2023-02-13 接受日期:2023-11-25 出版日期:2023-11-25 发布日期:2023-11-16
  • 基金资助:
    河北省自然科学基金(F2021202054);河北省研究生示范课项目(KCJSX2020014)

A multi-scale feature fusion network based fast CU partitioning in HEVC intra coding

LIU Yu-mo1,2,LIU Jian-fei1,2,HAO Lu-guo3,ZENG Wen-bin4   

  1. (1.School of Electronic and Information Engineering,Hebei University of Technology,Tianjin 300131;
    2.National Experimental Teaching Demonstration Center of Electronic and 
    Communication Engineering,Hebei University of Technology,Tianjin 300131;
    3.School of Information Engineering,Guangdong University of Technology,Guangzhou 510006;
    4.School of Electrical and Information Engineering,Tianjin University,Tianjin 300072,China)
  • Received:2022-07-18 Revised:2023-02-13 Accepted:2023-11-25 Online:2023-11-25 Published:2023-11-16

摘要: 高效视频编码HEVC显著提高了编码效率,但同时增加了编码复杂度,在基于四叉树结构的编码单元(CU)划分过程中尤为明显,因此研究CU快速划分具有重要意义。多尺度特征融合的网络可以实现HEVC编码单元快速划分。为此,结合U-Net和CU划分特性设计了UcuNet网络,同时为加强不同尺度像素的特征提取,采用了非对称卷积AC和CBAM注意力机制。为更好地训练深度学习模型,收集了不同分辨率的原始视频和对应的编码信息构建出大规模的数据集。最后将模型嵌入到HEVC编码架构中,提前预测CU划分的结果,跳过了原始CU划分方法中递归的率失真优化(RDO)计算过程,从而有效降低CU划分带来的编码复杂度。实验结果表明,对比HEVC官方测试模型(HM16.20),UcuNet在BD-BR仅损失2.63%的情况下,使平均编码时间缩短了68.13%。

关键词: HEVC, 编码单元划分, 深度学习, 非对称卷积

Abstract: High Efficiency Video Coding (HEVC) significantly improves the coding efficiency but increases the coding complexity, especially in the process of coding unit (CU) partitioning based on quadtree structure, so it is important to study the fast CU partitioning. A multi-scale feature fusion network can achieve fast HEVC CU partitioning. Therefore, the UcuNet network structure is designed by combining the U-Net and CU partitioning features. Meanwhile, asymmetric convolutional AC and CBAM attention mechanisms are used to enhance the feature extraction of pixels at different scales. In order to sufficiently  train the deep learning model, the original video with different resolutions and the corresponding encoding information are collected to build a large-scale dataset. Finally, the model is embedded into the HEVC coding architecture to predict the result of CU partitioning in advance, which can effectively reduce the coding complexity caused by CU partitioning by eliminating the recursive rate distortion optimization (RDO) calculation process in the original CU partitioning method. Compared with the official HEVC test model (HM16.20), the proposed UcuNet reduces the average coding time by 68.13% while BD-BR is only decreased by 2.63%.


Key words: high efficiency video coding (HEVC), coding unit partitioning, deep learning, asymmetric convolution