• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2022, Vol. 44 ›› Issue (09): 1655-1664.

• 图形与图像 • 上一篇    下一篇

一种轻量化中文指路标志的文本识别算法

宜超杰,陈莉,包宇翔   

  1. (西北大学信息科学与技术学院,陕西 西安 710100)
  • 收稿日期:2021-01-28 修回日期:2021-05-26 接受日期:2022-09-25 出版日期:2022-09-25 发布日期:2022-09-25
  • 基金资助:
     国家重点研发计划(2020YFC1523301);陕西省重点研发计划(2019ZDLGY10-01)

A lightweight text recognition algorithm for Chinese guide signs

YI Chao-jie,CHEN Li,BAO Yu-xiang   

  1. (School of Information Science & Technology,Northwest University,Xi’an 710100,China)
  • Received:2021-01-28 Revised:2021-05-26 Accepted:2022-09-25 Online:2022-09-25 Published:2022-09-25

摘要: 针对中文交通指路标志中多方向、多角度的文本提取与识别困难的问题,提出了一种融合了卷积神经网络与传统机器学习方法的轻量化中文交通指路标志文本提取与识别算法。首先,对YOLOv5l目标检测网络进行轻量改进,提出了YOLOv5t网络用以提取指路标志牌中的文本区域;然后,结合投影直方图法与多项式拟合法的M-split算法,对提取到的文本区域进行字符分割;最后,使用MobileNetV3轻量化网络对文本进行识别。提出的算法在自制数据集TS-Detect上进行近景文本识别,精度达到了901%,检测速度达到了40 fps,且权重文件大小仅有24.45 MB。实验结果表明,提出的算法具有轻量化、高精度的特性,能够完成复杂拍摄条件下的实时中文指路标志文本提取与识别任务。



关键词: 交通标志, 文本识别, 多项式拟合, YOLO, MobileNet

Abstract: Aiming at the difficulty of multi-directional and multi-angle text extraction and recognition in Chinese traffic guidance signs, a light-weight Chinese traffic guidance sign text extraction and recognition algorithm is proposed that integrates convolutional neural networks and traditional machine learning methods. Firstly, the YOLOv5l object detection network is lightly improved, and the YOLOv5t network is proposed to extract the text regions in the road signs. Secondly, an M-split algorithm combining the projection histogram method and the polynomial fitting method is proposed to segment the extracted text regions. Finally, the MobileNetV3 lightweight network is used to recognize the text. The proposed algorithm achieves a close-shot text recognition accuracy of 90.1% on the self-made TS-Detect dataset, the detection speed achieves 40 fps, and the size of the weight file is only 24.45 MB. The experimental results show that the algorithm is lightweight and accurate enough to complete the real-time Chinese guide sign text extraction and recognition tasks under complex shooting conditions.

Key words: traffic sign, text detection, polynomial fitting, YOLO, MobileNet ,