• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2014, Vol. 36 ›› Issue (05): 797-803.

• 论文 • 上一篇    下一篇

一种基于SRT-8算法的SIMD浮点除法器的设计与实现

邓子椰,陈书明,彭元喜,雷元武   

  1. (国防科学技术大学计算机学院,湖南 长沙 410073)
  • 收稿日期:2013-07-05 修回日期:2013-11-12 出版日期:2014-05-25 发布日期:2014-05-25

Design and implementation of a SIMD
floating-point divider based on SRT-8       

DENG Ziye,CHEN Shuming,PENG Yuanxi,LEI Yuanwu   

  1. (College of Computer, National University of Defense Technology,Changsha 410073,China)
  • Received:2013-07-05 Revised:2013-11-12 Online:2014-05-25 Published:2014-05-25

摘要:

在科学计算、数字信号处理、通信和图像处理等应用中,除法运算是常用的基本操作之一。基于SRT8除法算法,设计一个SIMD结构的IEEE754标准浮点除法器,在同一硬件平台上能够实现双精度浮点除法和两个并行的单精度浮点除法。通过优化SRT8迭代除法结构,提出商选择和余数加法的并行处理,并采用商数字存储技术降低迭代除法的计算延时,提高频率。同时,采用复用策略减少硬件资源开销,节省面积。实验表明,在40nm工艺下,本设计综合cell面积为18601.9681 μm2,运行频率可达2.5GHz,相对传统的SRT8实现关键延迟减少了23.81%。

关键词: SRT-8, SIMD, 浮点除法器, 双精度浮点, SIMD单精度浮点

Abstract:

In the area of scientific computing, digital signal processing, communication and image processing, division is one of the widely used basic operations. Based on SRT-8 algorithm, a SIMD floating-point divider is designed,which is compatible to IEEE-754 standard.The divider supports one double precision floating point division and two parallel single precision floating point division on the same hardware platform.It reduces the iterative division calculation time delay and improves the frequency by optimizing the SRT8 iterative division structure,choosing parallel processing of quotient and residue addition,and adopting rapid storage technique. Besides,it reduces hardware resources and saves area by adopting reuse strategy.Experiments show that the synthesized cell area is 18 601.968 1μm2 and the frequency reaches up to 2.5GHz with 40nm technology library,and the latency of operation is reduced by 23.81% in comparison to the traditional implementation based on SRT-8.

Key words: SRT-8;SIMD;floating-point division;double precision floating-point;SIMD single precision floating-point