Exploration of the many-core data flow hardware architecture based on Actor model

Computer Engineering & Science ›› 2024, Vol. 46 ›› Issue (06): 959-967.

• High Performance Computing • Previous Articles Next Articles

Exploration of the many-core data flow hardware architecture based on Actor model

ZHANG Jia-hao,DENG Jin-yi,YIN Shou-yi,WEI Shao-jun,HU Yang

(School of Integrated Circuits,Tsinghua University,Beijing 100084,China)

Received:2023-10-06 Revised:2023-11-23 Accepted:2024-06-25 Online:2024-06-25 Published:2024-06-17

Abstract

Abstract: The distributed training of ultra-large-scale AI models poses challenges to the communication capability and scalability of chip architectures. Wafer-level chips integrate a large number of computing cores and inter-connect networks on the same wafer, achieving ultra-high computing density and communication performance, making them an ideal choice for training ultra-large-scale AI models. AMCoDA is a hardware architecture based on the Actor model, aiming to leverage the highly parallel, asynchronous message passing, and scalable characteristics of the Actor parallel programming model to achieve distributed training of AI models on wafer-level chips. The design of AMCoDA includes three levels: computational model, execution model, and hardware architecture. Experiments show that AMCoDA extensively supports various parallel patterns and collective communications in distributed training, flexibly and efficiently deploying and executing complex distributed training strategies.

Key words: wafer-level chip, distributed training, Actor model, many-core dataflow architecture ,

ZHANG Jia-hao, DENG Jin-yi, YIN Shou-yi, WEI Shao-jun, HU Yang. Exploration of the many-core data flow hardware architecture based on Actor model[J]. Computer Engineering & Science, 2024, 46(06): 959-967.

[1]	ZHANG Jianmin, XU Weikang, LIU Jinjin, LI Tiejun. Research advances in acceleration methods for particle transport non-deterministic simulation [J]. Computer Engineering & Science, 2025, 47(01): 1-9.
[2]	AN Xinchen. Structure optimization of second-level Cache in DSP processor [J]. Computer Engineering & Science, 2025, 47(01): 10-17.
[3]	YUAN Liangyong, QI Xingyun, L Fangxu, LUO Zhang, HUANG Heng, ZHANG Geng, WANG Wenchen, LI Meng, LAI Mingche. Research and design of clock recovery circuit for Duobinary signal [J]. Computer Engineering & Science, 2025, 47(01): 27-34.
[4]	WU Yuhong, WANG Jian. Fault diagnosis of analog circuits based on Patches-CNN [J]. Computer Engineering & Science, 2025, 47(01): 35-44.
[5]	WU Peicheng, ZHAO Xujun, JIN Lizhong. Anomaly detection of stream data based on grid density stacking [J]. Computer Engineering & Science, 2025, 47(01): 75-85.
[6]	LUO Yangxia, LI Hao, WU Chenming. Construction and research of malware knowledge graph [J]. Computer Engineering & Science, 2025, 47(01): 86-94.
[7]	XU Chao, RUAN Rongyao, CHEN Yong, . A blockchain-based medical data auditing method [J]. Computer Engineering & Science, 2025, 47(01): 95-106.
[8]	REN Ruilin, YANG Yan. An adaptive Gaussian function dehazing algorithm under channel difference prior [J]. Computer Engineering & Science, 2025, 47(01): 107-118.
[9]	QI Ranran, PALIDAN Tuerxun, TANG Bochuan, QIAN Yurong, . A road extraction method based on residual attention encoder-decoder network [J]. Computer Engineering & Science, 2025, 47(01): 119-129.
[10]	CHEN Zhaobo, ZHANG Lin, MA Xiaoxuan. Video anomaly detection with improved attention hybrid auto-encoder [J]. Computer Engineering & Science, 2025, 47(01): 130-139.
[11]	ZHANG Zheng, XIA Xiaoyun, CHEN Zefeng, XIANG Yi. A staged strategy incorporating reinforcement learning to solve the travelling thief problem [J]. Computer Engineering & Science, 2025, 47(01): 140-149.
[12]	CHEN Xinran, LIU Ning, YAN Zhongmin, LIU Lei, CUI Lizhen. An attention-guided dual-granularity cross-modal medical representation learning framework [J]. Computer Engineering & Science, 2025, 47(01): 150-159.
[13]	WANG Yang, XU Jiawei, WANG Ao, SONG Shijia, XIE Fan, ZHAO Chuanxin, JI Yimu. WiFi-based human activity recognition using cross-sequence prediction and consistency comparison [J]. Computer Engineering & Science, 2025, 47(01): 160-170.
[14]	GAO Jiyuan, LIU Jie, CHEN Changsheng, LI Wei, LIU Ying, YANG Jing, . A hybrid strategy improved dung beetle optimization algorithm [J]. Computer Engineering & Science, 2025, 47(01): 171-179.
[15]	JIANG Yunzhuo, GONG Zhengxian. Document-level neural machine translation based on rhetorical structure [J]. Computer Engineering & Science, 2025, 47(01): 180-190.

Exploration of the many-core data flow hardware architecture based on Actor model

PDF

Knowledge

Abstract

Cite this article

share this article

Related Articles 15

Recommended Articles

Metrics

Comments