Computer Engineering & Science

User QoS-aware deep learning task dynamic scheduling on GPU clusters

LUO Lei, CHEN Zhao-yun, WANG Li-xuan

2021, 43(08): 1331-1340. doi:

Abstract ( 354 )

PDF (2873KB) ( 407 ) 　　

A QoS (Quality of Service)-aware deep learning task dynamic scheduling method on GPU clusters is proposed. The offline evaluation module is used to perform offline evaluation of deep learning tasks and build a computational performance prediction model. Based on the performance prediction model, combined with the expected QoS of the task, the online scheduling module carries out the scheduling of task placement and task execution sequence. Experiments on a distributed GPU cluster demonstrate that the proposed method can achieve higher QoS-guarantee percentage and cluster resource utilization than other baseline schedulers.

Data center energy efficiency simulation based on genetic algorithm

2021, 43(08): 1341-1352. doi:

Abstract ( 247 )

PDF (1305KB) ( 336 ) 　　

With the continuous expansion of the number and scale of data centers, energy consumption has become a key issue that restricts the cost and reliability of data centers. Moreover, with the iterative updates of the hardware after the data center is put into operation, the heterogeneity of the data center server has further increased, and its energy efficiency has also undergone major changes compared with the initial design and construction. Therefore, according to the structure and hardware configuration of the data center server, the dynamic energy efficiency simulation and analysis of the entire data center will help to grasp the energy efficiency status of the data center in real time, carry out energy efficiency aware load scheduling, and provide the possibility of energy efficiency optimization. Based on the SPECpower test results of enterprise-level servers, this article first analyzes the energy efficiency deve-
lopment trends and influencing factors of servers in recent years. Secondly, based on the genetic algorithm, the energy efficiency optimization of the data center is simulated, and a prototype system of the energy efficiency simulator of the data center is designed. The simulator can dynamically simulate and adjust the operating status of data center servers according to power supply limits, load conditions, throughput indicators, etc., and simulate the energy efficiency of data centers of different sizes and types of servers. The proposed data center energy efficiency simulation algorithm based on genetic algorithm obtains smaller errors and shorter simulation calculation time in the energy minimization problem.

Security enhancement technologies of security subsystem in microprocessors

SHI Wei, LIU Wei, GONG Rui, WANG Lei, ZHANG Jian-feng

2021, 43(08): 1353-1359. doi:

Abstract ( 207 )

PDF (771KB) ( 363 ) 　　

With the rapid development of information technology, information security is becoming more and more important. As the core component of information system, the security of processor plays an important role in system security. Building a secure and trusted execution environment on the processor is an important method to improve the security of processor. However, many security technologies still rely on independent security chip, such as trusted platform module (TPM) and trusted cryptography module (TCM). In recent years, the root of security, which is the security basis of computer system, has gradually shifted to the processor. In this paper, the security enhancement technologies of on-chip security subsystem are discussed. Firstly, the architecture of the security processor is studied. Secondly, the components of the security subsystem such as processor core, interconnection network, storage, cipher module, are researched. At the same time, the system security protection technologies such as key management, life-cycle management, secure boot, and physical attack resistant schemes are also realized. Finally, a security subsystem for desktop processors is implemented and analyzed.

Lightweight secure memory: Security enhancement for RISC-V embedded microprocessors

NIU Shi-quan

2021, 43(08): 1360-1365. doi:

Abstract ( 253 )

PDF (515KB) ( 263 ) 　　

In recent years, new types of attacks against the hardware in embedded devices have emerged, which seriously threaten the security of embedded devices. Especially, as non-volatile memory begins to be used in embedded devices, it is necessary to consider how to protect the security of embedded systems equipped with non-volatile memory. Secure memory is such a means to enhance the security of embedded systems by protecting their memory. It uses memory encryption technology to encrypt data in memory, thereby protecting the sensitive data in memory. A lightweight, low-overhead secure memory encryption engine is designed and integrated into the RISC-V embedded microprocessor. In addition, the secure memory encryption engine is evaluated on FPGA. The evaluation results show that the secure memory encryption engine can ensure reasonable memory access performance and small area overhead while obtaining security. The research results have good reference value and application prospects.

Monitoring subsystem for exascale HPC systems: Challenges and design

YUAN Yuan, LI Shi-jie, XING Jian-ying, JIANG Ju-ping

2021, 43(08): 1366-1375. doi:

Abstract ( 227 )

PDF (1740KB) ( 450 ) 　　

The High-Performance Computer (HPC) systems built for future Exascale computing require a several-times increase of assembly density, along with the large expansion of node scale. This presents huge challenges for HPC monitoring subsystem in terms of scalability, reliability, serviceability, and maintenance. In response to these challenges, this paper introduces the design ideas of the monitoring subsystem from the four aspects of architecture, network, functionality, and maintenance, and verifies the feasibility and advantages of some designs through the prototype system, which can significantly benefit the construction of future exascale HPC systems.

A quantitative evaluation method of social network users’ privacy leakage

XIE Xiao-jie, LIANG Ying, WANG Zi-sen, DONG Xiang-xiang,

2021, 43(08): 1376-1386. doi:

Abstract ( 354 )

PDF (1316KB) ( 475 ) 　　

Quantitative assessment of social network users’ privacy leakage can help users’ understand personal privacy, improve public privacy protection and prevention awareness, and also provide a basis for the design of personalized privacy protection methods. Current privacy quantitative assessment methods are mainly used to evaluate the protective effect of privacy protection methods, and are not able to effectively assess the privacy leakage risk of social network users. A quantitative evaluation method of social network users’ privacy leakage is proposed. Firstly, users’ subjective attribute sensitivity is calculated by Pearson similarity based on privacy preference matrix, and is averaged to obtain the objective attribute sensitivity. Attribute openness is calculated by the information entropy of posterior distribution which is inferenced by user sensitive attribute inference method. Transition probability and user importance is used to estimate the visible range of user data to calculate data visibility. Then, privacy score is calculated by aggregating attribute sensitivity, attribute openness, and data visibility. Finally, a fine-grained privacy evaluation is conducted based on user's privacy score, which supports dynamic evaluation of user privacy and provides a basis for personalized privacy protection. The experimental results on Sina Weibo data show that the proposed method can effectively quantify the user's privacy leakage status.

A federated ensemble algorithm for multi-source data security

2021, 43(08): 1387-1397. doi:

Abstract ( 189 )

PDF (4080KB) ( 318 ) 　　

Federated learning is a hot topic in the field of privacy protection, and it has a problem that it is difficult to concentrate local model parameters and data leakage due to gradient updates. This paper proposes a federated ensemble algorithm. The proposal uses a 256-byte key to transfer different types of initialization models to various data sources and do the training, and uses different ensemble algorithms to integrate local model parameters to ensure the security of the data and the model, thus greatly improving the security of data and model. Simulation results show that, for small and medium data sets, the accuracy of the model obtained by the adaboost integration algorithm reaches 92.505%, and the variance is about 8.6×10-8. For large data sets, the accuracy of the model obtained by the stacking ensemble algorithm reaches 92.495%, and the variance is about 8.85×10-8. Compared with the traditional method of training the model with integrated data, the proposal ensures the accuracy while taking into account the data and the model safety.

A passive indoor fingerprint positioning algorithm based on CSI signal

LIU Yan-xing, HAO Zhan-jun, , TIAN Ran

2021, 43(08): 1398-1404. doi:

Abstract ( 241 )

PDF (978KB) ( 311 ) 　　

The positioning technology based on channel state information (CSI) has been widely concerned in indoor scene applications. In order to improve the indoor positioning accuracy and stability of the WiFi signal multipath effect on the received signal strength indicator, this paper proposes a passive indoor fingerprint positioning algorithm based on CSI signal. In the offline phase, the location is divided into blocks of the same size. The original data are filtered by the adaptive Kalman filter algorithm with variance compensation at each connection point, and then the filtered data are classified by the bisecting K-means clustering algorithm. The amplitude and phase information of the processed CSI are used as fingerprints. In the online stage, the real-time data collected by the test points are matched with the fingerprint database, and the located target does not need to carry any equipment. The simulation and field experiments show that the proposed algorithm can effectively reduce the multi-path attenuation effect of the signal receiver by using the sub-carrier characteristics in CSI signals, the positioning accuracy is significantly improved.

A three-dimensional positioning algorithm based on hop correction and lion swarm optimization in WSNs

GOU Ping-zhang, LIU Xue-zhi, SUN Meng-yuan, HE Bo

2021, 43(08): 1405-1412. doi:

Abstract ( 153 )

PDF (686KB) ( 264 ) 　　

To reduce the positioning error of DV-Hop in three-dimensional space and improve the positioning accuracy of nodes, a three-dimensional positioning algorithm based on hop correction and lion swarm optimization in wireless sensor networks (WSNs) is proposed (HCLSO-3D). Firstly, through the propagation of multiple communication radii, the node hops are accurately divided to obtain the optimized hop value. Secondly, the similar path search algorithm is used to obtain the path of the most similar anchor node pair between the to-be-determined location node and the corresponding anchor node, and the average hop distance of this path is corrected to obtain the average hop distance from the to-be- determined location node to the target anchor node. Finally, the lion swarm optimization algorithm is used to solve the coordinates of the to-be-determined node. The simulation results show that, compared with the 3D-DVHop algorithm and the algorithm in reference [16], HCLSO-3D significantly improves the positioning accuracy in the same network environment.

DWSN technology analysis and performance optimization based on SDN

ZHOU Jing, GUAN Yu-rong

2021, 43(08): 1413-1421. doi:

Abstract ( 139 )

PDF (822KB) ( 235 ) 　　

At present, distributed wireless sensor network (DWSN) has been widely used and served in many fields, but there are still some technical and performance optimization difficulties in its network architecture and nodes, which affect the application effect. Software defined network (SDN) is a new network architecture with the programmable function of controlling and forwarding separation. It can virtualize the software definition of node devices and interlayer protocols. In order to improve the security and operability of DWSN’s key technologies and performance optimitation, and reduce the network complexity, the SDN framework is introduced into the DWSN to form a new SDN-based distributed wireless sensor network (SDDWSN) framework. Firstly, the main points of SDDWSN framework are discussed by introducing SDN necessity. Secondly, the related technology and performance optimization of SDDWSN framework are studied from the aspects of network communication, resource processing and control management. Finally, the advantages of SDDWSN are reflected by designing cases and qualitatively comparing and analyzing their security and event maneuverability.

Illumination estimation based on image decomposition

CAO Tian-chi, LI Xiu-shi, LI Dan, CHEN Jia-nan, LIU Shuang, XIANG Wei-lai, HU Ying-song

2021, 43(08): 1422-1428. doi:

Abstract ( 358 )

PDF (688KB) ( 565 ) 　　

The purpose of augmented reality technology is to superimpose computer-generated virtual objects into real scenes. The realization of augmented reality requires the estimation of the scene lighting. For the highlight scene, different kinds of reflected light information in the scene is used to effectively estimate the illumination. Firstly, the diffuse reflection map and the specular reflection map are obtained, and the intrinsic image decomposition of the diffuse reflection map is further performed to obtain the albedo map and the shadow map. Then, the illumination information of the input image is calculated based on the decomposition result and the depth of the scene. Finally, the global illumination model is used to render the virtual object, and the highly fused illumination effect of virtual and real scene can be obtained.

A survey of small object detection based on deep learning

LIU Hong-jiang, WANG Mao, LIU Li-hua, WU Ji-bing, HUANG Hong-bin

2021, 43(08): 1429-1442. doi:

Abstract ( 665 )

PDF (714KB) ( 907 ) 　　

Object detection is one of the core issues and most challenging problems in the computer vision research field. With the wide application of deep learning technology, the efficiency and accuracy of object detection have gradually improved, which have reached or even exceeded the resolution level of the human eye in some respects. However, due to the small coverage area, low resolution, and insignificant features of the small object in the image, the existing object detection methods are not ideal for the detection of small object. Therefore, many special methods have been created to enhance the detection effect of small object. Based on extensive literatures, this paper thoroughly analyzes the reasons for the difficulty of small object detection, and fully discussed the methods for improving the detection effect of small object from multiple aspects such as multi-scale, feature context information, anchor box settings, intersection over union matching strategy, non-maximum suppression, loss function, generative adversarial network, object detection network structure and so on.

3D human dimension prediction based on improved GA-BP-MC neural network

HU Xin-rong, LIU Jia-wen, LIU Jun-ping, PENG Tao, HE Ru-han,

2021, 43(08): 1443-1453. doi:

Abstract ( 162 )

PDF (1362KB) ( 225 ) 　　

Due to lack of depth information in 2D pictures, it is difficult to obtain the three- dimensional size information of the human body. The size information fitted by the classical linear regression method is the mean value of the threshold interval to which the human body belongs. Because the heterogeneity of the human body is ignored, the size error of the fitting is very large. Model reconstruction can improve the accuracy of size acquisition. However, due to the large scale of computations and parameters in deep neural network, it is difficult to deploy it in mobile devices. Therefore, a 3D human dimension prediction model based on improved GA-BP-MC neural network is proposed. This model optimizes the BP network structure, weights and thresholds by upgrading the adaptive crossover and mutation probability of genetic algorithm. In addition, the Markov residual network is used to improve the prediction accuracy of the UGA-BP model. Finally, data comparison and analysis of 210 sets of samples were carried out through engineering examples, and the results show that, compared with Hyperelliptic curve method, multivariate function and GA-BP model, the proposed UGA-BP-MC reduces the average prediction error by 2.8cm, 1.62cm and 0.94cm respectively.

A lightweight semantic segmentation algorithm based on ENet

XU Shi-jie, DU Yu, LU Xin, WU Si-fan

2021, 43(08): 1454-1460. doi:

Abstract ( 258 )

PDF (1038KB) ( 315 ) 　　

Semantic segmentation algorithms can classify images at the pixel level, and are widely used in fields such as unmanned driving, medical image processing, and industrial automation, and have important research value. The research of semantic segmentation algorithms focuses on three aspects: improving the accuracy of segmentation, reducing the amount of parameters and increasing the speed of inference. The lightweight semantic segmentation algorithm ENet uses a multi-layer convolutional codec and a large number of dilated convolutions to avoid excessive downsampling and use of spatial information. Although it retains some spatial information integrity and large receptive field, the codec is bloated, the transmission of spatial information is poor, and the sensory field overflows and causes grid effect. Aiming at the above problems, this paper tailors the ENet algorithm structure, uses the attention mechanism and the pyramid dilated convolution to design spatial information transmission module, optimizes the algorithm structure, improves the algorithm receptive field, and completely transmits the spatial information transmission. The experimental results on public datasets Cityscapes and BDD100K show that the new module can improve the performance of the original algorithm with a smaller amount of parameters and calculations, which proves the redundancy of the original algorithm and the effectiveness of the designed module.

Source cell-phone identification under practical noises based on temporal convolutional network

WU Zhang-qian, SU Zhao-pin, WU Qin-fang, ZHANG Guo-fu,

2021, 43(08): 1461-1469. doi:

Abstract ( 123 )

PDF (2589KB) ( 255 ) 　　

To solve the problem of source cell-phone identification under practical environmental noises, a source cell-phone identification method based on linear discriminant analysis and temporal convolutional network is proposed. Firstly, the classification performance of different speech features under practical noises is analyzed in detail, based on which a new mixed speech feature is proposed according to band energy descriptor, constant Q transform, and linear discriminant analysis. Additionally, the mixed speech feature is used as the input to the temporal convolutional network for training and classification. Finally, the test results on the practical noise speech database of 10 brands, 47 mobile phone models, and 32,900 speech samples show that the average recognition accuracy of the proposed method reaches 99.82%. Moreover, compared with the two existing classical methods based on the band energy descriptor and support vector machine, and the constant Q transform domain and convolutional neural network, the proposed method increases the average recognition accuracy by about 0.44 and 0.54 percentages respectively, the average recall by about 0.45 and 0.55 percentages respectively, the average precision by about 0.41 and 0.57 percentages respectively, and the average F1-score by about 0.49 and 0.55 percentages respectively. The experimental results show that the proposed method has better comprehensive identification performance.

An automatic streetlight monitoring system based on LoRa and STM32

TIAN Xu-fei, YAO Kai-xue, WANG Kai-peng, WANG Yun-feng

2021, 43(08): 1470-1478. doi:

Abstract ( 248 )

PDF (1476KB) ( 361 ) 　　

Nowadays, streetlights in most areas are still in the stages of traditional control, on-site manual inspection, traditional circuit troubleshooting, and no unified management platform, resulting in huge energy waste. In view of the above situation, an automatic streetlight monitoring system based on LoRa and STM32 is designed. The system uses STM32 series MCU as the core processor, and realizes the data transmission between streetlights and the back-end server through 4G communication. The system adopts the LoRa cascade networking mode and is designed with Bluetooth interface to facilitate users’ on-site maintenance and parameter setting. The main functions of the system include real-time monitoring and control of streetlight status, real-time intelligent dimming, geographic location detection, environmental information data collection, real-time fault information detection and alarm, real-time power information detection and early warning, and other functions. Tests verify that the system is stable and reliable, with accurate data collection, and can be widely used in streetlight systems, which saves electric energy for the society, effectively improves the level of scientific management of streetlights, and improves the appearance of the area where the streetlights are located.

An improved bat algorithm for the vehicle routing problem with time windows and capacity constraints

ZHANG Jin, HONG Li, DAI Er-zhuang

2021, 43(08): 1479-1487. doi:

Abstract ( 212 )

PDF (709KB) ( 262 ) 　　

The vehicle routing problem with time windows and capacity constraints is one of the most important extensions of the vehicle routing problem, which belongs to the NP hard problem. The exact algorithm has low efficiency, which cannot give the optimal solution in limited time especially for large scale problems. In order to meet the rapid and effective distribution needs of enterprises and customers, intelligent optimization algorithms are usually adopted since they can give relatively optimal solutions in limited time. In this paper, an improved discrete bat algorithm for vehicle routing problem with capacity and time windows constraints problem is studied. The variable step size search strategy and the 2-opt optimization method are introduced into the algorithm for local search based on the clustering of customer points according to their locations, so as to increase the disturbance mechanism and improve the search speed and accuracy. Simulation results show that the designed algorithm has better optimization ability and practical value.

A tongue image classification method based on deep transfer learning

SONG Chao, WANG Bin, XU Jia-tuo

2021, 43(08): 1488-1496. doi:

Abstract ( 494 )

PDF (921KB) ( 842 ) 　　

Tongue image analysis is an important topic in the objective and quantitative research and application of computer vision technology in the diagnosis and treatment of Traditional Chinese Medicine(TCM), and its two key steps are tongue segmentation and tongue image classification. Cascade classifier is used to automatically segment the tongue region on the original image, and then the segmented tongue image is deeply trained and learned on GoogLeNet and ResNet. The obtained depth network is used to classify toothmarks, cracks and thickness of tongue coating. 2245 tongue images obtained from specialized TCM medical institutions are used to build a dataset. In the experiments of classifying the three types of tongue images (toothmarks, cracks and thickness tongue coating), this method has better classification performance than the traditional tongue image feature classification methods. The effectiveness of the tongue feature classification method based on deep transfer learning is verified.

A Chinese-Vietnamese neural machine translation method based on synonym data augmentation

YOU Cong-cong, GAO Sheng-xiang, YU Zheng-tao, MAO Cun-li, PAN Run-hai,

2021, 43(08): 1497-1502. doi:

Abstract ( 326 )

PDF (501KB) ( 353 ) 　　

The scarcity of resources in the Chinese-Vietnamese parallel corpus greatly affects the effect of Chinese-Vietnamese machine translation. Data enhancement is an effective way to improve Chinese-Vietnamese machine translation. Bilingual dictionary-based vocabulary replacement and data enhancement is currently a more popular method. Since Chinese-Vietnamese
bilingualism is a low-resource languages, bilingual dictionaries are difficult to obtain, and synonyms for low-frequency words are easier to obtain from monolingual word vectors. Therefore, we propose a data enhancement method based on synonym replacement of low-frequency words. This method uses a small-scale parallel corpus. Firstly, by learning monolingual word vectors, a synonym list of low-frequency words at one end is obtained. Then, low-frequency words are replaced with synonyms. Secondly, the language model is used to filter the replaced sentences. Finally, The filtered sentence is matched with the sentence in the language on the other side to obtain an extended parallel corpus. The experimental results of Chinese-Vietnamese translation experiments show that the proposed method achieves good results, and the extended method improves the BLEU value by 1.8 and 1.1, compared with the baseline and back translation methods.

Optimizing deep neural networks using a modified genetic algorithm

LI Jing, MO Si-min

2021, 43(08): 1503-1511. doi:

Abstract ( 371 )

PDF (1078KB) ( 619 ) 　　

Deep feed-forward neural networks are well applied in classification and regression problems, but network performance is greatly affected by their structure and hyper-parameters. To achieve high performance neural networks, a modified genetic algorithm is designed firstly, which modifies the selection strategy. Then, the modified genetic algorithm is employed to optimize the number of network layers, the number of nodes in each layer, and the learning rate and weights, which are coded by binary coding and real number coding strategy respectively. For the modified selection strategy, in 2n indivi- duals from the combination of parent population with offspring population, some top fitness individuals are selected and some worse fitness individuals with a high probability are also selected to achieve better diversity and avoid falling into local optimum. dropout method is introduced to avoid the overfitting training data of network. Seven datasets (Ring, Breast cancer，Twonorm, Heart，Blood，Ionosphere，Monk) are used in the experiments. The results show that, compared with the algorithms in related literatures, the modified genetic algorithm has higher performance neural networks.

Current Issue

Author center

Review center

Online journal