Computer Engineering & Science

Overview on the energy efficiency of job

scheduling for high performance computing

ZHENG Wen-xu,PAN Xiao-dong,MA Di,WANG Hao

2019, 41(09): 1526-1533. doi:

Abstract ( 279 )

PDF (595KB) ( 387 ) 　　

Review attachment

Due to the increasing demand for high performance computing in scientific research and commercial applications, the performance and system scale of high performance computing are developing rapidly. However, the rapid increase in power consumption severely limits the design and use of high-performance computing (HPC) systems, which makes low-power technologies become the key technology in the HPC field. As the core component of the whole system and based on limited system resources, the job scheduling system distributes job resources to the applications submitted by the users. Its energy efficiency plays an important role in the control and regulation of the energy consumption of the whole HPC system. We first introduce the main energy efficiency techniques and common job scheduling strategies, then analyze the energy efficiency of current HPC job scheduling, and discuss the challenges it faces and future development directions.

An improved primary/backup scheduling algorithm

based on simulated annealing algorithm

ZHU Yong-chao1,ZHOU Chuan1,CUI Yu-wei2,GUO Jian1,WU Yi-fei1

2019, 41(09): 1534-1540. doi:

Abstract ( 194 )

PDF (691KB) ( 293 ) 　　

Review attachment

Aiming at the priority constraint oriented task scheduling in heterogeneous distributed systems, we propose an improved primary/backup scheduling algorithm based on simulated annealing algorithm called SAPB. The task model is presented by a directed acyclic graph (DAG) and the proposed algorithm schedules two copies (primary and backup) of each task. For task sorting, we adopt the HEFT task sorting method to avoid the limitation of the task model description in primary/backup scheduling strategies such as eFRD. In task processor selection stage, the simulated annealing algorithm is used to search the solution with higher reliability under the condition of meeting the deadline and it considers preparing one more backup copy to solve the likely scheduling failure caused by task priority constraints when scheduling backup copies have fewer processors. Finally, the simulation experiments based on randomly generated DAGs are carried out. The results show that the proposed algorithm outperforms the eFRD in terms of the schedulability of backup copies and system reliability.

Analyzing runtime variability of multi-core

processors for safety-critical systems

BIN Jing-yi,DING Xu-yang,QI Xiao-ming,CHEN Hao

2019, 41(09): 1541-1550. doi:

Abstract ( 153 )

PDF (1518KB) ( 295 ) 　　

Review attachment

In order to solve the runtime variability problem caused by the contentions of shared hardware resources in multi-core processors for safety-critical systems, we propose a generalized measurement method based on performance monitor register (PMR) and resource stressing benchmark (RSB). The method captures the hardware events related to the execution time variability to analyze the sharing of hardware resources, the variability of execution time, and the black box or gray box behavior of the hardware platform. The proposed method can not only estimate the performance of hardware platforms, but also the resource consumption of applications, which thus can provide guidance for the worst case execution time (WCET) prediction in a pre-determined co-running context.

A novel sequence alignment method for third-generation

sequencing based on low frequency seeds

SONG Si-yi1,2，CHEN Hao-yu1,2，XU Yun1,2

2019, 41(09): 1551-1556. doi:

Abstract ( 154 )

PDF (692KB) ( 235 ) 　　

Review attachment

With the development of sequencing technology, the third-generation sequencing has been widely used in genetic research. It can generate longer sequences but has a higher error rate. It is difficult to align sequences to the reference genome quickly and accurately. Existing methods utilizes seeds which are subsequences selected from test sequences to speed up the alignment process. However, the seed frequency is not fully considered, which results in a large time consumption in the stage of finding candidate regions. We therefore propose a sequence alignment method for third-generation sequencing based on low frequency seeds. Its key idea is a modified seed-voting strategy, which adopts frequency seeds for voting to reduce the time consumption for counting the votes. Moreover, the alignment method re-filters the candidate regions based on the position and the number of votes, further increasing the speed of alignment. Experimental results show that the method is about 3 times faster than existing methods while ensuring sensitivity and accuracy.

A convolutional neural network accelerator

based on Winograd-Sparse algorithm

XU Rui，MA Sheng，GUO Yang，HUANG You，LI Yi-huang

2019, 41(09): 1557-1566. doi:

Abstract ( 232 )

PDF (996KB) ( 324 ) 　　

Review attachment

As convolutional neural networks are widely used, more researchers pay attention to customized hardware accelerators to solve the problem of complex computation. However, current hardware accelerators mostly use the traditional convolution algorithm. There is lack of support for sparse neural networks, and there is little improvement space for these accelerators from an algorithmic perspective. We redesign a convolutional neural network accelerator based on the Winograd-Sparse algorithm, and we prove that it can effectively reduce the computational complexity of convolutional neural networks and also well adapts to sparse neural networks. A combination of hardware and the algorithm can achieve considerable computational efficiency while reducing hardware resources. Experiments show that compared with the traditional algorithm, our accelerator can improve the computation speed by nearly 4.15 times. From the perspective of multiplier utilization, we have increase the utilization rate by up to 9 times in comparison with other existing designs

A generalized positive-definite and skew-Hermitian

splitting iteration method for saddle point problems

DONG Bei-bei,BAO Liang

2019, 41(09): 1567-1573. doi:

Abstract ( 179 )

PDF (1156KB) ( 280 ) 　　

Review attachment

We propose a generalized positive-definite and skew-Hermitian splitting method to solve the large sparse saddle point problems based on the positive-definite splitting. The method first uses the positive-definite splitting of matrix to construct two splitting forms of the saddle-point matrix. Then, the two splitting formats are used to construct the GPSS iteration. Thirdly, the necessary and sufficient conditions for the iteration convergence are given. Finally, some numerical comparison experiments are carried out and the results show that the GPSS is more effective than "the positive-definite and skew-Hermitian splitting" (PSS) method and "the Hermitian and skew-Hermitian splitting" (HSS) method.

A fair load-balanced dominant

resource allocation algorithm

LIU Zi-xuan，ZHOU Jian-tao

2019, 41(09): 1574-1580. doi:

Abstract ( 369 )

PDF (934KB) ( 382 ) 　　

Review attachment

With the trend of using cloud computing to handle computation problems concurrently and reliably, various cloud computing platforms have emerged. In cloud computing, the fairness of multiple resource scheduling strategies is very important. Dominant resource fairness (DRF) can effectively implement the fairness in multiple resource environments, however, it can introduce uneven cluster load in the process of resource allocation. Therefore, we propose a load-balanced dominant resource fairness allocation algorithm. It uses the DRF algorithm to allocate resources, perform K-means clustering on the slaves according to the resource utilization of each slave in the cluster, and allocate the resources to the tasks according to the clustering results to improve the load balance of the cluster. Simulations of the improved DRF algorithm are implemented on CloudSim4.0. Experimental results show that the proposed load-balanced DRF algorithm can improve the load balancing of the cluster more effective than the original DRF algorithm and the improved DRF algorithm based on analytic hierarchy process (AHP).

A secure medical record storage and

sharing scheme based on dual-blockchain

ZHANG Li-hua1，LAN Fan1，JIANG Pan-pan2，JIANG Teng-fei2

2019, 41(09): 1581-1587. doi:

Abstract ( 304 )

PDF (811KB) ( 541 ) 　　

In current medical informatization construction, the storage and sharing of electronic medical records (EMRs) incurs the risk of privacy leakage of patients, resulting in the loss of reputation and property. Currently, centralized management nodes are used in most medical record storage and sharing schemes for privacy preservation, which is vulnerable to the threat of single-point failure and malicious tampering caused by concentrated attacks. In response to these problems, we design a medical record storage & sharing scheme based on dual-blockchain structure, called electronic medical records share & storage blockchain (EMRSBC). It adopts two consortium chains for medical record storage and sharing, which solves the problems of poor scalability and low throughput in traditional single-chain applications. It also separates the sharing and storage of medical records, and utilizes the decentralization feature of the blockchain and the chain code of the smart contract to achieve access control in an untrusted environment, which prevents the privacy data of patients from leaking during storage and sharing.

A batch message authentication scheme based on

certificateless key insulation in vehicular ad hoc networks

WANG Rui,CAO Su-zhen,WANG Fei,LANG Xiao-li,DU Xia-ling

2019, 41(09): 1588-1596. doi:

Abstract ( 162 )

PDF (538KB) ( 318 ) 　　

In order to solve the security problem of anonymous authentication in vehicle ad hoc networks (VANET), we propose an efficient anonymous authentication scheme. The scheme combines certificateless password system and key insulation technique in vehicle ad hoc networks by updating the key of the road side unit (RSU) and the vehicle user . The leakage of temporary private key in a certain period of time will not affect the forward and backward security, and the security of the scheme is proved under the random prediction model. Finally, the performance analysis results show that the scheme can not only improve the efficiency of anonymous authentication of message signature, but also reduce the operation cost of the whole system, which has good theoretical significance and practical value.

A dynamically updated password

authorization multi-secret sharing scheme

WANG Cai-fen1，2,SU Shun-chang1,YANG Xiao-dong1

2019, 41(09): 1597-1602. doi:

Abstract ( 175 )

PDF (405KB) ( 287 ) 　　

As an important branch of cryptography, secret sharing plays an important role in secret key escrow, secure multi-party computing, missile launching and many other fields. Most of the existing secret sharing schemes are based on the (t, n)-Shamir threshold scheme, whose core idea is that the secret distributor divides the secret s into n shadow secrets and distributes them to the holder by secret polynomial. Any less than t shadow secret cannot get any information of the main secret. However, traditional schemes cannot realize dynamic update of the secret number and password authorization of the secret holder. Based on the traditional Shamir secret sharing scheme and the modular operation over finite fields and the RSA cryptosystem, we propose a verifiable password-authorized multi-secret sharing scheme. In the secret sharing process, it can prevent distributors from deceiving and malicious participants’ attack, and achieve dynamic update of the secret number and password authorization of the secret holder, which makes the scheme more practical.

LSTM modle based early warning of

internet public opinion on food security

A Yong-jun，CHEN Hai-shan

2019, 41(09): 1603-1611. doi:

Abstract ( 199 )

PDF (699KB) ( 356 ) 　　

In recent years, increasing events of internet public opinion on food security have attracted great attention of the Chinese government. The current early warning index system of internet public opinion on food security lacks a comprehensive consideration of theme attributes, propagation and diffusion indexes, and does not consider the inherent characteristics and evolution law of the public opinion in depth. Moreover, as the current early warning model of internet public opinion fails to take the interrelationship between different characteristics of the public opinion into consideration, which leads to the low accuracy of early warning of public opinion. Aiming at the above problems, we construct an index system consisting of five dimensions, including theme attributes and propagation and diffusion indexes. Based on this, we propose a regularization long short term memory (Re-LSTM) model, which uses the regularization method to constrain the input weight of each unit in the network, and replaces the tanh activation function by the softsign activation function. Compared with other classic models, the proposed model can not only improve the accuracy of early warning, but also better avoid the problem of gradient disappearance and overfitting.

A cache replacement algorithm based on

potential energy cooling for content-centric networks

ZHANG Jian-wei1，WANG Xu-hui2，CAI Zeng-yu2，HUANG Wan-wei1，DU Chun-feng2

2019, 41(09): 1612-1617. doi:

Abstract ( 148 )

PDF (860KB) ( 246 ) 　　

In view of the inefficiency problem, of the cache replacement strategy for current content-centric networks, we propose a cache replacement algorithm based on potential energy cooling (PEC-Rep), which combines the “natural coding” phenomenon with the reference of the concept of “potential energy”. The PEC-Rep can evaluate content value according to the access number and time interval. And it deletes the content with the minimum value when it performs content replacement so that the node can maintain maximum value to satisfy the subsequent requests of users. Simulation results show that the PEC-Rep can effectively improve the domain-cache hit ratio, reduce the server load and improve the overall performance of the CCN.

Service composition optimization based on the fireworks

algorithm mixing chaotic mechanism and Levy mutation

#br#

LIU Ting1,2,YANG Qiu-xiang1

2019, 41(09): 1618-1626. doi:

Abstract ( 160 )

PDF (863KB) ( 251 ) 　　

In order to pick out the service composition that meets the complex application requirements of users and the high comprehensive performance in the large-scale Web service environment, we propose an improved fireworks algorithm mixing chaotic mechanism and Levy mutation. Firstly, the chaos theory is used to generate the initial fireworks population and avoid uneven dispersion of individuals which can result in repeated local optimization. Then the Levy mutation operator is introduced to the search process to enhance the global search capability of the algorithm and avoid premature convergence. Finally, the elite selection strategy is adopted to reduce the time expenses of the algorithm in the process of selecting next generation fireworks population. Experimental results verify the optimization performance and stability of the algorithm .

Small target pedestrian detection

based on multi-scale feature fusion

ZHANG Si-yu1,2,ZHANG Yi1,2

2019, 41(09): 1627-1634. doi:

Abstract ( 304 )

PDF (790KB) ( 401 ) 　　

Given the problems of missing detection and detection failure for small targets in the single shot multibox detector (SSD), we propose an hourglass SSD model based on the idea of deconvolution and feature fusion, called hgSSD model. It deconvolutes the conventional SSD feature, which is then combined with shallower features to detect small target pedestrians in complex scenes. In order to preserve shallow network characteristics, ensure real-time detection and save computing resources, we use the VGG-16 instead of the deeper RestNet-101 as the basic network. In order to enhance the detection of small targets, Conv3_3 in VGG16 is improved as the feature layer added into the training. The fused network is more complex than the conventional SSD, but the real-time performance is basically guaranteed. It can successfully detect most of the small targets that are missed by the conventional SSD network, and the network has a higher accuracy than the conventional SSD model. In the case where the default box confidence threshold of 0.3, it basically detects the small targets undetected by the conventional SSD. In VOC 2007+2012, the pedestrian average precision value is increased from 0.765 to 0.83 in comparison with the conventional SSD.

Application of similarity dimension in point cloud

LIU Ni，ZHANG Zhi-yi

2019, 41(09): 1635-1641. doi:

Abstract ( 142 )

PDF (1004KB) ( 291 ) 　　

Shape attribute acquisition is necessary in 3D point cloud registration, segmentation, recognition and other tasks. Traditional shape attributes are sensitive to scale changes, complicated to calculate，and simple to express the geometric meaning. Thus, combined with the concept of similarity dimension in fractal geometry, we propose a dimension definition that can be used as the shape attribute of the point cloud model. Firstly, we obtain a set of points that composed of the k neighborhood of each point in the model, and calculate the radius of the circumscribed sphere of the set of points. Secondly, we calculate the volume and area of the set of points, and solve the scale-sensitive problem by scaling. Finally, the dimension value of each point in the point cloud model is calculated using the similarity dimension expression, and the value is used to represent the shape attribute of the point cloud model. Experimental results show that the similarity dimension has the ability to express the shape and can clearly express the global characteristics of the model.

A quality evaluation method for structural

RAO Yong-ming1,WU Shuai1,2,YAN Feng2,LUO Yue-tong1

2019, 41(09): 1642-1649. doi:

Abstract ( 148 )

PDF (840KB) ( 236 ) 　　

Since chip surface symbols are used to identify chip names, performance, and functional information, their quality evaluation is an important part of chip production. Currently chip surface symbol quality evaluation methods based on machine vision are widely used, and most of them are evaluation methods based on pixel-by-pixel comparison, facing two main problems: (1) a small amount of deformation of the symbols leads to poor quality evaluation results; (2) the internal defect characteristics of the symbols are not considered. We propose an evaluation method for structural defects of printed symbols on chip surface. Firstly, the thin-plate spline interpolation is used to align the symbols to be evaluated with the reference symbols to eliminate the influence of small deformation. At the same time, we propose a deformation formula to determine the larger deformation symbols to be evaluated. Secondly, we propose a defect detection method, define the concept of defect cluster and key position, and obtain the two main influencing factors of quality assessment: the inherent characteristics of defects and whether the location of defects is critical. Finally, we define a reasonable scoring strategy based on the above characteristics. This quality evaluation method is an objective evaluation method, and its advantages are: (1) on the premise of only one reference image, it has good robustness when evaluating the quality of images with deformation; (2) The method focuses on the structure quality evaluation of image symbols rather than that of the image itself, so the evaluation results are objective and conform to human's actual feeling for symbol contents. The experimental data of an actual production line demonstrates the validity of the proposed method.

Depth information estimation based

on multi-constraint optimization

YUAN Hao-xiang,CHEN Shu,LIN Min

2019, 41(09): 1650-1655. doi:

Abstract ( 158 )

PDF (630KB) ( 273 ) 　　

null

An adversarial domain adaptation image

classification algorithm with dual discriminators

XU Hao,GUO Wei-bin

2019, 41(09): 1656-1661. doi:

Abstract ( 170 )

PDF (542KB) ( 290 ) 　　

Since the appearance of generative adversarial networks, adversarial learning has been widely used in various branches of machine learning, which drives new developments. In domain adaptation, adversarial domain adaptation methods use a shared feature extractor to extract the domain invariant representation and a discriminator to
judge. Both reach optimal solution through the iterative update of adversarial learning. In terms of data sources, generative adversarial networks and domain adaptation have two similar domains. In the aspect of objective functions, both try to pursue consistency. We attempt to further enhance the domain adaption performance from a deeper level by taking advantage of the mature framework of generative adversarial networks and analyzing intrinsic similarities between the two based on the theoretical level and logical structure. By analogy, the model employs two discriminators to solve the mode collapse problem existing in previous adversarial domain adaptation algorithms, and utilzies pseudo labels for structural improvement. Finally, experiments on standard domain adaptation tasks confirm the feasibility and effectiveness of the method.

An optic disc positioning method in

fundus images based on YOLO

JIANG Yun，PENG Ting-ting，TAN Ning，HOU Jin-quan

2019, 41(09): 1662-1670. doi:

Abstract ( 297 )

PDF (893KB) ( 411 ) 　　

The parameters of the optic disc are important indicators for measuring the health status and lesions of the fundus. The detection and localization of the optic disc is especially important for observing the shape of the optic disc. Research on optic disc positioning in the past mainly depends on the shape and brightness of the optic disc, and the direction of the fundus blood vessels, and image-processing methods are used to locate the optic disc in fundus images. Due to the influence of human factors, the feature extraction time is long and the optic disc positioning efficiency is low. We propose a method of locating optic disc of the fundus image based on the YOLO algorithm. The YOLO algorithm is used to divide the fundus image into N×N grids, and each grid is responsible for detecting whether the center point of the disc falls into the grid. The multi-scale method and the residual layer are fused with low-level features to locate the disc, and bounding boxes of different sizes are obtained. The bounding box with the highest score is finally selected through non-maximum suppression. We test the proposed localization method on three open databases of fundus images (DRIVE、DRISHTI-GS1 and MESSIDOR). The positioning accuracy is 100%, and the center point coordinates of the optic disc are located in the experiment. The average Euclidean distances to the center points are 22.36 px, 2.52 px, 21.42 px, respectively, which verifies the accuracy and versatility of the method.

A garment retrieval method based on deformable convolution

WANG Zhen,QUAN Hong-yan

2019, 41(09): 1671-1678. doi:

Abstract ( 143 )

PDF (802KB) ( 346 ) 　　

Traditional garment retrieval methods use fixed-shape receptive fields, and they cannot extract features effectively when the garment target has geometric deformation. To solve this problem, we propose a garment retrieval method based on deformable convolution and similarity learning. Firstly, we build a deformable convolutional network which can automatically learn the sampling locations of garment features and the Hash code of garment images. Secondly, a similarity learning network is cascaded to measure the similarity of the Hash code. Finally, we obtain the retrieval results according to similarity scores. Experimental results show that this method can effectively extract the features of garment objects with geometric deformation, thus reducing the impact of image background features and improving the accuracy of the retrieval model.

Current Issue

Author center

Review center

Online journal