Computer Engineering & Science

A gradient-based H.265/HEVC intra

prediction hardware acceleration algorithm

LI Wenwu，SUN Shuwei,GUO Yang

2019, 41(4): 575-582. doi:

Abstract ( 468 )

PDF (698KB) ( 580 ) 　　

Review attachment

High efficiency video coding (HEVC), namely H.265, is the latest video coding standard. Compared to the previous generation of video coding standards, H.265/HEVC can significantly improve video compression efficiency. However, it incurs a much higher computational complexity, especially in the intra prediction process. To address this issue, we propose an improved and hardware-friendly gradient-based intra prediction hardware acceleration algorithm which can skip some of the intra prediction modes and the intra prediction process for depth partition, so as to reduce computation. The proposed algorithm can estimate the texture direction and texture complexity of codingunits according to gradient information. The texture direction can be used to estimate the optimal intra prediction direction of the coding-unit, and the texture complexity can determine whether to skip the intra-prediction process for the current depth partitioning. Experimental results show that compared with the HEVC test model HM16.18, the proposed algorithm can reduce encoding time by 60.59%, with only 8.52% BD-rate increase and 0.38dB BD-PSNR decrease.

Parallel optimization for coupling N-S/DSMC

algorithm of near-continuum transitional flow region

on 100P domestic supercomputing system

XU Jinxiu1,LI Zhonghua2,SUN Jun1,LI Zhihui3,ZHENG Yan1

2019, 41(4): 590-597. doi:

Abstract ( 482 )

PDF (702KB) ( 502 ) 　　

Review attachment

The numerical simulation for near-continuum transitional flow region is a very challenging field of aerodynamics. We first introduce the NavierStokes (NS) resolver and the direct simulation Monte Carlo (DSMC) method, based on which we propose an N-S/DSMC coupling numerical algorithm by developing the modular particle-continuum (MPC) technique. This extends the application range of the DSMC method and the N-S method to the transitional flow region. Then, we explain fully the multi-level parallel optimization techniques of the coupling algorithm on the 100P supercomputers in the National Supercomputing Center in Wuxi, and the many-core parallelization is realized for the first time. The tests show that the process-level optimization can obtain super-linear speedup. Though the many-core level optimization does not obtain the expected effect due to the features of the original algorithm and the computer architecture, we make in-depth discussion and analysis. Our research and analysis provide a reference for many-core parallelization of the coupling N-S/DSMC algorithm, and provide an effective approach for the numerical simulations of hypersonic aerodynamics.

An SpMV partitioning and optimization algorithm

on heterogeneous computing platforms

TAN Zhaonian，JI Weixing，AKREM Benatia，GAO Jianhua,LI Anmin,WANG Yizhuo

2019, 41(4): 590-597. doi:

Abstract ( 90 )

PDF (750KB) ( 77 ) 　　

Review attachment

Sparse matrix vector multiplication SpMV is widely used in solving scientific computing and engineering problems. The distribution of nonzero elements of the sparse matrix can greatly affect the computational efficiency of SpMV, and significant performance improvements can be achieved by using specific algorithms for different data distribution patterns. CPU has strong controllability which enables its application in general-purpose computing while GPU has high degree of parallelism and many cores, thus suitable for data-intensive computing. Taking advantage of the two can gain greater performance for SpMV. We study the task partitioning and optimization methods of SpMV on CPU-GPU hybrid architecture and propose a taskbased rescheduling algorithm based on SVR for two main data distribution patterns: Quasi-diagonal and Tetris. The two representative sparse matrix patterns account for 66% of the data in practical scientific and engineering applications. Experimental results show that our approach can achieve an average speedup of 1.74x and 2.15x for Quasi-diagonal and Tetris on heterogeneous platforms over on GPU respectively.

A service-oriented dynamic collaboration

method between cloud and edge

CAO Yunmeng1,2,ZHOU Shengjun3,LIU Chen1,2,HAN Yanbo1,2

2019, 41(4): 598-605. doi:

Abstract ( 613 )

PDF (708KB) ( 464 ) 　　

Review attachment

Edge computing can improve the processing quality of big IoT stream data and reduce network operating cost by moving computation onto edge devices. However, there are two challenges in integrating cloud and edge computing for big stream data. Firstly, edge devices usually have very limited computing and storage capabilities, and apparently cannot support real-time processing of big stream data. Secondly, the unpredictability of stream data leads to constant changes in edge-side collaboration. Therefore, it is necessary to achieve a flexible division between edge services and cloud services. We propose a servicebased approach to seamlessly integrating cloud and edge devices to realize the collaboration of large-scale stream data cloud computing and edge computing. This approach divides the cloud service into two parts running on cloud and edge respectively. At the same time, we propose a dynamic service scheduling mechanism based on the improved bipartite graphs. During event generation, we can deploy cloud service on the edge node at appropriate time. The effectiveness of the proposed approach is demonstrated by examining real cases of China's State Power Grid. Experimental results verify the effectiveness and efficiency of our approach.

An SoC trusted startup framework based

on trusted cryptographic module

WANG Xiji,ZHANG Gongxuan,GUO Ziheng

2019, 41(4): 606-611. doi:

Abstract ( 573 )

PDF (565KB) ( 451 ) 　　

Review attachment

We design an SoC trusted startup framework based on trusted cryptographic module to satisfy the requirement for information security on embedded terminals. This framework can partition the boot program Uboot functionally and store them in different nonvolatile memories. In addition, we add communication modules to enable the Uboot to transmit and receive files before OS stratup. Trusted entities including the parts of the Uboot and OS core files are transmitted to the trusted cryptographic module to measure integrity. If they pass the integrity measurement, then a signal for starting the next phase is sent back by the trusted cryptographic module and the trusted entities are stored in local memory on the trusted cryptographic module. Otherwise initialization signals are not sent. Experimental results show that the proposed framework is feasible and effective, and it can satisfy the requirement for information security on embedded terminals.

Crosstalk simulation and analysis of

DDR4 parallel interconnection

LI Chuan，WANG Yanhui，ZHENG Hao

2019, 41(4): 612-617. doi:

Abstract ( 1153 )

PDF (773KB) ( 494 ) 　　

Review attachment

Due to everincreasing demand for memory bandwidth, memory access rate and interconnect density become higher and higher. DDR4, as a very popular and fast parallel interconnect technology in main memory, features 100ps level of signal rise/fall time, which brings noticeable crosstalk issue between signals. Thus we design a trilinear disturbance model based on a certain DDR driver model and its board-level embedded application, and respectively simulate the effect of four factors on striline transmission crosstalk from the time domain angle, including line space, disturbing source phase, date rate, and coupling transmission line length. The results show that the crosstalk is close to 0 mV when the line pace reaches 5 times of dielectric thickness and different disturbing source phases cause double two-dimensional difference in total crosstalk. For a certain data rate, a periodic relationship between coupling transmission line length and extreme value of crosstalk is revealed. Utilizing this relationship to design reasonable line length for DDR data groups, crosstalk maximum value is avoidable.

Key words:

Hierarchical navigable small world graph

algorithms based on quantization coding

LI Qiuzhen1,BAI Xingqiang2,LI Lixia1,WANG Ying1

2019, 41(4): 618-625. doi:

Abstract ( 741 )

PDF (1021KB) ( 705 ) 　　

Review attachment

With the rapid development of big data and artificial intelligence, structured processing and content-based retrieval for multimedia data have received attention. Facing the massive high-dimensional feature vectors after multimedia data structuralization, how to achieve fast and accurate search results is a problem that artificial intelligence must solve when dealing with largescale data. The recently proposed hierarchical navigable small world (HNSW) graph retrieval algorithm achieves the best performance in a number of public data sets. However, the algorithm suffers large memory overhead. Retrieval algorithms based on quantization coding can greatly compress the vectors of data sets and greatly reduce memory usage. Combining quantization coding with the hierarchical navigable smallworld graph algorithm, we propose two improved hierarchical navigable small world graph algorithms based on quantization coding, namely HNSWSQ algorithm using scalar quantization-encoding vectors and HNSWPQ algorithm using productquantized coding vectors. The two algorithms adopt different quantization strategies to encode and store the original vectors to reduce memory overhead, and then the HNSW algorithm is used to create an index to reduce the timeconsumption of the retrieval. The HNSWSQ algorithm has similar recall rate and average retrieval time to those of the HNSW algorithm on multiple data sets, and the memory overhead is greatly reduced. Experimental results show that compared with the HNSW algorithm, the memory overhead of the HNSWSQ algorithm on the SIFT-1M and GIST-1M data sets is reduced by 45.1% and 70.4%, respectively.

A secure electronic transaction protocol with privacy protection

CAO Suzhen，WANG Fei，LANG Xiaoli

2019, 41(4): 626-632. doi:

Abstract ( 333 )

PDF (470KB) ( 380 ) 　　

While the popularization of electronic transactions brings convenience to users, privacy protection and security problems exposed in transaction payment are also facing challenges to varying degrees. Aiming at this problem, we propose a secure electronic transaction protocol. The optimized signcryption algorithm in the protocol can secure safe electronic transaction. Meanwhile, payment service providers have the function of deanonymity, which can claim responsibility on the premise of protecting user privacy. Performance analysis show that the proposed protocol can improve communication performance, meet the need of confidentiality and nonrepudiation of information, and guarantee the anonymity and traceability of buyers and fairness of electronic transactions.

An improved identity authentication scheme

of dynamic ID multi-factor remote users

CAO Shouqi，SUN Qing，CAO Liling

2019, 41(4): 633-640. doi:

Abstract ( 512 )

PDF (645KB) ( 439 ) 　　

A remote user authentication scheme based on dynamic ID can achieve mutual authentication between remote users and the server under the premise of ensuring anonymity and untraceability of remote user identity. We analyze the security of Yang’s remote user authentication scheme based on dynamic ID, and point out that Yang’s scheme cannot resist replay attack, sever masquerade attack, user masquerade attack or provide mutual authentication. Aiming at the defects of Yang’s scheme, we propose a multifactor remote user authentication scheme and analyze its security and efficiency. The analysis results show that the improved scheme can overcome the shortcomings of Yang’s scheme and has higher security performance

Multi-source data privacy protection

based on transfer learning

FU Yuxiang1，QIN Yongbin1,2，SHEN Guowei1,2

2019, 41(4): 641-648. doi:

Abstract ( 634 )

PDF (727KB) ( 770 ) 　　

Multisource data analysis with privacy protection is a research hotspot in big data analysis. Learning classifiers from multiparty privacy data has important applications. We propose a twostage privacy protection analyzer model. Firstly, we use the PATET model with privacy protection to train the classifier for private data. Then we gather the multi-party classifier, and use transfer learning to transfer the set knowledge to the global classifier to establish an accurate global classifier with differential privacy. The global classifier does not need to access any party’s private data. Experimental results show that the global classifier can not only interpret each local classifier well, but also protect the details of the privacy training data of all parties.

Energy internet data protection based on

attribute based hidden access strategy

LIU Peng1,2,HE Qian1,2,LI Shuangfu1,2,XU Hong1,2

2019, 41(4): 649-656. doi:

Abstract ( 496 )

PDF (620KB) ( 501 ) 　　

The communication data of the entities in different security domains in the energy internet contains sensitive information. The ciphertext-policy attributebased encryption (CP-ABE) scheme can achieve fine-grained protection. However, the traditional CP-ABE scheme is complicated to decrypt, and the revocation of attributes requires a complete update of the entire ciphertext. Besides, its access policy is prone to leaking private information, resulting in its limited application in the energy Internet. In order to solve the above problems, based on the data sharing security of the energy internet cloud storage, we propose an energy internet data protection scheme based on a hidden access strategy. The access strategy supports arbitrary thresholds or Boolean expressions, and the attributes in the access strategy are obscured to realize policy hiding. The scheme introduce the decryption agent to outsource the main part of the high-complexity attributebased decryption process to the server, thus reducing the decryption overhead of the receiving end. The attribute revocation process only requires the attribute authority and the decryption agent to participate, so the difficulty of the process is reduced. Analysis on comparative experiments shows that the decryption performance of this scheme is greatly improved.

An image segmentation algorithm using variable

precision least square rough entropy

SHE Zhiyong,DUAN Chao,ZHANG Lei

2019, 41(4): 657-664. doi:

Abstract ( 1242 )

PDF (621KB) ( 483 ) 　　

Image processing is an important way to obtain information and is widely used in important fields such as military, medical and transportation fields. Image segmentation plays an important role in image processing. Aiming at the uncertainty in the process of image segmentation, and in order to obtain more accurate image segmentation results, we proposes a singlethreshold image segmentation algorithm based on variable precision least square rough entropy and particle swarm optimization. It uses the variable precision rough set to represent the image, utilizes the variable precision least square rough entropy to solve the optimal segmentation threshold, and employs the particle swarm optimization to improve segmentation efficiency. Experimental results show that the singlethreshold segmentation algorithm is superior to the maximum average entropy method, and demonstrates that variable precision rough entropy can settle the uncertainty problem in image segmentation process.

Improved pedestrian re-identification based on CNN

XIONG Wei1,2，FENG Chuan1，XIONG Zijie1，WANG Juan1,2，LIU Min1,2，ZENG Chunyan1,2

2019, 41(4): 665-672. doi:

Abstract ( 450 )

PDF (1133KB) ( 532 ) 　　

For the lack of training samples in pedestrian reidentification (reID) research, we propose a pedestrian re-ID method based on convolutional neural network (CNN) to improve the recognition accuracy and generalization ability. Firstly, we employ the unsupervised learning method for the generative adversarial network to generate unlabeled images, so the training data set is expanded. Secondly, the original data set is collaborated to perform semi-supervised CNN training, and a Siamese network is constructed to perform training according to the features of the identification model and the verification model. Finally, the unlabeled image category distribution method is introduced, and the cross entropy loss is calculated to perform similarity measurement. Experiments on the Market1501, CUHK03, and DukeMTMC-reID datasets show that the proposed method has a nearly 3% to 5% improvement in performance indicators such as rank1 and mAP in comparison with the original Siamese method. The proposed method has certain application value in small sample scenarios.pedestrian re-identification;convolutional neural network (CNN);generative adversarial network (GAN);cross entropy;Siamese

A corner matching algorithm based on image sharpness

XING Caiyan,ZHANG Zhiyi,HU Shaojun,GENG Nan

2019, 41(4): 665-672. doi:

Abstract ( 65 )

PDF (1841KB) ( 68 ) 　　

It is crucial to 3D image reconstruction that matching the key points of the same object from multi-angle shots of images. We propose a new corner matching algorithm based on image sharpness to get precise corner matching pairs. The algorithm is divided into three steps. Firstly, the coarse edge information of image is gained by Canny operator. To minimize noise interference to the greatest extent, we then use the 8 neighborhood contour tracking algorithm to track the edge points to obtain the edge contour. Secondly, the sharpness of the contour lines is calculated to get the key corner points in the image. Thirdly, onetomany matching relationships between corner points in different images are established through the zero-mean normalized crosscorrelation method, thus coarsematching of point pairs is achieved; and we adopt an optimized relaxation iteration method to reduce the number of iterations and obtain one-to-one precisematching of point pairs. Experimental results show that the proposed algorithm is able to improve the running efficiency and accuracy of corner matching, thus realizing precise-matching of point pairs.

A neighborhood search artificial bee colony algorithm

improved by dynamic adjustment factors

CHANG Xiaogang1,ZHAO Hongxing2

2019, 41(4): 673-681. doi:

Abstract ( 378 )

PDF (939KB) ( 368 ) 　　

The artificial bee colony (ABC) algorithm attracts wide attention because of its simple process, a small number of control parameters, strong global convergence ability and easy implementation. However, it has several disadvantages, such as low convergence precision and slow convergence speed. Inspired by the rules of the optimal biological individual in a neighborhood, we propose a neighborhood search artificial colony (NABC) algorithm to address these issues. It improves the searching speed of the population by searching the food source around the best food source in the neighborhood. Moreover, to dynamically adjust the search process of the algorithm, we also propose a dynamical neighborhood search ABC (DNABC) algorithm based on trigonometric function regulatory factors. It can make the algorithm focus on a global search in the early stage and on depth search in the late stage of exploration. Results from the experiments on 12 benchmark functions show that the NABC algorithm has high convergence precision and fast convergence speed during function optimization. And the NABC algorithm can be improved by the adjustment factors of the trigonometric function.

null

ZHANG Bing1,DONG Xiaoxiong2,LI Wen1,MENG Xiangfei1,LI Chao1

2019, 41(4): 692-698. doi:

Abstract ( 394 )

PDF (497KB) ( 447 ) 　　

null

Gas concentration prediction based on

neural network with random hidden weight

ZHANG Yiwen1,GUO Haishuai1,TU Hui2,YU Guofeng2

2019, 41(4): 699-707. doi:

Abstract ( 322 )

PDF (740KB) ( 393 ) 　　

The safe production of coal mines has always been one of human's key research subjects. In numerous safety accidents in coal mining, gas accidents account for most of them. Real-time and accurate prediction of gas concentration in underground production lines and anticipating whether the production environment is in a safe state is critical for the safety of coal mines. Aiming at this problem, we propose a gas concentration prediction method based on the random hidden layer neural network trained by NSGA-II (BNSGA-II NN). On the one hand, fewer parameters need to be set in the NSGA-II, and they are convenient to use. On the other hand, the cross variation mechanism in the NSGA-II avoids the problem of falling into local optimal solution in the traditional methods. To demonstrate the prediction quality of the trained neural network with random hidden weight using the NSGA-II, we compare the BNSGAII NN with PSOGSA NN through experiments. Experimental results show that the prediction effect of the BNSGA-II NN is significantly better than that of the PSOGSA NN.

Topic detection based on graph

analytical method and cosine similarity

MA Changlin，CHENG Mengli，WANG Tao

2019, 41(4): 708-712. doi:

Abstract ( 296 )

PDF (612KB) ( 459 ) 　　

How to automatically extract valuable topic information from massive texts has become an important technical challenge. Currently, most methods carry out their research under the assumption that topics are independent. However, there are complicated inherent relationships between topics. In order to solve the abovementioned problem, we combine the correlated theory with an improved graph analytical approach to model topic detection based on topic correlation and term co-occurrence. Semantic information with high accuracy and potential co-occurrence relationship are simultaneously considered for topic detection to discover important and meaningful topics and trends. Simulation results verify the validity of the proposed model.

Military named entity recognition based on bidirectional LSTM

LI Jianlong，WANG Panqing，HAN Qiyu

2019, 41(4): 711-718. doi:

Abstract ( 517 )

PDF (541KB) ( 565 ) 　　

In order to reduce the large amount of work that traditional named entity recognition needs to manually formulate features, we obtain distributed vector representations of the military domain corpus through unsupervised training, and utilize the bidirectional LSTM (BLSTM) recursive neural network model to solve the identification problem of named entities in the military field. The BLSTM recursive neural network model is extended and improved by adding wordbinding input vectors and attention mechanism to enhance the recognition of named entities in the military field. Experimental results show that the proposed method can identify named entities in the military field, and the Fvalue in the test set corpus reaches 87.38%.

An evaluation method for multiclass classification SVM

structure based on IG ratio of classification attributes

LI Jundi,ZHANG Zhengjun,ZHUANG Lichun,ZHANG Naijin

2019, 41(4): 719-726. doi:

Abstract ( 32 )

PDF (737KB) ( 44 ) 　　

The multiclass classification SVM based on the combination of binary tree structures has a small number of binary SVMs, and can avoid the occurrence of inseparable and repellent regions. Since the combination methods of multiclass classification SVM based on binary tree structure lack specific evaluation criteria for category combination, we propose a multiclass classification SVM structure evaluation method based on information gain (IG) ratio of classification attributes, define the IG ratio of classification attributes, and divide multiple classes into left category and right category. We calculate the IG ratio dependent on the classification attribute of variables for each possible combination of categories, and take the maximum value of the IG ratio as the evaluation criterion of this combination. Empirical analysis on Iris in the UCI database shows that the proposed method has a high recognition rate for multiclass classification SVM when the maximum value is taken as the evaluation criterion .

Current Issue

Author center

Review center

Online journal