Computer Engineering & Science

State of the art analysis of China HPC 2016

YUAN Guoxing1,YAO Jifeng2

2016, 38(12): 2375-2380. doi:

Abstract ( 160 )

PDF (971KB) ( 479 ) 　　

Review attachment

We present the overall performance trends of China HPC TOP100 and TOP 10 of 2016 according to the latest China HPC TOP100 rank list released by the SAMSS in the late October, and analyze the characteristics of performance, manufacturers and application areas in detail.

A high efficient E-Flash accelerator based on

pre-fetch and cache principles

JIANG Jinsong1,HUANG Kai1,CHEN Chen2,WANG Yubo3,YAN Xiaolang1

2016, 38(12): 2381-2391. doi:

Abstract ( 211 )

PDF (2075KB) ( 452 ) 　　

Review attachment

Based on prefetch and cache principles, we realize a Flash acceleration controller which is used to improve the efficiency of Flash memory in different embedded applications. The controller contains two acceleration schemes: prefetch cache and highspeed cache. The prefetchcache method uses the data width extension and prefetch technology to accelerate the access to sequence instructions, and uses the branchbuffer technology to reduce the missing penalty caused by the branch instruction. The highspeed cache method uses the setassociative and waypredict technology to improve instruction reuse and reduce the Flash access frequency and power consumption. The two acceleration methods can not only be selected statically by configuring related registers, but also be switched dynamically by the software flow. Several benchmark results prove the feasibility and efficiency of the proposed Flash acceleration controller in terms of performance and power optimization.

Design and implementation of a routing &

switching array node chip and its system

QIN Jilong1,LI Qinghua2,WANG Endong1,GONG Weifeng1,ZHANG Feng1,NIU Yun3,WU Liji3,ZHANG Xiangmin3

2016, 38(12): 2392-2401. doi:

Abstract ( 148 )

PDF (1248KB) ( 364 ) 　　

Review attachment

We introduce an application in network computing/control/storage network, including a node controller chip of routing & switching array and a prototype design and implementation. The routing & switching array system uses software defined network (SDN) through the Internet to realize the integrated programming control on the inner routing control and security module of the high speed secure switching network, so as to meet the need of the data center for data transmission bandwidth capacity. Besides, we alleviate and eliminate the bottleneck of network transmission of the parallel computing process, thus avoiding the waste of longterm occupation of network data center resources and laying a foundation for the solutions of next generation data centers. In addition, we briefly introduce the current research situation of its application in the financial transaction system.

A method for ensuring data confidentiality in cloud storage

REN Jingsi1,2,WANG Jinlin1,CHEN Xiao1,YE Xiaozhou1

2016, 38(12): 2402-2408. doi:

Abstract ( 132 )

PDF (799KB) ( 367 ) 　　

Review attachment

The most popular method to ensure confidentiality of users’data is to encrypt the data stored in the cloud. We propose a new method to ensure data confidentiality in cloud storage. Its properties include: (1) an encryption system which processes user data between clients and servers, is deployed on the front end of cloud storage servers; (2) real time encryption of user data: data is encrypted while being uploaded and decrypted while being downloaded; (3) the encryption system is transparent to both clients and cloud servers. Currently the widespread HTTP protocolbased cloud storage systems, such as Amazon S3 and OpenStack Swift, can use this method directly. Test data show that: this method can effectively offload the burden of data encryption and decryption without reducing the throughput.

A data flow programming model and

compiler optimization for Storm

YANG Qiuji，YU Junqing，MO Binsheng,HE Yunfeng

2016, 38(12): 2409-2418. doi:

Abstract ( 152 )

PDF (1045KB) ( 431 ) 　　

Review attachment

As a domain specific programming model, data flow programming combines the features of media applications and programming languages and offers an attractive way to express the parallelism. However, the hierarchical storage structure of the multicore cluster architecture incurs new challenges to the performance of data flow applications. Besides, the programmability remains a significant challenge for the compiler. Aiming at the problems the data flow programming model facing in processing the big data of digital media field, we design and implement an integration of a data flow programming model and a distributed computing framework, and propose a compiler optimization framework for Storm based on COStream. The compiler optimization method for Storm includes two steps: hierarchical task partition and scheduling for Storm, and pipeline scheduler and code generation for Storm. The hierarchical task partition and scheduling is used to assign the tasks to the multicore cluster nodes within the cluster, which can ensure a workload balance between multiple cores with small inter cluster communication overhead. The pipeline scheduler and code generation are used to build software pipelines between cluster nodes and between cores in a node, and generate the corresponding object code. We conduct experiments on a multicore cluster as the target platform, build the Storm distributed architecture in the cluster, choose typical digital media processing program as the benchmarks, evaluate and analyze the optimization performance for Storm. Experimental results verify the effectiveness of the proposed model.

An energyandperformanceaware virtual machine

placement optimization algorithm in cloud computing

FANG Bingwu1,2,HUANG Zhiqiu1

2016, 38(12): 2419-2424. doi:

Abstract ( 130 )

PDF (457KB) ( 496 ) 　　

Review attachment

Optimizing virtual machine placement is an important way to reduce energy consumption in the data center. At present, most placement algorithms of virtual machine can reduce energy consumption significantly, but a considerable degradation of system performance is caused by excessive migration and consolidation of the virtual machine. To solve this problem, we first build an optimization model of virtual machine placement and then propose a twophase iterative heuristic algorithm to solve the model. The first phase is using the optimization placement algorithm of virtual machine to minimize the number of hosts based on the first fit decreasing binpacking algorithm. The second phase is using the live migration selection algorithm of virtual machine to minimize the number of virtual machine migration. Experimental results show that the proposed algorithms can effectively reduce energy consumption, with lower service level agreement (SLA) violation rate and better time performance.

MapReduce training of BP neural networks

based on local weight matrix evolution

CHEN Wanghu,YU Maoyi,MA Shengjun,LI Jingrong,JIA Wenbo

2016, 38(12): 2425-2433. doi:

Abstract ( 144 )

PDF (614KB) ( 395 ) 　　

Review attachment

To improve the efficiency of BP neural network (BPNN) training, we propose an approach with MapReduce based on the local weight matrix evolution. The local weight matrices produced by Map tasks based on its input data splits are passed to Reduce tasks as the initial population of a genetic algorithm. After the evolution of the current population, the weight matrix with highest fitness is selected as the initial weight matrix in the next turn of training. The training does not stop until the weight matrix is convergent on the whole sample data. Experiments show that the proposed approach can maintain the global convergence of a BPNN on its whole training sample data, and improve the efficiency of BP-NN training with MapReduce.

MRI:A MapReduce model for parallel iteration

MA Zhiqiang,ZHANG Li,YANG Shuangtao

2016, 38(12): 2434-2441. doi:

Abstract ( 114 )

PDF (843KB) ( 316 ) 　　

Review attachment

MapReduce models have not been widely used in iterative computation because of its defect in iterative computation. However, in order to get the optimal parameters, most of the algorithms in the field of machine learning need to be solved by iterative computation. We propose and implement a parallel iterative model based on the MapReduce for solving the optimal parameters.The MRI adds an iterate phase to the MapReduce to realize the update and distribution of parameters and the control of iteration during the iterative process. We then modify the MapReduce state machine to reuse the node tasks and avoid unnecessary performance overhead. In order to speed up the iterative process, the MRI also caches data block in the task nodes and implements the memory based block caching mechanism on the Map node. Experiment results on the gradient descent algorithm show that the performance of the proposed MRI model outperforms the MapReduce.

Cross-media composition methods

ZHANG Jibo，XUE Jinyun，WANG Xiong，XIA Jing，XIONG Xiaozhou

2016, 38(12): 2442-2449. doi:

Abstract ( 136 )

PDF (845KB) ( 377 ) 　　

Review attachment

Crossmedia calculation technology based on multimedia calculation has become a hotspot of information technology research in the era of big data. However, we are short of effective ways for crossmedia acquisition, composition and applications. We present two different crossmedia composition methods, and exploit the course of crossmedia traversing binary trees. The first method utilize the web service model which utilizes multiple multimedia data and joins them as a service composition, thus constituting crossmedia services. The second one based on the "New SQL" technology and Apla language program developed by our own PAR design platform, utilizes the multimedia database technology of PAR platform to implement the storage and retrieval of crossmedia, and generalizes the Apla language program. Thus, the multimedia service deployed in cloud can be realized, and the problem of crossmedia composition and storage is efficiently solved.

A parallel solution to mass remote sensing

data classification and application

ZHAI Hao1,YUAN Zhanliang1,HUANG Xiangzhi2,3,ZANG Wenqian2,ZHANG Zhouwei2，ZHOU Ke2,4

2016, 38(12): 2450-2455. doi:

Abstract ( 136 )

PDF (877KB) ( 336 ) 　　

Review attachment

At present, as remote sensing data grows massively, how to carry out fast image classification and information mining in applications and how to improve the business level of manipulation, is an important research direction. Aiming at this problem, we propose an efficient solution. Firstly, based on "fivelayer fifteenlevel" data structure, we segment the image which takes a scene as a unit, then build an image data organization system based on image slices. Secondly, with the help of storage technology of large data, we realize a cluster distributed storage of slices. Thirdly, we utilize the supervised classification algorithms based on pixel and object as the processing algorithm, and make adaptive designs of parallel architecture and drive mechanism in cluster environment according to computation processing requirements. Finally, we realize the solution and carry out experiments with GF2 multi spectral slicing. The results show that the proposed solution can improve the efficiency of classification processing while maintaining the accuracy.

An encryption method based on iSCSI

network storage system

MENG Xianghui1,2,ZENG Xuewen2,CHEN Xiao2,YE Xiaozhou2

2016, 38(12): 2456-2462. doi:

Abstract ( 236 )

PDF (941KB) ( 309 ) 　　

Review attachment

Due to the fact that the iSCSI protocol does not provide security services and most network storage systems do not have the encryption capabilities either, we propose a realtime encryption module for the iSCSI, which enables the network storage system to provide users with transparent realtime encryption services after loading this module. We design an encrypted writing and decrypted reading process for the iSCSI target. Since the encryption module is independent of the original network storage system, the operating system's kernel does not need to change. The iSCSI initiator does not perceive the existence of encryption operation, thus clients based on standard iSCSI protocol can use the service directly. In addition, we use the security coprocessor of the multicore network processor to optimize the read and write performance. Experimental results show that introducing the encryption module to network storage system does not lead to serious loss of performance, and the system performance is satisfactory.

Influence of CVSS environmental metrics on system security

ZHOU Shiyang，FU Li

2016, 38(12): 2463-2470. doi:

Abstract ( 154 )

PDF (733KB) ( 301 ) 　　

Review attachment

The common vulnerability scoring system (CVSS) evaluates the threats of vulnerabilities of a particular system at three levels, and the final environmental scores reflect the degree of its security. In the CVSS metrics, CVSS environmental metrics are the only variable that depends on the conditions of the target organization or system, so obtaining their values becomes the key and most difficult part for users to implement security risk management and control strategies. To solve this issue, we study the influence of environmental metrics on the final CVSS environmental scores, and give an overall estimation of environmental metrics vector influence on CVSS environmental scores, as well as the formulas of each vector component's influence on the score. Experimental results show that the new estimation method can improve the accuracy in the aspects of environmental metrics’ overall impact and subindex influence on CVSS environmental scores, thus entering the completely accepted range of the defacto standard.

A multilevel P2P traffic classification method

CHEN Yuan，LIN Haitao

2016, 38(12): 2471-2477. doi:

Abstract ( 152 )

PDF (596KB) ( 336 ) 　　

Review attachment

P2P network traffic classification is very important for network management and network security. Because of the diverse development of P2P traffic currently, it is difficult for any single traditional P2P traffic classification method to classify P2P traffic accurately. By analyzing the current status of P2P traffic classification methods, We propose a multilevel P2P traffic classification method based on the advantages of the existing P2P traffic classification methods. It is composed of four P2P traffic modules, and the division of labor and feedback mechanism between the modules can promote the effect of P2P traffic classification. Experimental results verify its accuracy and efficiency.

Design and implementation of network

fault tolerance on embedded devices

FAN Hao1,2，DENG Haojiang1，CHEN Jun1，LI Mingzhe1,2

2016, 38(12): 2478-2482. doi:

Abstract ( 108 )

PDF (665KB) ( 327 ) 　　

Review attachment

To deal with the service reliability of embedded network devices, the redundancy design of dual network cards is widely used in the existing network fault tolerance. We propose a method for network fault tolerance for the situation of single network card and multi network ports, which can effectively utilize the bandwidth resources of the system. It includes a network port state inspection mechanism and a fault network service data migration method, based on which we develop several functional modules. Fault detection module realizes the detection method based on Loopback. Fault tolerant processing module realizes the migration of fault network service loads when it detects any network fault. The proposed method can effectively detect network port state and hold the fault tolerance of load data. It features application independence and low resources occupancy rate, and the feasibility is verified through several tests.

Construction of 2nperiodic periodic binary sequences

with given kerror linear complexity spectrum

BI Songsong,DAI Xiaoping,ZHOU Jianqin,WANG Xifeng

2016, 38(12): 2483-2492. doi:

Abstract ( 188 )

PDF (470KB) ( 313 ) 　　

Review attachment

The kerror linear complexity is an important stability index of pseudorandom sequences. Based on the cube theory and the reverse process of the GamesChan algorithm, we propose an constructive approach for constructing 2nperiodic binary sequences with given kerror linear complexity spectrum. We use the standard cube decomposition algorithm to classify 2nperiodic binary sequences with the kerror linear complexity of them with the first descent point k=2, the second descent point k′=6 and the third descent point k″=10. We then discuss the relationship between linear complexity parameters in each category. Finally, we derive the counting formula and construction process on the number of the periodic sequences for each case. In fact, we can construct 2n periodic binary sequence with more descent points of kerror linear complexity by the methods.

An APIT localization algorithm based on the minimum

sum of signal strength for wireless sensor networks

JI Changpeng1,GAO Liang1,FENG Zhusong2

2016, 38(12): 2493-2498. doi:

Abstract ( 164 )

PDF (754KB) ( 268 ) 　　

Review attachment

We study localization algorithms of wireless sensor networks and propose a new pointintriangulation test (NAPIT) based on the minimum sum of signal strength to solve the problem of boundary effects caused by the traditional APIT algorithm. Firstly, the algorithm finds the point from the triangle of anchor nodes where we can get the minimum sum of signal strength. Then the algorithm regards the minimum value as a threshold value to judge if it is an effective triangle. Finally, the algorithm determines the coordinate position of unknown nodes by the gird scanning algorithm. Simulation results show that the NAPIT algorithm can partly reduce the frequency of InToOut Error and OutToIn Error, effectively improve the localization precision and remarkably enlarge the localization coverage.

A multidimensional reputation computation

model for O2O Ecommerce

ZHU Wenqiang，ZHONG Yuansheng

2016, 38(12): 2499-2506. doi:

Abstract ( 146 )

PDF (796KB) ( 364 ) 　　

Review attachment

In recent years, with mobile communication technology and credible payment technology being mature, the O2O Ecommerce develops rapidly. Although reputation management is the solid foundation of trust relationships on both transaction sides, there are few researches on reputation management of O2O Ecommerce at present. Furthermore, the existing reputation management researches on P2P, B2C, C2C Ecommerce cannot reflect the features of O2O Ecommerce, and they are not suitable for O2O Ecommerce sellers to do reputation computation. Aiming at these problems, we propose a novel reputation computation model called ESRep for O2O Ecommerce sellers. We introduce the factors such as the running time of sellers, passenger flow, ratings from customers, transaction price deviation degree, and space length of both transaction sides into this model, so it can reflect the offline and online features of O2O Ecommerce. Simulation results show this model can effectively indicate the real reputation of O2O Ecommerce sellers, reduce the influence of hostile nodes' rating, and resist the reputation collusion action from hostile nodes.

A selfadaptive adjusting backoff algorithm based

on contention window diminishment factor

ZHANG Changsen,CHEN Pengpeng

2016, 38(12): 2507-2513. doi:

Abstract ( 166 )

PDF (785KB) ( 351 ) 　　

Review attachment

Numerous wireless network protocols, including IEEE802.11 and 802.15.4, manage the retransmission of data frames with the binary exponential backoff (BEB) mechanism. In a dynamic distributed environment, the fixed way the BEB adjusts contention window cannot adapt to the ever changing network size. To solve this problem, we propose an improved selfadaptive adjusting backoff algorithm based on contention window diminishment factor. By introducing the concept of contention window diminishment factor, the algorithm adaptively adjusts the waiting time of wireless nodes to realize the network throughput maximization. In addition, in order to track the changing number of competition nodes, we propose a heuristic algorithm when implementing our new backoff scheme. Simulations on the same physical layer parameters of the IEEE 802.11 DCF protocol show that the proposed algorithm can improve the throughput and decrease the frame delay.

A social simulation model of language competition

and its computation experiments

ZENG Zhenhua1，BI Guihong2，ZHANG Shouming1，CAI Zilong2

2016, 38(12): 2514-2528. doi:

Abstract ( 150 )

PDF (1673KB) ( 314 ) 　　

Review attachment

Language evolution is an important problem in social science field. It is a typical social science problem which cannot be applied to computation experiments before the social simulation method is introduced. We apply the social simulation method to this problem, and present a new microscopic social simulation model of language competition based on the multiagent and social circles theory. Agents of the model are characterized by vocabulary structure.

the model can show the macroscopic process of language evolution through the evolution of the vocabulary structure of speakers. We do a large number of computation experiments and the results show that the range of parameter values of language coexistence is very small and it is difficult to achieve the state of language coexistence without intervention. However, the range is significantly broadened when the intervention is undertaken within a certain time window. Furthermore, we find that the effect of intervention is better in the society with a higher openness degree. This finding extends the understanding for the language competition model of complex networks.

Study on power control of home area access control

network of smart grid based on cognitive radio technology

#br#

LIN Xuesong1,YANG Wanqing1,XIA Yuandou1,CHENG Song2

2016, 38(12): 2529-2535. doi:

Abstract ( 123 )

PDF (891KB) ( 302 ) 　　

Review attachment

The main purpose of the application of cognitive radio (CR) technology in the Home Area Network (HAN) of smart grid is to improve the spectrum utilization. Power control is one of the main ways of improving spectrum efficiency and spectrum sharing. By reducing the interference between secondary users (SUs), users' signal to interference plus noise power ratio (SINR) is improved so as to ensure the normal communications among SUs. We study three power control algorithms in the performance of home area access control network of smart grid based on CR technology. Simulation results show that the three algorithms have rapid convergence and high efficiency, which can reduce the system overhead to a certain extent and improve the network performance, thus having very good guidance significance for practical engineering.

Current Issue

Author center

Review center

Online journal