• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学

• 论文 • 上一篇    下一篇

海量文件系统中基于特征实现文件多维度浏览

贺扬,何连跃,陈博,徐俊,徐照淼   

  1. (国防科学技术大学计算机学院,湖南 长沙 410073)
  • 收稿日期:2017-01-15 修回日期:2017-03-22 出版日期:2017-05-25 发布日期:2017-05-25

Multi-dimension browsing based on
features in massive file system

HE Yang,HE Lian-yue,CHEN Bo,XU Jun,XU Zhao-miao   

  1. (College of Computer,National University of Defense Technology,Changsha 410073,China)
  • Received:2017-01-15 Revised:2017-03-22 Online:2017-05-25 Published:2017-05-25

摘要:

SMDFS可以高效地管理百亿级数量文件。然而针对照片、音乐等海量数据,往往需要从多个维度快速浏览文件,基于目录结构管理海量文件的传统文件组织方式很难满足这一要求。在SMDFS文件系统基础之上,为文件引入特征属性,并提出基于特征的海量小文件倒排索引技术和分布索引技术,
使SMDFS可根据多个特征快速浏览文件。实验数据表明,支持特征的SMDFS能为海量小文件提供高效管理和多维度快速浏览能力,同时基于文件目录结构访问海量小文件的性能并没有明显下降。

关键词: 海量小文件, 检索, 倒排索引, 动态重构

Abstract:

The small files distributed file system (SMDFS) can efficiently manage ten billions of files. However, a huge amount of data such as photos, music, etc., often needs to quickly browse files from multiple dimensions, and traditional files organization schemes based on the directory structure to manage massive files cannot  easily meet this requirement. Based on the SMDFS file system, we introduce features to file attributes and put forward a featurebased massive small files inverted indexing technique and a distributed indexing technique, which enables the SMDFS browse files based on multiple features. Experimental results show that the featuresupported SMDFS can provide efficient management and rapid multidimensional browsing capability for massive small files while the massive small files access performance based on filedirectory structure is not significantly decreased.

Key words: massive small files, search, inverted index, dynamic reconstruction