• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2006, Vol. 28 ›› Issue (9): 15-17.

• 论文 • 上一篇    下一篇

基于XSLT的Web包装器环境

廖灵睿 肖田元   

  • 出版日期:2006-09-01 发布日期:2010-05-20

  • Online:2006-09-01 Published:2010-05-20

摘要:

Web包装器将网页内容转换为XML格式,用于系统集成。进行XML转换的XSLT技术能较好地支持包装器的信息抽取和组织。本文从包含查询接口、结果模式和映射规则的包装器描述文件(XML)出发,给出了自动生成可执行代码的技术方案。包装器的执行及其生成过程完全基于XSLT技术,系统具有较强的可移植性。提出“元数据对齐”方法进行内
容辅助定位,提高了对页面变化的容忍度。原型系统的实现验证了以上技术的可行性。

关键词: Web包装器 XSLT XML Schema

Abstract:

Converting Web pages into XML, Web wrappers can support the integration of existing systems. XSLT aims at XML transformation. Its capabilities in info rmation extraction and reorganization show good potential for wrappers. Wrapper description incorporates the query interfaces, result schemas and mappin
g rules. This XML file is used to generate executable code automatically. Wrapper execution and code generation are all based on standard XSLT, which features great portability. A content-based locating method called "Meta-Data Aligning" is also put forward to improve wrappers' tolerance to the chang ges of Web pages. Ideas and techniques are validated in the implementation of the prototype.

Key words: Web wrapper, XSLT, XML Schema