It is difficult to use the traditional technology to realize data aggregation and data sharing for the Internet,
which contains a large number of free, open and valuable noncontractual earth observation
data sources. These data sources have the characteristics of webpage query entrance, massive
data hidden in the network background database, data sharing platform diversity and different
kinds of spatial data platform to interconnect etc. Considering these problems, a non
contractual heterogeneous distributed data sources passive aggregation architecture is
proposed, which is based on deep web crawler technology. Meanwhile, we design a data source
identification standard, noncontractual data source discovery mechanism, noncontractual
data source search tree building mode, noncontractual data source indexing mechanism and
data source asynchronous update rules. Using this mechanism, we archive 5 data sources of
large data sharing system including NASA, USGS, ASAR, these three widely used data resources
and form earth observation data resource automatic aggregation and update tool sets.
Eventually, through a unified query interface, users can obtain noncontractual earth
observation data resource information.