Comparing Semi-Structured Documents via Graph Probing
07 November 2001
In this paper, we describe our first steps towards adapting a new approach for graph comparison known as graph probing to allow for the pre-computation of a compact, efficient probe set for databases of graph-structured documents (e.g., Web pages coded in HTML). We consider both the comparison of two graphs in their entirety, as well as determining whether one graph contains a subgraph that closely matches the other. After presenting an overview of work in progress, we provide some preliminary experimental results and suggest directions for future research.