Research on Web Table Positioning Technology Based on Table Structure and Heuristic Rules

As a compact and efficient way to present relational data information, Web tables are used frequently in Web documents. Web table positioning technology are considered as essential components of Web table information extraction, and more and more people pay attention to them. This paper realizes table positioning according to Web table structure label and heuristic rules of user-definition, which includes the solution of nested problem, the determination of table data’s integrity, and traversal of tree. The experimental results show that our web table positioning method has good performance.