Research Article
An Efficient Mechanism for Deep Web Data Extraction Based on Tree-Structured Web Pattern Matching
Algorithm 2
Web database WD, data records DR, data items DI, and users U are the inputs.
Step 1: Begin with the set of input web pages you have gathered. | Step 2: Create a tree T. | Step 3: Fill in the missing for each (Tree representation T). | Step 4: Create a M matrix. | Step 5: Determined by the presence of content similarity in web pages, | Step 6: Move the nodes from the left to the right or likewise. | Step 7: End For. | Step 8: Form a set of rules R. | Step 9: For Each (R). | Step 10: Form a vector v in (WP). | Step 11: Determine whether or not there is a schema present. | Step 12: Come to an end For. | Step 13: Make a distinction between the exact representation and the leaf child nodes. | Step 14: Come to an end. | The technique above outlines the full process of identifying schema and templates in order to improve deep web page extraction. |
|