Research Article

[Retracted] Analysis of Application Data Mining to Capture Consumer Review Data on Booking Websites

Table 1

Web crawling and anticrawling strategies.

StrategiesWeb crawlingAnticrawling
Sending requests to websites and acquiring dataWhen the view count of a website increases drastically during a specific period, all views are from the same IP address, and all user agents are Python-based, the manager limits the access from the IP address to the website
Simulating a user agent and acquiring a proxy IPWhen the view count is abnormal, all users are required to log in to their accounts before viewing the website
Registering an account and visiting a website through cookies or tokensA complete account database is established, and each account must have clearance to review specific information
Mimicking user operations by restricting the request sending frequencyA verification code is used to determine whether website visitors are real people
Passing the required authentication (e.g., OpenCv authentication)Dynamic loading pages are introduced, in which data are loaded through JavaScript to increase the difficulty of website analysis
Using Selenium and PhantomJS to fully mimic the browsing behavior of real usersā€‰