Research Article

Construction of Online English Corpus Based on Web Crawler Technology

Table 3

Comparison of GitHub data of web crawler framework.

ProjectLanguageWatchStarFork

ScrapyPython1840319567573
pyspiderPython888128653163
webmagicJAVA80977303395
CollyGo2197164536
PholcusGo44152091331
node-crawlernode.js2564555732
crawler4jJAVA30734291719
WebCollectorJAVA32922941324
Apache NutchJAVA24518951135
QueryListPHP671469250
GeccoJAVA1331466606
heritriz3JAVA1741413596
wombatRuby531143113