Research Article
Construction of Online English Corpus Based on Web Crawler Technology
Table 3
Comparison of GitHub data of web crawler framework.
| Project | Language | Watch | Star | Fork |
| Scrapy | Python | 1840 | 31956 | 7573 | pyspider | Python | 888 | 12865 | 3163 | webmagic | JAVA | 809 | 7730 | 3395 | Colly | Go | 219 | 7164 | 536 | Pholcus | Go | 441 | 5209 | 1331 | node-crawler | node.js | 256 | 4555 | 732 | crawler4j | JAVA | 307 | 3429 | 1719 | WebCollector | JAVA | 329 | 2294 | 1324 | Apache Nutch | JAVA | 245 | 1895 | 1135 | QueryList | PHP | 67 | 1469 | 250 | Gecco | JAVA | 133 | 1466 | 606 | heritriz3 | JAVA | 174 | 1413 | 596 | wombat | Ruby | 53 | 1143 | 113 |
|
|