怪我咯2017-04-17 17:50:02
scrapy +1
It is very convenient to use, has a lot of functions, and the documentation is very clear:
scrapy official website
高洛峰2017-04-17 17:50:02
The questioner has already added the python tag himself, why do you still ask about the language...
黄舟2017-04-17 17:50:02
Using a browser or browser-like method to parse a page is far less fast than regular analysis. If you want to use a selector, you have to build something. This is not a labor-saving job
However, the biggest problem with regular parsing is that once someone else changes the version, you may find it easier to change it
PHP中文网2017-04-17 17:50:02
I have used nokogiri when writing ruby, but for high efficiency, python is more convenient
大家讲道理2017-04-17 17:50:02
Language is not a problem. The specific business depends on the module. There must be a useful http library, a useful concurrency library, a useful job scheduling library, and a useful markup language parsing library. These are all available and the language has good performance. Having a more beautiful syntax depends on whether most people in the company can accept this language. From a broad perspective, python, java, ruby, nodejs, c# all meet these conditions. As for how to choose, it depends on the following conditions.