I am about to be a sophomore, I have studied Python by myself, and I know basic grammar. I want to learn crawling, but I feel that it involves a lot of knowledge. Is there anyone who has experienced it and can summarize what they know, or how to learn Python crawling?
某草草2017-07-05 10:36:11
When learning crawlers, you must learn from needs. You see, there are so many junior crawlers on the Internet crawling for jokes, pictures of beautiful women, etc. You can get these simple crawlers in three days.
But if you go in depth, it is very difficult, and there are many aspects involved.
Getting started is not difficult, you can read this--
How to learn Python crawler [Introduction] https://zhuanlan.zhihu.com/p/...
仅有的幸福2017-07-05 10:36:11
In principle, it is an http request, a little more is session and cookie, and a little more is verification code recognition.
As for the tool, the request tool can use urllib2, or even better, the request library. If the request comes in and needs to be parsed, that is beautifulsoup.
Python basic tutorial | Novice tutorial http://www.runoob.com/python/...
Beautiful Soup 4.2.0 documentation — Beautiful Soup 4.2.0 documentation https://www.crummy.com/softwa...
Crawler performance: NodeJs VS Python - QueenKing - SegmentFault /a/11...
Use KNN for verification code recognition - QueenKing - SegmentFault /a/11...
伊谢尔伦2017-07-05 10:36:11
You can refer to the Python-Scrapy crawler framework, which has a Chinese manual.