编写爬虫程序时最痛苦的就是需要一个个网站的写解析程序,感觉完全是体力活,有没有办法根据关注字自动生成xpath,比如抓取物流方面的信息根据车长,车型,出发地,目的地自动生成相应的元素的xpath,有没类似的论文或者github项目
大家讲道理2017-04-17 16:49:07
Since you mentioned a paper, let me recommend one (although reading it is of no use): Web data extraction, applications and techniques: A survey
Summary and introduction to the research classification of structured and semi-structured data extraction over the past few decades. and basic ideas. You can use this paper as an index to read related research.