网页爬虫 - python爬虫翻页问题，请问各位大神我这段代码怎样翻页，还有价格要登陆后才能看到，应该怎么解决

Question

{代码...}

黄舟 · Answer

?pageNum=" + str(pageIndex)

这一个不就是你的页码控制吗？
登录后才看到那就用cookie或者用户名密码模拟登录后获取

迷茫 · Answer

httplib2基本应该是所有http请求的终结者了吧。

import httplib2
import urllib
http = httplib2.Http()
url='要获取的地址'
header={'Accept':'text/html',
     'Accept-Encoding':'gzip, deflate, sdch',
     'Accept-Language':'zh-CN,zh;q=0.8',
     'Cache-Control':'max-age=0',
     'Connection':'keep-alive',
     'Cookie':'cookie内容',
     'Upgrade-Insecure-Requests':'1',
     'User-Agent':'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36'}   #要有登陆状态才能翻页就要模拟登陆后把cookie放进去
body_value={'username':'test','password':'123456'} #表单的所有内容
body_value=urllib.urlencode(body_value) #utf8编码
response, content = http.request(url, 'GET', headers=header,body=body_value)  #GET或者POST方法
response.encoding = 'utf-8'
#content就是返回内容

网页爬虫 - python爬虫翻页问题，请问各位大神我这段代码怎样翻页，还有价格要登陆后才能看到，应该怎么解决

全部回复(2)我来回复