Home > Article > Backend Development > Detailed explanation of examples of using Python to write crawlers using the Requests library
Basic Get request:
#-*- coding:utf-8 -*-import requests url = 'www.baidu.com'r = requests.get(url)print r.text
Get request with parameters:
#-*- coding:utf-8 -*-import requests url = 'http://www.baidu.com'payload = {'key1': 'value1', 'key2': 'value2'} r = requests.get(url, params=payload)print r.text
POST request to simulate login and some methods of returning objects:
#-*- coding:utf-8 -*-import requests url1 = 'www.exanple.com/login'#登陆地址url2 = "www.example.com/main"#需要登陆才能访问的地址data={"user":"user","password":"pass"} headers = { "Accept":"text/html,application/xhtml+xml,application/xml;", "Accept-Encoding":"gzip", "Accept-Language":"zh-CN,zh;q=0.8", "Referer":"www.example.com/", "User-Agent":"Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.90 Safari/537.36" } res1 = requests.post(url1, data=data, headers=headers) res2 = requests.get(url2, cookies=res1.cookies, headers=headers)print res2.conten
t#Get the binary response content print res2.raw#Get the original response content, stream=True is required print res2.raw.read(50)print type(res2.text)#Return the content decoded into unicode print res2.urlprint res2.history#Track redirection print res2. cookiesprint res2.cookies['example_cookie_name']print res2.headersprint res2.headers['Content-Type']print res2.headers.get('content-type')print res2.json#The returned content is encoded as jsonprint res2.encoding #Return content encoding print res2.status_code#Return http status code print res2.raise_for_status()#Return error status code
Use Session() object writing method (Prepared Requests):
#-*- coding:utf-8 -*-import requests s = requests.Session() url1 = 'www.exanple.com/login'#登陆地址url2 = "www.example.com/main"#需要登陆才能访问的地址data={"user":"user","password":"pass"} headers = { "Accept":"text/html,application/xhtml+xml,application/xml;", "Accept-Encoding":"gzip", "Accept-Language":"zh-CN,zh;q=0.8", "Referer":"http://www.example.com/", "User-Agent":"Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.90 Safari/537.36" } prepped1 = requests.Request('POST', url1, data=data, headers=headers ).prepare() s.send(prepped1)''' 也可以这样写 res = requests.Request('POST', url1, data=data, headers=headers ) prepared = s.prepare_request(res) # do something with prepped.body # do something with prepped.headers s.send(prepared) '''prepare2 = requests.Request('POST', url2, headers=headers ).prepare() res2 = s.send(prepare2)print res2.content
Another way of writing:
#-*- coding:utf-8 -*-import requestss = requests.Session()url1 = 'www.exanple.com/login'#登陆地址url2 = "www.example.com/main"#需要登陆才能访问的页面地址data={"user":"user","password":"pass"}headers = { "Accept":"text/html,application/xhtml+xml,application/xml;", "Accept-Encoding":"gzip", "Accept-Language":"zh-CN,zh;q=0.8", "Referer":"http://www.example.com/", "User-Agent":"Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.90 Safari/537.36" }res1 = s.post(url1, data=data)res2 = s.post(url2)print(resp2.content) SessionApi 其他的一些请求方式 >>> r = requests.put("http://httpbin.org/put")>>> r = requests.delete("http://httpbin.org/delete")>>> r = requests.head("http://httpbin.org/get")>>> r = requests.options("http://httpbin.org/get")
Problems encountered:
When executing under cmd, a small error was encountered:
UnicodeEncodeError:'gbk' codec can' t encode character u'\xbb' in position 23460: illegal multibyte sequence
Analysis:
1. Is Unicode encoding or decoding
UnicodeEncodeError
Obviously there was an error during encoding
2. What encoding was used
'gbk' codec can't encode character
Use GBK Encoding error
Solution:
Determine the current string. For example,
#-*- coding:utf-8 -*-import requests url = 'www.baidu.com'r = requests.get(url)print r.encoding >utf-8
has determined that the html string is utf-8, you can directly pass utf-8 Go code.
print r.text.encode('utf-8')
The above is the detailed content of Detailed explanation of examples of using Python to write crawlers using the Requests library. For more information, please follow other related articles on the PHP Chinese website!