python - 为什么明明我可以访问的网站, urlopen却会报 404: Not Found

Question

有的说是因为代理.我的浏览器倒是经常开着代理, 但是我已经关闭了. 我特意查看了下HTTP报文, 也都是没经过代理的.但还是会出错. 代码: {代码...} python版本: 3.5.1 报错信息:urllib.error.HTTPError: HTTP Error...

大家讲道理 · Answer

There is no problem with my python 3.5.2 under windows.
It is recommended that you capture the packet and compare it with the request when accessed by the browser.

Python 3.5.2 (v3.5.2:4def2a2901a5, Jun 25 2016, 22:18:55) [MSC v.1900 64 bit (AMD64)] on win32
>>> 
>>> 
>>> 
>>> import urllib.request
>>> url = "http://news.dbanotes.net/"
>>> req = urllib.request.Request(url)
>>> page = urllib.request.urlopen(req).read()
>>> page
b'

伊谢尔伦 · Answer

This may be related to the setting value of your agent, because some websites will check this to prevent non-browsers from crawling

巴扎黑 · Answer

You copy the headers and cookies from the browser and add them to the Request object of urllib.
Simulated browser~~

天蓬老师 · Answer

A very important reason is that the agent header you requested in your program has been blocked by the other party. Try changing the agent header.

阿神 · Answer

No need for Request, just urlopen directly

python - 为什么明明我可以访问的网站, urlopen却会报 404: Not Found

reply all(5)I'll reply