HTTP错误403-禁止使用urlretrieve

coding2live 2021-01-29 15:27:18 0 154 python, http, python-requests, urllib

我正在下载PDF，遇到了一个报错:HTTP Error 403: Forbidden

我个人猜测的原因可能是请求被禁止了，但我没有找到解决方案。

下面是我的代码：

import urllib.request
import urllib.parse
import requests


def download_pdf(url):

full_name = "Test.pdf"
urllib.request.urlretrieve(url, full_name)


try: 
url =         ('http://papers.xtremepapers.com/CIE/Cambridge IGCSE/Mathematics (0580)/0580_s03_qp_1.pdf')

print('initialized')

hdr = {}
hdr = {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_2)     AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36',
'Content-Length': '136963',
}



print('HDR recieved')

req = urllib.request.Request(url, headers=hdr)

print('Header sent')

resp = urllib.request.urlopen(req)

print('Request sent')

respData = resp.read()

download_pdf(url)


print('Complete')

except Exception as e:
print(str(e))

以下答案仅供参考

你的猜测是对的。

远程服务器显然正在检查user agent header，并拒绝来自Python的urllib的请求。

虽然urllib.request.urlretrieve()不允许更改HTTP请求头。但是，你可以用urllib.request.URLopener.retrieve():

import urllib.request

opener = urllib.request.URLopener()
opener.addheader('User-Agent', 'whatever')
filename, headers = opener.retrieve(url, 'Test.pdf')

注意:你使用的是python3，这些函数现在被认为是“遗留接口”的一部分，而且URLopener已被弃用。

所以，不应该继续使用这些老旧的方法了。

另外，简单直接地访问URL也会遇到很多麻烦。

你的项目里引入了requests包，那应该使用requests而不是用urllib。

requests使用起来更简单:

import requests

url = 'http://papers.xtremepapers.com/CIE/Cambridge IGCSE/Mathematics (0580)/0580_s03_qp_1.pdf'
r = requests.get(url)
with open('0580_s03_qp_1.pdf', 'wb') as outfile:
    outfile.write(r.content)

HTTP错误403-禁止使用urlretrieve

HTTP错误403-禁止使用urlretrieve

Recommend

Github GitHub - telekom-security/tpotce: ? T-Pot - The All In One Honeypot Platf...

如何应用 SOLID 原则整理 React 代码之单一原则

Six announcements from Google I/O

化繁为简，兼顾效率与人性化，飞书4.0以工具升级带动组织跃迁

夏普AQUOS R6发布，1英寸图像传感器来了！

本周公布名单！长沙活动线上周边礼品你申请了吗

Valencia anuncios clasificados de fisioterapia, quiroprácticos masaje terapéuti...

标签：CoffeeScript

Github GitHub - atc1441/ATC_MiThermometer: Custom firmware for the Xiaomi Thermo...

如何在vue组件内引用外部js脚本？

About Joyk