Requests模块
Requests 唯一的一个非转基因的 Python HTTP 库,人类可以安全享用。
警告:非专业使用其他 HTTP 库会导致危险的副作用,包括:安全缺陷症、冗余代码症、重新发明轮子症、啃文档症、抑郁、头疼、甚至死亡
Requests 允许你发送纯天然,植物饲养的 HTTP/1.1 请求,无需手工劳动。你不需要手动为 URL 添加查询字串,也不需要对 POST 数据进行表单编码。Keep-alive 和 HTTP 连接池的功能是 100% 自动化的,一切动力都来自于根植在 Requests 内部的 urllib3。
官网地址:https://2.python-requests.org//zh_CN/latest/index.html
安装
get请求
| response = requests.get("http://www.baidu.com/")
|
| kw = {'wd':'长城'}
# params 接收一个字典或者字符串的查询参数,字典类型自动转换为url编码,不需要urlencode()
response = requests.get("http://www.baidu.com/s?", params = kw)
|
携带请求头
| kw = {'wd':'长城'}
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.99 Safari/537.36"}
# params 接收一个字典或者字符串的查询参数,字典类型自动转换为url编码,不需要urlencode()
response = requests.get("http://www.baidu.com/s?", params = kw, headers = headers)
|
响应Response
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16 | kw = {"name":"zhangsan"}
headers={"User-Agent":"a niubility navigator"}
url = "http://httpbin.org/get"
response = requests.get(url, params=kw, headers=headers)
print("响应状态码:{}".format(response.status_code))
print("响应头:{}".format(response.headers))
print("编码:{}".format(response.encoding))
# 这里解码内容,它内部是依靠猜测的方式去解码的
print("解码内容:{}".format(response.text))
print("原始字节信息:{}".format(response.content))
print("完整的响应地址:{}".format(response.url))
|
打印的结果为:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19 | 响应状态码:200
响应头:{'Access-Control-Allow-Credentials': 'true', 'Access-Control-Allow-Origin': '*', 'Content-Encoding': 'gzip', 'Content-Type': 'application/json', 'Date': 'Thu, 25 Apr 2019 14:38:53 GMT', 'Referrer-Policy': 'no-referrer-when-downgrade', 'Server': 'nginx', 'X-Content-Type-Options': 'nosniff', 'X-Frame-Options': 'DENY', 'X-XSS-Protection': '1; mode=block', 'Content-Length': '197', 'Connection': 'keep-alive'}
编码:None
解码内容:{
"args": {
"name": "zhangsan"
},
"headers": {
"Accept": "*/*",
"Accept-Encoding": "gzip, deflate",
"Host": "httpbin.org",
"User-Agent": "a niubility navigator"
},
"origin": "111.47.249.9, 111.47.249.9",
"url": "https://httpbin.org/get?name=zhangsan"
}
原始字节信息:b'{\n "args": {\n "name": "zhangsan"\n }, \n "headers": {\n "Accept": "*/*", \n "Accept-Encoding": "gzip, deflate", \n "Host": "httpbin.org", \n "User-Agent": "a niubility navigator"\n }, \n "origin": "111.47.249.9, 111.47.249.9", \n "url": "https://httpbin.org/get?name=zhangsan"\n}\n'
完整的响应地址:http://httpbin.org/get?name=zhangsan
|
post请求
| response = requests.post("http://www.baidu.com/")
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23 | data = "你好Python";
headers = {
"Accept": "application/json, text/javascript, */*; q=0.01",
"Content-Type": "application/x-www-form-urlencoded; charset=UTF-8",
"Origin": "https://fanyi.qq.com",
"Referer": "https://fanyi.qq.com/",
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.103 Safari/537.36",
"X-Requested-With": "XMLHttpRequest",
"Content-length": str(len(data.encode()))
}
keyword = {
"source": "auto",
"target": "auto",
"sourceText": data,
"sessionUuid": "translate_uuid"+str(int(time.time()*1000))
}
url = "https://fanyi.qq.com/api/translate"
result = requests.post(url, headers=headers, data=keyword).json()
print(result['translate']['records'][0]['targetText'])
|
Cookies 和 Session
Cookies
| response = requests.get("http://www.baidu.com/")
# 通过cookies拿到信息
cookiejar = response.cookies
# 将CookieJar转为字典:
cookiedict = requests.utils.dict_from_cookiejar(cookiejar)
print(cookiedict)
|
Session
在 requests 里,session对象是一个非常常用的对象,这个对象代表一次用户会话:从客户端浏览器连接服务器开始,到客户端浏览器与服务器断开。
会话能让我们在跨请求时候保持某些参数,比如在同一个 Session 实例发出的所有请求之间保持 cookie 。
| session = requests.session()
session.get(url)
|