当前位置: 首页 > 后端技术 > Python

Python爬虫requests模块_0

时间:2023-03-26 17:09:24 Python

获取响应信息importrequestsresponse=requests.get('http://www.baidu.com')print(response.status_code)#状态码print(response.url)#请求urlprint(response.headers)#响应头信息打印(response.cookies)#cookie信息print(response.content)#以字节为单位的响应内容print(response.encoding)#获取响应内容编码response.encoding="utf-8"#指定响应内容编码print(response.text)#响应内容为文本形式,response.content编码后的结果发送一个Get请求Getrequestwithoutparametersresponse=requests.get('http://www.baidu.com')print(response.text)withparametersGet请求直接写在url后面,url后面跟?表示带参数,每对参数用&隔开。以下url:https://www.bilibili.com/video...注意:url最长2048字节,数据透明不安全。将其作为字典参数传递data={'name':'xiaoming','age':26}response=requests.get('http://www.abcd.com',params=data)print(response.text)发送post请求只能作为字典参数传入。注意参数名是data而不是paramsdata={'name':'xiaoming','age':26}response=requests.post('http://www.abcd.com',data=data)print(response.text)添加headersheads={}heads['User-Agent']='Mozilla/5.0'\'(Maci电脑;你;英特尔MacOSX10_6_8;en-us)AppleWebKit/534.50'\'(KHTML,likeGecko)Version/5.1Safari/534.50'response=requests.get('http://www.baidu.com',headers=headers)使用代理proxy={'http':'49.89.84.106:9999','https':'49.89.84.106:9999'}heads={}heads['User-Agent']='Mozilla/5.0(WindowsNT10.0;WOW64)AppleWebKit/537.36(KHTML,如Gecko)Chrome/49.0.2623.221Safari/537.36SE2.XMetaSr1.0'req=requests.get(url,proxies=proxy,headers=heads)print(req.text)使用来自请求的加密代理。authimportHTTPProxyAuthproxies={'http':'127.0.0.1:8888','https':'127.0.0.1:8888'}auth=HTTPProxyAuth('user','pwd')requests.get(url,proxies=proxies,auth=auth)也可以这样proxies={"http":"http://user:pass@10.10.1.10:3128/",}req=requests.get(url,proxies=proxy,headers=heads)Cookie获取Cookieimportrequestsresponse=requests.get("http://www.baidu.com")print(type(response.cookies))#把cookiejar对图像转换为字典cookies=requests.utils.dict_from_cookiejar(response.cookies)print(cookies)使用Cookiecookie={"Cookie":"xxxxxxxx"}response=requests.get(url,cookies=cookie)Sessionsession=requests.Session()session.get('http://httpbin.org/cookies/set/number/12345')response=session.get('http://httpbin.org/cookies')print(response.text)限制响应时间fromrequests.exceptionsimportReadTimeouttry:response=requests.get('https://www.baidu.com',timeout=1)print(response.status_code)except:print('给定时间内无响应')通过response.json()该方法可以将JSON格式的响应内容转换为Python对象,json.loads(response.text)也可以起到同样的作用。response=requests.get('http://www.abcd.com')print(response.text)print(response.json())print(type(response.json()))如果你想了解更多编程开发和我一起成长进步,请关注我的公众号“松果仓库”,一起分享宅&程序员的各种资源,谢谢!!!