当前位置: 首页 > 后端技术 > Python

蟒蛇爬虫_0

时间:2023-03-26 13:23:41 Python

爬虫框架导入请求的通用代码defgetHtmlText(url):try:Headers={'user-agent':'Mozilla/5.0(WindowsNT10.0;Win64;x64)AppleWebKit/537.36(KHTML,likeGecko)Chrome/79.0.3945.130Safari/537.36'}r=requests.get(url,headers=Headers)r.raise_for_status()#如果status不为200,则为httpError异常r.encoding=r.apparent_encodingreturnr.textexcept:return"发生异常"if__name__=="__main__":url="http://news.fznews.com.cn/shehui/list.shtml"HtmlText=getHtmlText(url)print(HtmlText)2.抓取图片代码导入请求defgetPicture(url):try:Headers={'user-agent':'Mozilla/5.0(WindowsNT10.0;Win64;x64)AppleWebKit/537.36(KHTML,likeGecko)Chrome/79.0.3945.130Safari/537.36'}r=requests.get(url,headers\=Headers)r.raise\_for\_status()\#如果status不是200就是httpError异常returnr.contentexcept:return"异常“如果\_\_name\_\_==“\_\_main\_\_”:picurl="http://img0.dili360.com/pic/2019/10/23/5db027e9441a73i93221149。jpg"path="C://Users//fuxingyu//Desktop//abc.jpg"Pic=getPicture(picurl)withopen(path,'wb')asf:f.write(Pic)f.close()或导入请求importosurl="https://pic.rmb.bdstatic.com/1cf349c922d2e0faa054de841535a0788853.gif"root="C://Users//fuxingyu//Desktop//"path=root+url.split('/')[-1]尝试:如果不是os.path.split(root):os.mkdir(root)如果不是os.path.exists(path):r=requests.get(url)withopen(path,'wb')asf:f.write(r.content)f.close()print("文件保存成功")else:print("文件已经存在")except:print("抓取失败")