Python下载大文件，哪种方式更快！

时间：2023-03-17 10:56:30 科技观察

通常我们会使用requests库来下载，这个库用起来太方便了。方法一使用如下流式代码，无论下载文件大小，Python内存占用都不会增加：defdownload_file(url):local_filename=url.split('/')[-1]#注意传入参数stream=Truewithrequests.get(url,stream=True)asr:r.raise_for_status()withopen(local_filename,'wb')asf:forchunkinr.iter_content(chunk_size=8192):f.write(chunk)returnlocal_filename如果有chunk编码的需求，不要传入chunk_size参数，要有if判断。defdownload_file(url):local_filename=url.split('/')[-1]#注意传入的参数stream=Truewithrequests.get(url,stream=True)asr:r.raise_for_status()withopen(local_filename,'w')asf:forchunkinr.iter_content():ifchunk:f.write(chunk.decode("utf-8"))returnlocal_filenameiter_content[1]函数本身也可以解码，只是需要通过输入参数decode_unicode=True。另外，搜索公众号顶级Python背景，回复“进阶”即可获得惊喜大礼包。请注意，使用iter_content返回的字节数并不完全是chunk_size，它是一个通常较大的随机数，预计每次迭代都会有所不同。方法2使用Response.raw[2]和shutil.copyfileobj[3]importrequestsimportshutildefdownload_file(url):local_filename=url.split('/')[-1]withrequests.get(url,stream=True)作为r:withopen(local_filename,'wb')asf:shutil.copyfileobj(r.raw,f)returnlocal_filename这样流式传输文件到磁盘，不会占用太多内存，代码也更简单。注意：根据文档，Response.raw不会被解码，因此您可以根据需要手动替换r.raw.read方法response.raw.read=functools.partial(response.raw.read,decode_content=True)。方法2更快。如果第一种方法是2-3MB/s，第二种方法可以达到近40MB/s。参考资料[1]iter_content:https://requests.readthedocs.io/en/latest/api/#requests.Response.iter_content[2]Response.raw:https://requests.readthedocs.io/en/latest/api/#requests.Response.raw[3]shutil.copyfileobj：https://docs.python.org/3/library/shutil.html#shutil.copyfileobj

上一篇：听说只能用注解，不能自己写注解？有点危险

下一篇：物联网对统一操作系统说，我们还能看到Windows11吗？

Python下载大文件，哪种方式更快！相关文章