当前位置: 首页 > 后端技术 > Python

Python实战:一键导出书籍笔记微信阅读

时间:2023-03-25 20:07:02 Python

全民阅读时代已经到来。目前有2.1亿用户使用阅读软件,日活跃用户超过500万,其中19-35岁的年轻用户占比超过60%,本科及以上学历用户占比高达80%,北京、上海、广州、深圳等省会/直辖市用户占比超过80%。我习惯用微信看书。为了方便整理书籍和导出笔记,我开发了这个带有一些截图的小工具。代码思路1.目录结构首先我们看一下整体目录结构代码├─excel_func.py读写excel文件├─pyqt_gui.pyPyQtGUI界面└─wereader.py微信阅读相关apiexcel_func.py使用xlrd和读写excel文件的xlwt库pyqt_gui.py利用PyQt绘制GUI界面wereader.py通过抓包分析获取相关api2和excel_func.pydefwrite_excel_xls(path,sheet_name_list,value):#新建工作簿workbook=xlwt.Workbook()#获取需要写入的行数index=len(value)forsheet_nameinsheet_name_list:#新建一个工作簿Aformsheet=workbook.add_sheet(sheet_name)#向这个工作簿的表格写入数据为iinrange(0,index):forjinrange(0,len(value[i])):工作表。write(i,j,value[i][j])#保存工作簿workbook.save(path)该函数的代码流程为:创建excel文件,创建表,向表中写入数据3、pyqt_gui.pyclassMainWindow(QMainWindow):def__init__(self,*args,**kwargs):super().__init__(*args,**kwargs)self.DomainCookies={}self.setWindowTitle('微信阅读助手')#设置窗口标题self.resize(900,600)#设置窗口大小self.setWindowFlags(Qt.WindowMinimizeButtonHint)#禁止最大化按钮self.setFixedSize(self.width(),self.height())#禁止调整窗口大小url='https://weread.qq.com/#login'#目标地址self.browser=QWebEngineView()#实例化浏览器对象QWebEngineProfile.defaultProfile().cookieStore().deleteAllCookies()#运行软件时删除所有cookies第一次cookiesQWebEngineProfile.defaultProfile().cookieStore().cookieAdded.connect(self.onCookieAdd)#cookies增加时触发self.onCookieAdd()函数self.browser.loadFinished.connect(self.onLoadFinished)#self.onLoadFinished时触发网页加载完成.onLoadFinished()functionself.browser.load(QUrl(url))#加载网页self.setCentralWidget(self.browser)#设置中央窗口该函数的代码流程是:新建一个QT窗口,实例化QWebEngineView对象并绑定self.onCookieAdd事件绑定self.onLoadFinished事件加载网页#网页加载完成事件defonLoadFinished(self):globalUSER_VIDglobalHEADERS#获取cookiescookies=['{}={};'.format(key,value)forkey,self.DomainCoo中的值kies.items()]cookies=''.join(cookies)#添加Cookie到headerHEADERS.update(Cookie=cookies)#判断是否登录成功阅读微信iflogin_success(HEADERS):print('LogintoWeChatreadingsuccessful!')#getuser_vidif'wr_vid'inself.DomainCookies.keys():USER_VID=self.DomainCookies['wr_vid']print('userid:{}'.format(USER_VID))#关闭整个qt窗口self.close()else:print('请扫描二维码登录微信阅读...')该函数的代码流程为:网页加载时,检查是否微信阅读登录成功。如果成功登录微信阅读,则关闭QT窗口,开始数据导出。如果登录微信阅读失败,继续等待用户扫描二维码#添加cookies事件defonCookieAdd(self,cookie):if'weread.qq.com'incookie.domain():name=cookie.name().data().decode('utf-8')value=cookie.value().data().decode('utf-8')如果名称不在self.DomainCookies中:self.DomainCookies.update({name:value})该函数的代码流程是:保存微信阅读网址的cookie,用于后续操作books=get_bookshelf(USER_VID,HEADERS)#get获取书架上的书books_finish_read=books['finishReadBooks']books_recent_read=books['recentBooks']books_all=books['allBooks']write_excel_xls_append(data_dir+'我的书架.xls','已读过的书',books_finish_read)#添加写入excel文件write_excel_xls_append(data_dir+'MyBookshelf.xls','Recentlyreadbooks',books_recent_read)#添加写入excel文件write_excel_xls_append(data_dir+'MyBookshelf.xls','Allbooks',books_all)#添加并写入excel文件#获取书架上每本书的笔记用于索引,bookinenumerate(books_finish_read):book_id=book[0]book_name=book[1]notes=get_bookmarklist(book[0],HEADERS)withopen(note_dir+book_name+'.txt','w')asf:f.write(notes)print('导出笔记{}({}/{})'.format(note_dir+book_name+'.txt',index+1,len(books_finish_read)))该函数的代码流程是:调用write_excel_xls_append函数,保存书籍,导出笔记4.wereader.pydefget_bookshelf(userVid,headers):"""获取书架上的所有书籍"""url="https://i.weread.qq.com/shelf/friendCommon"params=dict(userVid=userVid)r=requests.get(url,params=params,headers=headers,verify=False)ifr.ok:data=r.json()else:raiseException(r.text)books_finish_read=set()#读取读完的书books_recent_read=set()#最近读过的书books_all=set()#所有书架上的书forbookindata['recentBooks']:ifnotbook['bookId'].isdigit():#Filter公众号continueb=Book(book['bookId'],book['title'],book['author'],book['cover'],book['intro'],book['category'])books_recent_read.add(b)books_all=books_finish_read+books_recent_readreturndict(finishReadBooks=books_finish_read,recentBooks=books_recent_read,allBooks=books_all)这个函数的代码流程是:获取最近读过的书,已经读过的书,所有的书Filter公众号部分并将图书数据保存为字典defget_bookmarklist(bookId,headers):"""获取一本书的笔记并返回md文本"""url="https://i.weread.qq.com/book/bookmarklist"params=dict(bookId=bookId)r=requests.get(url,params=params,headers=headers,verify=False)ifr.ok:data=r.json()#剪贴板。copy(json.dumps(data,indent=4,sort_keys=True))else:raiseException(r.text)chapters={c['chapterUid']:c['title']forcindata['chapters']}contents=defaultdict(list)foriteminsorted(data['updated'],key=lambdax:x['chapterUid']):#foritemindata['updated']:chapter=item['chapterUid']text=item['markText']create_time=item["createTime"]start=int(item['range'].split('-')[0])contents[chapter].append((开始,文本))chapters_map={title:levelforlevel,titleinget_chapters(int(bookId),headers)}res=''forcinsorted(chapters.keys()):title=chapters[c]res+='#'*chapters_map[title]+''+title+'\n'开始,排序中的文本(内容[c],key=lambdae:e[0]):res+='>'+text.strip()+'\n\n'res+='\n'returnres该函数的代码流程为:获取某本书将返回的字符串改写成markdown格式并输出howtorun#跳转到当前目录cd目录名#先卸载依赖库pipuninstall-y-rrequirement.txt#然后重新安装依赖库pipinstall-rrequirement。txt-ihttps://pypi.tuna.tsinghua.edu.cn/simple#开始运行pythonpyqt_gui.py