python3将网站转换为PDF需要用到pdfkit模块,这个模块需要我们进行安装
pip install pdfkit
然后我们百度下得到一段生成PDF的代码,我测试的是把我博客首页生成PDF
#!/usr/bin/python3 #coding:utf-8 import requests import pdfkit response = requests.get("https://sulao.cn/") htmls = response.content.decode("utf-8") with open("index.html", "w", encoding="utf-8") as f: f.write(htmls) pdfkit.from_file("index.html", "test.pdf")
然后执行,发现报错了,然后仔细看报错信息,报错信息如下:
OSError: No wkhtmltopdf executable found: "b''" If this file exists please check that this process can read it. Otherwise please install wkhtmltopdf - https://github.com/JazzCore/python-pdfkit/wiki/Installing-wkhtmltopdf
好像是要我们安装wkhtmltopdf的程序,百度了下我们去这里下载
https://wkhtmltopdf.org/downloads.html
安装以后记录安装文件位置,我们重新修改下代码
#!/usr/bin/python3 #coding:utf-8 import requests import pdfkit pdk_path = r'D:\python3\wkhtmltopdf\bin\wkhtmltopdf.exe' config = pdfkit.configuration(wkhtmltopdf = pdk_path) response = requests.get("https://sulao.cn/") htmls = response.content.decode("utf-8") with open("index.html", "w", encoding="utf-8") as f: f.write(htmls) pdfkit.from_file("index.html", "test.pdf", configuration=config)
然后再次生成,就可以生成功了
该模块还有其他功能,我们再来看看
直接生成远程页面为PDF,代码如下:
#!/usr/bin/python3 #coding:utf-8 import requests import pdfkit pdk_path = r'D:\python3\wkhtmltopdf\bin\wkhtmltopdf.exe' config = pdfkit.configuration(wkhtmltopdf = pdk_path) pdfkit.from_url("https://sulao.cn/", "test.pdf", configuration=config)
将字符串直接生成PDF,这个跟html生成相似
#!/usr/bin/python3 #coding:utf-8 import requests import pdfkit pdk_path = r'D:\python3\wkhtmltopdf\bin\wkhtmltopdf.exe' #安装位置 config = pdfkit.configuration(wkhtmltopdf = pdk_path) with open("index.html", "r", encoding="utf-8") as f: htmls = f.read() pdfkit.from_string(htmls, "test.pdf", configuration=config)
pdfkit的常用功能就给大家分享这些,然后我们再来看如何生成图片,总体来说和生成PDF是差不多的,我们需要先安装imgkit模块
pip install imgkit
我就直接上代码了
#!/usr/bin/python3 #coding:utf-8 import imgkit imgkit_path = r'D:\python3\wkhtmltopdf\bin\wkhtmltoimage.exe' config = imgkit.config(wkhtmltoimage=imgkit_path) imgkit.from_url("https://sulao.cn/", "test1.png", config=config)
主要是wkhtmltopdf.exe更换为wkhtmltoimage.exe,另外用string和html生成方法也是一样的,代码我就不一一写出来