66

html2pdf 网页转PDF

 5 years ago
source link: https://studygolang.com/articles/14299?amp%3Butm_medium=referral
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Gayhub 链接

问遍谷歌百度,依然没有好的方案.

打开Gayhub ,发现万赞JS 效果也就那个XX样,一张糊里糊涂的img 冒充精美的PDF?

经过一天的苦思冥想,借助HiqPDF (估计Itext , spirePDF 等类似的都可以,思路还用这个就可以了),终于实现了目前来看最完美的方案 - -而且,贼简单你敢信?

<img src="https://github.com/yshch2008/perfecthtml2pdf_with_Hiq/blob/master/timg2.jpg"/>

先来个效果图 (= ̄ω ̄=)

<img src="https://github.com/yshch2008/perfecthtml2pdf_with_Hiq/blob/master/git_index.png"/>

简单版本,读文件 - -好处是无视网站定制,坏处当然就是非自动化啦(,,・ω・,,)

在你的后台放一个接收器,java,golang,python版本的 ,改天懒癌减轻了就补:

public ActionResult getWOW()
        {
            FileStream fs = new FileStream(Server.MapPath("/App_Data/wow.txt"), FileMode.OpenOrCreate, FileAccess.Read);//路径
            StreamReader sr = new StreamReader(fs, Encoding.UTF8);
            var htm = sr.ReadToEnd();
            sr.Close();
            fs.Close();
            // create the HTML to PDF converter
            HtmlToPdf htmlToPdfConverter = new HtmlToPdf();

            // set browser width
            htmlToPdfConverter.BrowserWidth = 1440;

            // set browser height if specified, otherwise use the default
            htmlToPdfConverter.BrowserHeight = htmlToPdfConverter.BrowserWidth * 2;

            // set HTML Load timeout
            //htmlToPdfConverter.HtmlLoadedTimeout = int.Parse(textBoxLoadHtmlTimeout.Text);

            // set PDF page size and orientation
            //htmlToPdfConverter.Document.PageSize = new PdfPageSize((float)(width / 2.4), (float)(height / 2.4));
            htmlToPdfConverter.Document.PageSize = new PdfPageSize(htmlToPdfConverter.BrowserWidth, htmlToPdfConverter.BrowserHeight);
            htmlToPdfConverter.Document.PageOrientation = PdfPageOrientation.Portrait;

            // set the PDF standard used by the document
            htmlToPdfConverter.Document.PdfStandard = PdfStandard.Pdf;//checkBoxPdfA.Checked ? PdfStandard.PdfA :

            // set PDF page margins
            htmlToPdfConverter.Document.Margins = new PdfMargins(0);

            // set whether to embed the true type font in PDF
            htmlToPdfConverter.Document.FontEmbedding = true;

            // set triggering mode; for WaitTime mode set the wait time before convert
            htmlToPdfConverter.TriggerMode = ConversionTriggerMode.Auto;

            // set header and footer
            //SetHeader(htmlToPdfConverter.Document);
            //SetFooter(htmlToPdfConverter.Document);

            // set the document security
            //htmlToPdfConverter.Document.Security.OpenPassword = textBoxOpenPassword.Text;
            htmlToPdfConverter.Document.Security.AllowPrinting = true;

            // set the permissions password too if an open password was set
            if (htmlToPdfConverter.Document.Security.OpenPassword != null && htmlToPdfConverter.Document.Security.OpenPassword != String.Empty)
                htmlToPdfConverter.Document.Security.PermissionsPassword = htmlToPdfConverter.Document.Security.OpenPassword + "_admin";

            //Cursor = Cursors.WaitCursor;

            // convert HTML to PDF
            string pdfFile = null;
            // convert URL to a PDF memory buffer
            //string url = formCollection["textBoxUrl"];
            byte[] pdfBuffer = htmlToPdfConverter.ConvertHtmlToMemory(htm, null);

            // send the PDF document to browser
            FileResult fileResult = new FileContentResult(pdfBuffer, "application/pdf");
            fileResult.FileDownloadName = "HtmlToPdf.pdf";

            return File(pdfBuffer, "application/octet-stream", "test.pdf");//最后一个双引号里面是回传的文件名,加个time值就可以避免重复了
        }

在浏览器按F12打开控制台(console)输入 document.getElementsByTagName("html")[0].innerHTML

把得到的字符串拷贝到路径保存

然后把服务器跑起来,激活一下这个controller " http://localhost:3095/ "

然后你就就会下载一个PDF,打开看看,激不激动?开不开心?

进阶版本- -前后端交互

  1. Controller:
public ActionResult GetPDFfromHtmlCode(int width, int height, string htm)
        {
            htm = Server.UrlDecode(htm);
            //var path = Server.MapPath("/").Replace("\\", "/");
            var path = @"C:\Users\Public\Documents\DevExpress Demos 18.1\Components\ASP.NET\CS\ASPxCardViewDemos";
            htm = htm.Replace("/Content/",path+ "/Content/")
                .Replace("src=\"", "src=\""+path)
                //.Replace("<script src=\"", "<script src=\"" + path)
                //.Replace("type=\"text/css\" href=\"", "type=\"text/css\" href=\"" + path)
                .Replace("//","/")
                ;
            // create the HTML to PDF converter
            HtmlToPdf htmlToPdfConverter = new HtmlToPdf();

            // set browser width
            htmlToPdfConverter.BrowserWidth = 1440;

            // set browser height if specified, otherwise use the default
            htmlToPdfConverter.BrowserHeight = height;

            // set HTML Load timeout
            //htmlToPdfConverter.HtmlLoadedTimeout = int.Parse(textBoxLoadHtmlTimeout.Text);

            // set PDF page size and orientation
            //htmlToPdfConverter.Document.PageSize = new PdfPageSize((float)(width / 2.4), (float)(height / 2.4));
            htmlToPdfConverter.Document.PageSize = new PdfPageSize(htmlToPdfConverter.BrowserWidth, height);
            htmlToPdfConverter.Document.PageOrientation = PdfPageOrientation.Portrait;

            // set the PDF standard used by the document
            htmlToPdfConverter.Document.PdfStandard = PdfStandard.Pdf;//checkBoxPdfA.Checked ? PdfStandard.PdfA :

            // set PDF page margins
            htmlToPdfConverter.Document.Margins = new PdfMargins(0);

            // set whether to embed the true type font in PDF
            htmlToPdfConverter.Document.FontEmbedding = true;

            // set triggering mode; for WaitTime mode set the wait time before convert
            htmlToPdfConverter.TriggerMode = ConversionTriggerMode.Auto;

            // set header and footer
            //SetHeader(htmlToPdfConverter.Document);
            //SetFooter(htmlToPdfConverter.Document);

            // set the document security
            //htmlToPdfConverter.Document.Security.OpenPassword = textBoxOpenPassword.Text;
            htmlToPdfConverter.Document.Security.AllowPrinting = true;

            // set the permissions password too if an open password was set
            if (htmlToPdfConverter.Document.Security.OpenPassword != null && htmlToPdfConverter.Document.Security.OpenPassword != String.Empty)
                htmlToPdfConverter.Document.Security.PermissionsPassword = htmlToPdfConverter.Document.Security.OpenPassword + "_admin";

            //Cursor = Cursors.WaitCursor;

            // convert HTML to PDF
            string pdfFile = null;
            // convert URL to a PDF memory buffer
            //string url = formCollection["textBoxUrl"];
            byte[] pdfBuffer = htmlToPdfConverter.ConvertHtmlToMemory(htm, null);

            // send the PDF document to browser
            FileResult fileResult = new FileContentResult(pdfBuffer, "application/pdf");
            fileResult.FileDownloadName = "HtmlToPdf.pdf";

            return File(pdfBuffer, "application/octet-stream","test.pdf");
        }
  1. JS里面这个激活(注意转译html):
var DownLoadFile = function (options) {
            var config = $.extend(true, { method: 'post' }, options);
            var $iframe = $('<iframe id="down-file-iframe" />');
            var $form = $('<form target="down-file-iframe" method="' + config.method + '" />');
            $form.attr('action', config.url);
            for (var key in config.data) {
                $form.append('<input type="hidden" name="' + key + '" value="' + config.data[key] + '" />');
            }
            $iframe.append($form);
            $(document.body).append($iframe);
            $form[0].submit();
            $iframe.remove();
        }
         DownLoadFile({
                url: "http://localhost:3095/test/getPDFfromHtmlCode", //the URI of your controller
                data: { "height": $("body").outerHeight(), "width": $("body").outerWidth(), "htm": escape(document.getElementsByTagName("html")[0].innerHTML), "head": "test" }//the data in your request ,according to your controller
            });

最后说明

首先你要转的网页最好资源文件都是绝对路径的,因为巧妇难为无米之炊 ,你的服务器拿不到资源文件去渲染,当然就做不到给你一个完美的PDF

其次,如果是相对路径, 你需要在后端用replace大法,把相对路径转成服务器能获得的绝对路径

再其次,终极办法就是把网页里面的左右资源文件,你都给手动下载到服务器上,然后把相对地址都引用到这个位置, 除了太神奇的样式,基本都可以做到所见即所得

免费版本的hiqPDF会在文件的页眉位置打水印 ,这个时候就体现我们方法的好处了:导出的PDF不是图片而是真的PDF编码,所以可以用福昕 PDF 之类的软件打开把这个水印删除掉

(当然你也可以在后端直接替换编码,我这就不搞了,眼睛疼(๑´ڡ`๑))

开动一下脑筋,其实配合Markdown,在线Excel ,在线简历设计等等 东西,你可以很轻松的得到一份好看的PDF格式的 文档,图表,简历 ~~

所以不要纠结于我的代码是否好看,主要是思路 - -遇山劈山,逢海填海 :前端没法很好的渲染,那就放到后端; 后端拿不到相对路径 ,就转成绝对路径;手动麻烦 ,就封装成POST ,配合编解码函数`````

你看,html2canvas 2pdf 都有那么多的赞,效果其实就一张模糊的图片 ,而我们,用一个简单低级的思路, 得到了一份几乎完美的PDF文件 ,所以,思路啊老铁~~

声明:

  • 无意支持盗版,大家有hiqPDF使用权的就用就好了,试用免费版本的也不触犯法律,效果还是可以的~~
  • 对于非开发人员或者纯前端 ,这个方法还是稍微有点成本的 ,在此表示歉意
  • 介于第二点 ,我准备在工作安稳一点后搭建一个在线的网站, 实现思路已经有了,希望到时候有志同道合的大佬们多给建议,多给帮助~~

既然你都看到这里了,鼠标移到右上角给个star咯 ( ¯ ³¯ )♡ㄘゅ


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK