Home  >  Article  >  Backend Development  >  Three ways to save web pages as word files in PHP_PHP tutorial

Three ways to save web pages as word files in PHP_PHP tutorial

WBOY
WBOYOriginal
2016-07-13 10:34:57826browse

1. Two ideas or principles for generating word with PHP

1. Use the com component under windows
2. Use PHP to write the content into the doc file
The specific implementation method is as follows.

2. Use com components under windows

Principle: com is an extension class of PHP. Servers with office installed will automatically call com of word.application, which can automatically generate documents. PHP official document manual: http://www.php.net/manual/en /class.com.php

Use official example:

Copy code The code is as follows:
// starting word
$word = new COM(" word.application") or die("Unable to instantiate Word");
echo "Loaded Word, version {$word->Version}n";

//bring it to front
$word->Visible = 1;

//open an empty document
$word->Documents->Add();

//do some weird stuff
$word->Selection->TypeText("This is a test...");
$word->Documents[1]->SaveAs("Useless test.doc");

//closing word
$word->Quit();

//free the object
$word = null;
?>

Personal suggestion: For the methods after the com instance, you need to look up official documents to know what they mean. The editor has no code prompts, which is very inconvenient. In addition, this efficiency is not very high, and it is not recommended to use

3. Use PHP to write the content into the doc file
This method can be divided into two methods

1. Generate mht format (very similar to HTML) and write it into word
2. Write pure HTML format into word


1), generate mht format (very similar to HTML) and write it into word

Copy code The code is as follows:
/**
* Get word document content based on HTML code
* Create a document that is essentially mht. This function will analyze the file content and download the image resources in the page from a remote location
* This function depends on the class MhtFileMaker
* This function will analyze the img tag and extract the attribute value of src. However, the attribute value of src must be surrounded by quotes, otherwise it cannot be extracted.
*
* @param string $content HTML content
* @param string $absolutePath The absolute path of the web page. If the image path in the HTML content is a relative path, then you need to fill in this parameter to let the function automatically fill it into an absolute path. This parameter needs to end with /
* @param bool $isEraseLink Whether to remove the link in the HTML content
*/
function getWordDocument( $content , $absolutePath = "" , $isEraseLink = true )
{
$mht = new MhtFileMaker();
if ($isEraseLink)
            $content = preg_replace('/(s*.*?s*)/i' , '$1' , $content) ; //Remove link

$images = array();
$files = array();
$matches = array();
//This algorithm requires attributes after src Values ​​must be enclosed in quotes
if ( preg_match_all('//i',$content ,$matches ) )
{
$arrPath = $matches[1];
for ( $i=0;$i {
$path = $arrPath[$i];
$imgPath = trim( $path );
if ( $imgPath != "" )
{
                                files[] = $imgPath;
if( substr($imgPath,0,7) == 'http://')
{
//Absolute link, no prefix
          }
else
$images[] = $imgPath;
}
}
}
$ mht->AddContents("tmp.html",$mht->GetMimeType("tmp.html"),$content);

for ( $i=0;$i {
$image = $images[$i];
if ( @fopen($image, 'r') )
{
$imgcontent = @file_get_contents ( $image );
if ( $content )
$mht->AddContents($files[$i],$mht->GetMimeType($image),$imgcontent);
}
else
echo "file:".$image." not exist! > ;GetFile();
}

The main function of this function is actually to analyze all the image addresses in the HTML code and download them in sequence. After obtaining the content of the image, call the MhtFileMaker class to add the image to the mht file. The specific adding details are encapsulated in the MhtFileMaker class.

Usage method 1: Remote call

Copy code The code is as follows:
$url= http://www.*** .com;

$content = file_get_contents($url);

$fileContent = getWordDocument($content,"http://www.yoursite.com/Music/etc/");
$fp = fopen("test.doc", 'w');
fwrite($fp, $fileContent);
fclose($fp);
Where, the $content variable should be HTML source code, the following link should be a URL address that can fill in the relative path of the image in the HTML code


Among them, the $content variable should be the HTML source code, and the following link should be the URL address that can fill in the relative path of the image in the HTML code

Usage method 2: Locally generated call

Copy code The code is as follows:

header("Cache-Control: no -cache, must-revalidate");
header("Pragma: no-cache");
$wordStr = 'PHP tutorial website--jb51.net';
$fileContent = getWordDocument($wordStr );
$fileName = iconv("utf-8", "GBK", 'PHP tutorial' . '_'. $intro . '_' . rand(100, 999));
header(" Content-Type: application/doc");
header("Content-Disposition: attachment; filename=" . $fileName . ".doc");
echo $fileContent;

Note that before using this function, you need to include the class MhtFileMaker. This class can help us generate Mht documents.

Copy code The code is as follows:
/***********************************************************************
Class:        Mht File Maker
Version:      1.2 beta
Date:         02/11/2007
Author:       Wudi
Description:  The class can make .mht file.
***********************************************************************/

class MhtFileMaker{
    var $config = array();
    var $headers = array();
    var $headers_exists = array();
    var $files = array();
    var $boundary;
    var $dir_base;
    var $page_first;

    function MhtFile($config = array()){

    }

    function SetHeader($header){
        $this->headers[] = $header;
        $key = strtolower(substr($header, 0, strpos($header, ':')));
        $this->headers_exists[$key] = TRUE;
    }

    function SetFrom($from){
        $this->SetHeader("From: $from");
    }

    function SetSubject($subject){
        $this->SetHeader("Subject: $subject");
    }

    function SetDate($date = NULL, $istimestamp = FALSE){
        if ($date == NULL) {
            $date = time();
        }
        if ($istimestamp == TRUE) {
            $date = date('D, d M Y H:i:s O', $date);
        }
        $this->SetHeader("Date: $date");
    }

    function SetBoundary($boundary = NULL){
        if ($boundary == NULL) {
            $this->boundary = '--' . strtoupper(md5(mt_rand())) . '_MULTIPART_MIXED';
        } else {
            $this->boundary = $boundary;
        }
    }

    function SetBaseDir($dir){
        $this->dir_base = str_replace("\", "/", realpath($dir));
    }

    function SetFirstPage($filename){
        $this->page_first = str_replace("\", "/", realpath("{$this->dir_base}/$filename"));
    }

    function AutoAddFiles(){
        if (!isset($this->page_first)) {
            exit ('Not set the first page.');
        }
        $filepath = str_replace($this->dir_base, '', $this->page_first);
        $filepath = 'http://mhtfile' . $filepath;
        $this->AddFile($this->page_first, $filepath, NULL);
        $this->AddDir($this->dir_base);
    }

    function AddDir($dir){
        $handle_dir = opendir($dir);
        while ($filename = readdir($handle_dir)) {
            if (($filename!='.') && ($filename!='..') && ("$dir/$filename"!=$this->page_first)) {
                if (is_dir("$dir/$filename")) {
                    $this->AddDir("$dir/$filename");
                }elseif (is_file("$dir/$filename")) {
                    $filepath = str_replace($this->dir_base, '', "$dir/$filename");
                    $filepath = 'http://mhtfile' . $filepath;
                    $this->AddFile("$dir/$filename", $filepath, NULL);
                }
            }
        }
        closedir($handle_dir);
    }

    function AddFile($filename, $filepath = NULL, $encoding = NULL){
        if ($filepath == NULL) {
            $filepath = $filename;
        }
        $mimetype = $this->GetMimeType($filename);
        $filecont = file_get_contents($filename);
        $this->AddContents($filepath, $mimetype, $filecont, $encoding);
    }

    function AddContents($filepath, $mimetype, $filecont, $encoding = NULL){
        if ($encoding == NULL) {
            $filecont = chunk_split(base64_encode($filecont), 76);
            $encoding = 'base64';
        }
        $this->files[] = array('filepath' => $filepath,
                               'mimetype' => $mimetype,
                               'filecont' => $filecont,
                               'encoding' => $encoding);
    }

    function CheckHeaders(){
        if (!array_key_exists('date', $this->headers_exists)) {
            $this->SetDate(NULL, TRUE);
        }
        if ($this->boundary == NULL) {
            $this->SetBoundary();
        }
    }

    function CheckFiles(){
        if (count($this->files) == 0) {
            return FALSE;
        } else {
            return TRUE;
        }
    }

    function GetFile(){
        $this->CheckHeaders();
        if (!$this->CheckFiles()) {
            exit ('No file was added.');
        }
        $contents = implode("rn", $this->headers);
        $contents .= "rn";
        $contents .= "MIME-Version: 1.0rn";
        $contents .= "Content-Type: multipart/related;rn";
        $contents .= "tboundary="{$this->boundary}";rn";
        $contents .= "ttype="" . $this->files[0]['mimetype'] . ""rn";
        $contents .= "X-MimeOLE: Produced By Mht File Maker v1.0 betarn";
        $contents .= "rn";
        $contents .= "This is a multi-part message in MIME format.rn";
        $contents .= "rn";
        foreach ($this->files as $file) {
            $contents .= "--{$this->boundary}rn";
            $contents .= "Content-Type: $file[mimetype]rn";
            $contents .= "Content-Transfer-Encoding: $file[encoding]rn";
            $contents .= "Content-Location: $file[filepath]rn";
            $contents .= "rn";
            $contents .= $file['filecont'];
            $contents .= "rn";
        }
        $contents .= "--{$this->boundary}--rn";
        return $contents;
    }

    function MakeFile($filename){
        $contents = $this->GetFile();
        $fp = fopen($filename, 'w');
        fwrite($fp, $contents);
        fclose($fp);
    }

    function GetMimeType($filename){
        $pathinfo = pathinfo($filename);
        switch ($pathinfo['extension']) {
            case 'htm': $mimetype = 'text/html'; break;
            case 'html': $mimetype = 'text/html'; break;
            case 'txt': $mimetype = 'text/plain'; break;
            case 'cgi': $mimetype = 'text/plain'; break;
            case 'php': $mimetype = 'text/plain'; break;
            case 'css': $mimetype = 'text/css'; break;
            case 'jpg': $mimetype = 'image/jpeg'; break;
            case 'jpeg': $mimetype = 'image/jpeg'; break;
            case 'jpe': $mimetype = 'image/jpeg'; break;
            case 'gif': $mimetype = 'image/gif'; break;
            case 'png': $mimetype = 'image/png'; break;
            default: $mimetype = 'application/octet-stream'; break;
        }
        return $mimetype;
    }
}
?>

点评:这种方法的缺点是不支持批量生成下载,因为一个页面只能有一个header,(无论远程使用还是本地生成声明header页面只能输出一个header),即使你循环生成,结果还是只有一个word生成(当然你可以修改上面的方式来实现)

2.纯HTML格式写入word

原理:

利用ob_start把html页面先存储起来(解决一下页面多个header问题,可以批量生成),然后在写入doc文档内容利用

代码:

复制代码 代码如下:
class word
{
    function start()
    {
        ob_start();
        echo '        xmlns:w="urn:schemas-microsoft-com:office:word"
        xmlns="http://www.w3.org/TR/REC-html40">';
    }
    function save($path)
    {

        echo "";
        $data = ob_get_contents();
        ob_end_clean();

        $this->wirtefile ($path,$data);
    }

    function wirtefile ($fn,$data)
    {
        $fp=fopen($fn,"wb");
        fwrite($fp,$data);
        fclose($fp);
    }
}

复制代码 代码如下:
$html = '


  
  


  
  


  

PHP10086 http://www.jb51.net
PHP10086 http://www.jb51.net

  PHP10086

  最靠谱的PHP技术分享网站
  
  

';

//批量生成
for($i=1;$i<=3;$i++){
    $word = new word();
    $word->start();
    //$html = "aaa".$i;
    $wordname = 'PHP教程网站--jb51.net'.$i.".doc";
    echo $html;
    $word->save($wordname);
    ob_flush();//每次执行前刷新缓存
    flush();
}

个人点评:这种方法效果最好,原因有三个:

第一代码比较简洁,很容易理解
第二是支持批量生成word(这个很重要)
第三是支持完整的html代码

www.bkjia.comtruehttp://www.bkjia.com/PHPjc/748163.htmlTechArticle一、PHP生成word的两种思路或原理 1.利用windows下面的 com组件 2.利用PHP将内容写入doc文件之中 具体实现方法如下。 二、利用windows下面的com组...
Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn