Home >Backend Development >PHP Tutorial >Detailed explanation of examples of php parsing mht files and converting them into html

Detailed explanation of examples of php parsing mht files and converting them into html

黄舟
黄舟Original
2017-03-14 16:33:382493browse

The following editor will bring you an example of converting php files into html by parsing mht files. The editor thinks it is quite good, so I will share it with you now and give it as a reference for everyone. Let's follow the editor and take a look.

php parses the mht file. Use editor to open it and you can see the base64 encoding. Therefore, mht can be converted into html.


<?php

/**
 * 针对Mht格式的文件进行解析
* 使用例子:
* 
* function mhtmlParseBody($filename) {

	if (file_exists ( $filename )) {
		if (is_dir ( $filename )) return false;
		
		$filename = strtolower ( $filename );
		if (strpos ( $filename, &#39;.mht&#39;, 1 ) == FALSE) return false;
			
		
		$o_mhtml = new mhtml ();
		$o_mhtml->set_file ( $filename );
		$o_mhtml->extract ();
		return $o_mhtml->get_part_to_file(0);

	}
	return null;
}

function mhtmlParseAll($filename) {

	if (file_exists ( $filename )) {
		if (is_dir ( $filename )) return false;

		$filename = strtolower ( $filename );
		if (strpos ( $filename, &#39;.mht&#39;, 1 ) == FALSE) return false;
			

		$o_mhtml = new mhtml ();
		$o_mhtml->set_file ( $filename );
		$o_mhtml->extract ();
		return $o_mhtml->get_all_part_file();

	}
	return null;
}
*/

class mhtparse {

	var $file = &#39;&#39;;
	var $boundary = &#39;&#39;;
	var $filedata = &#39;&#39;;
	var $countparts = 1;
	var $log = &#39;&#39;;
	
	function extract() {
		$this->read_filedata ();
		$this->file_parts ();

		return 1;
	}
	
	function set_file($p) {
		$this->file = $p;
	}
	
	function get_log() {
		return $this->log;
	}
	
	function file_parts() {
		$lines = explode ( "\n", substr ( $this->filedata, 0, 8192 ) );
		foreach ( $lines as $line ) {
			$line = trim ( $line );
			if (strpos ( $line, &#39;=&#39; ) !== FALSE) {
				if (strpos ( $line, &#39;boundary&#39;, 0 ) !== FALSE) {
					$range = $this->getrange ( $line, &#39;"&#39;, &#39;"&#39;, 0 );
					$this->boundary = "--" . $range [&#39;range&#39;];
					$this->filedata = str_replace ( $line, &#39;&#39;, $this->filedata );
					break;
				}
			}
		}
		if ($this->boundary != &#39;&#39;) {
			$this->filedata = explode ( $this->boundary, $this->filedata );
			unset ( $this->filedata [0] );
			$this->filedata = array_values ( $this->filedata );
			$this->countparts = count ( $this->filedata );
		} else {
			$tmp = $this->filedata;
			$this->filedata = array (
					$tmp 
			);
		}
	}
	
	function get_all_part_file() {
		return $this->filedata;
	}
	
	function get_part_to_file($i) {
		$line_data_start = 0;
		$encoding = &#39;&#39;;
		$part_lines = explode ( "\n", ltrim ( $this->filedata [$i] ) );
		foreach ( $part_lines as $line_id => $line ) {
			$line = trim ( $line );
			if ($line == &#39;&#39;) {
				if (trim ( $part_lines [0] ) == &#39;--&#39;)
					return 1;
				$line_data_start = $line_id;
				break;
			}
			if (strpos ( $line, &#39;:&#39; ) !== FALSE) {
				$pos = strpos ( $line, &#39;:&#39; );
				$k = strtolower ( trim ( substr ( $line, 0, $pos ) ) );
				$v = trim ( substr ( $line, $pos + 1, strlen ( $line ) ) );
				if ($k == &#39;content-transfer-encoding&#39;) {
					$encoding = $v;
				}
				if ($k == &#39;content-location&#39;) {
					$location = $v;
				}
				if ($k == &#39;content-type&#39;) {
					$contenttype = $v;
				}
			}
		}
		
		foreach ( $part_lines as $line_id => $line ) {
			if ($line_id <= $line_data_start)
				$part_lines [$line_id] = &#39;&#39;;
		}
		
		$part_lines = implode ( &#39;&#39;, $part_lines );
		if ($encoding == &#39;base64&#39;)
			$part_lines = base64_decode ( $part_lines );
		elseif ($encoding == &#39;quoted-printable&#39;)
			$part_lines = imap_qprint ( $part_lines );
		
		return $part_lines;
	}
	
	function read_filedata() {
		$handle = fopen ( $this->file, &#39;r&#39; );
		$this->filedata = fread ( $handle, filesize ( $this->file ) );
		fclose ( $handle );
	}
	
	function getrange(&$subject, $Beginmark_str = &#39;{&#39;, $Endmark_str = &#39;}&#39;, $Start_pos = 0) {
		/*
		 * $str="sssss { x { xx } {xx{xx } x} x} sssss"; $range=string::getRange($str,&#39;{&#39;,&#39;}&#39;,0); echo $range[&#39;range&#39;]; 
		 //tulem: " x { xx } {xx{xx } x} x" echo $range[&#39;behin&#39;]; 
		 //tulem: 6 echo $range[&#39;end&#39;]; 
		 //tulem: 30 (&#39; &#39;) -- l5pumärgist järgnev out: array(&#39;range&#39;=>$Range,&#39;begin&#39;=>$Begin_firstOccurence_pos,&#39;end&#39;=>$End_sequel_pos) | 
		 false v1.1 2004-2006,Uku-Kaarel J5esaar,ukjoesaar@hot.ee,http://www.php.cn/,+3725110693
		 */
		if (empty ( $Beginmark_str ))
			$Beginmark_str = &#39;{&#39;;
		$Beginmark_str_len = strlen ( $Beginmark_str );
		
		if (empty ( $Endmark_str ))
			$Endmark_str = &#39;}&#39;;
		$Endmark_str_len = strlen ( $Endmark_str );
		
		/* $Start_pos_cache = 0; */
		do {
			/* !algus */
			if (! is_int ( $Begin_firstOccurence_pos ))
				$Start_pos_cache = $Start_pos;
				
				/* ?algus-test */
			$Start_pos_cache = @strpos ( $subject, $Beginmark_str, $Start_pos_cache );
			
			/* this is possible start for range */
			if (is_int ( $Start_pos_cache )) {
				/* skip */
				$Start_pos_cache = ($Start_pos_cache + $Beginmark_str_len);
				/* test possible range start pos */
				if (is_int ( $Begin_firstOccurence_pos )) {
					if ($Start_pos_cache < $range_end_pos)
						$rangeClean = 0;
					elseif ($Start_pos_cache > $range_end_pos)
						$rangeClean = 1;
				}
				/* here it is */
				if (! is_int ( $Begin_firstOccurence_pos ))
					$Begin_firstOccurence_pos = $Start_pos_cache;
			} /* VIGA NR 0 ALGUST EI OLE */
			
			if (! is_int ( $Start_pos_cache )) {
				/* !algus */
	/* VIGA NR 1 ALGUSMARKI EI LEITUD : VIIMANE VOIMALIK ALGUS */
	if (is_int ( $Begin_firstOccurence_pos ) and ($Start_pos_cache < $range_end_pos))
					$rangeClean = 1;
				else
					return false;
			}
			if (is_int ( $Begin_firstOccurence_pos ) and ($rangeClean != 1)) {
				if (! is_int ( $End_pos_cache ))
					$End_sequel_pos = $Begin_firstOccurence_pos;
				
				$End_pos_cache = strpos ( $subject, $Endmark_str, $End_sequel_pos );
				
				/* ok */
				if (is_int ( $End_pos_cache ) and ($rangeClean != 1)) {
					$range_current_lenght = ($End_pos_cache - $Begin_firstOccurence_pos);
					$End_sequel_pos = ($End_pos_cache + $Endmark_str_len);
					$range_end_pos = $End_pos_cache;
				}
				/* VIGA NR 2 LOPPU EI LEITUD */
				if (! is_int ( $End_pos_cache ))
					if ($End_pos_cache == false)
						return false;
			}
		} while ( $rangeClean < 1 );
		
		if (is_int ( $Begin_firstOccurence_pos ) and is_int ( $range_current_lenght ))
			$Range = substr ( $subject, $Begin_firstOccurence_pos, $range_current_lenght );
		else
			return false;
		
		return array (
				&#39;range&#39; => $Range,
				&#39;begin&#39; => $Begin_firstOccurence_pos,
				&#39;end&#39; => $End_sequel_pos 
		);
	} // end getrange()
} // class
?>

The above is the detailed content of Detailed explanation of examples of php parsing mht files and converting them into html. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn