Home  >  Article  >  Backend Development  >  PHP implements batch conversion of file encoding_PHP tutorial

PHP implements batch conversion of file encoding_PHP tutorial

WBOY
WBOYOriginal
2016-07-13 10:36:18806browse

Some problems cannot be repeated. For example, gbk is converted to utf8, and then converted to utf8. This will cause garbled characters. I originally tried to detect the encoding before conversion, but it seemed to have failed. I specifically tried a file and checked whether it was gbk or utf-8, and both returned true. I don’t understand this.

Copy code The code is as follows:

/**
* Convert file encoding
* Dependent extensions filesystem and mbstring
* @example
*
<br> * include_once 'ConvertEncode.php';<br> * $convert = new ConvertEncode();<br> * try{<br> * $convert->setPath('my', true, true);//Directory<br> * //$convert->setPath('my.php ');//Single file<br> * $convert->setEncode('GBK', 'UTF-8');<br> * $convert->convert();<br> * }catch(ConvertException $e) {<br> * echo $e->getMessage();<br> * }<br> * 

*/
class ConvertEncode {

 /**
* The encoding to be converted to
* @var string
*/
 private $_to_encoding;

 /**
* Encoding before conversion
* @var string
*/
 private $_from_encoding;

 /**
* Directory or single file to be converted
* @var string
*/
 private $_path;

 /**
* Whether it is a directory, it is only set when the directory is given.
* @var boolean
*/
 private $_directory;

 /**
* Whether to traverse recursively, only valid for directories
* @var boolean
*/
 private $_recursion;

 /**
* Save all files to be converted, only used when converting files in the directory
* @var array
*/
 private $_files = array();

 /**
* Constructor
*/
 public function __construct() {
  if( ! function_exists('mb_convert_encoding') ) {
   throw new ConvertException('mbstring extension be required');
  }
 }

 /**
* Set the directory or single file to be converted
* @param string $path directory or file
* @param boolean whether it is a directory
* @param boolean whether it is a recursive directory
* @return boolean
*/
 public function setPath($path, $is_dir = false, $rec = false) {
  $this->_path = $path;
  $this->_directory = $is_dir;
  $this->_recursion = $rec;
  return true;
 }

 /**
* Set the encoding before conversion and the encoding to be converted to
* @param string $encode The encoding before conversion
* @param string $encode The encoding to be converted
* @return boolean
*/
 public function setEncode($encode_from, $encode_to) {
  $this->_from_encoding = $encode_from;
  $this->_to_encoding   = $encode_to;
  return true;
 }

 /**
* Convert encoding, convert separately according to whether it is a directory setting
* @return boolean
*/
 public function convert() {
  if($this->_directory ) {
   return $this->_convertDirectory();
  }
  return $this->_convertFile();
 }

 /**
* Convert file
* @throws ConvertException
* @return boolean
*/
 private function _convertFile() {
  if( ! file_exists($this->_path) ) {
   $message = $this->_path . ' does not exist.';
   throw new ConvertException($message);
  }
  if( ! is_file($this->_path) ) {
   $message = $this->_path . ' is not a file.';
   throw new ConvertException($message);
  }
  if( ! $this->_isWR() ) {
   $message = $this->_path . ' must can be read and write.';
   throw new ConvertException($message);
  }
  $file_real_path    = realpath($this->_path);
  $file_content_from = file_get_contents( $file_real_path );
  if( mb_check_encoding($file_content_from, $this->_from_encoding) ) {
   $file_content_to   = mb_convert_encoding( $file_content_from, $this->_to_encoding, $this->_from_encoding );
   file_put_contents( $file_real_path, $file_content_to );
  }
  return true;

 }

 /**
* Convert directory
* @throws ConvertException
* @return boolean
*/
 private function _convertDirectory() {
  if( ! file_exists($this->_path) ) {
   $message = $this->_path . ' does not exist.';
   throw new ConvertException($message);
  }
  if( ! is_dir($this->_path) ) {
   $message = $this->_path . ' is not a directory.';
   throw new ConvertException($message);
  }
  if( ! $this->_isWR() ) {
   $message = $this->_path . ' must can be read and write.';
   throw new ConvertException($message);
  }
  $this->_scanDirFiles();
  if( empty($this->_files) ) {
   $message = $this->_path . ' is a empty directory.';
   throw new ConvertException($message);
  }
  foreach( $this->_files as $value ) {
   $file_content_from = file_get_contents( $value );
   if( mb_check_encoding($file_content_from, $this->_from_encoding) ) {
    $file_content_to   = mb_convert_encoding( $file_content_from, $this->_to_encoding, $this->_from_encoding );
    file_put_contents( $value, $file_content_to );
   }
  }
  return true;
 }

 /**
* Determine whether the file or directory is readable and writable
* @return boolean returns true if it is readable and writable, otherwise returns false
*/
 private function _isWR() {
  if( is_readable($this->_path) && is_writable($this->_path) ) {
   return true;
  }
  return false;
 }

 /**
* Traverse the directory, find all files, and add absolute paths
* @return boolean
*/
 private function _scanDirFiles($dir = '') {
  $base_path = empty( $dir ) ? realpath($this->_path) . DIRECTORY_SEPARATOR : realpath($dir) . DIRECTORY_SEPARATOR;
  $files_tmp = empty( $dir ) ? scandir($this->_path) : scandir($dir);
  foreach( $files_tmp as $value ) {
   if( $value == '.' || $value == '..' || ( strpos($value, '.') === 0 ) ) {
    continue;
   }
   $value = $base_path . $value;
   if( is_dir($value) ) {
    if( $this->_recursion ) {
     $this->_scanDirFiles($value);
    }
   }
   elseif( is_file($value) ) {
    $this->_files[] = $value;
   }
  }
  return true;
 }
}

/**
* Conversion exception
*
*/
class ConvertException extends Exception {

}

www.bkjia.comtruehttp://www.bkjia.com/PHPjc/739782.htmlTechArticle有些问题,不能重复转,比如gbk转到utf8,然后有在转成utf8,这样会乱码,我本来试图在转换之前去检测编码的,貌似失败了。我特意试了一个...
Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn