Home >Backend Development >PHP Tutorial >How Can I Parse DOC and DOCX Files in PHP?
When working with DOC and DOCX files in PHP, it's important to understand the limitations and technicalities involved. While PHP can successfully parse DOCX files, it lacks the necessary built-in capabilities to handle DOC files. Let's explore the available solutions for both file formats.
To read a DOCX file, you can utilize the following code:
<code class="php">function read_file_docx($filename){ $striped_content = ''; $content = ''; if(!$filename || !file_exists($filename)) return false; $zip = zip_open($filename); if (!$zip || is_numeric($zip)) return false; while ($zip_entry = zip_read($zip)) { if (zip_entry_open($zip, $zip_entry) == FALSE) continue; if (zip_entry_name($zip_entry) != "word/document.xml") continue; $content .= zip_entry_read($zip_entry, zip_entry_filesize($zip_entry)); zip_entry_close($zip_entry); }// end while zip_close($zip); //echo $content; //echo "<hr>"; //file_put_contents('1.xml', $content); $content = str_replace('</w:r></w:p></w:tc><w:tc>', " ", $content); $content = str_replace('</w:r></w:p>', "\r\n", $content); $striped_content = strip_tags($content); return $striped_content; } $filename = "filepath";// or /var/www/html/file.docx $content = read_file_docx($filename); if($content !== false) { echo nl2br($content); } else { echo 'Couldn\'t the file. Please check that file.'; }</code>
Unfortunately, PHP does not provide a native solution for parsing DOC files. External libraries or command-line tools are required to handle this file format.
The above is the detailed content of How Can I Parse DOC and DOCX Files in PHP?. For more information, please follow other related articles on the PHP Chinese website!