Home >Backend Development >PHP Tutorial >PHP implements OCR text recognition
The Baidu definition of OCR (Optical Character Recognition) refers to an electronic device (such as a scanner or digital camera) checking the characters printed on paper, determining their shape by detecting dark and light patterns, and then using character recognition methods to The process of translating shapes into computer text; that is, for printed characters, the text in the paper document is optically converted into a black and white dot matrix image file, and the text in the image is converted into a text format through recognition software for The technology of further editing and processing using word processing software.
As an engineer, in actual programming, you may need to display the text in the picture, which requires the use of OCR technology. Because of PHP development, I gave priority to PHP. I found PHP's OCR extension and tested it, but found that it was not available (address: http://sourceforge.net/projects/phpocr.berlios)? I have also watched many demos from friends on the Internet. The basic principle is to decompose the image into a matrix of 0 and 1, and then convert it into the corresponding string according to the characteristics. It is not feasible to test several. Then I saw others saying that PHP is rarely used for OCR and is not suitable. The language efficiency is too low. This algorithm requires high efficiency. You can try the OCR algorithm of C, MATLAB, etc. There are many people working in matlab who play partial algorithms such as OCR.
But I have little talent and little knowledge, and I can’t do C. I accidentally discovered that Baidu has an OCR API provided: http://apistore.baidu.com/apiworks/servicedetail/146.html.
Written for fun:
<?php header("Content-type: text/html; charset=utf-8"); function curl($img){ $ch = curl_init(); $url ='http://apis.baidu.com/apistore/idlocr/ocr';//百度ocr api $header = array( 'Content-Type:application/x-www-form-urlencoded', 'apikey:69c2ace1ef297ce88869f0751cb1b618', ); $data_temp = file_get_contents($img); $data_temp = urlencode(base64_encode($data_temp)); //封装必要参数 $data="fromdevice=pc&clientip=127.0.0.1&detecttype=LocateRecognize&languagetype=CHN_ENG&imagetype=1&image=".$data_temp; curl_setopt($ch, CURLOPT_HTTPHEADER , $header);// 添加apikey到header curl_setopt($ch, CURLOPT_POST,1); curl_setopt($ch, CURLOPT_POSTFIELDS, $data);// 添加参数 curl_setopt($ch, CURLOPT_RETURNTRANSFER,1); curl_setopt($ch , CURLOPT_URL , $url);// 执行HTTP请求 $res = curl_exec($ch); if($res === FALSE){ echo "cURL Error: ". curl_error($ch); } curl_close($ch); $temp_var = json_decode($res,true); return $temp_var; } $wordArr = curl('4.jpg'); if($wordArr['errNum']==0){ var_dump($wordArr); }else{ echo "识别出错:".$wordArr["errMsg"]; }