Home  >  Article  >  Backend Development  >  php抓取这个网页的数据,只要数据,不用html内容,然后json后写入文件,新手求教

php抓取这个网页的数据,只要数据,不用html内容,然后json后写入文件,新手求教

WBOY
WBOYOriginal
2016-06-13 12:02:481096browse

php抓取这个网页的数据,只要数据,不要html内容,然后json后写入文件,新手求教
http://www.okooo.com/Upload/sohu/table_23.html   
新收求教啊,这个难度在于正则上,不会写正则啊
------解决方案--------------------

$url = 'http://www.okooo.com/Upload/sohu/table_23.html';<br />$s = file_get_contents($url);<br />preg_match_all('#<table.+</table>#isU', $s, $m);<br />foreach(array_map('strip_tags', $m[0]) as $r) {<br />  $a = preg_split('/\s+/', $r, -1, PREG_SPLIT_NO_EMPTY);<br />  $res[] = array_chunk(array_slice($a, 0, -1), 3);<br />}<br />print_r($res);<br />echo json_encode($res);

Array<br>(<br>    [0] => Array<br>        (<br>            [0] => Array<br>                (<br>                    [0] => 排名<br>                    [1] => 球队<br>                    [2] => 积分<br>                )<br><br>            [1] => Array<br>                (<br>                    [0] => 1<br>                    [1] => 尤文图斯<br>                    [2] => 102<br>                )<br><br>            [2] => Array<br>                (<br>                    [0] => 2<br>                    [1] => 罗马<br>                    [2] => 85<br>                )<br><br>            [3] => Array<br>                (<br>                    [0] => 3<br>                    [1] => 那不勒斯<br>                    [2] => 78<br>                )<br><br>            [4] => Array<br>                (<br>                    [0] => 4<br>                    [1] => 佛罗伦萨<br>                    [2] => 65<br>                )<br><br>            [5] => Array<br>                (<br>                    [0] => 5<br>                    [1] => 国际米兰<br>                    [2] => 60<br>                )<br><br>            [6] => Array<br>                (<br>                    [0] => 6<br>                    [1] => 帕尔马<br>                    [2] => 58<div class="clear">
                 
              
              
        
            </div>
Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn