Maison > Article > développement back-end > php抓取这个网页的数据,只要数据,不用html内容,然后json后写入文件,新手求教
php抓取这个网页的数据,只要数据,不要html内容,然后json后写入文件,新手求教
http://www.okooo.com/Upload/sohu/table_23.html
新收求教啊,这个难度在于正则上,不会写正则啊
------解决方案--------------------
$url = 'http://www.okooo.com/Upload/sohu/table_23.html';<br />$s = file_get_contents($url);<br />preg_match_all('#<table.+</table>#isU', $s, $m);<br />foreach(array_map('strip_tags', $m[0]) as $r) {<br /> $a = preg_split('/\s+/', $r, -1, PREG_SPLIT_NO_EMPTY);<br /> $res[] = array_chunk(array_slice($a, 0, -1), 3);<br />}<br />print_r($res);<br />echo json_encode($res);
Array<br>(<br> [0] => Array<br> (<br> [0] => Array<br> (<br> [0] => 排名<br> [1] => 球队<br> [2] => 积分<br> )<br><br> [1] => Array<br> (<br> [0] => 1<br> [1] => 尤文图斯<br> [2] => 102<br> )<br><br> [2] => Array<br> (<br> [0] => 2<br> [1] => 罗马<br> [2] => 85<br> )<br><br> [3] => Array<br> (<br> [0] => 3<br> [1] => 那不勒斯<br> [2] => 78<br> )<br><br> [4] => Array<br> (<br> [0] => 4<br> [1] => 佛罗伦萨<br> [2] => 65<br> )<br><br> [5] => Array<br> (<br> [0] => 5<br> [1] => 国际米兰<br> [2] => 60<br> )<br><br> [6] => Array<br> (<br> [0] => 6<br> [1] => 帕尔马<br> [2] => 58<div class="clear"> </div>