Home  >  Article  >  Backend Development  >  正则表达式匹配url地址的问题求救

正则表达式匹配url地址的问题求救

WBOY
WBOYOriginal
2016-06-23 14:08:451085browse

我想匹配出 http://www.so.com/s?q=csdn&pn=7&j=0里每个搜索结果的url,用下面的正则匹配出的结果为空,错在哪里呢?

$c1 = "/<h3 class=\"res-title (?:mark\-nowrap)?\">\s*<a target=\"_blank\" data-m=\"(?:.*)\" data-pos\"(?:\d+)\" data-e=\"(?:\d+)\" data-st=\"(?:\d+)\" href=\"(.*)\">(?:.*)<\/a>\s*<\/h3>/Uis";		$content= get_content('http://www.so.com/s?q=csdn&pn=7&j=0');		preg_match_all($c1,$content,$arr1);		print_r($arr1);


回复讨论(解决方案)

你应该重新考虑一下问问题的方法,这样问没几个人愿意回答的

全文都找不到一个 

用惯了DOM解析,正则生疏了,那个href太长了,所有用了2次preg_match_all,勉强匹配出来了。


$urls = 'http://www.so.com/s?q=csdn&pn=7&j=0';$ch = curl_init();curl_setopt($ch, CURLOPT_URL,$urls);curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);$content = curl_exec($ch);curl_close($ch);$c1 = "/<li class=\"res-list\">(.*?)<a href=\"(.*?)\">(.*?)<\/a>(.*?)<\/li>/is";preg_match_all($c1,$content,$arr1);foreach($arr1[0] as $part){	$c2 = "/href=('|\")(.*?)(?1)\s+/is";	preg_match_all($c2,$part,$arr2);	echo $arr2[2][0].'<br />';}

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn