Home >Backend Development >PHP Tutorial >PHP采集程序问题

PHP采集程序问题

WBOY
WBOYOriginal
2016-06-23 14:14:001093browse

   <td id="prodImageCell" height="280" width="280"><a href="http://www.amazon.co.jp/gp/product/images/B007HDJPOU/ref=dp_image_0/375-3424220-7267256?ie=UTF8&n=3828871&s=kitchen" target="AmazonHelp" onclick="return amz_js_PopWin(this.href,'AmazonHelp','width=700,height=600,resizable=1,scrollbars=1,toolbar=1,status=1');"  ><img onload="if (typeof uet == 'function') { if(typeof setCSMReq=='function'){setCSMReq('af');setCSMReq('cf');}else{uet('af');uet('cf');amznJQ.completedStage('amznJQ.AboveTheFold');} } " src="http://ec2.images-amazon.com/images/I/418Qdk6bctL._SL500_AA280_.jpg" id="prodImage"  width="280"    style="max-width:90%" border="0" alt="FJK ?・ル蜒吝・??maxell??????ョ?募с蜒?・???? LR1130??10?・・??" onmouseover="" /></a>  <td id="prodImageCell" height="280" width="280"><a href="http://www.amazon.co.jp/gp/product/images/B005318B0C/ref=dp_image_0/376-2257022-2490017?ie=UTF8&n=13299531&s=toys" target="AmazonHelp" onclick="return amz_js_PopWin(this.href,'AmazonHelp','width=700,height=600,resizable=1,scrollbars=1,toolbar=1,status=1');"  ><img onload="if (typeof uet == 'function') { if(typeof setCSMReq=='function'){setCSMReq('af');setCSMReq('cf');}else{uet('af');uet('cf');amznJQ.completedStage('amznJQ.AboveTheFold');} } " src="http://ec2.images-amazon.com/images/I/51bsLAswKVL._SL500_AA280_.jpg" id="prodImage"  width="280"    style="max-width:90%" border="0" alt="PSP2000?PSP3000? ????????1000????? ???????-543547" onmouseover="" /></a>



这两段是我采集过来的,采集两个产品页面,我就想要产品图片的地址
http://ec2.images-amazon.com/images/I/418Qdk6bctL._SL500_AA280_.jpg
http://ec2.images-amazon.com/images/I/51bsLAswKVL._SL500_AA280_.jpg

整个页面的代码很多,我就是想要这个图片
http://ec2.images-amazon.com/images/I/
._SL500_AA280_.jpg
以上两段每个页面都是一样的,请问怎么过滤我想要的这个图片地址


回复讨论(解决方案)

加上  id="prodImage" 这个可以区分其他图片了吧

不用
就http://ec2.images-amazon.com/images/I/
._SL500_AA280_.jpg这个地址就行了,只是中间编号不一样,一个页面所有这样的地址都是同一张图片

http://ec2.images-amazon.com/images/I/51bsLAswKVL._SL500_AA280_.jpg
这个地址在一个页面是唯一的,只是中间51bsLAswKVL这个会不一样

preg_match_all('#src="(http://ec2\.images-amazon\.com/images/I/[a-z\d]+\._SL500_AA280_\.jpg)"#is',$s,$m);
print_r($m[1]);

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn