Heim > Artikel > Backend-Entwicklung > 采集正则 求大神解答
<h4 class="cat-hd fst-cat-hd "> <i class="cat-icon fst-cat-icon active-trigger"></i> <a class="cat-name fst-cat-name" href="http://bosidengny.tmall.com/category-907362758.htm?search=y&catName=%D0%C2%C6%B7%D7%A8%C7%F8" >新品专区</a> </h4> </li> <li class="cat fst-cat"> <h4 class="cat-hd fst-cat-hd has-children"> <i class="cat-icon fst-cat-icon active-trigger"></i> <a class="cat-name fst-cat-name" href="http://bosidengny.tmall.com/category-907362759.htm?search=y&catName=%B1%A3%C5%AF%C9%CF%D7%B0" >保暖上装</a> </h4> <div class="snd-pop"> <div class="snd-pop-inner"> <ul class="fst-cat-bd"> <li class="cat snd-cat"> <h4 class="cat-hd snd-cat-hd"> <i class="cat-icon snd-cat-icon"></i> <a class="cat-name snd-cat-name" href="http://bosidengny.tmall.com/category-907362760.htm?search=y&parentCatId=907362759&parentCatName=%B1%A3%C5%AF%C9%CF%D7%B0&catName=%BC%D9%C1%BD%BC%FE%A3%A8%B3%C4%C9%C0%C1%EC%A3%A9" > 假两件(衬衫领) </a> </h4> </li> <li class="cat snd-cat"> <h4 class="cat-hd snd-cat-hd"> <i class="cat-icon snd-cat-icon"></i> <a class="cat-name snd-cat-name" href="http://bosidengny.tmall.com/category-907362761.htm?search=y&parentCatId=907362759&parentCatName=%B1%A3%C5%AF%C9%CF%D7%B0&catName=V%C1%EC%C9%CF%D7%B0" > V领上装 </a> </h4> </li>
一级分类:/
为什么费神写这个?人家网站稍有变化,功夫就白费了
网上有很多简捷实用的工具,为什么不用呢?
比如这个
$s =<<< TXT<h4 class="cat-hd fst-cat-hd "> <i class="cat-icon fst-cat-icon active-trigger"></i> <a class="cat-name fst-cat-name" href="http://bosidengny.tmall.com/category-907362758.htm?search=y&catName=%D0%C2%C6%B7%D7%A8%C7%F8" >新品专区</a> </h4> </li> <li class="cat fst-cat"> <h4 class="cat-hd fst-cat-hd has-children"> <i class="cat-icon fst-cat-icon active-trigger"></i> <a class="cat-name fst-cat-name" href="http://bosidengny.tmall.com/category-907362759.htm?search=y&catName=%B1%A3%C5%AF%C9%CF%D7%B0" >保暖上装</a> </h4> <div class="snd-pop"> <div class="snd-pop-inner"> <ul class="fst-cat-bd"> <li class="cat snd-cat"> <h4 class="cat-hd snd-cat-hd"> <i class="cat-icon snd-cat-icon"></i> <a class="cat-name snd-cat-name" href="http://bosidengny.tmall.com/category-907362760.htm?search=y&parentCatId=907362759&parentCatName=%B1%A3%C5%AF%C9%CF%D7%B0&catName=%BC%D9%C1%BD%BC%FE%A3%A8%B3%C4%C9%C0%C1%EC%A3%A9" > 假两件(衬衫领) </a> </h4> </li> <li class="cat snd-cat"> <h4 class="cat-hd snd-cat-hd"> <i class="cat-icon snd-cat-icon"></i> <a class="cat-name snd-cat-name" href="http://bosidengny.tmall.com/category-907362761.htm?search=y&parentCatId=907362759&parentCatName=%B1%A3%C5%AF%C9%CF%D7%B0&catName=V%C1%EC%C9%CF%D7%B0" > V领上装 </a> </h4> </li>TXT;include 'simple_html_dom.php';$p = new simple_html_dom;$p->load($s);foreach($p->find('a') as $v) { echo $v->class, PHP_EOL; //这是可供区分级别的 class echo $v->href,PHP_EOL; //这是url echo trim($v->innertext()),PHP_EOL; //这是说明文字}
cat-name fst-cat-name http://bosidengny.tmall.com/category-907362758.htm?search=y&catName=%D0%C2%C6%B7%D7%A8%C7%F8新品专区 cat-name fst-cat-namehttp://bosidengny.tmall.com/category-907362759.htm?search=y&catName=%B1%A3%C5%AF%C9%CF%D7%B0保暖上装cat-name snd-cat-namehttp://bosidengny.tmall.com/category-907362760.htm?search=y&parentCatId=907362759&parentCatName=%B1%A3%C5%AF%C9%CF%D7%B0&catName=%BC%D9%C1%BD%BC%FE%A3%A8%B3%C4%C9%C0%C1%EC%A3%A9假两件(衬衫领)cat-name snd-cat-namehttp://bosidengny.tmall.com/category-907362761.htm?search=y&parentCatId=907362759&parentCatName=%B1%A3%C5%AF%C9%CF%D7%B0&catName=V%C1%EC%C9%CF%D7%B0V领上装