Home > Article > Backend Development > Colly - How to get the value of a child property?
php editor Xigua introduces you to Colly, a powerful web crawler framework. Colly is a simple and flexible crawler framework written in Go language. It provides rich functions, including obtaining HTML elements, extracting data, and processing requests and responses. When using Colly, sometimes we need to get the value of a sub-attribute of an HTML element, such as getting the href attribute of a link. So, how to get the value of sub-property in Colly? Next, we will answer your questions one by one.
This is a sample page I have been working on https://www.lazada.vn/-i1701980654-s7563711492.html
This is the element I want to get (product title)
... <div> <img src="https://lzd-img-global.slatic.net/g/tps/imgextra/i1/o1cn01juoyif22n3uu7jx4r_!!6000000007107-2-tps-162-48.png" class="pdp-mod-product-badge" alt="lazmall"> <h1 class="pdp-mod-product-badge-title"> yierku 【free shipping miễn phí vận chuyển】giày nam mùa thu và mùa đông giày thường xu hướng nam thể thao tất cả các trận đấu giày da tăng chiều cao giày nam </h1> </div> ...
I want to get the text value between 4a249f0d628e2318394fd9b75b4636b1
elements, that is yierku [Free shipping miễn phí vận chuyển] giày n....
Here's what I've tried so far
c := colly.NewCollector() c.OnError(func(_ *colly.Response, err error) { log.Println("Something went wrong:", err) }) c.OnXML("/html/body", func(e *colly.XMLElement) { child := e.ChildAttrs("div[4]/div/div[3]/div[2]/div/div[1]/div[3]/div/div/h1", "class") fmt.Println(child) //fmt.Println(child) })
It gives a response of pdp-mod-product-badge-title
When I try to change it to
child := e.childattrs("div[4]/div/div[3]/div[2]/div/div[1]/div[3]/div/div/h1", "text" )
It doesn't give me any results
Use func (*xmlelement) childtextinstead.
package main import ( "fmt" "github.com/gocolly/colly/v2" ) func main() { c := colly.NewCollector() c.OnError(func(_ *colly.Response, err error) { fmt.Println("Something went wrong:", err) }) c.OnXML("/html/body", func(e *colly.XMLElement) { child := e.ChildText("div[4]/div/div[3]/div[2]/div/div[1]/div[3]/div/div/h1") fmt.Println(child) }) c.Visit("https://www.lazada.vn/-i1701980654-s7563711492.html") // Output: // Yierku 【Free Shipping Miễn phí vận chuyển】Giày nam mùa thu và mùa đông giày thường xu hướng nam thể thao tất cả các trận đấu giày da tăng chiều cao giày nam }
The above is the detailed content of Colly - How to get the value of a child property?. For more information, please follow other related articles on the PHP Chinese website!