Home >Backend Development >Golang >How does Go Colly find the requested element?
php Editor Banana will introduce you to a powerful web crawler framework-Go Colly. Go Colly is a lightweight web crawler framework developed based on the Go language. It has the characteristics of high performance, high concurrency, and easy expansion. When using Go Colly for web crawling, we often need to find the requested elements according to our needs. So, how does Go Colly find the requested element? Next, we will answer them one by one.
I am trying to use colly to have a specific table loop through its contents but the table is not recognized, this is what I have so far.
package main import ( "fmt" "github.com/gocolly/colly" ) func main() { c := colly.NewCollector( colly.AllowedDomains("wikipedia.org", "en.wikipedia.org"), ) links := make([]string, 0) c.OnHTML("div.mw-parser-output", func(e *colly.HTMLElement) { e.ForEach("table.wikitable.sortable.jquery-tablesorter > tbody > tr", func(_ int, elem *colly.HTMLElement) { fmt.Println(elem.ChildAttr("a[href]", "href")) links = append(links, elem.ChildAttr("a[href]", "href")) }) }) c.OnRequest(func(r *colly.Request) { fmt.Println("Visiting", r.URL.String()) }) c.Visit("https://en.wikipedia.org/wiki/List_of_countries_and_dependencies_by_population") fmt.Println("Found urls for", len(links), "countries.") }
I need to loop through all tr elements in the table.
It turns out that the name of the class is actually wikitable.sortable
, even though it appears in the chrome console as wikitable sortable jquery-tablesorter
. I don't know why the names are so different, but it solved my problem.
The above is the detailed content of How does Go Colly find the requested element?. For more information, please follow other related articles on the PHP Chinese website!