Home  >  Article  >  Backend Development  >  How does Go Colly find the requested element?

How does Go Colly find the requested element?

PHPz
PHPzforward
2024-02-13 13:57:08794browse

Go Colly如何找到请求的元素?

php Editor Banana will introduce you to a powerful web crawler framework-Go Colly. Go Colly is a lightweight web crawler framework developed based on the Go language. It has the characteristics of high performance, high concurrency, and easy expansion. When using Go Colly for web crawling, we often need to find the requested elements according to our needs. So, how does Go Colly find the requested element? Next, we will answer them one by one.

Question content

I am trying to use colly to have a specific table loop through its contents but the table is not recognized, this is what I have so far.

package main

import (
    "fmt"
    
    "github.com/gocolly/colly"
)

func main() {
    c := colly.NewCollector(
        colly.AllowedDomains("wikipedia.org", "en.wikipedia.org"),
    )
    
    links := make([]string, 0)

    c.OnHTML("div.mw-parser-output", func(e *colly.HTMLElement) {
        
        e.ForEach("table.wikitable.sortable.jquery-tablesorter > tbody > tr", func(_ int, elem *colly.HTMLElement) {
            fmt.Println(elem.ChildAttr("a[href]", "href"))
            links = append(links, elem.ChildAttr("a[href]", "href"))
        })
    })
    
    c.OnRequest(func(r *colly.Request) {
        fmt.Println("Visiting", r.URL.String())
    })

    c.Visit("https://en.wikipedia.org/wiki/List_of_countries_and_dependencies_by_population")
    fmt.Println("Found urls for", len(links), "countries.")
}

I need to loop through all tr ​​elements in the table.

Workaround

It turns out that the name of the class is actually wikitable.sortable, even though it appears in the chrome console as wikitable sortable jquery-tablesorter. I don't know why the names are so different, but it solved my problem.

The above is the detailed content of How does Go Colly find the requested element?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:stackoverflow.com. If there is any infringement, please contact admin@php.cn delete