1) What I want to capture is the fans of a celebrity on Instagram
2) The Instagram PC site uses a lot of js rendering
3) I have never written a crawler, and the boss will need the data tomorrow
I am currently using BeautifulSoup
, selenium
and phantomjs
The code demo is probably
driver = webdriver.PhantomJS(self.browser)
driver.get(self.url)
driver.implicitly_wait(3)
element = driver.find_element_by_class_name("_s53mj")
element.click()
html = driver.page_source
soup = BeautifulSoup(html)
The problem is:
1) I don’t know whether the click is executed successfully, whether the click element is correct, the driver seems to have no return value for my reference
2) Even if the click is successful, does it only call What should I do if the click() method in js is not triggered?
3) I don’t know whether to render page_source
first or click
first. Assume that the click execution is successful. , will it not be returned to the source?
Ah, thank you all reptile bosses
我想大声告诉你2017-06-28 09:24:20
What do you mean? I am puzzled. . .
Selenium automation, click can imitate user clicks, just like you click on the page yourself, everything is done in the virtual browser driver.
Look at your business logic. . . For example, some data needs to be clicked to obtain, so click first and then get the source code.