Python3如何抓取JS動態產生的html網頁功能實作範例-Python教學-PHP中文網

首頁

後端開發

Python教學

Python3如何抓取JS動態產生的html網頁功能實作範例

黄舟

May 18, 2018 pm 03:51 PM

javascriptpython3產生

這篇文章主要介紹了Python3實現抓取javascript動態生成的html網頁功能,結合實例形式分析了Python3使用selenium庫針對javascript動態生成的HTML網頁元素進行抓取的相關操作技巧,需要的朋友可以參考下方

本文實例講述了Python3實作抓取javascript動態產生的html網頁功能。分享給大家供大家參考，具體如下：

用urllib等抓取網頁，只能讀取網頁的靜態原始文件，而抓不到由javascript產生的內容。

究其原因，是因為urllib是瞬時抓取，它不會等javascript的載入延遲，所以頁面中由javascript產生的內容，urllib讀取不到。

那由javascript產生的內容就真的沒有辦法讀取了嗎？非也！

這裡要介紹一個python函式庫：selenium，本文使用的版本是2.44.0

先安裝：

pip install -U selenium

下面用三個例子來說明其用法：

【範例0】

開啟一個Firefox瀏覽器
載入所給url位址的頁面

from selenium import webdriver
browser = webdriver.Firefox()
browser.get(&#39;http://www.baidu.com/&#39;)

【範例1 】

開啟一個Firefox瀏覽器
載入百度首頁
搜尋「seleniumhq」
關閉瀏覽器

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
browser = webdriver.Firefox()
browser.get(&#39;http://www.baidu.com&#39;)
assert &#39;百度&#39; in browser.title
elem = browser.find_element_by_name(&#39;p&#39;) # Find the search box
elem.send_keys(&#39;seleniumhq&#39; + Keys.RETURN) # 模拟按键
browser.quit()

【範例2】

Selenium WebDriver 常用於網路程式的測試。下面是一個使用Python標準函式庫 unittest 的範例:

import unittest
class BaiduTestCase(unittest.TestCase):
  def setUp(self):
    self.browser = webdriver.Firefox()
    self.addCleanup(self.browser.quit)
  def testPageTitle(self):
    self.browser.get(&#39;http://www.baidu.com&#39;)
    self.assertIn(&#39;百度&#39;, self.browser.title)
if __name__ == &#39;__main__&#39;:
  unittest.main(verbosity=2)

以上是Python3如何抓取JS動態產生的html網頁功能實作範例的詳細內容。更多資訊請關注PHP中文網其他相關文章！

陳述

本文內容由網友自願投稿，版權歸原作者所有。本站不承擔相應的法律責任。如發現涉嫌抄襲或侵權的內容，請聯絡admin@php.cn

Python：編譯器還是解釋器？May 13, 2025 am 12:10 AM

Python是解釋型語言，但也包含編譯過程。 1）Python代碼先編譯成字節碼。 2）字節碼由Python虛擬機解釋執行。 3）這種混合機制使Python既靈活又高效，但執行速度不如完全編譯型語言。

python用於循環與循環時：何時使用哪個？May 13, 2025 am 12:07 AM

UseeAforloopWheniteratingOveraseQuenceOrforAspecificnumberoftimes; useAwhiLeLoopWhenconTinuingUntilAcIntiment.forloopsareIdealForkNownsences，而WhileLeleLeleLeleLeleLoopSituationSituationsItuationsItuationSuationSituationswithUndEtermentersitations。

Python循環：最常見的錯誤May 13, 2025 am 12:07 AM

pythonloopscanleadtoerrorslikeinfiniteloops，modifyingListsDuringteritation，逐個偏置，零indexingissues，andnestedloopineflinefficiencies

對於循環和python中的循環時：每個循環的優點是什麼？May 13, 2025 am 12:01 AM

forloopsareadvantageousforknowniterations and sequests，供應模擬性和可讀性；而LileLoopSareIdealFordyNamicConcitionSandunknowniterations，提供ControloperRoverTermination.1）forloopsareperfectForeTectForeTerToratingOrtratingRiteratingOrtratingRitterlistlistslists，callings conspass，calplace，cal，ofstrings ofstrings，orstrings，orstrings，orstrings ofcces

Python：深入研究彙編和解釋May 12, 2025 am 12:14 AM

pythonisehybridmodeLofCompilation和interpretation：1）thepythoninterpretercompilesourcecececodeintoplatform- interpententbybytecode.2）thepythonvirtualmachine（pvm）thenexecutecutestestestestestesthisbytecode，ballancingEaseofuseEfuseWithPerformance。

Python是一種解釋或編譯語言，為什麼重要？May 12, 2025 am 12:09 AM

pythonisbothinterpretedAndCompiled.1）它的compiledTobyTecodeForportabilityAcrosplatforms.2）bytecodeisthenInterpreted，允許fordingfordforderynamictynamictymictymictymictyandrapiddefupment，儘管Ititmaybeslowerthananeflowerthanancompiledcompiledlanguages。

對於python中的循環時循環與循環：解釋了關鍵差異May 12, 2025 am 12:08 AM

在您的知識之際，而foroopsareideal insinAdvance中，而WhileLoopSareBetterForsituations則youneedtoloopuntilaconditionismet

循環時：實用指南May 12, 2025 am 12:07 AM

ForboopSareSusedwhenthentheneMberofiterationsiskNownInAdvance，而WhileLoopSareSareDestrationsDepportonAcondition.1）ForloopSareIdealForiteratingOverSequencesLikelistSorarrays.2）whileLeleLooleSuitableApeableableableableableableforscenarioscenarioswhereTheLeTheLeTheLeTeLoopContinusunuesuntilaspecificiccificcificCondond

See all articles