Python提取網頁中超連結的方法-Python教學-PHP中文網

首頁

後端開發

Python教學

Python提取網頁中超連結的方法

高洛峰

Feb 22, 2017 pm 04:52 PM

很多人在一開始學習Python，會打算用作爬蟲開發。既然要做爬蟲，首先就要抓取網頁，並且從網頁中提取出超連結位址。這篇文章跟大家分享一個簡單的方法，有需要的可以參考借鏡。

以下是最簡單的實作方法，先將目標網頁抓回來，然後透過正規比對a標籤中的href屬性來獲得超連結

程式碼如下：

import urllib2
import re
 
url = &#39;http://www.sunbloger.com/&#39;
 
req = urllib2.Request(url)
con = urllib2.urlopen(req)
doc = con.read()
con.close()
 
links = re.findall(r&#39;href\=\"(http\:\/\/[a-zA-Z0-9\.\/]+)\"&#39;, doc)
for a in links:
  print a

更多Python提取網頁中超連結的方法相關文章請關注PHP中文網！

陳述

本文內容由網友自願投稿，版權歸原作者所有。本站不承擔相應的法律責任。如發現涉嫌抄襲或侵權的內容，請聯絡admin@php.cn

您如何將元素附加到Python數組？Apr 30, 2025 am 12:19 AM

Inpython，YouAppendElementStoAlistusingTheAppend（）方法。 1）useappend（）forsingleelements：my_list.append（4）.2）useextend（）orextend（）或= formultiplelements：my_list.extend.extend（emote_list）ormy_list = [4,5,6] .3）useInsert（）forspefificpositions：my_list.insert（1,5）.beaware

您如何調試與Shebang有關的問題？Apr 30, 2025 am 12:17 AM

調試shebang問題的方法包括：1.檢查shebang行確保是腳本首行且無前置空格；2.驗證解釋器路徑是否正確；3.直接調用解釋器運行腳本以隔離shebang問題；4.使用strace或truss跟踪系統調用；5.檢查環境變量對shebang的影響。

如何從python數組中刪除元素？Apr 30, 2025 am 12:16 AM

pythonlistscanbemanipulationusseveralmethodstoremovelements：1）theremove（）MethodRemovestHefirStocCurrenceOfAstePecificiedValue.2）thepop（）thepop（）methodRemovesandReturnturnturnturnsanaNelementAgivenIndex.3）

可以在Python列表中存儲哪些數據類型？Apr 30, 2025 am 12:07 AM

pythonlistscanstoreanydatate型，包括素，弦，浮子，布爾人，其他列表和迪克尼亞式

在Python列表上可以執行哪些常見操作？Apr 30, 2025 am 12:01 AM

pythristssupportnumeroferations：1）addingElementSwithAppend（），Extend（），andInsert（）。 2）emovingItemSusingRemove（），pop（），andclear（），and clear（）。 3）訪問andModifyingandmodifyingwithIndexingandSlicing.4）

如何使用numpy創建多維數組？Apr 29, 2025 am 12:27 AM

使用NumPy創建多維數組可以通過以下步驟實現：1)使用numpy.array()函數創建數組，例如np.array([[1,2,3],[4,5,6]])創建2D數組；2)使用np.zeros(),np.ones(),np.random.random()等函數創建特定值填充的數組；3)理解數組的shape和size屬性，確保子數組長度一致，避免錯誤；4)使用np.reshape()函數改變數組形狀；5)注意內存使用，確保代碼清晰高效。

說明Numpy陣列中'廣播”的概念。Apr 29, 2025 am 12:23 AM

播放innumpyisamethodtoperformoperationsonArraySofDifferentsHapesbyAutapityallate AligningThem.itSimplifififiesCode，增強可讀性，和Boostsperformance.Shere'shore'showitworks：1）較小的ArraySaraySaraysAraySaraySaraySaraySarePaddedDedWiteWithOnestOmatchDimentions.2）

說明如何在列表，Array.Array和用於數據存儲的Numpy數組之間進行選擇。Apr 29, 2025 am 12:20 AM

forpythondataTastorage，choselistsforflexibilityWithMixedDatatypes，array.ArrayFormeMory-effficityHomogeneousnumericalData，andnumpyArraysForAdvancedNumericalComputing.listsareversareversareversareversArversatilebutlessEbutlesseftlesseftlesseftlessforefforefforefforefforefforefforefforefforefforlargenumerdataSets; arrayoffray.array.array.array.array.array.ersersamiddreddregro

See all articles