Python脚本检测网站链接是否存在_html/css

Home

Web Front-end

HTML Tutorial

Python脚本检测网站链接是否存在_html/css_WEB-ITnose

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Jun 21, 2016 am 08:52 AM

早就听说Python语言操作简单，果然名不虚传，短短几句，就实现了基本的功能。

要检测目标网站上是否存在指定的URL，其实过程很简单：

1、获得指定网站网页的HTML代码

2、在HTML代码中查找指定的URL

3、如果存在，OK；否则，Error

整个程序引用了两个lib库， urllib2和 sgmllib 。

urllib2 库主要定义了一些访问URL（基本通过HTTP）的函数与类。

sgmllib 库主要负责解析HTML代码。

 1 import urllib2 2 from sgmllib import SGMLParser 3  4 class URLLister(SGMLParser): 5     def reset(self): 6         SGMLParser.reset(self) 7         self.urls = [] 8  9     def start_a(self,attrs):10         href=[v for k,v in attrs if k=='href']11         if href:12             if (href[0].count('http://网站URL')==1):13                 self.urls.extend(href)14 15 16 links = ['http://www.google.com/',17          'http://www.baidu.com',18          'http://www.sohu.net',19          'http://www.163.com',20          'http://www.cnblogs.com',21          'http://www.qq.com',22          'http://www.yahoo.com/',23          'http://www.bing.com/',24          'http://www.360.com',]25 26 for eachlink in links:27     f = urllib2.urlopen(eachlink)28     if f.code ==200:29         parser = URLLister()30         parser.feed(f.read())31         f.close()32         if (len(parser.urls)>=1):33             print 'The link from '+eachlink+' is OK!'34         else:35             print 'The link from '+eachlink+' is ERROR!'

这其中几个主要函数：

1、 urllib2. urlopen ( url[, data][, timeout] )//打开一个URL

2、 SGMLParser. feed ( data ) //获得需要解析的HTML数据

3、 SGMLParser. start_tag ( attributes ) //指定需要解析的HTML标签，在本程序中，我们调用了start_a,说明我们需要解析HTML代码中标签。通过查找标签中href属性的value，可以获得该网页上所有链接的信息，只要我们指定的URL存在，就OK了。

这其实是一个很小的脚本，但也让我激动不已。一来，我已经跨进了Python的世界，并用它解决了实际工作中的问题，二来，它的简单语法、缩进格式着实让我眼前一亮。今后，希望能够多多使用Python,解决实际工作中的种种问题，做到学以致用

Statement

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

How do you set the lang attribute on the tag? Why is this important?May 08, 2025 am 12:03 AM

Setting the lang attributes of a tag is a key step in optimizing web accessibility and SEO. 1) Set the lang attribute in the tag, such as. 2) In multilingual content, set lang attributes for different language parts, such as. 3) Use language codes that comply with ISO639-1 standards, such as "en", "fr", "zh", etc. Correctly setting the lang attribute can improve the accessibility of web pages and search engine rankings.

What is the purpose of HTML attributes?May 07, 2025 am 12:01 AM

HTMLattributesareessentialforenhancingwebelements'functionalityandappearance.Theyaddinformationtodefinebehavior,appearance,andinteraction,makingwebsitesinteractive,responsive,andvisuallyappealing.Attributeslikesrc,href,class,type,anddisabledtransform

How do you create a list in HTML?May 06, 2025 am 12:01 AM

TocreatealistinHTML,useforunorderedlistsandfororderedlists:1)Forunorderedlists,wrapitemsinanduseforeachitem,renderingasabulletedlist.2)Fororderedlists,useandfornumberedlists,customizablewiththetypeattributefordifferentnumberingstyles.

HTML in Action: Examples of Website StructureMay 05, 2025 am 12:03 AM

HTML is used to build websites with clear structure. 1) Use tags such as, and define the website structure. 2) Examples show the structure of blogs and e-commerce websites. 3) Avoid common mistakes such as incorrect label nesting. 4) Optimize performance by reducing HTTP requests and using semantic tags.

How do you insert an image into an HTML page?May 04, 2025 am 12:02 AM

ToinsertanimageintoanHTMLpage,usethetagwithsrcandaltattributes.1)UsealttextforaccessibilityandSEO.2)Implementsrcsetforresponsiveimages.3)Applylazyloadingwithloading="lazy"tooptimizeperformance.4)OptimizeimagesusingtoolslikeImageOptimtoreduc

HTML's Purpose: Enabling Web Browsers to Display ContentMay 03, 2025 am 12:03 AM

The core purpose of HTML is to enable the browser to understand and display web content. 1. HTML defines the web page structure and content through tags, such as, to, etc. 2. HTML5 enhances multimedia support and introduces and tags. 3.HTML provides form elements to support user interaction. 4. Optimizing HTML code can improve web page performance, such as reducing HTTP requests and compressing HTML.

Why are HTML tags important for web development?May 02, 2025 am 12:03 AM

HTMLtagsareessentialforwebdevelopmentastheystructureandenhancewebpages.1)Theydefinelayout,semantics,andinteractivity.2)SemantictagsimproveaccessibilityandSEO.3)Properuseoftagscanoptimizeperformanceandensurecross-browsercompatibility.

Explain the importance of using consistent coding style for HTML tags and attributes.May 01, 2025 am 12:01 AM

A consistent HTML encoding style is important because it improves the readability, maintainability and efficiency of the code. 1) Use lowercase tags and attributes, 2) Keep consistent indentation, 3) Select and stick to single or double quotes, 4) Avoid mixing different styles in projects, 5) Use automation tools such as Prettier or ESLint to ensure consistency in styles.

See all articles

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

How to fix KB5055523 fails to install in Windows 11?

4 weeks agoByDDD

How to fix KB5055518 fails to install in Windows 10?

4 weeks agoByDDD

Roblox: Grow A Garden - Complete Mutation Guide

2 weeks agoByDDD

Roblox: Bubble Gum Simulator Infinity - How To Get And Use Royal Keys

3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

How to fix KB5055612 fails to install in Windows 10?

3 weeks agoByDDD

Hot Tools

mPDF

mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),

EditPlus Chinese cracked version

Small size, syntax highlighting, does not support code prompt function

SecLists

SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.