早就听说Python语言操作简单,果然名不虚传,短短几句,就实现了基本的功能。
要检测目标网站上是否存在指定的URL,其实过程很简单:
1、获得指定网站网页的HTML代码
2、在HTML代码中查找指定的URL
3、如果存在,OK;否则,Error
整个程序引用了两个lib库, urllib2和 sgmllib 。
urllib2 库主要定义了一些访问URL(基本通过HTTP)的函数与类。
sgmllib 库主要负责解析HTML代码。
1 import urllib2 2 from sgmllib import SGMLParser 3 4 class URLLister(SGMLParser): 5 def reset(self): 6 SGMLParser.reset(self) 7 self.urls = [] 8 9 def start_a(self,attrs):10 href=[v for k,v in attrs if k=='href']11 if href:12 if (href[0].count('http://网站URL')==1):13 self.urls.extend(href)14 15 16 links = ['http://www.google.com/',17 'http://www.baidu.com',18 'http://www.sohu.net',19 'http://www.163.com',20 'http://www.cnblogs.com',21 'http://www.qq.com',22 'http://www.yahoo.com/',23 'http://www.bing.com/',24 'http://www.360.com',]25 26 for eachlink in links:27 f = urllib2.urlopen(eachlink)28 if f.code ==200:29 parser = URLLister()30 parser.feed(f.read())31 f.close()32 if (len(parser.urls)>=1):33 print 'The link from '+eachlink+' is OK!'34 else:35 print 'The link from '+eachlink+' is ERROR!'
这其中几个主要函数:
1、 urllib2. urlopen ( url[, data][, timeout] )//打开一个URL
2、 SGMLParser. feed ( data ) //获得需要解析的HTML数据
3、 SGMLParser. start_tag ( attributes ) //指定需要解析的HTML标签,在本程序中,我们调用了start_a,说明我们需要解析HTML代码中标签。通过查找标签中href属性的value,可以获得该网页上所有链接的信息,只要我们指定的URL存在,就OK了。
这其实是一个很小的脚本,但也让我激动不已。一来,我已经跨进了Python的世界,并用它解决了实际工作中的问题,二来,它的简单语法、缩进格式着实让我眼前一亮。今后,希望能够多多使用Python,解决实际工作中的种种问题,做到学以致用

The official account web page update cache, this thing is simple and simple, and it is complicated enough to drink a pot of it. You worked hard to update the official account article, but the user still opened the old version. Who can bear the taste? In this article, let’s take a look at the twists and turns behind this and how to solve this problem gracefully. After reading it, you can easily deal with various caching problems, allowing your users to always experience the freshest content. Let’s talk about the basics first. To put it bluntly, in order to improve access speed, the browser or server stores some static resources (such as pictures, CSS, JS) or page content. Next time you access it, you can directly retrieve it from the cache without having to download it again, and it is naturally fast. But this thing is also a double-edged sword. The new version is online,

The article discusses using HTML5 form validation attributes like required, pattern, min, max, and length limits to validate user input directly in the browser.

Article discusses best practices for ensuring HTML5 cross-browser compatibility, focusing on feature detection, progressive enhancement, and testing methods.

This article demonstrates efficient PNG border addition to webpages using CSS. It argues that CSS offers superior performance compared to JavaScript or libraries, detailing how to adjust border width, style, and color for subtle or prominent effect

The article discusses the HTML <datalist> element, which enhances forms by providing autocomplete suggestions, improving user experience and reducing errors.Character count: 159

The article discusses the HTML <meter> element, used for displaying scalar or fractional values within a range, and its common applications in web development. It differentiates <meter> from <progress> and ex

This article explains the HTML5 <time> element for semantic date/time representation. It emphasizes the importance of the datetime attribute for machine readability (ISO 8601 format) alongside human-readable text, boosting accessibilit

The article discusses the HTML <progress> element, its purpose, styling, and differences from the <meter> element. The main focus is on using <progress> for task completion and <meter> for stati


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

PhpStorm Mac version
The latest (2018.2.1) professional PHP integrated development tool

Safe Exam Browser
Safe Exam Browser is a secure browser environment for taking online exams securely. This software turns any computer into a secure workstation. It controls access to any utility and prevents students from using unauthorized resources.

SublimeText3 English version
Recommended: Win version, supports code prompts!

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)
