CSS Tutorial

How to implement CSS selector field parsing

小云云

Feb 02, 2018 am 10:30 AM

cssSelector

Based on the basic CSS syntax knowledge learned above, now let’s implement field parsing. First, parse the title. Open the web developer tools and find the source code corresponding to the title. This article mainly introduces the relevant information about CSS selector implementation of field parsing. Friends who need it can refer to it. I hope it can help everyone

I found it in p class= "entry-header"In the h1 node below, I opened scrapy shell for debugging

But I don’t want the

tag, what should I do? At this time, you need to use the pseudo-class method in the CSS selector. As follows.

Note the two colons. Using CSS selectors is really convenient. In the same way, I use CSS to implement field parsing. The code is as follows

# -*- coding: utf-8 -*-  
import scrapy  
import re  
class JobboleSpider(scrapy.Spider):  
    name = &#39;jobbole&#39;  
    allowed_domains = [&#39;blog.jobbole.com&#39;]  
    start_urls = [&#39;http://blog.jobbole.com/113549/&#39;]  
    def parse(self, response):  
        # title = response.xpath(&#39;//p[@class = "entry-header"]/h1/text()&#39;).extract()[0]  
        # create_date = response.xpath("//p[@class = &#39;entry-meta-hide-on-mobile&#39;]/text()").extract()[0].strip().replace("·","").strip()  
        # praise_numbers = response.xpath("//span[contains(@class,&#39;vote-post-up&#39;)]/h10/text()").extract()[0]  
        # fav_nums = response.xpath("//span[contains(@class,&#39;bookmark-btn&#39;)]/text()").extract()[0]  
        # match_re = re.match(".*?(\d+).*",fav_nums)  
        # if match_re:  
        #     fav_nums = match_re.group(1)  
        # comment_nums = response.xpath("//a[@href=&#39;#article-comment&#39;]/span").extract()[0]  
        # match_re = re.match(".*?(\d+).*", comment_nums)  
        # if match_re:  
        #     comment_nums = match_re.group(1)  
        # content = response.xpath("//p[@class=&#39;entry&#39;]").extract()[0]  
#通过CSS选择器提取字段  
        title = response.css(".entry-header h1::text").extract()[0]  
        create_date = response.css(".entry-meta-hide-on-mobile::text").extract()[0].strip().replace("·","").strip()  
        praise_numbers = response.css(".vote-post-up h10::text").extract()[0]  
        fav_nums = response.css("span.bookmark-btn::text").extract()[0]  
        match_re = re.match(".*?(\d+).*", fav_nums)  
        if match_re:  
            fav_nums = match_re.group(1)  
        comment_nums = response.css("a[href=&#39;#article-comment&#39;] span::text").extract()[0]  
        match_re = re.match(".*?(\d+).*", comment_nums)  
        if match_re:  
            comment_nums = match_re.group(1)  
        content = response.css("p.entry").extract()[0]  
        tags = response.css("p.entry-meta-hide-on-mobile a::text").extract()[0]  
        pass

Related recommendations:

OpenERP employee (employee) table and user Table related field analysis

The above is the detailed content of How to implement CSS selector field parsing. For more information, please follow other related articles on the PHP Chinese website!

Statement

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

css中id选择符的标识是什么Sep 22, 2022 pm 03:57 PM

在css中，id选择符的标识是“#”，可以为标有特定id属性值的HTML元素指定特定的样式，语法结构“#ID值 {属性 : 属性值;}”。ID属性在整个页面中是唯一不可重复的；ID属性值不要以数字开头，数字开头的ID在Mozilla/Firefox浏览器中不起作用。

使用:nth-child(n+3)伪类选择器选择位置大于等于3的子元素的样式Nov 20, 2023 am 11:20 AM

使用:nth-child(n+3)伪类选择器选择位置大于等于3的子元素的样式，具体代码示例如下：HTML代码：<divid="container"><divclass="item">第一个子元素</div><divclass="item"&

css伪选择器学习之伪类选择器解析Aug 03, 2022 am 11:26 AM

在之前的文章《css伪选择器学习之伪元素选择器解析》中，我们学习了伪元素选择器，而今天我们详细了解一下伪类选择器，希望对大家有所帮助！

javascript选择器失效怎么办Feb 10, 2023 am 10:15 AM

javascript选择器失效是因为代码不规范导致的，其解决办法：1、把引入的JS代码去掉，ID选择器方法即可有效；2、在引入“jquery.js”之前引入指定JS代码即可。

从入门到精通：掌握is与where选择器的使用技巧Sep 08, 2023 am 09:15 AM

从入门到精通：掌握is与where选择器的使用技巧引言：在进行数据处理和分析的过程中，选择器（selector）是一项非常重要的工具。通过选择器，我们可以按照特定的条件从数据集中提取所需的数据。本文将介绍is和where选择器的使用技巧，帮助读者快速掌握这两个选择器的强大功能。一、is选择器的使用is选择器是一种基本的选择器，它允许我们根据给定条件对数据集进

css中的选择器包括超文本标记选择器吗Sep 01, 2022 pm 05:25 PM

不包括。css选择器有：1、标签选择器，是通过HTML页面的元素名定位具体HTML元素；2、类选择器，是通过HTML元素的class属性的值定位具体HTML元素；3、ID选择器，是通过HTML元素的id属性的值定位具体HTML元素；4、通配符选择器“*”，可以指代所有类型的标签元素，包括自定义元素；5、属性选择器，是通过HTML元素已经存在属性名或属性值来定位具体HTML元素。

深度解析is与where选择器：提升CSS编程水平Sep 08, 2023 pm 08:22 PM

深度解析is与where选择器：提升CSS编程水平引言：在CSS编程过程中，选择器是必不可少的元素。它们允许我们根据特定的条件选择HTML文档中的元素并对其进行样式化。在这篇文章中，我们将深入探讨两个常用的选择器，即：is选择器和where选择器。通过了解它们的工作原理和使用场景，我们可以大大提升CSS编程的水平。一、is选择器is选择器是一个非常强大的选择

wxss选择器有哪些Sep 28, 2023 pm 04:27 PM

wxss选择器有元素选择器、类选择器、ID选择器、伪类选择器、子元素选择器、属性选择器、后代选择器和通配选择器等。详细介绍：1、元素选择器，使用元素名称作为选择器，选取匹配的元素，使用“view”选择器可以选取所有的“view”组件；2、类选择器，使用类名作为选择器，选取具有特定类名的元素，使用“.classname”选择器可以选取具有“.classname”类名的元素等等。

See all articles

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)

2 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Hello Kitty Island Adventure: How To Get Giant Seeds

1 months agoBy尊渡假赌尊渡假赌尊渡假赌

How Long Does It Take To Beat Split Fiction?

4 weeks agoByDDD

R.E.P.O. Save File Location: Where Is It & How to Protect It?

4 weeks agoByDDD

Two Point Museum: All Exhibits And Where To Find Them

1 months agoBy尊渡假赌尊渡假赌尊渡假赌

Hot Tools

SublimeText3 Chinese version

Chinese version, very easy to use

Dreamweaver Mac version

Visual web development tools

WebStorm Mac version

Useful JavaScript development tools

Notepad++7.3.1

Easy-to-use and free code editor

SecLists

SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

Hot Topics

Where is the login entrance for gmail email?

7375

1628

1355

1267

1216