search
HomeJavajavaTutorialSimple usage example of Jsoup

Test web page


# #

nbsp;html><!-- http://jwc.yangtzeu.edu.cn/ -->
    <meta>
    <title>长江大学</title>
    <link>
    <link>
    <link>
    <script></script>
    <script></script>
    <script></script>

    <p>

        <!-- 顶部图片p -->
        </p><p></p>

        <!-- 顶部菜单p -->
        <p>

            </p><p>

                </p><p><a>首页</a></p>
                <p><a>机构设置</a></p>
                <p><a>规章制度</a></p>
                <p><a>教学建设</a></p>
                <p><a>教务管理</a></p>
                <p><a>考务管理</a></p>
                <p><a>实践创新</a></p>
                <p><a>质量评估</a></p>
                <p><a>学务管理</a></p>
                <p><a>服务指南</a></p>
                <p><a>下载中心</a></p>

            

        
        <p></p>

        <!-- 顶部时间p -->
        <p></p>
        <p></p>

        <!-- 中间的tablep -->
        <p>

            <!-- 左侧table-cell -->
            </p><p>

                </p><p></p>
                <p></p>

                <h2>高教信息<a>+MORE</a>
</h2>
                
                

                

友情链接

                

                         

                         

                

教务通知本周事务

                
                        
  • 关于组织2017年(第十二届)长江大学大学生化学实验2017-03-30
  •                     
  • 关于核查文科相关学院2013级毕业班学生成绩的通知2017-03-30
  •                     
  • 关于组织申报第二批校级双语教学示范课程的通知2017-03-30
  •                     
  • 查看更多...
  •                 
                
                        
  • 2016~2017学年第二学期6~7月份主要教学工作安排2017-03-30
  •                     
  • 2016~2017学年第二学期5月份主要教学工作安排2017-03-30
  •                     
  • 2016~2017学年第二学期4月份主要教学工作安排2017-03-30
  •                     
  • 2016~2017学年第二学期3月份主要教学工作安排2017-03-30
  •                     
  • 查看更多...
  •                 
                

教务通知本周事务

                                      

            

                 

                 

        

              

    <script> setup(); switchTab(elementById("notice")); addEventss(); </script>

Java code

##

import java.io.File;import java.util.ArrayList;import org.jsoup.Jsoup;import org.jsoup.nodes.Document;import org.jsoup.nodes.Element;import org.jsoup.select.Elements;public class App {    public static void main(String args[]) {        try {

            File input = new File("/Users/YouXianMing/Documents/Project/HTML Project/yangtze/yangtze.html");
            Document doc = Jsoup.parse(input, "UTF-8", "http://yangtze.com/");            // 根据元素id获取元素            {
                Element content = doc.getElementById("content");
                System.out.println(content);
            }            // 根据CSS的class名获取元素数组            {
                ArrayList<element> list = doc.getElementsByClass("space");                for (Element element : list) {
                    System.out.println(element + "\n");
                }
            }            // 根据标签获取元素数组            {
                ArrayList<element> list = doc.getElementsByTag("p");                for (Element element : list) {
                    System.out.println(element + "\n");
                }
            }            // 根据元素中含有的属性值获取元素数组            {
                ArrayList<element> list = doc.getElementsByAttribute("href");                for (Element element : list) {
                    System.out.println(element + "\n");
                }
            }            // 根据元素中含有的属性值获取元素数组            {
                Element content = doc.getElementById("header-menu-table");                // 元素的父元素                System.out.println(content.parent());                // 元素的所有子元素                System.out.println(content.children());                // 与该元素平级的第一个兄弟元素
                System.out.println(content.child(0).firstElementSibling());                // 与该元素平级的最后一个兄弟元素
                System.out.println(content.child(0).lastElementSibling());                // 该元素的前一个兄弟元素
                System.out.println(content.child(1).previousElementSibling());                // 该元素的下一个兄弟元素
                System.out.println(content.child(0).nextElementSibling());
            }            // 一个元素中的数据            {
                Element content = doc.getElementsByClass("ul-type-1").first().child(0);                // 获取文本内容                System.out.println(content.text());                // 获取tag名字                System.out.println(content.tagName());                // 获取tag对象                System.out.println(content.tag());                // 获取属性字典                System.out.println(content.attributes());                // 获取当前内容当中的html内容                System.out.println(content.html());                // 获取外部的html内容                System.out.println(content.outerHtml());                // 获取属性style的值
                System.out.println(content.attr("style"));
            }            // 使用选择器语法来查找元素            {
                Elements elements = null;                // 通过标签查找元素
                elements = doc.select("a");
                System.out.println(elements);                // 通过id查找元素
                elements = doc.select("#content");
                System.out.println(elements);                // 通过class查找元素
                elements = doc.select(".ul-type-1");
                System.out.println(elements);                // 通过属性查找元素
                elements = doc.select("[href]");
                System.out.println(elements);                // 通过属性前缀查找元素
                elements = doc.select("[^hr]");
                System.out.println(elements);                // 通过属性值来查找元素
                elements = doc.select("[id=notice]");
                System.out.println(elements);                // 匹配属性值开头
                elements = doc.select("[onmouseover^=swit]");
                System.out.println(elements);                
                // 匹配属性值结尾
                elements = doc.select("[onmouseover$=(this)]");
                System.out.println(elements);                
                // 匹配包含了属性值
                elements = doc.select("[onmouseover*=Tab]");
                System.out.println(elements);                
                // 正则表达式匹配
                elements = doc.select("ul[id~=^notice]");
                System.out.println(elements);
            }

        } catch (Exception e) {

            System.out.println(e);
        }
    }
}</element></element></element>

Note

Please replace the following places by yourself. I loaded the html from local

The following are several situations of obtaining elements

The above is the detailed content of Simple usage example of Jsoup. For more information, please follow other related articles on the PHP Chinese website!

Statement
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
How to dynamically modify the savePath parameter of @Excel annotation in easypoi when project starts in Java?How to dynamically modify the savePath parameter of @Excel annotation in easypoi when project starts in Java?Apr 19, 2025 pm 02:09 PM

How to dynamically configure the parameters of entity class annotations in Java During the development process, we often encounter the need to dynamically configure the annotation parameters according to different environments...

Why does the Python script not be found when submitting a PyFlink job on YARN?Why does the Python script not be found when submitting a PyFlink job on YARN?Apr 19, 2025 pm 02:06 PM

Analysis of the reason why Python script cannot be found when submitting a PyFlink job on YARN When you try to submit a PyFlink job through YARN, you may encounter...

What should I do if a third-party interface is called in Spring Boot project, and the field name case and getter method are inconsistent, resulting in data transmission failure?What should I do if a third-party interface is called in Spring Boot project, and the field name case and getter method are inconsistent, resulting in data transmission failure?Apr 19, 2025 pm 02:03 PM

The difficulties encountered when calling third-party interfaces to transmit data in SpringBoot project will be used for a Spring...

How to convert names to numbers to implement sorting within groups?How to convert names to numbers to implement sorting within groups?Apr 19, 2025 pm 01:57 PM

How to convert names to numbers to implement sorting within groups? When sorting users in groups, it is often necessary to convert the user's name into numbers so that it can be different...

In Java remote debugging, how to correctly obtain constant values ​​on remote servers?In Java remote debugging, how to correctly obtain constant values ​​on remote servers?Apr 19, 2025 pm 01:54 PM

Questions and Answers about constant acquisition in Java Remote Debugging When using Java for remote debugging, many developers may encounter some difficult phenomena. It...

In back-end development, how to distinguish the responsibilities of the service layer and the dao layer?In back-end development, how to distinguish the responsibilities of the service layer and the dao layer?Apr 19, 2025 pm 01:51 PM

Discussing the hierarchical architecture in back-end development. In back-end development, hierarchical architecture is a common design pattern, usually including controller, service and dao three layers...

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Tools

SecLists

SecLists

SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

WebStorm Mac version

WebStorm Mac version

Useful JavaScript development tools

ZendStudio 13.5.1 Mac

ZendStudio 13.5.1 Mac

Powerful PHP integrated development environment

Safe Exam Browser

Safe Exam Browser

Safe Exam Browser is a secure browser environment for taking online exams securely. This software turns any computer into a secure workstation. It controls access to any utility and prevents students from using unauthorized resources.

MinGW - Minimalist GNU for Windows

MinGW - Minimalist GNU for Windows

This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.