首页 >Java >java教程 >Java中如何在不使用任何外部库的情况下读取网页内容？

Java中如何在不使用任何外部库的情况下读取网页内容？

王林转载: 2023-09-02 08:45:081193浏览

java.net包的URL类代表一个统一资源定位器，用于在万维网上指向资源（文件或目录或引用）。

openStream() 该类的方法打开与当前对象表示的 URL 的连接，并返回一个 InputStream 对象，您可以使用该对象从 URL 读取数据。

因此，要从网页读取数据（使用 URL 类）−

通过将所需网页的 URL 作为参数传递给其构造函数来实例化 java.net.URL 类。
调用 openStream() 方法并检索 InputStream 对象。
实例化 Scanner通过将上面检索到的 InputStream 对象作为参数传递给类。

示例

import java.io.IOException;
import java.net.URL;
import java.util.Scanner;
public class ReadingWebPage {
   public static void main(String args[]) throws IOException {
      //Instantiating the URL class
      URL url = new URL("http://www.something.com/");
      //Retrieving the contents of the specified page
      Scanner sc = new Scanner(url.openStream());
      //Instantiating the StringBuffer class to hold the result
      StringBuffer sb = new StringBuffer();
      while(sc.hasNext()) {
         sb.append(sc.next());
         //System.out.println(sc.next());
      }
      //Retrieving the String from the String Buffer object
      String result = sb.toString();
      System.out.println(result);
      //Removing the HTML tags
      result = result.replaceAll("<[^>]*>", "");
      System.out.println("Contents of the web page: "+result);
   }
}

输出

<html><body><h1>Itworks!</h1></body></html>
Contents of the web page: Itworks!

以上是Java中如何在不使用任何外部库的情况下读取网页内容？的详细内容。更多信息请关注PHP中文网其他相关文章！

声明：

本文转载于：tutorialspoint.com。如有侵权，请联系admin@php.cn删除

上一篇：在Java中使用Jackson时，何时使用@JsonValue注解？下一篇：我们可以将Java数组转换为列表吗？

查看更多