


Java Program Receiving 403 Forbidden Error While Web Browsers Succeed
Problem:
A Java program designed to retrieve the result count for a given Google search query returns a 403 Forbidden error, while the same query yields results in web browsers. Code snippet:
import java.io.BufferedReader; import java.io.IOException; import java.io.InputStreamReader; import java.net.URL; public class DataGetter { public static void main(String[] args) throws IOException { getResultAmount("test"); } private static int getResultAmount(String query) throws IOException { BufferedReader r = new BufferedReader(new InputStreamReader(new URL("https://www.google.com/search?q=" + query).openConnection() .getInputStream())); String line; String src = ""; while ((line = r.readLine()) != null) { src += line; } System.out.println(src); return 1; } }
Error:
Exception in thread "main" java.io.IOException: Server returned HTTP response code: 403 for URL: https://www.google.com/search?q=test at sun.net.www.protocol.http.HttpURLConnection.getInputStream(Unknown Source) at sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(Unknown Source) at DataGetter.getResultAmount(DataGetter.java:15) at DataGetter.main(DataGetter.java:10)
Solution:
The issue arises because Java's URLConnection class does not fake an actual user-agent by default. Modifying the code to set a user-agent header resolves this problem:
URLConnection connection = new URL("https://www.google.com/search?q=" + query).openConnection(); connection.setRequestProperty("User-Agent", "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.95 Safari/537.11"); connection.connect(); BufferedReader r = new BufferedReader(new InputStreamReader(connection.getInputStream(), Charset.forName("UTF-8")));
Additionally, SSL handling is transparent, as indicated by the exception stacktrace.
Further Considerations:
Retrieving result counts involves further steps, including faking a browser by fetching cookies and parsing redirect token links:
String cookie = connection.getHeaderField("Set-Cookie").split(";")[0]; Pattern pattern = Pattern.compile("content=\\"0;url=(.*?)\\""); Matcher m = pattern.matcher(response); if (m.find()) { String url = m.group(1); connection = new URL(url).openConnection(); connection.setRequestProperty("User-Agent", "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.95 Safari/537.11"); connection.setRequestProperty("Cookie", cookie); connection.connect(); r = new BufferedReader(new InputStreamReader(connection.getInputStream(), Charset.forName("UTF-8"))); sb = new StringBuilder(); while ((line = r.readLine()) != null) { sb.append(line); } response = sb.toString(); pattern = Pattern.compile("<div><p>Executing this full code yields a result of 2930000000L.</p></div>
The above is the detailed content of Why Does My Java Program Get a 403 Forbidden Error When Scraping Google Search Results While My Browser Doesn't?. For more information, please follow other related articles on the PHP Chinese website!

There are subtle differences in Java's performance on different operating systems. 1) The JVM implementations are different, such as HotSpot and OpenJDK, which affect performance and garbage collection. 2) The file system structure and path separator are different, so it needs to be processed using the Java standard library. 3) Differential implementation of network protocols affects network performance. 4) The appearance and behavior of GUI components vary on different systems. By using standard libraries and virtual machine testing, the impact of these differences can be reduced and Java programs can be ensured to run smoothly.

Javaoffersrobustobject-orientedprogramming(OOP)andtop-notchsecurityfeatures.1)OOPinJavaincludesclasses,objects,inheritance,polymorphism,andencapsulation,enablingflexibleandmaintainablesystems.2)SecurityfeaturesincludetheJavaVirtualMachine(JVM)forsand

JavaScriptandJavahavedistinctstrengths:JavaScriptexcelsindynamictypingandasynchronousprogramming,whileJavaisrobustwithstrongOOPandtyping.1)JavaScript'sdynamicnatureallowsforrapiddevelopmentandprototyping,withasync/awaitfornon-blockingI/O.2)Java'sOOPf

JavaachievesplatformindependencethroughtheJavaVirtualMachine(JVM)andbytecode.1)TheJVMinterpretsbytecode,allowingthesamecodetorunonanyplatformwithaJVM.2)BytecodeiscompiledfromJavasourcecodeandisplatform-independent.However,limitationsincludepotentialp

Java'splatformindependencemeansapplicationscanrunonanyplatformwithaJVM,enabling"WriteOnce,RunAnywhere."However,challengesincludeJVMinconsistencies,libraryportability,andperformancevariations.Toaddressthese:1)Usecross-platformtestingtools,2)

JVM'sperformanceiscompetitivewithotherruntimes,offeringabalanceofspeed,safety,andproductivity.1)JVMusesJITcompilationfordynamicoptimizations.2)C offersnativeperformancebutlacksJVM'ssafetyfeatures.3)Pythonisslowerbuteasiertouse.4)JavaScript'sJITisles

JavaachievesplatformindependencethroughtheJavaVirtualMachine(JVM),allowingcodetorunonanyplatformwithaJVM.1)Codeiscompiledintobytecode,notmachine-specificcode.2)BytecodeisinterpretedbytheJVM,enablingcross-platformexecution.3)Developersshouldtestacross

TheJVMisanabstractcomputingmachinecrucialforrunningJavaprogramsduetoitsplatform-independentarchitecture.Itincludes:1)ClassLoaderforloadingclasses,2)RuntimeDataAreafordatastorage,3)ExecutionEnginewithInterpreter,JITCompiler,andGarbageCollectorforbytec


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Safe Exam Browser
Safe Exam Browser is a secure browser environment for taking online exams securely. This software turns any computer into a secure workstation. It controls access to any utility and prevents students from using unauthorized resources.

SublimeText3 English version
Recommended: Win version, supports code prompts!

MinGW - Minimalist GNU for Windows
This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.

mPDF
mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),

Dreamweaver CS6
Visual web development tools
