Home >Backend Development >Python Tutorial >Guide to Python Requests Headers
When interacting with web servers, whether for web scraping or API work, Python requests headers are a powerful yet often overlooked tool. These headers communicate silently, telling the server who’s calling, why, and in what format data should be returned.
In this guide, we’ll cover everything you need to know about setting up headers with Python’s requests library, why header order matters, and how understanding headers can improve the success of your web interactions.
For those new to the library, you can get started by installing it with pip install requests to follow along with this guide.
In HTTP, headers are key-value pairs that accompany each request and response, guiding the server on how to process the request. Headers specify expectations, formats, and permissions, playing a critical role in server-client communication. For instance, headers can tell the server about the type of device sending the request, or whether the client expects a JSON response.
Each request initiates a dialogue between the client (like a browser or application) and server, with headers acting as instructions. The most common headers include:
Headers can be easily managed using Python’s requests library, allowing you to get headers from a response or set custom headers to tailor each request.
Example: Getting Headers with Python Requests
In Python requests to get the headers can be done with response.headers.
import requests response = requests.get('https://httpbin.dev') print(response.headers) { "Access-Control-Allow-Credentials": "true", "Access-Control-Allow-Origin": "*", "Content-Security-Policy": "frame-ancestors 'self' *.httpbin.dev; font-src 'self' *.httpbin.dev; default-src 'self' *.httpbin.dev; img-src 'self' *.httpbin.dev https://cdn.scrapfly.io; media-src 'self' *.httpbin.dev; script-src 'self' 'unsafe-inline' 'unsafe-eval' *.httpbin.dev; style-src 'self' 'unsafe-inline' *.httpbin.dev https://unpkg.com; frame-src 'self' *.httpbin.dev; worker-src 'self' *.httpbin.dev; connect-src 'self' *.httpbin.dev", "Content-Type": "text/html; charset=utf-8", "Date": "Fri, 25 Oct 2024 14:14:02 GMT", "Permissions-Policy": "fullscreen=(self), autoplay=*, geolocation=(), camera=()", "Referrer-Policy": "strict-origin-when-cross-origin", "Strict-Transport-Security": "max-age=31536000; includeSubDomains; preload", "X-Content-Type-Options": "nosniff", "X-Xss-Protection": "1; mode=block", "Transfer-Encoding": "chunked" }
The output shows headers the server sends back, with details like
Example: Setting Custom Headers
Custom headers, like adding a User-Agent for device emulation, can make requests appear more authentic:
import requests response = requests.get('https://httpbin.dev') print(response.headers) { "Access-Control-Allow-Credentials": "true", "Access-Control-Allow-Origin": "*", "Content-Security-Policy": "frame-ancestors 'self' *.httpbin.dev; font-src 'self' *.httpbin.dev; default-src 'self' *.httpbin.dev; img-src 'self' *.httpbin.dev https://cdn.scrapfly.io; media-src 'self' *.httpbin.dev; script-src 'self' 'unsafe-inline' 'unsafe-eval' *.httpbin.dev; style-src 'self' 'unsafe-inline' *.httpbin.dev https://unpkg.com; frame-src 'self' *.httpbin.dev; worker-src 'self' *.httpbin.dev; connect-src 'self' *.httpbin.dev", "Content-Type": "text/html; charset=utf-8", "Date": "Fri, 25 Oct 2024 14:14:02 GMT", "Permissions-Policy": "fullscreen=(self), autoplay=*, geolocation=(), camera=()", "Referrer-Policy": "strict-origin-when-cross-origin", "Strict-Transport-Security": "max-age=31536000; includeSubDomains; preload", "X-Content-Type-Options": "nosniff", "X-Xss-Protection": "1; mode=block", "Transfer-Encoding": "chunked" }
This setup helps ensure each request appears browser-like, reducing the chance of triggering anti-bot measures. In Python requests, setting headers lets you precisely control interactions with the server.
A frequent question when working with Python requests headers is whether header names are case-sensitive.
According to the HTTP/1.1 specification, header names are case-insensitive, meaning Content-Type, content-type, and CONTENT-TYPE are all equivalent. However, sticking to standard naming conventions like Content-Type instead of alternative casing is a good practice. Standardizing the format helps prevent confusion, especially when integrating with third-party APIs or systems that may interpret headers differently.
When web servers evaluate requests, subtle details such as inconsistent header casing can reveal the nature of a client. Many legitimate browsers and applications follow specific casing conventions, like capitalizing Content-Type. Bots or scripts, however, may not follow these conventions uniformly. By analyzing requests with unconventional casing, servers can flag or block potential bots.
In practice, Python’s requests library automatically handles case normalization for headers when using functions like python requests set headers. This means that regardless of how you write the header name, the library converts it to a standardized format, ensuring compatibility with the server. However, note that while the header names themselves are case-insensitive, header values (such as “application/json” in Content-Type) may still be interpreted literally and should be formatted accurately.
In Python’s requests library, you can set headers in any case, and the library will interpret them correctly:
headers = {'User-Agent': 'my-app/0.0.1'} response = requests.get('https://httpbin.dev/headers', headers=headers) print(response.json()) { "headers": { "Accept": ["*/*"], "Accept-Encoding": ["gzip, deflate"], "Host": ["httpbin.dev"], "User-Agent": ["my-app/0.0.1"], "X-Forwarded-For": ["45.242.24.152"], "X-Forwarded-Host": ["httpbin.dev"], "X-Forwarded-Port": ["443"], "X-Forwarded-Proto": ["https"], "X-Forwarded-Server": ["traefik-2kvlz"], "X-Real-Ip": ["45.242.24.152"] }}
As shown above, requests automatically converted content-type to the standard Content-Type. This demonstrates that Python’s requests library will normalize header names for you, maintaining compatibility with web servers regardless of the case used in the original code.
In most standard API interactions, the order of headers sent with a Python requests headers call does not affect functionality, as the HTTP specification does not require a specific order for headers. However, when dealing with advanced anti-bot and anti-scraping systems, header order can play an unexpectedly significant role in determining whether a request is accepted or blocked.
Anti-bot systems, such as Cloudflare, DataDome, and PerimeterX, often go beyond simple header verification and analyze the "fingerprint" of a request. This includes the order in which headers are sent. Human users (via browsers) typically send headers in a consistent order. For example, browser requests might commonly follow an order such as User-Agent, Accept, Accept-Language, Referer, and so on. In contrast, automation libraries or scrapers may send headers in a different order or add non-standard headers, which can serve as red flags for detection algorithms.
Example: Browser Headers vs. Python Requests Headers
In a browser, you might observe headers in this order:
import requests response = requests.get('https://httpbin.dev') print(response.headers) { "Access-Control-Allow-Credentials": "true", "Access-Control-Allow-Origin": "*", "Content-Security-Policy": "frame-ancestors 'self' *.httpbin.dev; font-src 'self' *.httpbin.dev; default-src 'self' *.httpbin.dev; img-src 'self' *.httpbin.dev https://cdn.scrapfly.io; media-src 'self' *.httpbin.dev; script-src 'self' 'unsafe-inline' 'unsafe-eval' *.httpbin.dev; style-src 'self' 'unsafe-inline' *.httpbin.dev https://unpkg.com; frame-src 'self' *.httpbin.dev; worker-src 'self' *.httpbin.dev; connect-src 'self' *.httpbin.dev", "Content-Type": "text/html; charset=utf-8", "Date": "Fri, 25 Oct 2024 14:14:02 GMT", "Permissions-Policy": "fullscreen=(self), autoplay=*, geolocation=(), camera=()", "Referrer-Policy": "strict-origin-when-cross-origin", "Strict-Transport-Security": "max-age=31536000; includeSubDomains; preload", "X-Content-Type-Options": "nosniff", "X-Xss-Protection": "1; mode=block", "Transfer-Encoding": "chunked" }
With Python’s requests library, headers might look slightly different:
headers = {'User-Agent': 'my-app/0.0.1'} response = requests.get('https://httpbin.dev/headers', headers=headers) print(response.json()) { "headers": { "Accept": ["*/*"], "Accept-Encoding": ["gzip, deflate"], "Host": ["httpbin.dev"], "User-Agent": ["my-app/0.0.1"], "X-Forwarded-For": ["45.242.24.152"], "X-Forwarded-Host": ["httpbin.dev"], "X-Forwarded-Port": ["443"], "X-Forwarded-Proto": ["https"], "X-Forwarded-Server": ["traefik-2kvlz"], "X-Real-Ip": ["45.242.24.152"] }}
This slight difference in header ordering can hint to anti-bot systems that the request might be automated, especially if combined with other signals, such as the User-Agent format or missing headers.
By analyzing this order, advanced detection systems can identify patterns often associated with automated scripts or bots. When a request does not match the usual order, the server may assume it’s coming from a bot, potentially resulting in blocked requests or captcha challenges.
When setting up Python requests headers to mimic browser requests, it's helpful to know which headers are standard in most web browsers. These headers inform the server about the client’s capabilities and preferences, making the request appear more legitimate.
Standard headers mimic browser behavior, increasing the success of requests. Key headers include:
To ensure requests mimic real browsers:
Browser Developer Tools :
Proxy Tools :
Example: Mimicking Headers in Python
import requests response = requests.get('https://httpbin.dev') print(response.headers) { "Access-Control-Allow-Credentials": "true", "Access-Control-Allow-Origin": "*", "Content-Security-Policy": "frame-ancestors 'self' *.httpbin.dev; font-src 'self' *.httpbin.dev; default-src 'self' *.httpbin.dev; img-src 'self' *.httpbin.dev https://cdn.scrapfly.io; media-src 'self' *.httpbin.dev; script-src 'self' 'unsafe-inline' 'unsafe-eval' *.httpbin.dev; style-src 'self' 'unsafe-inline' *.httpbin.dev https://unpkg.com; frame-src 'self' *.httpbin.dev; worker-src 'self' *.httpbin.dev; connect-src 'self' *.httpbin.dev", "Content-Type": "text/html; charset=utf-8", "Date": "Fri, 25 Oct 2024 14:14:02 GMT", "Permissions-Policy": "fullscreen=(self), autoplay=*, geolocation=(), camera=()", "Referrer-Policy": "strict-origin-when-cross-origin", "Strict-Transport-Security": "max-age=31536000; includeSubDomains; preload", "X-Content-Type-Options": "nosniff", "X-Xss-Protection": "1; mode=block", "Transfer-Encoding": "chunked" }
This request uses browser-like headers to make the interaction appear more natural. By observing the headers and header order from browser tools, you can customize these in Python to make your request as close to a real browser request as possible.
The User-Agent string plays a crucial role in how servers respond to requests. It identifies the application, operating system, and device making the request, allowing servers to tailor their responses accordingly.
User-Agent strings are typically generated by the browser itself and can vary based on the version of the browser, the operating system, and even the hardware configuration.
You can learn more about How to Effectively Use User Agents for Web Scraping in our dedicated article:
(https://scrapfly.io/blog/user-agent-header-in-web-scraping/)
When using Python requests headers with POST requests, headers play a vital role in how the server interprets the data sent by the client. POST requests are typically used to send data to a server to create, update, or modify resources, often requiring additional headers to clarify the data’s structure, format, and purpose.
Content-Type : Indicates the data format, such as application/json for JSON data, application/x-www-form-urlencoded for form submissions, or multipart/form-data for files. Setting this correctly ensures the server parses your data as expected.
User-Agent : Identifies the client application, which helps with API access and rate limit policies.
Authorization : Needed for secure endpoints to authenticate requests, often using tokens or credentials.
Accept : Specifies the desired response format (e.g., application/json), aiding in consistent data handling and error processing.
Example Usage of Headers for POST Requests
To send data in a JSON format, you typically set the Content-Type header to application/json and pass the data as JSON. Here’s an example with python requests post headers to send a JSON payload:
headers = {'User-Agent': 'my-app/0.0.1'} response = requests.get('https://httpbin.dev/headers', headers=headers) print(response.json()) { "headers": { "Accept": ["*/*"], "Accept-Encoding": ["gzip, deflate"], "Host": ["httpbin.dev"], "User-Agent": ["my-app/0.0.1"], "X-Forwarded-For": ["45.242.24.152"], "X-Forwarded-Host": ["httpbin.dev"], "X-Forwarded-Port": ["443"], "X-Forwarded-Proto": ["https"], "X-Forwarded-Server": ["traefik-2kvlz"], "X-Real-Ip": ["45.242.24.152"] }}
Using python requests post headers in this way ensures the server processes your data correctly and may prevent the request from being blocked.
When a server expects traffic from real users, it may check for certain browser-specific headers that are typically sent only by actual web browsers. These headers help identify and differentiate browsers from automated scripts, which is particularly important when navigating anti-bot protections on certain sites. By configuring Python requests headers to mimic these browser-specific patterns, you can make your requests appear more human-like, often increasing the chances of successful requests.
DNT (Do Not Track): Informs the server of the user’s tracking preference (1 means "do not track"), making the request more browser-like.
Sec-Fetch-Site : Shows the origin relationship, with values like same-origin, cross-site, and none, helping mimic genuine navigation context.
Sec-Fetch-Mode : Defines request purpose, such as navigate for page loads, making it useful for replicating typical browser behavior.
Sec-Fetch-Dest : Indicates content type (document, image, script), useful for mimicking specific resource requests.
Example of Browser-Specific Headers in Python Requests:
Set browser-specific headers when making requests using the requests library in Python.
import requests response = requests.get('https://httpbin.dev') print(response.headers) { "Access-Control-Allow-Credentials": "true", "Access-Control-Allow-Origin": "*", "Content-Security-Policy": "frame-ancestors 'self' *.httpbin.dev; font-src 'self' *.httpbin.dev; default-src 'self' *.httpbin.dev; img-src 'self' *.httpbin.dev https://cdn.scrapfly.io; media-src 'self' *.httpbin.dev; script-src 'self' 'unsafe-inline' 'unsafe-eval' *.httpbin.dev; style-src 'self' 'unsafe-inline' *.httpbin.dev https://unpkg.com; frame-src 'self' *.httpbin.dev; worker-src 'self' *.httpbin.dev; connect-src 'self' *.httpbin.dev", "Content-Type": "text/html; charset=utf-8", "Date": "Fri, 25 Oct 2024 14:14:02 GMT", "Permissions-Policy": "fullscreen=(self), autoplay=*, geolocation=(), camera=()", "Referrer-Policy": "strict-origin-when-cross-origin", "Strict-Transport-Security": "max-age=31536000; includeSubDomains; preload", "X-Content-Type-Options": "nosniff", "X-Xss-Protection": "1; mode=block", "Transfer-Encoding": "chunked" }
By including these headers, you can make your request appear closer to those typically sent by browsers, reducing the likelihood of being flagged as a bot or encountering access restrictions.
Anti-Bot Detection : Browser-specific headers help requests resemble regular user traffic, making it harder for anti-bot systems to flag them.
Enhanced Compatibility : Some sites offer different responses for browser-like requests, making these headers useful for sites that restrict non-browser traffic.
Request Authenticity : Mimicking browser behavior with these headers can increase request success rates by reducing the chance of blocks.
When working with Python requests headers, it’s essential to use valid, correctly formatted headers. Many servers actively monitor incoming headers to detect unusual or incomplete requests. Requests with invalid or missing headers—such as a missing User-Agent, improperly set Content-Type, or contradictory headers—are common signals of automated or suspicious traffic and can lead to immediate blocking.
For example, headers that contradict each other, like mixing Accept: text/html with Content-Type: application/json, may cause the server to reject your request, as this combination doesn’t align with typical browser behavior.
Additionally, some websites use AI-powered anti-bot tools to scrutinize headers and pinpoint bot-like inconsistencies. Testing headers for potential issues is best done on a controlled platform.
These practical tips for setting headers, Like using User-Agent, matching Content-Type, and avoiding excessive headers help reduce detection and minimize request blocking.
Taking these precautions when setting headers can significantly improve the success rate of your requests and help you bypass potential blocks effectively.
While requests is a powerful HTTP client library it's not a great tool for scraping as it's hard to scale and easy to identify and block.
ScrapFly provides web scraping, screenshot, and extraction APIs for data collection at scale.
To wrap up this guide, here are answers to some frequently asked questions about python requests headers.
Headers convey additional information with each request, such as the type of data expected, client information, and authorization details. They’re essential for communicating preferences and ensuring that servers handle requests correctly.
Headers can help bypass anti-bot detection, authenticate requests, and ensure the correct data format in responses. Customizing headers to resemble real browser requests is especially helpful for scraping and accessing restricted APIs.
Using browser developer tools, you can inspect the headers sent with each request to a website. Copying these headers into your Python requests can help your request mimic browser traffic.
Working with Python requests headers is essential for both web scraping and API interactions. Understanding how to set, get, and manipulate headers can help you create more effective and reliable requests. Whether you're dealing with GET or POST requests, mimicking browser headers, or trying to avoid detection, the way you handle headers can make or break your scraping success.
By following best practices, Such as using standard headers, setting appropriate values for POST requests, and ensuring header order, Your requests will be better equipped to navigate the complex landscape of modern web services.
The above is the detailed content of Guide to Python Requests Headers. For more information, please follow other related articles on the PHP Chinese website!