Home >Backend Development >Python Tutorial >How to Read Data Directly from a URL Using Pandas?
One common task in data analysis is to load data from a URL. Pandas, a popular Python library for data manipulation, provides a read_csv function that allows one to read data from a CSV file located in a file path or as a file-like object. However, attempting to directly pass a URL to read_csv may result in an error.
To demonstrate this error, let's consider the example provided in the question:
<code class="python">import pandas as pd import requests url = "https://github.com/cs109/2014_data/blob/master/countries.csv" s = requests.get(url).content c = pd.read_csv(s)</code>
This code attempts to retrieve the CSV file from the given URL using the requests library and then pass the retrieved content as a file-like object to read_csv. However, this will raise an error:
Expected file path name or file-like object, got <class 'bytes'> type
To resolve this error, we need to ensure that we pass a file-like object to read_csv. In Python, there are two main types of file-like objects: text files and binary files. The example provided in the question passes a byte array retrieved from the URL, which is a binary file. Read_csv expects a text file object, which can be obtained by decoding the byte array:
<code class="python">import pandas as pd url = "https://raw.githubusercontent.com/cs109/2014_data/master/countries.csv" c = pd.read_csv(url, encoding="utf-8")</code>
By specifying the encoding as "utf-8," we are interpreting the byte array as a text file. This allows read_csv to successfully load the data from the URL.
In the latest version of pandas (0.19.2), there is a simpler solution available. Pandas now allows direct reading from URLs:
<code class="python">import pandas as pd url = "https://raw.githubusercontent.com/cs109/2014_data/master/countries.csv" c = pd.read_csv(url)</code>
This eliminates the need for additional operations such as retrieving the content and decoding it, making the process more straightforward.
The above is the detailed content of How to Read Data Directly from a URL Using Pandas?. For more information, please follow other related articles on the PHP Chinese website!