Uploading a CSV file to Django REST (especially in an atomic setting) is a simple task, but kept me puzzled until I found out some tricks I would be sharing with you.
In this article, I will be using postman (in place of a frontend) and will also share what you need to set on postman for request sending via pictures.
What we desire
- Upload CSV via Django Rest to the DB
- Make the operation atomic i.e any error in any row from the csv should cause complete rollback of the entire operation, so we can avoid the stress of cutting the csv file i.e the headache of identifying the portion of the rows that made it to the DB and those that didn’t due to any error midway!! (partial entry). So we want an all-or-none thing !!
Method
- Assuming, you already have Django and Django REST installed, the first step would be to install pandas, a python library for data manipulation.
pip install pandas
- Next in postman: In the body tab, select form-data and add a key (any arbitrary name). In that same cell, hover on the rightmost of the cell and use the dropdown to change option from text to file. Postman will automatically set Content-Type to multipart/form-data in Headers the moment you do this.
For the value cell, click the 'Select Files' button and upload the CSV. Check the screenshot below
Under headers, set Content-Disposition and the value to form-data; name="file"; filename="your_file_name.csv". Replace your_file_name.csv with your actual file name. Check the screenshot below.
- In the Django views, the code is as follows:
from rest_framework import status from rest_framework.views import APIView from rest_framework.parsers import FileUploadParser from rest_framework.response import Response from .models import BiodataModel from django.db import transaction import pandas as pd class UploadCSVFile(APIView): parser_classes = [FileUploadParser] def post(self,request): csv_file = request.FILES.get('file') if not csv_file: return Response({"error": "No file provided"}, status=status.HTTP_400_BAD_REQUEST) # Validate file type if not csv_file.name.endswith('.csv'): return Response({"error": "File is not CSV type"}, status=status.HTTP_400_BAD_REQUEST) df = pd.read_csv(csv_file, delimiter=',',skiprows=3,dtype=str).iloc[:-1] df = df.where(pd.notnull(df), None) bulk_data=[] for index, row in df.iterrows(): try: row_instance= BiodataModel( name=row.get('name'), age=row.get('age'), address =row.get('address')) row_instance.full_clean() bulk_data.append(row_instance) except Exception as e: return Response({"error": f'Error at row {index + 2} -> {e}'}, status=status.HTTP_400_BAD_REQUEST) try: with transaction.atomic(): BiodataModel.objects.bulk_create(bulk_data) except Exception as e: return Response({"error": f'Bulk create error--{e}'}, status=status.HTTP_400_BAD_REQUEST) return Response({"msg":"CSV file processed successfully"}, status=status.HTTP_201_CREATED)
Explaining the code above:
The code begins with importing necessary packages, defining a class based view and setting a parser class (FileUploadParser). The first part of the post method in the class attempts to get the file from request.FILES and check its availability.
Then a minor validation checks that it is a CSV by checking the extension.
The next part loads it into a pandas dataframe (very much like a spreadsheet):
df = pd.read_csv(csv_file, delimiter=',',skiprows=3,dtype=str).iloc[:-1]
I will explain some of the arguments passed to the loading function:
skiprows
In reading the loaded csv file, it should be noted that the csv in this case is passed over a network, so some metadata like stuff is added to the beginning and end of the file. These things can be annoying and are not in comma separated value (csv) form so can actually raise errors in parsing. This explains why I used skiprows=3, to skip the first 3 rows containing metadata and header and land directly on the body of the csv. If you remove skiprowsor use a lesser number, perhaps you might get an error like: Error tokenizing data. C error or you might notice the data starting from the header.
dtype=str
Pandas likes to prove smart in trying to guess the datatype of certain columns. I wanted all values as string, so I used dtype=str
delimiter
Specifies how the cells are separated. Default is usually comma.
iloc[:-1]
I had to use iloc to slice the dataframe, removing the metadata at the end of the df.
Then, the next line df = df.where(pd.notnull(df), None) converts all NaNvalues to None. NaNis a stand-in value that pandas uses to rep None.
The next block is a bit tricky. We loop over every row in the dataframe, instantiate the row data with the BiodataModel, perform model-level validation (not serializer-level) with full_clean() method because bulk create bypasses Django validation, and then add our create operations to a list called bulk_data. Yeah , add not run yet ! Remember, we are trying to do an atomic operation (at batch level ) so we want all or None. Saving rows individually won’t give us all or none behaviour.
Then for the last significant part. Within a transaction.atomic() block (which provides all or none behaviour), we run BiodataModel.objects.bulk_create(bulk_data) to save all rows at once.
One more thing. Notice the index variable and the except block in the for loop. In the except block error message, I added 2 to the indexvariable derived from df.iterrows() because the value did not match exactly the row it was on, when looked at in an excel file. The except block catches any error and constructs an error message having the exact row number when opened in excel, so that the uploader can easily locate the line in the excel file!
Thanks for reading!!!
VERSIONS OF TOOLS USED
from rest_framework import status from rest_framework.views import APIView from rest_framework.parsers import FileUploadParser from rest_framework.response import Response from .models import BiodataModel from django.db import transaction import pandas as pd class UploadCSVFile(APIView): parser_classes = [FileUploadParser] def post(self,request): csv_file = request.FILES.get('file') if not csv_file: return Response({"error": "No file provided"}, status=status.HTTP_400_BAD_REQUEST) # Validate file type if not csv_file.name.endswith('.csv'): return Response({"error": "File is not CSV type"}, status=status.HTTP_400_BAD_REQUEST) df = pd.read_csv(csv_file, delimiter=',',skiprows=3,dtype=str).iloc[:-1] df = df.where(pd.notnull(df), None) bulk_data=[] for index, row in df.iterrows(): try: row_instance= BiodataModel( name=row.get('name'), age=row.get('age'), address =row.get('address')) row_instance.full_clean() bulk_data.append(row_instance) except Exception as e: return Response({"error": f'Error at row {index + 2} -> {e}'}, status=status.HTTP_400_BAD_REQUEST) try: with transaction.atomic(): BiodataModel.objects.bulk_create(bulk_data) except Exception as e: return Response({"error": f'Bulk create error--{e}'}, status=status.HTTP_400_BAD_REQUEST) return Response({"msg":"CSV file processed successfully"}, status=status.HTTP_201_CREATED)
The above is the detailed content of HOW TO UPLOAD A CSV FILE TO DJANGO REST. For more information, please follow other related articles on the PHP Chinese website!

Solution to permission issues when viewing Python version in Linux terminal When you try to view Python version in Linux terminal, enter python...

This article explains how to use Beautiful Soup, a Python library, to parse HTML. It details common methods like find(), find_all(), select(), and get_text() for data extraction, handling of diverse HTML structures and errors, and alternatives (Sel

Serialization and deserialization of Python objects are key aspects of any non-trivial program. If you save something to a Python file, you do object serialization and deserialization if you read the configuration file, or if you respond to an HTTP request. In a sense, serialization and deserialization are the most boring things in the world. Who cares about all these formats and protocols? You want to persist or stream some Python objects and retrieve them in full at a later time. This is a great way to see the world on a conceptual level. However, on a practical level, the serialization scheme, format or protocol you choose may determine the speed, security, freedom of maintenance status, and other aspects of the program

Python's statistics module provides powerful data statistical analysis capabilities to help us quickly understand the overall characteristics of data, such as biostatistics and business analysis. Instead of looking at data points one by one, just look at statistics such as mean or variance to discover trends and features in the original data that may be ignored, and compare large datasets more easily and effectively. This tutorial will explain how to calculate the mean and measure the degree of dispersion of the dataset. Unless otherwise stated, all functions in this module support the calculation of the mean() function instead of simply summing the average. Floating point numbers can also be used. import random import statistics from fracti

This article compares TensorFlow and PyTorch for deep learning. It details the steps involved: data preparation, model building, training, evaluation, and deployment. Key differences between the frameworks, particularly regarding computational grap

This tutorial builds upon the previous introduction to Beautiful Soup, focusing on DOM manipulation beyond simple tree navigation. We'll explore efficient search methods and techniques for modifying HTML structure. One common DOM search method is ex

This article guides Python developers on building command-line interfaces (CLIs). It details using libraries like typer, click, and argparse, emphasizing input/output handling, and promoting user-friendly design patterns for improved CLI usability.

The article discusses popular Python libraries like NumPy, Pandas, Matplotlib, Scikit-learn, TensorFlow, Django, Flask, and Requests, detailing their uses in scientific computing, data analysis, visualization, machine learning, web development, and H


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

EditPlus Chinese cracked version
Small size, syntax highlighting, does not support code prompt function

SecLists
SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

Zend Studio 13.0.1
Powerful PHP integrated development environment

Atom editor mac version download
The most popular open source editor

SublimeText3 Chinese version
Chinese version, very easy to use