Effortless File Parsing in NestJS: Manage CSV and XLSX Uploads in Memory for Speed, Security, and Scalability
Handling file uploads in a web application is a common task, but dealing with different file types and ensuring they are processed correctly can be challenging. Often, developers need to parse uploaded files without saving them to the server, which is especially important for reducing server storage costs and ensuring that sensitive data is not unnecessarily retained. In this article, we’ll walk through the process of creating a custom NestJS module to handle file uploads specifically for CSV and XLS/XLSX files, and we’ll parse these files in memory using Node.js streams, so no static files are created on the server.
NestJS is a progressive Node.js framework that leverages TypeScript and provides an out-of-the-box application architecture that enables you to build highly testable, scalable, loosely coupled, and easily maintainable applications. By using NestJS, we can take advantage of its modular structure, powerful dependency injection system, and extensive ecosystem.
Before we dive into the code, let’s set up a new NestJS project. If you haven’t already, install the NestJS CLI:
npm install -g @nestjs/cli
Create a new NestJS project:
nest new your-super-name
Navigate into the project directory:
cd your-super-name
We’ll need to install some additional packages to handle file uploads and parsing:
npm install @nestjs/platform-express multer exceljsfile-type
To customize the file upload process, we’ll create a custom Multer storage engine. This engine will ensure that only CSV and XLS/XLSX files are accepted, parse them in memory using Node.js streams, and return the parsed data without saving any files to disk.
Create a new file for our engine:
import { PassThrough } from 'stream'; import * as fileType from 'file-type'; import { BadRequestException } from '@nestjs/common'; import { Request } from 'express'; import { Workbook } from 'exceljs'; import { createParserCsvOrXlsx } from './parser-factory.js'; const ALLOWED_MIME_TYPES = [ 'text/csv', 'application/vnd.ms-excel', 'text/comma-separated-values', 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet', 'application/vnd.ms-excel', ] as const; export class CsvOrXlsxMulterEngine { private destKey: string; private maxFileSize: number; constructor(opts: { destKey: string; maxFileSize: number }) { this.destKey = opts.destKey; this.maxFileSize = opts.maxFileSize; } async _handleFile(req: Request, file: any, cb: any) { try { const contentLength = Number(req.headers['content-length']); if ( typeof contentLength === 'number' && contentLength > this.maxFileSize ) { throw new Error(`Max file size is ${this.maxFileSize} bytes.`); } const fileStream = await fileType.fileTypeStream(file.stream); const mime = fileStream.fileType?.mime ?? file.mimetype; if (!ALLOWED_MIME_TYPES.includes(mime)) { throw new BadRequestException('File must be *.csv or *.xlsx'); } const replacementStream = new PassThrough(); fileStream.pipe(replacementStream); const parser = createParserCsvOrXlsx(mime); const data = await parser.read(replacementStream); cb(null, { [this.destKey]: mime === 'text/csv' ? data : (data as Workbook).getWorksheet(), }); } catch (error) { cb(error); } } _removeFile(req: Request, file: any, cb: any) { cb(null); } }
This custom storage engine checks the file’s MIME type and ensures it’s either a CSV or XLS/XLSX file. It then processes the file entirely in memory using Node.js streams, so no temporary files are created on the server. This approach is both efficient and secure, especially when dealing with sensitive data.
The parser factory is responsible for determining the appropriate parser based on the file type.
Create a new file for our parser:
import excel from 'exceljs'; export function createParserCsvOrXlsx(mime: string) { const workbook = new excel.Workbook(); return [ 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet', 'application/vnd.ms-excel', ].includes(mime) ? workbook.xlsx : workbook.csv; }
This factory function checks the MIME type and returns the appropriate parser (either xlsx or csv).
Next, let’s create a controller to handle file uploads using our custom storage engine.
Generate a new controller:
nest g controller files
In the files.controller.ts, configure the file upload using Multer and the custom storage engine:
import { Controller, Post, UploadedFile, UseInterceptors, } from '@nestjs/common'; import { FileInterceptor } from '@nestjs/platform-express'; import { Worksheet } from 'exceljs'; import { CsvOrXlsxMulterEngine } from '../../shared/multer-engines/csv-xlsx/engine.js'; import { FilesService } from './files.service.js'; const MAX_FILE_SIZE_IN_MiB = 1000000000; // Only for test @Controller('files') export class FilesController { constructor(private readonly filesService: FilesService) {} @UseInterceptors( FileInterceptor('file', { storage: new CsvOrXlsxMulterEngine({ maxFileSize: MAX_FILE_SIZE_IN_MiB, destKey: 'worksheet', }), }), ) @Post() create(@UploadedFile() data: { worksheet: Worksheet }) { return this.filesService.format(data.worksheet); } }
This controller sets up an endpoint to handle file uploads. The uploaded file is processed by the CsvOrXlsxMulterEngine, and the parsed data is returned in the response without ever being saved to disk.
Finally, we need to set up a module to include our controller.
Generate a new module:
nest g module files
In the files.module.ts, import the controller:
import { Module } from '@nestjs/common'; import { FilesController } from './files.controller.js'; import { FilesService } from './files.service.js'; @Module({ providers: [FilesService], controllers: [FilesController], }) export class FilesModule {}
Make sure to import this module into your AppModule:
To test the file upload functionality, we can create a simple HTML page that allows users to upload CSV or XLS/XLSX files. This page will send the file to our /api/files endpoint, where it will be parsed and processed in memory.
Here’s the basic HTML file for testing the file upload:
<!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <meta name="viewport" content="width=device-width, initial-scale=1.0"> <title>File Upload</title> </head> <body> <h1>Upload a File (CSV or XLSX)</h1> <form action="/api/files" method="post" enctype="multipart/form-data"> <label for="file">Choose file:</label> <input type="file" id="file" name="file" accept=".csv, .xlsx" required> <br><br> <button type="submit">Upload</button> </form> </body> </html>
To render the HTML page for file uploads, we first need to install an additional NestJS module called @nestjs/serve-static. You can do this by running the following command:
npm install @nestjs/serve-static
After installing, we need to configure this module in AppModule:
import { Module } from '@nestjs/common'; import { join } from 'path'; import { ServeStaticModule } from '@nestjs/serve-static'; import { FilesModule } from './modules/files/files.module.js'; @Module({ imports: [ FilesModule, ServeStaticModule.forRoot({ rootPath: join(new URL('..', import.meta.url).pathname, 'public'), serveRoot: '/', }), ], }) export class AppModule {}
This setup will allow us to serve static files from the public directory. Now, we can open the file upload page by navigating to http://localhost:3000 in your browser.
Upload Your File
要上傳文件,請依照下列步驟操作:
文件上傳成功後,您應該會看到文件已上傳並格式化的確認訊息。
注意:我沒有包含用於格式化上傳檔案的程式碼,因為這取決於您選擇用於處理 CSV 或 XLS/XLSX 檔案的程式庫。您可以在 GitHub 上查看完整的實作。
比較記憶體檔案處理的優缺點
在決定是使用記憶體中文件處理還是將文件保存到磁碟時,了解權衡非常重要。
磁碟上沒有臨時檔案:
更快的處理:
簡化清理:
記憶體使用量:
檔案大小限制:
錯誤處理的複雜性:
中小型文件:如果您的應用程式處理相對較小的文件,記憶體中處理可以提供速度和簡單性。
安全敏感應用程式:處理不應儲存在磁碟上的敏感資料時,記憶體中處理可以降低資料外洩的風險。
高效能場景:需要高吞吐量和最小延遲的應用程式可能會受益於記憶體處理開銷的減少。
大文件:如果您的應用程式需要處理非常大的文件,可能需要基於磁碟的處理以避免記憶體不足。
資源受限環境:在伺服器記憶體有限的情況下,處理磁碟上的檔案可以防止記憶體耗盡並實現更好的資源管理。
持久儲存需求:如果您需要保留上傳檔案的副本以供審核、備份或以後檢索,則需要將檔案儲存到磁碟。
與外部儲存服務整合:對於大文件,請考慮將其上傳到 AWS S3、Google Cloud 等外部儲存服務
可擴充性:雲端儲存解決方案可以處理大量檔案並提供冗餘,確保您的資料安全且可以從多個地理位置輕鬆存取。
成本效率:使用雲端儲存處理大型檔案可以更具成本效益,因為它減少了對本地伺服器資源的需求,並提供按需付費的定價。
在本文中,我們在NestJS 中創建了一個自訂文件上傳模組,用於處理CSV 和XLS/XLSX 文件,在記憶體中解析它們,並返回解析後的數據,而不將任何文件保存到磁碟.這種方法利用了 Node.js 流的強大功能,使其既高效又安全,因為伺服器上不會留下任何臨時檔案。
我們也探討了記憶體中檔案處理與將檔案儲存到磁碟的優缺點。雖然記憶體處理提供速度、安全性和簡單性,但在採用此方法之前考慮記憶體使用和潛在的檔案大小限制非常重要。
無論您是建立企業應用程式還是小型項目,正確處理文件上傳和解析都至關重要。透過此設置,您就可以很好地掌握 NestJS 中的文件上傳,而無需擔心不必要的伺服器儲存或資料安全問題。
請隨時在下面的評論部分分享您的想法和改進!
如果您喜歡本文或發現這些工具很有用,請務必在 Dev.to 上關注我,以獲取有關編碼和開發的更多見解和技巧。我會定期分享有用的內容,讓您的程式設計之旅更加順利。
在 X (Twitter) 上關注我,我在這裡分享更多關於程式設計和技術的有趣想法、更新和討論!不要錯過 - 點擊這些關注按鈕。
您也可以在 LinkedIn 上關注我,以獲取專業見解、最新專案的更新以及有關編碼、技術趨勢等的討論。不要錯過可以幫助您提高開發技能的有價值的內容 - 讓我們聯繫!
以上是簡化 NestJS 中的檔案上傳:無需磁碟儲存即可高效能記憶體解析 CSV 和 XLSX的詳細內容。更多資訊請關注PHP中文網其他相關文章!