将大文件上传到云端可能具有挑战性 - 网络中断、浏览器限制和巨大的文件大小很容易破坏该过程。 Amazon S3(简单存储服务)是一种可扩展、高速、基于 Web 的云存储服务,专为数据和应用程序的在线备份和归档而设计。然而,将大文件上传到 S3 需要小心处理,以确保可靠性和性能。
AWS S3 的分段上传:这是一个功能强大的解决方案,可以将大文件分成更小的块,通过独立处理每个部分甚至并行上传部分来实现更快、更可靠的上传。这种方法不仅克服了文件大小限制(S3 需要对大于 5GB 的文件进行分段上传),而且还最大限度地降低了失败的风险,使其非常适合需要无缝、稳健的文件上传的应用程序。
在本指南中,我们将详细介绍客户端分段上传到 S3 的细节,向您展示为什么它是处理大文件的明智选择、如何安全地启动和运行它以及需要注意哪些挑战出去。我将提供分步说明、代码示例和最佳实践,以帮助您实现可靠的客户端文件上传解决方案。
准备好升级您的文件上传体验了吗?让我们开始吧!
设计文件上传系统时,您有两个主要选择:通过服务器上传文件(服务器端)或直接从客户端上传文件到 S3(客户端)。每种方法都有其优点和缺点。
增强的安全性:所有上传均由服务器管理,确保 AWS 凭证的安全。
更好的错误处理:服务器可以更稳健地管理重试、日志记录和错误处理。
集中处理:文件可以在存储到 S3 之前在服务器上进行验证、处理或转换。
更高的服务器负载:大量上传会消耗服务器资源(CPU、内存、带宽),这会影响性能并增加运营成本。
潜在瓶颈:在高上传流量期间,服务器可能成为单点故障或性能瓶颈,导致上传缓慢或停机。
成本增加:在服务器端处理上传可能需要扩展基础设施以处理峰值负载,从而增加运营费用。
减少服务器负载:文件直接从用户设备发送到 S3,释放服务器资源。
速度提高:由于绕过应用程序服务器,用户体验更快的上传速度。
成本效率:无需服务器基础设施来处理大型上传,从而可能降低成本。
可扩展性:非常适合在不给后端服务器造成压力的情况下扩展文件上传。
安全风险:需要仔细处理 AWS 凭证和权限。必须安全地生成预签名 URL,以防止未经授权的访问。
有限控制:服务器端对上传的监督较少;错误处理和重试通常在客户端进行管理。
浏览器限制:浏览器有内存和 API 限制,这可能会阻碍处理非常大的文件或影响低端设备上的性能。
安全地实现客户端上传涉及前端应用程序和安全后端服务之间的协调。后端服务的主要作用是生成预签名 URL,允许客户端将文件直接上传到 S3,而无需暴露敏感的 AWS 凭证。
要有效实现客户端上传,您需要:
此架构确保敏感操作在后端安全处理,而前端则管理上传过程。
预签名 URL 允许客户端直接与 S3 交互,执行上传文件等操作,而无需在客户端提供 AWS 凭证。它们是安全的,因为:
在服务器上创建一个服务类,负责:
a.定义 S3 存储桶和区域
b.安全地建立 AWS 凭证。
c.提供生成预签名 URL 和管理分段上传的方法。
// services/S3UploadService.js import { S3Client, CreateMultipartUploadCommand, CompleteMultipartUploadCommand, UploadPartCommand, AbortMultipartUploadCommand, PutObjectCommand, GetObjectCommand, DeleteObjectCommand, } from '@aws-sdk/client-s3'; import { getSignedUrl } from '@aws-sdk/s3-request-presigner'; // Import credential providers import { fromIni, fromInstanceMetadata, fromEnv, fromProcess, } from '@aws-sdk/credential-providers'; export class S3UploadService { constructor() { this.s3BucketName = process.env.S3_BUCKET_NAME; this.s3Region = process.env.S3_REGION; this.s3Client = new S3Client({ region: this.s3Region, credentials: this.getS3ClientCredentials(), }); } // Method to generate AWS credentials securely getS3ClientCredentials() { if (process.env.NODE_ENV === 'development') { // In development, use credentials from environment variables return fromEnv(); } else { // In production, use credentials from EC2 instance metadata or another secure method return fromInstanceMetadata(); } } // Generate a presigned URL for single-part upload (PUT), download (GET), or deletion (DELETE) async generatePresignedUrl(key, operation) { let command; switch (operation) { case 'PUT': command = new PutObjectCommand({ Bucket: this.s3BucketName, Key: key, }); break; case 'GET': command = new GetObjectCommand({ Bucket: this.s3BucketName, Key: key, }); break; case 'DELETE': command = new DeleteObjectCommand({ Bucket: this.s3BucketName, Key: key, }); break; default: throw new Error(`Invalid operation "${operation}"`); } // Generate presigned URL return await getSignedUrl(this.s3Client, command, { expiresIn: 3600 }); // Expires in 1 hour } // Methods for multipart upload async createMultipartUpload(key) { const command = new CreateMultipartUploadCommand({ Bucket: this.s3BucketName, Key: key, }); const response = await this.s3Client.send(command); return response.UploadId; } async generateUploadPartUrl(key, uploadId, partNumber) { const command = new UploadPartCommand({ Bucket: this.s3BucketName, Key: key, UploadId: uploadId, PartNumber: partNumber, }); return await getSignedUrl(this.s3Client, command, { expiresIn: 3600 }); } async completeMultipartUpload(key, uploadId, parts) { const command = new CompleteMultipartUploadCommand({ Bucket: this.s3BucketName, Key: key, UploadId: uploadId, MultipartUpload: { Parts: parts }, }); return await this.s3Client.send(command); } async abortMultipartUpload(key, uploadId) { const command = new AbortMultipartUploadCommand({ Bucket: this.s3BucketName, Key: key, UploadId: uploadId, }); return await this.s3Client.send(command); } }
注意:确保您的 AWS 凭证得到安全管理。在生产中,建议使用附加到 EC2 实例或 ECS 任务的 IAM 角色,而不是硬编码凭证或使用环境变量。
在后端创建 API 端点来处理来自前端的请求。这些端点将利用 S3UploadService 来执行操作。
// controllers/S3UploadController.js import { S3UploadService } from '../services/S3UploadService'; const s3UploadService = new S3UploadService(); export const generatePresignedUrl = async (req, res, next) => { try { const { key, operation } = req.body; // key is the S3 object key (file identifier) const url = await s3UploadService.generatePresignedUrl(key, operation); res.status(200).json({ url }); } catch (error) { next(error); } }; export const initializeMultipartUpload = async (req, res, next) => { try { const { key } = req.body; const uploadId = await s3UploadService.createMultipartUpload(key); res.status(200).json({ uploadId }); } catch (error) { next(error); } }; export const generateUploadPartUrls = async (req, res, next) => { try { const { key, uploadId, parts } = req.body; // parts is the number of parts const urls = await Promise.all( [...Array(parts).keys()].map(async (index) => { const partNumber = index + 1; const url = await s3UploadService.generateUploadPartUrl(key, uploadId, partNumber); return { partNumber, url }; }) ); res.status(200).json({ urls }); } catch (error) { next(error); } }; export const completeMultipartUpload = async (req, res, next) => { try { const { key, uploadId, parts } = req.body; // parts is an array of { ETag, PartNumber } const result = await s3UploadService.completeMultipartUpload(key, uploadId, parts); res.status(200).json({ result }); } catch (error) { next(error); } }; export const abortMultipartUpload = async (req, res, next) => { try { const { key, uploadId } = req.body; await s3UploadService.abortMultipartUpload(key, uploadId); res.status(200).json({ message: 'Upload aborted' }); } catch (error) { next(error); } };
在 Express 应用程序或您使用的任何框架中设置这些端点的路由。
前端将处理选择文件,根据文件大小决定是否执行单部分或分段上传,并管理上传过程。
一般来说,AWS 建议“当您的对象大小达到 100 MB 时,您应该考虑使用分段上传,而不是在单个操作中上传对象。”来源
// services/S3UploadService.js import { S3Client, CreateMultipartUploadCommand, CompleteMultipartUploadCommand, UploadPartCommand, AbortMultipartUploadCommand, PutObjectCommand, GetObjectCommand, DeleteObjectCommand, } from '@aws-sdk/client-s3'; import { getSignedUrl } from '@aws-sdk/s3-request-presigner'; // Import credential providers import { fromIni, fromInstanceMetadata, fromEnv, fromProcess, } from '@aws-sdk/credential-providers'; export class S3UploadService { constructor() { this.s3BucketName = process.env.S3_BUCKET_NAME; this.s3Region = process.env.S3_REGION; this.s3Client = new S3Client({ region: this.s3Region, credentials: this.getS3ClientCredentials(), }); } // Method to generate AWS credentials securely getS3ClientCredentials() { if (process.env.NODE_ENV === 'development') { // In development, use credentials from environment variables return fromEnv(); } else { // In production, use credentials from EC2 instance metadata or another secure method return fromInstanceMetadata(); } } // Generate a presigned URL for single-part upload (PUT), download (GET), or deletion (DELETE) async generatePresignedUrl(key, operation) { let command; switch (operation) { case 'PUT': command = new PutObjectCommand({ Bucket: this.s3BucketName, Key: key, }); break; case 'GET': command = new GetObjectCommand({ Bucket: this.s3BucketName, Key: key, }); break; case 'DELETE': command = new DeleteObjectCommand({ Bucket: this.s3BucketName, Key: key, }); break; default: throw new Error(`Invalid operation "${operation}"`); } // Generate presigned URL return await getSignedUrl(this.s3Client, command, { expiresIn: 3600 }); // Expires in 1 hour } // Methods for multipart upload async createMultipartUpload(key) { const command = new CreateMultipartUploadCommand({ Bucket: this.s3BucketName, Key: key, }); const response = await this.s3Client.send(command); return response.UploadId; } async generateUploadPartUrl(key, uploadId, partNumber) { const command = new UploadPartCommand({ Bucket: this.s3BucketName, Key: key, UploadId: uploadId, PartNumber: partNumber, }); return await getSignedUrl(this.s3Client, command, { expiresIn: 3600 }); } async completeMultipartUpload(key, uploadId, parts) { const command = new CompleteMultipartUploadCommand({ Bucket: this.s3BucketName, Key: key, UploadId: uploadId, MultipartUpload: { Parts: parts }, }); return await this.s3Client.send(command); } async abortMultipartUpload(key, uploadId) { const command = new AbortMultipartUploadCommand({ Bucket: this.s3BucketName, Key: key, UploadId: uploadId, }); return await this.s3Client.send(command); } }
// controllers/S3UploadController.js import { S3UploadService } from '../services/S3UploadService'; const s3UploadService = new S3UploadService(); export const generatePresignedUrl = async (req, res, next) => { try { const { key, operation } = req.body; // key is the S3 object key (file identifier) const url = await s3UploadService.generatePresignedUrl(key, operation); res.status(200).json({ url }); } catch (error) { next(error); } }; export const initializeMultipartUpload = async (req, res, next) => { try { const { key } = req.body; const uploadId = await s3UploadService.createMultipartUpload(key); res.status(200).json({ uploadId }); } catch (error) { next(error); } }; export const generateUploadPartUrls = async (req, res, next) => { try { const { key, uploadId, parts } = req.body; // parts is the number of parts const urls = await Promise.all( [...Array(parts).keys()].map(async (index) => { const partNumber = index + 1; const url = await s3UploadService.generateUploadPartUrl(key, uploadId, partNumber); return { partNumber, url }; }) ); res.status(200).json({ urls }); } catch (error) { next(error); } }; export const completeMultipartUpload = async (req, res, next) => { try { const { key, uploadId, parts } = req.body; // parts is an array of { ETag, PartNumber } const result = await s3UploadService.completeMultipartUpload(key, uploadId, parts); res.status(200).json({ result }); } catch (error) { next(error); } }; export const abortMultipartUpload = async (req, res, next) => { try { const { key, uploadId } = req.body; await s3UploadService.abortMultipartUpload(key, uploadId); res.status(200).json({ message: 'Upload aborted' }); } catch (error) { next(error); } };
虽然 AWS S3 支持大小高达 5 TiB(太字节)的对象,但由于浏览器限制和客户端资源限制,直接从浏览器上传如此大的文件是不切实际的,而且通常是不可能的。处理非常大的文件时,浏览器可能会崩溃或变得无响应,特别是需要在内存中处理它们时。
上传大文件会增加上传过程中网络中断或失败的风险。实施稳健的重试策略对于增强用户体验并确保成功上传至关重要。
不完整的分段上传可能会累积在您的 S3 存储桶中,消耗存储空间并可能产生费用。
生命周期规则配置示例:
// services/S3UploadService.js import { S3Client, CreateMultipartUploadCommand, CompleteMultipartUploadCommand, UploadPartCommand, AbortMultipartUploadCommand, PutObjectCommand, GetObjectCommand, DeleteObjectCommand, } from '@aws-sdk/client-s3'; import { getSignedUrl } from '@aws-sdk/s3-request-presigner'; // Import credential providers import { fromIni, fromInstanceMetadata, fromEnv, fromProcess, } from '@aws-sdk/credential-providers'; export class S3UploadService { constructor() { this.s3BucketName = process.env.S3_BUCKET_NAME; this.s3Region = process.env.S3_REGION; this.s3Client = new S3Client({ region: this.s3Region, credentials: this.getS3ClientCredentials(), }); } // Method to generate AWS credentials securely getS3ClientCredentials() { if (process.env.NODE_ENV === 'development') { // In development, use credentials from environment variables return fromEnv(); } else { // In production, use credentials from EC2 instance metadata or another secure method return fromInstanceMetadata(); } } // Generate a presigned URL for single-part upload (PUT), download (GET), or deletion (DELETE) async generatePresignedUrl(key, operation) { let command; switch (operation) { case 'PUT': command = new PutObjectCommand({ Bucket: this.s3BucketName, Key: key, }); break; case 'GET': command = new GetObjectCommand({ Bucket: this.s3BucketName, Key: key, }); break; case 'DELETE': command = new DeleteObjectCommand({ Bucket: this.s3BucketName, Key: key, }); break; default: throw new Error(`Invalid operation "${operation}"`); } // Generate presigned URL return await getSignedUrl(this.s3Client, command, { expiresIn: 3600 }); // Expires in 1 hour } // Methods for multipart upload async createMultipartUpload(key) { const command = new CreateMultipartUploadCommand({ Bucket: this.s3BucketName, Key: key, }); const response = await this.s3Client.send(command); return response.UploadId; } async generateUploadPartUrl(key, uploadId, partNumber) { const command = new UploadPartCommand({ Bucket: this.s3BucketName, Key: key, UploadId: uploadId, PartNumber: partNumber, }); return await getSignedUrl(this.s3Client, command, { expiresIn: 3600 }); } async completeMultipartUpload(key, uploadId, parts) { const command = new CompleteMultipartUploadCommand({ Bucket: this.s3BucketName, Key: key, UploadId: uploadId, MultipartUpload: { Parts: parts }, }); return await this.s3Client.send(command); } async abortMultipartUpload(key, uploadId) { const command = new AbortMultipartUploadCommand({ Bucket: this.s3BucketName, Key: key, UploadId: uploadId, }); return await this.s3Client.send(command); } }
上传大文件可能会占用大量资源,并可能导致浏览器主线程无响应,从而导致用户体验不佳。
在实现客户端分段上传时,浏览器兼容性确实是一个问题。不同的浏览器可能对处理大文件上传所需的 API 和功能有不同级别的支持,例如 *文件 API、Blob 切片、Web Workers 和网络请求处理* 。成功应对这些差异对于确保在所有受支持的浏览器上获得一致且可靠的用户体验至关重要。
通过使用预签名 URL 和分段上传实现客户端上传,您可以高效地直接处理任意大小的文件上传到 S3,从而减少服务器负载并提高性能。请记住,通过安全地管理 AWS 凭证并限制预签名 URL 的权限和生命周期,将安全性放在首位。
本指南提供了使用 AWS S3、AWS SDK for JavaScript 和预签名 URL 设置安全且可扩展的文件上传系统的分步方法。通过提供的代码示例和最佳实践,您就可以很好地增强应用程序的文件上传功能。
以上是优化大文件上传:安全地将客户端分段上传到 AWS S3的详细内容。更多信息请关注PHP中文网其他相关文章!