The author has recently been doing some file reading, writing and multi-part upload work on node. During this process, I found that if the file read by node exceeds 2G, it will exceed 2G. In order to read the maximum value of the Blob, a read exception will occur. In addition, reading and writing files in the node is also limited by the server RAM. It needs to be read in slices. I will record the problems encountered and the process of solving the problems. [Related tutorial recommendations: nodejs video tutorial]
- File reading and writing in node
- Node file reading and writing RAM and Blob size restrictions
- Others
1. File reading and writing in node
1.1 Regular file reading and writing
Normally, if we want to read a relatively small file, we can directly pass:
const fs = require('fs') let data = fs.readFileSync("./test.png") console.log(data,123) //输出data = <buffer></buffer>
Generally speaking, the synchronization method is not recommended because js/nodejs is single-threaded. Synchronous methods block the main thread. The latest version of node directly provides fs.promise, which can be used directly in combination with async/await:
const fs = require('fs') const readFileSync = async () => { let data = await fs.promises.readFile("./test.png") console.log(data,123) } readFileSync() //输出data = <buffer></buffer>
The asynchronous method call here will not block the main thread, and the IO of multiple file readings can also be performed in parallel, etc. .
1.2 Stream file reading and writing
For conventional file reading and writing, we will read the file into the memory at one time. This method is time efficient and memory efficient. Both are very low. Low time efficiency means that the file must be read once before execution. Low memory efficiency means that the file must be read and put into the memory at once, which takes up a lot of memory. Therefore, in this case, we generally use Stream to read files:
const fs = require('fs') const readFileTest = () => { var data = '' var rs = fs.createReadStream('./test.png'); rs.on('data', function(chunk) { data += chunk; console.log(chunk) }); rs.on('end',function(){ console.log(data); }); rs.on('error', function(err){ console.log(err.stack); }); } readFileTest() // data = <buffer></buffer>
Reading and writing files through Steam can improve memory efficiency and time efficiency.
- Memory efficiency: There is no need to load large amounts (or the entire data) in memory before processing the data
- Time efficiency: Once you have the data, you can start processing, which greatly Reduce the time it takes to start processing data instead of waiting until the entire data is loaded.
Stream files also support the second writing method:
const fs = require('fs') const readFileTest = () => { var data = '' var chunk; var rs = fs.createReadStream('./test.png'); rs.on('readable', function() { while ((chunk=rs.read()) != null) { data += chunk; }}); rs.on('end', function() { console.log(data) }); }; readFileTest()
2. Node file read and write RAM and Blob size restrictions
2.1 Basic question
When reading large files, there will be a limit on the size of the read file. For example, we are currently reading a 2.5G video file:
const fs = require('fs') const readFileTest = async () => { let data = await fs.promises.readFile("./video.mp4") console.log(data) } readFileTest()
Executing the above code will report an error:
RangeError [ERR_FS_FILE_TOO_LARGE]: File size (2246121911) is greater than 2 GB
We may think that by setting option, NODE_OPTIONS='--max -old-space-size=5000', at this time 5000M>2.5G, but the error still does not disappear, which means that the size limit of the file read by the node cannot be changed through Options.
The above is a conventional way to read large files. Is there any file size limit if it is read through Steam? For example:
const fs = require('fs') const readFileTest = () => { var data = '' var rs = fs.createReadStream('./video.mp4'); rs.on('data', function(chunk) { data += chunk; }); rs.on('end',function(){ console.log(data); }); rs.on('error', function(err){ console.log(err.stack); }); } readFileTest()
There will be no exception when reading a 2.5G file in the above way, but please note that there is an error here:
data += chunk; ^ RangeError: Invalid string length
This is because the length of the data exceeds Maximum limit, such as 2048M, etc. Therefore, when processing with Steam, when saving the reading results, pay attention to the file size, which must not exceed the default maximum value of the Buffer. In the above case, we don't need data = chunk to save all the data in one large data. We can read and process it at the same time.
2.2 Segmented reading
During the process of reading the file, createReadStream can actually read in segments. This method of segmented reading can also be used. As an alternative to reading large files. Especially when reading concurrently, it has certain advantages and can improve the speed of file reading and processing.
createReadStream accepts the second parameter {start, end}. We can get the size of the file through fs.promises.stat, then determine the fragments, and finally read the fragments once, for example:
- Get the file size
const info = await fs.promises.stat(filepath) const size = info.size
- Fragment according to the specified SIZE (such as 128M per fragment)
const SIZE = 128 * 1024 * 1024 let sizeLen = Math.floor(size/SIZE) let total = sizeLen +1 ; for(let i=0;i<p>3. Implement the read function</p><pre class="brush:php;toolbar:false">const readStremfunc = () => { const readStream = fs.createReadStream(filepath,{start:start,end:end}) readStream.setEncoding('binary') let data = '' readStream.on('data', chunk => { data = data + chunk }) readStream.end('data', () => { ... }) }
It is worth noting that fs.createReadStream(filepath,{ start, end}), start and end are closed front and back. For example, fs.createReadSteam(filepath,{start:0,end:1023}) reads [0,1023] for a total of 1024 bits.
3. Others
3.1 Expand the reading and writing of large files on the browser side
The large files were previously stored in nodejs So is there any problem with reading large files on the browser side?
浏览器在本地读取大文件时,之前有类似FileSaver、StreamSaver等方案,不过在浏览器本身添加了File的规范,使得浏览器本身就默认和优化了Stream的读取。我们不需要做额外的工作,相关的工作:github.com/whatwg/fs。不过不同的版本会有兼容性的问题,我们还是可以通过FileSaver等进行兼容。
3.2 请求静态资源大文件
如果是在浏览器中获取静态资源大文件,一般情况下只需要通过range分配请求即可,一般的CDN加速域名,不管是阿里云还是腾讯云,对于分片请求都支持的很好,我们可以将资源通过cdn加速,然后在浏览器端直接请求cdn加速有的资源。
分片获取cdn静态资源大文件的步骤为,首先通过head请求获取文件大小:
const getHeaderInfo = async (url: string) => { const res: any = await axios.head(url + `?${Math.random()}`); return res?.headers; }; const header = getHeaderInfo(source_url) const size = header['content-length']
我们可以从header中的content-length属性中,获取文件的大小。然后进行分片和分段,最后发起range请求:
const getRangeInfo = async (url: string, start: number, end: number) => { const data = await axios({ method: 'get', url, headers: { range: `bytes=${start}-${end}`, }, responseType: 'blob', }); return data?.data; };
在headers中指定 range: bytes=${start}-${end}
,就可以发起分片请求去获取分段资源,这里的start和end也是前闭后闭的。
更多node相关知识,请访问:nodejs 教程!
The above is the detailed content of A brief analysis of how Nodejs reads and writes large files. For more information, please follow other related articles on the PHP Chinese website!

Choosing Python or JavaScript should be based on career development, learning curve and ecosystem: 1) Career development: Python is suitable for data science and back-end development, while JavaScript is suitable for front-end and full-stack development. 2) Learning curve: Python syntax is concise and suitable for beginners; JavaScript syntax is flexible. 3) Ecosystem: Python has rich scientific computing libraries, and JavaScript has a powerful front-end framework.

The power of the JavaScript framework lies in simplifying development, improving user experience and application performance. When choosing a framework, consider: 1. Project size and complexity, 2. Team experience, 3. Ecosystem and community support.

Introduction I know you may find it strange, what exactly does JavaScript, C and browser have to do? They seem to be unrelated, but in fact, they play a very important role in modern web development. Today we will discuss the close connection between these three. Through this article, you will learn how JavaScript runs in the browser, the role of C in the browser engine, and how they work together to drive rendering and interaction of web pages. We all know the relationship between JavaScript and browser. JavaScript is the core language of front-end development. It runs directly in the browser, making web pages vivid and interesting. Have you ever wondered why JavaScr

Node.js excels at efficient I/O, largely thanks to streams. Streams process data incrementally, avoiding memory overload—ideal for large files, network tasks, and real-time applications. Combining streams with TypeScript's type safety creates a powe

The differences in performance and efficiency between Python and JavaScript are mainly reflected in: 1) As an interpreted language, Python runs slowly but has high development efficiency and is suitable for rapid prototype development; 2) JavaScript is limited to single thread in the browser, but multi-threading and asynchronous I/O can be used to improve performance in Node.js, and both have advantages in actual projects.

JavaScript originated in 1995 and was created by Brandon Ike, and realized the language into C. 1.C language provides high performance and system-level programming capabilities for JavaScript. 2. JavaScript's memory management and performance optimization rely on C language. 3. The cross-platform feature of C language helps JavaScript run efficiently on different operating systems.

JavaScript runs in browsers and Node.js environments and relies on the JavaScript engine to parse and execute code. 1) Generate abstract syntax tree (AST) in the parsing stage; 2) convert AST into bytecode or machine code in the compilation stage; 3) execute the compiled code in the execution stage.

The future trends of Python and JavaScript include: 1. Python will consolidate its position in the fields of scientific computing and AI, 2. JavaScript will promote the development of web technology, 3. Cross-platform development will become a hot topic, and 4. Performance optimization will be the focus. Both will continue to expand application scenarios in their respective fields and make more breakthroughs in performance.


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

MantisBT
Mantis is an easy-to-deploy web-based defect tracking tool designed to aid in product defect tracking. It requires PHP, MySQL and a web server. Check out our demo and hosting services.

WebStorm Mac version
Useful JavaScript development tools

SecLists
SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

SAP NetWeaver Server Adapter for Eclipse
Integrate Eclipse with SAP NetWeaver application server.

Dreamweaver Mac version
Visual web development tools
