search
HomeWeb Front-endJS TutorialLoop Unrolling in JavaScript?

Loop Unrolling in JavaScript?

Jul 24, 2024 pm 01:18 PM

Loop Unrolling in JavaScript?

JavaScript can feel very removed from the hardware it runs on, but thinking low-level can still be useful in limited cases.

A recent post of Kafeel Ahmad on loop optimization detailed a number of loop performance improvement techniques. That article got me thinking about the topic.

Premature Optimization

Just to get this out of the way, this is a technique very few will ever need to consider in web development. Also, focusing on optimization too early can make code harder to write and much harder maintain. Taking a peek at low-level techniques can give us insight into our tools and the work in general, even if we can't apply that knowledge directly.

What is Loop Unrolling?

Loop unrolling basically duplicates the logic inside a loop so you perform multiple operations during each, well, loop. In specific cases, making the code in the loop longer can make it faster.

By intentionally performing some operations in groups rather than one-by-one, the computer may be able to operate more efficiently.

Unrolling Example

Let's take a very simple example: summing values in an array.

// 1-to-1 looping
const simpleSum = (data) => {
  let sum = 0;
  for(let i=0; i  {
  let sum1 = 0;
  let sum2 = 0;
  for(let i=0; i 



<p>This may look very strange at first. We're managing more variables and performing additional operations that don't happen in the simple example. How can this be faster?!</p>

<h3>
  
  
  Measuring the Difference
</h3>

<p>I ran some comparisons over a variety of data sizes and multiple runs, as well as sequential or interleaved testing. The parallelSum performance varied, but was almost always better, excepting some odd results for very small data sizes. I tested this using RunJS, which is built on Chrome's V8 engine.</p>

<p>Different data sizes gave <em>very roughly</em> these results:</p>

  • Small (
  • Medium (10k-100k): Typically ~20-80% faster
  • Large (> 1M): Consistently twice as fast

Then I created a JSPerf with 1 million records to try across different browsers. Try it yourself!

Chrome ran parallelSum twice as fast as simpleSum, as expected from the RunJS testing.

Safari was almost identical to Chrome, both in percents and operations per second.

Firefox on the same system performed almost the same for simpleSum but parallelSum was only about 15% faster, not twice as fast.

This variation sent me looking for more information. While it's nothing definitive, I found a StackOverflow comment from 2016 discussing some of the JS engine issues with loop unrolling. It's an interesting look at how engines and optimizations can affect code in ways we don't expect.

Variations

I tried a third version as well, which added two values in a single operation to see if there was a noticeable difference between one variable and two.

const parallelSum = (data) => {
  let sum = 0
  for(let i=0; i 



<p>Short answer: No. The two "parallel" versions were within the reported margin of error of each other.</p>

<h2>
  
  
  So, How Does it Work?
</h2>

<p>While JavaScript is single-threaded, the interpreters, compilers, and hardware underneath can perform optimizations for us when certain conditions are met.</p>

<p>In the simple example, the operation needs the value i to know what data to fetch, and it needs the latest value of sum to update. Because both of these change in each loop, the computer has to wait for the loop to complete to get more data. While it may seem obvious to us what i += 1 will do, the computer mostly understands "the value will change, check back later", so it has difficulty optimizing.</p>

<p>Our parallel versions load multiple data entries for each value of i. We still depend on sum for each loop, but we can load and process twice as much data per cycle. But that doesn't mean it runs <em>twice as fast</em>.</p>

<h3>
  
  
  Deeper Dive
</h3>

<p>To understand why loop unrolling works we look to the low-level operation of a computer. Processors with super-scalar architectures can have multiple pipelines to perform simultaneous operations. They can support out-of-order execution so operations that don't depend on each other can happen as soon as possible. For some operations, SIMD  can perform one action on multiple pieces of data at once. Beyond that we start getting into caching, data fetching, and branch prediction...</p>

<p>But this is a JavaScript article! We're not going that deep. If you want to know more about processor architectures, Anandtech has some excellent Deep Dives.</p><h2>
  
  
  Limits and Drawbacks
</h2>

<p>Loop unrolling is not magic. There are limits and diminishing returns that appear because of program or data size, operation complexity, computer architecture, and more. But we've only tested one or two operations, and modern computers often support four or more threads.</p>

<p>To try some larger increments, I made another JSPerf with 1, 2, 4, and 10 records and ran it on an Apple M1 Max MacBook Pro running macOS 14.5 Sonoma, and an AMD Ryzen 9 3950X PC running Windows 11.</p>

<p>Ten records at a time was 2.5-3.5x faster than the base loop, but only 12-15% faster than processing four records on the Mac. On the PC we still saw the 2x improvement between one to two records, but ten records was just 2% faster than four records, which I would not have predicted for a 16-core processor.</p>

<h3>
  
  
  Platforms and Updates
</h3>

<p>These different results remind us to be careful with optimization. Optimizing for your computer could create a worse experience on less-capable or just different hardware. Performance or functionality issues for older or entry-level hardware is a common issue when developers work on fast, powerful machines, and it's something I've been tasked with multiple times in my career.</p>

<p>For some performance scale, a currently-available entry-level Chromebook from HP has an Intel Celeron N4120 processor. This is roughly equivalent to my 2013 Core i5-4250U MacBook Air. It has just <em>one ninth</em> the performance of the M1 Max in a synthetic benchmark. On that 2013 MacBook Air, running the latest version of Chrome, <em>the 4-record function</em> was faster than the 10-record, but still only 60% faster than the single-record function!</p>

<p>Browsers and standards are constantly changing, too. A routine browser update or a different processor architecture could make optimized code <em>slower</em> than a regular loop. When you find yourself deeply optimizing, you may need to ensure your optimization is relevant to your consumers, and that it <em>stays relevant</em>.</p>

<p>It reminds me of the book High Performance JavaScript by Nicholas Zakas, which I read back in 2012. It was a great book and contained a lot of insight. However, by 2014 a number of the significant performance issues identified in the book had been resolved or substantially reduced by browser engine updates, and we were able to focus more effort on writing maintainable code.</p>

<p>If you are trying to stay on the edge of performance optimization, be prepared for change and regular validation.</p>

<h3>
  
  
  Lessons from the Past
</h3>

<p>While researching this topic I came across a Linux Kernel Mailing List thread from the year 2000 about removing some loop unrolling optimizations which ultimately improved the application performance. It included this still-relevant point (emphasis mine):</p>

<blockquote>
<p><strong>The bottom line is that our intuitive assumptions of what's fast and what isn't can often be wrong,</strong> especially given how much CPU's have changed over the past couple of years.<br>
– Theodore Ts'o</p>
</blockquote>

<h2>
  
  
  Conclusion
</h2>

<p>There are times you may need to squeeze performance out of a loop, and if you are processing enough items, this could be one of the ways you do that. It's good to know about these kind of optimizations, but for most work, You Aren't Gonna Need It™.</p>

<p>Still I hope you've enjoyed my rambling, and that maybe in the future your memory will be jogged about performance optimization considerations.</p>

<p>Thanks for reading!</p>


          

            
        

The above is the detailed content of Loop Unrolling in JavaScript?. For more information, please follow other related articles on the PHP Chinese website!

Statement
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Python vs. JavaScript: A Comparative Analysis for DevelopersPython vs. JavaScript: A Comparative Analysis for DevelopersMay 09, 2025 am 12:22 AM

The main difference between Python and JavaScript is the type system and application scenarios. 1. Python uses dynamic types, suitable for scientific computing and data analysis. 2. JavaScript adopts weak types and is widely used in front-end and full-stack development. The two have their own advantages in asynchronous programming and performance optimization, and should be decided according to project requirements when choosing.

Python vs. JavaScript: Choosing the Right Tool for the JobPython vs. JavaScript: Choosing the Right Tool for the JobMay 08, 2025 am 12:10 AM

Whether to choose Python or JavaScript depends on the project type: 1) Choose Python for data science and automation tasks; 2) Choose JavaScript for front-end and full-stack development. Python is favored for its powerful library in data processing and automation, while JavaScript is indispensable for its advantages in web interaction and full-stack development.

Python and JavaScript: Understanding the Strengths of EachPython and JavaScript: Understanding the Strengths of EachMay 06, 2025 am 12:15 AM

Python and JavaScript each have their own advantages, and the choice depends on project needs and personal preferences. 1. Python is easy to learn, with concise syntax, suitable for data science and back-end development, but has a slow execution speed. 2. JavaScript is everywhere in front-end development and has strong asynchronous programming capabilities. Node.js makes it suitable for full-stack development, but the syntax may be complex and error-prone.

JavaScript's Core: Is It Built on C or C  ?JavaScript's Core: Is It Built on C or C ?May 05, 2025 am 12:07 AM

JavaScriptisnotbuiltonCorC ;it'saninterpretedlanguagethatrunsonenginesoftenwritteninC .1)JavaScriptwasdesignedasalightweight,interpretedlanguageforwebbrowsers.2)EnginesevolvedfromsimpleinterpreterstoJITcompilers,typicallyinC ,improvingperformance.

JavaScript Applications: From Front-End to Back-EndJavaScript Applications: From Front-End to Back-EndMay 04, 2025 am 12:12 AM

JavaScript can be used for front-end and back-end development. The front-end enhances the user experience through DOM operations, and the back-end handles server tasks through Node.js. 1. Front-end example: Change the content of the web page text. 2. Backend example: Create a Node.js server.

Python vs. JavaScript: Which Language Should You Learn?Python vs. JavaScript: Which Language Should You Learn?May 03, 2025 am 12:10 AM

Choosing Python or JavaScript should be based on career development, learning curve and ecosystem: 1) Career development: Python is suitable for data science and back-end development, while JavaScript is suitable for front-end and full-stack development. 2) Learning curve: Python syntax is concise and suitable for beginners; JavaScript syntax is flexible. 3) Ecosystem: Python has rich scientific computing libraries, and JavaScript has a powerful front-end framework.

JavaScript Frameworks: Powering Modern Web DevelopmentJavaScript Frameworks: Powering Modern Web DevelopmentMay 02, 2025 am 12:04 AM

The power of the JavaScript framework lies in simplifying development, improving user experience and application performance. When choosing a framework, consider: 1. Project size and complexity, 2. Team experience, 3. Ecosystem and community support.

The Relationship Between JavaScript, C  , and BrowsersThe Relationship Between JavaScript, C , and BrowsersMay 01, 2025 am 12:06 AM

Introduction I know you may find it strange, what exactly does JavaScript, C and browser have to do? They seem to be unrelated, but in fact, they play a very important role in modern web development. Today we will discuss the close connection between these three. Through this article, you will learn how JavaScript runs in the browser, the role of C in the browser engine, and how they work together to drive rendering and interaction of web pages. We all know the relationship between JavaScript and browser. JavaScript is the core language of front-end development. It runs directly in the browser, making web pages vivid and interesting. Have you ever wondered why JavaScr

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

WebStorm Mac version

WebStorm Mac version

Useful JavaScript development tools

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

mPDF

mPDF

mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

PhpStorm Mac version

PhpStorm Mac version

The latest (2018.2.1) professional PHP integrated development tool