Building a Scalable Reverse Proxy Server like Nginx with Node.js and TypeScript-JS Tutorial-php.cn

Home

Web Front-end

JS Tutorial

Building a Scalable Reverse Proxy Server like Nginx with Node.js and TypeScript

Susan Sarandon

Jan 05, 2025 am 08:48 AM

The Inspiration

In today's microservices architecture, reverse proxies play a crucial role in managing and routing incoming requests to various backend services.

Building a Scalable Reverse Proxy Server like Nginx with Node.js and TypeScript

A reverse proxy sits in front of the web servers of an application and intercepts the requests coming from the client machines. This has a lot of benefits such as load balancing, hidden origin servers IP addresses leading to better security, caching, rate limiting, etc.

In a distributed and microservice architecture, a single entry point is necessary. Reverse Proxy servers like Nginx helps in such scenarios. If we have multiple instances of our server running, managing and ensuring efficient request routing becomes tricky. A reverse proxy like Nginx is a perfect solution in this case. We can point our domain to the IP Address of the Nginx server and the Nginx will route the incoming request according to the configuration to one of the instances while taking care of the load being handled by each.

How Nginx does it so good?

I will recommend reading through this article from Nginx which explains in detail how Nginx is able to support huge scale of requests with super reliability and speed: Nginx Architecture

In short, Nginx has a Master process and a bunch of worker processes. It also has helper processes like Cache Loader and Cache Manager. The master and the worker process do all the heavy work.

Master Process: Manages configuration and spawns child processes.
Cache Loader/Manager: Handle cache loading and pruning with minimal resources.
Worker Processes: Manage connections, disk I/O, and upstream communication, running nonblocking and independently.

Worker processes handle multiple connections nonblocking, reducing context switches. They are single-threaded, run independently, and use shared memory for shared resources like cache and session data. This architecture helps Nginx to reduce the number of context switches and increase the speed faster than a blocking, multi process architecture.

Taking inspiration from this, we will use the same concept of master and worker process and will implement our own event-driven reverse proxy server which will be able to handle thousands of connection per worker process.

Project Architecture

Our reverse proxy implementation follows these key design principles:

Configuration-Driven: All proxy behavior is defined in a YAML configuration file, making it easy to modify routing rules.
Type Safety: TypeScript and Zod schemas ensure configuration validity and runtime type safety.
Scalability: Node.js cluster module enables utilizing multiple CPU cores for better performance.
Modularity: Clear separation of concerns with distinct modules for configuration, server logic, and schema validation.

Project Structure

├── config.yaml           # Server configuration
├── src/
│   ├── config-schema.ts  # Configuration validation schemas
│   ├── config.ts         # Configuration parsing logic
│   ├── index.ts         # Application entry point
│   ├── server-schema.ts # Server message schemas
│   └── server.ts        # Core server implementation
└── tsconfig.json        # TypeScript configuration

Key Components

config.yaml: Defines the server's configuration, including the port, worker processes, upstream servers, headers, and routing rules.
config-schema.ts: Defines validation schemas using the Zod library to ensure the configuration structure is correct.
server-schema.ts: Specifies message formats exchanged between the master and worker processes.
config.ts: Provides functions for parsing and validating the YAML configuration file.
server.ts: Implements the reverse proxy server logic, including cluster setup, HTTP handling, and request forwarding.
index.ts: Serves as the entry point, parsing command-line options and initiating the server.

Configuration Management

The configuration system uses YAML. Here's how it works:

server:
    listen: 8080          # Port the server listens on.
    workers: 2            # Number of worker processes to handle requests.
    upstreams:            # Define upstream servers (backend targets).
        - id: jsonplaceholder
          url: jsonplaceholder.typicode.com
        - id: dummy
          url: dummyjson.com
    headers:              # Custom headers added to proxied requests.
        - key: x-forward-for
          value: $ip      # Adds the client IP to the forwarded request.
        - key: Authorization
          value: Bearer xyz  # Adds an authorization token to requests.
    rules:                # Define routing rules for incoming requests.
        - path: /test
          upstreams:
              - dummy     # Routes requests to "/test" to the "dummy" upstream.
        - path: /
          upstreams:
              - jsonplaceholder  # Routes all other requests to "jsonplaceholder".

Incoming requests are evaluated against the rules. Based on the path, the reverse proxy determines which upstream server to forward the request to.

Configuration Validation (config-schema.ts)

We use Zod to define strict schemas for configuration validation:

import { z } from "zod";

const upstreamSchema = z.object({
    id: z.string(),
    url: z.string(),
});

const headerSchema = z.object({
    key: z.string(),
    value: z.string(),
});

const ruleSchema = z.object({
    path: z.string(),
    upstreams: z.array(z.string()),
});

const serverSchema = z.object({
    listen: z.number(),
    workers: z.number().optional(),
    upstreams: z.array(upstreamSchema),
    headers: z.array(headerSchema).optional(),
    rules: z.array(ruleSchema),
});

export const rootConfigSchema = z.object({
    server: serverSchema,
});

export type ConfigSchemaType = z.infer<typeof rootconfigschema>;
</typeof>

Parsing and Validating Configurations (config.ts)

The config.ts module provides utility functions to parse and validate the configuration file.

import fs from "node:fs/promises";
import { parse } from "yaml";
import { rootConfigSchema } from "./config-schema";

export async function parseYAMLConfig(filepath: string) {
    const configFileContent = await fs.readFile(filepath, "utf8");
    const configParsed = parse(configFileContent);
    return JSON.stringify(configParsed);
}

export async function validateConfig(config: string) {
    const validatedConfig = await rootConfigSchema.parseAsync(
        JSON.parse(config)
    );
    return validatedConfig;
}

Reverse Proxy Server Logic (server.ts)

The server utilizes the Node.js cluster module for scalability and the http module for handling requests. The master process distributes requests to worker processes, which forwards them to upstream servers. Let's explore the server.ts file in detail, which contains the core logic of our reverse proxy server. We'll break down each component and understand how they work together to create a scalable proxy server.

The server implementation follows a master-worker architecture using Node.js's cluster module. This design allows us to:

Utilize multiple CPU cores
Handle requests concurrently
Maintain high availability
Isolate request processing

Master Process:
- Creates worker processes
- Distributes incoming requests across workers
- Manages the worker pool
- Handles worker crashes and restarts
Worker Processes:
- Handle individual HTTP requests
- Match requests against routing rules
- Forward requests to upstream servers
- Process responses and send them back to clients

Master Process Setup

├── config.yaml           # Server configuration
├── src/
│   ├── config-schema.ts  # Configuration validation schemas
│   ├── config.ts         # Configuration parsing logic
│   ├── index.ts         # Application entry point
│   ├── server-schema.ts # Server message schemas
│   └── server.ts        # Core server implementation
└── tsconfig.json        # TypeScript configuration

The master process creates a pool of workers and passes the configuration to each worker through environment variables. This ensures all workers have access to the same configuration.

Request Distribution

server:
    listen: 8080          # Port the server listens on.
    workers: 2            # Number of worker processes to handle requests.
    upstreams:            # Define upstream servers (backend targets).
        - id: jsonplaceholder
          url: jsonplaceholder.typicode.com
        - id: dummy
          url: dummyjson.com
    headers:              # Custom headers added to proxied requests.
        - key: x-forward-for
          value: $ip      # Adds the client IP to the forwarded request.
        - key: Authorization
          value: Bearer xyz  # Adds an authorization token to requests.
    rules:                # Define routing rules for incoming requests.
        - path: /test
          upstreams:
              - dummy     # Routes requests to "/test" to the "dummy" upstream.
        - path: /
          upstreams:
              - jsonplaceholder  # Routes all other requests to "jsonplaceholder".

The master process uses a simple random distribution strategy to assign requests to workers. While not as sophisticated as round-robin or least-connections algorithms, this approach provides decent load distribution for most use cases. The request distribution logic:

Randomly selects a worker from the pool
Creates a balanced workload across workers
Handles edge cases where workers might be unavailable

Worker Process Request Logic

Each worker listens for messages, matches requests against routing rules, and forwards them to the appropriate upstream server.

import { z } from "zod";

const upstreamSchema = z.object({
    id: z.string(),
    url: z.string(),
});

const headerSchema = z.object({
    key: z.string(),
    value: z.string(),
});

const ruleSchema = z.object({
    path: z.string(),
    upstreams: z.array(z.string()),
});

const serverSchema = z.object({
    listen: z.number(),
    workers: z.number().optional(),
    upstreams: z.array(upstreamSchema),
    headers: z.array(headerSchema).optional(),
    rules: z.array(ruleSchema),
});

export const rootConfigSchema = z.object({
    server: serverSchema,
});

export type ConfigSchemaType = z.infer<typeof rootconfigschema>;
</typeof>

The master process communicates with workers by constructing a standardized message payload, including all necessary request information, using Node.js IPC (Inter-Process Communication) and validating message structure using Zod schemas.

Workers handle the actual request processing and proxying. Each worker:

Loads its configuration from environment variables
Validates the configuration using Zod schemas
Maintains its own copy of the configuration

Workers select upstream servers by:

Finding the appropriate upstream ID from the rule
Locating the upstream server configuration
Validating the upstream server exists

The request forwarding mechanism:

Creates a new HTTP request to the upstream server
Streams the response data
Aggregates the response body
Sends the response back to the master process

Running the Server

To run the server, follow these steps:

Build the project:

import fs from "node:fs/promises";
import { parse } from "yaml";
import { rootConfigSchema } from "./config-schema";

export async function parseYAMLConfig(filepath: string) {
    const configFileContent = await fs.readFile(filepath, "utf8");
    const configParsed = parse(configFileContent);
    return JSON.stringify(configParsed);
}

export async function validateConfig(config: string) {
    const validatedConfig = await rootConfigSchema.parseAsync(
        JSON.parse(config)
    );
    return validatedConfig;
}

Start the server:

if (cluster.isPrimary) {
    console.log("Master Process is up ?");
    for (let i = 0; i  {
        const index = Math.floor(Math.random() * WORKER_POOL.length);
        const worker = WORKER_POOL.at(index);

        if (!worker) throw new Error("Worker not found.");

        const payload: WorkerMessageSchemaType = {
            requestType: "HTTP",
            headers: req.headers,
            body: null,
            url: ${req.url},
        };
        worker.send(JSON.stringify(payload));

        worker.once("message", async (workerReply: string) => {
            const reply = await workerMessageReplySchema.parseAsync(
                JSON.parse(workerReply)
            );

            if (reply.errorCode) {
                res.writeHead(parseInt(reply.errorCode));
                res.end(reply.error);
            } else {
                res.writeHead(200);
                res.end(reply.data);
            }
        });
    });

    server.listen(port, () => {
        console.log(Reverse Proxy listening on port: ${port});
    });
}

Development mode:

const server = http.createServer(function (req, res) {
    const index = Math.floor(Math.random() * WORKER_POOL.length);
    const worker = WORKER_POOL.at(index);

    const payload: WorkerMessageSchemaType = {
        requestType: "HTTP",
        headers: req.headers,
        body: null,
        url: ${req.url},
    };
    worker.send(JSON.stringify(payload));
});

Building a Scalable Reverse Proxy Server like Nginx with Node.js and TypeScript

In the above screenshot, we can see that there is 1 Master Node and 2 Worker Processes are running. Our reverse proxy server is listening on port 8080.
In the config.yaml file, we describe two upstream servers namely: jsonplaceholder and dummy. If we want all requests coming to our server to be routed to jsonplaceholder, we put the rule as:

├── config.yaml           # Server configuration
├── src/
│   ├── config-schema.ts  # Configuration validation schemas
│   ├── config.ts         # Configuration parsing logic
│   ├── index.ts         # Application entry point
│   ├── server-schema.ts # Server message schemas
│   └── server.ts        # Core server implementation
└── tsconfig.json        # TypeScript configuration

Similarly, if we want our request to the /test endpoint should route to our dummy upstream server, we put the rule as:

server:
    listen: 8080          # Port the server listens on.
    workers: 2            # Number of worker processes to handle requests.
    upstreams:            # Define upstream servers (backend targets).
        - id: jsonplaceholder
          url: jsonplaceholder.typicode.com
        - id: dummy
          url: dummyjson.com
    headers:              # Custom headers added to proxied requests.
        - key: x-forward-for
          value: $ip      # Adds the client IP to the forwarded request.
        - key: Authorization
          value: Bearer xyz  # Adds an authorization token to requests.
    rules:                # Define routing rules for incoming requests.
        - path: /test
          upstreams:
              - dummy     # Routes requests to "/test" to the "dummy" upstream.
        - path: /
          upstreams:
              - jsonplaceholder  # Routes all other requests to "jsonplaceholder".

Let's test this out!

Building a Scalable Reverse Proxy Server like Nginx with Node.js and TypeScript

Wow, that is cool! We are navigating to localhost:8080 but in response we can see we received the homepage for jsonplaceholder.typicode.com. The end user does not even know that we are seeing response from a separate server. That is why Reverse Proxy servers are important. If we have multiple servers running the same code and don't want to expose all of their ports to end users, use a reverse proxy as an abstraction layer. Users will hit the reverse proxy server, a very robust and quick server, and it will determine which server to route request to.

Let's hit localhost:8080/todos now and see what happens.

Building a Scalable Reverse Proxy Server like Nginx with Node.js and TypeScript

Our request got reverse proxied to the jsonplaceholder server again and received a JSON response from the resolved URL: jsonplaceholder.typicode.com/todos.

Communication Flow

Let's visualize the complete request flow:

Client sends request → Master Process
Master Process → Selected Worker
Worker → Upstream Server
Upstream Server → Worker
Worker → Master Process
Master Process → Client

Performance Considerations

The multi-process architecture provides several performance benefits:

CPU Utilization: Worker processes can run on different CPU cores, utilizing available hardware resources.
Process Isolation: A crash in one worker doesn't affect others, improving reliability.
Load Distribution: Random distribution of requests helps prevent any single worker from becoming overwhelmed.

Future Improvements

While functional, the current implementation could be enhanced with:

Better Load Balancing: Implement more sophisticated algorithms like round-robin or least-connections.
Health Checks: Add periodic health checks for upstream servers.
Caching: Implement response caching to reduce upstream server load.
Metrics: Add prometheus-style metrics for monitoring.
WebSocket Support: Extend the proxy to handle WebSocket connections.
HTTPS Support: Add SSL/TLS termination capabilities.

Wrapping Up

Building a reverse proxy server from scratch might seem intimidating at first, but as we’ve explored, it’s a rewarding experience. By combining Node.js clusters, TypeScript, and YAML-based configuration management, we’ve created a scalable and efficient system inspired by Nginx.

There’s still room to enhance this implementation — better load balancing, caching, or WebSocket support are just a few ideas to explore. But the current design sets a strong foundation for experimenting and scaling further. If you’ve followed along, you’re now equipped to dive deeper into reverse proxies or even start building custom solutions tailored to your needs.

If you’d like to connect or see more of my work, check out my GitHub, LinkedIn.
The repository for this project can be found here.

I’d love to hear your thoughts, feedback, or ideas for improvement. Thanks for reading, and happy coding! ?

The above is the detailed content of Building a Scalable Reverse Proxy Server like Nginx with Node.js and TypeScript. For more information, please follow other related articles on the PHP Chinese website!

Statement

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Javascript Data Types : Is there any difference between Browser and NodeJs?May 14, 2025 am 12:15 AM

JavaScript core data types are consistent in browsers and Node.js, but are handled differently from the extra types. 1) The global object is window in the browser and global in Node.js. 2) Node.js' unique Buffer object, used to process binary data. 3) There are also differences in performance and time processing, and the code needs to be adjusted according to the environment.

JavaScript Comments: A Guide to Using // and /* */May 13, 2025 pm 03:49 PM

JavaScriptusestwotypesofcomments:single-line(//)andmulti-line(//).1)Use//forquicknotesorsingle-lineexplanations.2)Use//forlongerexplanationsorcommentingoutblocksofcode.Commentsshouldexplainthe'why',notthe'what',andbeplacedabovetherelevantcodeforclari

Python vs. JavaScript: A Comparative Analysis for DevelopersMay 09, 2025 am 12:22 AM

The main difference between Python and JavaScript is the type system and application scenarios. 1. Python uses dynamic types, suitable for scientific computing and data analysis. 2. JavaScript adopts weak types and is widely used in front-end and full-stack development. The two have their own advantages in asynchronous programming and performance optimization, and should be decided according to project requirements when choosing.

Python vs. JavaScript: Choosing the Right Tool for the JobMay 08, 2025 am 12:10 AM

Whether to choose Python or JavaScript depends on the project type: 1) Choose Python for data science and automation tasks; 2) Choose JavaScript for front-end and full-stack development. Python is favored for its powerful library in data processing and automation, while JavaScript is indispensable for its advantages in web interaction and full-stack development.

Python and JavaScript: Understanding the Strengths of EachMay 06, 2025 am 12:15 AM

Python and JavaScript each have their own advantages, and the choice depends on project needs and personal preferences. 1. Python is easy to learn, with concise syntax, suitable for data science and back-end development, but has a slow execution speed. 2. JavaScript is everywhere in front-end development and has strong asynchronous programming capabilities. Node.js makes it suitable for full-stack development, but the syntax may be complex and error-prone.

JavaScript's Core: Is It Built on C or C ?May 05, 2025 am 12:07 AM

JavaScriptisnotbuiltonCorC ;it'saninterpretedlanguagethatrunsonenginesoftenwritteninC .1)JavaScriptwasdesignedasalightweight,interpretedlanguageforwebbrowsers.2)EnginesevolvedfromsimpleinterpreterstoJITcompilers,typicallyinC ,improvingperformance.

JavaScript Applications: From Front-End to Back-EndMay 04, 2025 am 12:12 AM

JavaScript can be used for front-end and back-end development. The front-end enhances the user experience through DOM operations, and the back-end handles server tasks through Node.js. 1. Front-end example: Change the content of the web page text. 2. Backend example: Create a Node.js server.

Python vs. JavaScript: Which Language Should You Learn?May 03, 2025 am 12:10 AM

Choosing Python or JavaScript should be based on career development, learning curve and ecosystem: 1) Career development: Python is suitable for data science and back-end development, while JavaScript is suitable for front-end and full-stack development. 2) Learning curve: Python syntax is concise and suitable for beginners; JavaScript syntax is flexible. 3) Ecosystem: Python has rich scientific computing libraries, and JavaScript has a powerful front-end framework.

See all articles