search
HomeWeb Front-endJS TutorialText Detection in Images Using AWS Rekognition and Node.js

Hey everyone! In this article, we'll be creating a simple application to perform image text detection using AWS Rekognition with Node.js.

What is AWS Rekognition?

Amazon Rekognition is a service that makes it easy to add image and video analysis to your applications. It offers features like text detection, facial recognition, and even celebrity detection.
While Rekognition can analyze images or videos stored in S3, for this tutorial, we'll be working without S3 to keep things simple.
We'll be using Express for the backend and React for the frontend.

First Steps

Before we start, you'll need to create an AWS account and set up an IAM user. If you already have these, you can skip this section.

Creating IAM user

  • Log in to AWS: Start by logging into your AWS root account.
  • Search for IAM: In the AWS console, search for IAM and select it.
  • Go to the Users section and click Create User.
  • Set the user name, and under Set Permissions, choose Attach policies directly.
  • Search for and select the Rekognition policy, then click Next and create the user.
  • Create Access Keys: After creating the user, select the user, and under the Security credentials tab, create an access key. Be sure to download the .csv file containing your access key and secret access key.
  • For more detailed instructions, refer to the official AWS documentation: https://docs.aws.amazon.com/IAM/latest/UserGuide/id_users_create.html

Configuring aws-sdk

  • Install AWS CLI: Install the AWS CLI on your system.
  • Verify Installation: Open a terminal or command prompt and type aws --version to ensure the CLI is installed correctly.
  • Configure the AWS CLI: Run aws configure and provide the access key, secret access key, and region from the .csv file you downloaded.

Project Directory

my-directory/
│
├── client/
│   └── src/
│       └── App.jsx
│   └── public/
│   └── package.json
│   └── ... (other React project files)
│
└── server/
    ├── index.js
    └── rekognition/
        └── aws.rek.js

Setting up frontend

npm create vite @latest . -- --template react
it will create the react project in the client folder. 

In the App.jsx

import { useState } from "react";

function App() {
  const [img, setImg] = useState(null);

  const handleImg = (e) => {
    setImg(e.target.files[0]);  // Store the selected image in state
  };

  const handleSubmit = (e) => {
    e.preventDefault();
    if (!img) return;

    const formData = new FormData();
    formData.append("img", img);
    console.log(formData);      // Log the form data to the console
  };

  return (
    <div>
      <form onsubmit="{handleSubmit}">
        <input type="file" name="img" accept="image/*" onchange="{handleImg}">
        <br>
        <button type="submit">Submit</button>
      </form>
    </div>
  );
}

export default App;

Let's test this out by ensuring the image is logged to the console after submitting.

Now, Let's move to backend and start making the soul, for this project.

Initializing the backend

in the server folder

npm init -y 
npm install express cors nodemon multer @aws-sdk/client-rekognition 
I have created a separate folder for rekognition, to handle analyzing logic and create a file inside that folder.

//aws.rek.js

import {
  RekognitionClient,
  DetectTextCommand,
} from "@aws-sdk/client-rekognition";

const client = new RekognitionClient({});

export const Reko = async (params) => {
  try {
      const command = new DetectTextCommand(
          {
              Image: {
                  Bytes:params  //we are using Bytes directly instead of S3
              }
        }
    );
    const response = await client.send(command);
    return response
  } catch (error) {
    console.log(error.message);
  }
};

Explanation

  • We initialize a RekognitionClient object. Since we've already configured the SDK, we can leave the braces empty.
  • We create an async function Reko to process the image. In this function Initalize a DetectTextCommand object, which takes an image in Bytes.
  • This DectedTextCommand is specifically used for text detection.
  • The function waits for a response and returns it.

Creating the API

In the server folder, create a file index.js or what ever name you want.

//index.js

import express from "express"
import multer from "multer"
import cors from "cors"
import { Reko } from "./rekognition/aws.rek.js";

const app = express()
app.use(cors())
const storage = multer.memoryStorage()
const upload = multer()
const texts = []
let data = []

app.post("/img", upload.single("img"), async(req,res) => {
    const file = req.file
    data = await Reko(file.buffer)
    data.TextDetections.map((item) => {
        texts.push(item.DetectedText)
    })
    res.status(200).send(texts)
})

app.listen(3000, () => {
    console.log("server started");
})

Explanation

  • Initializing the express and starting the server. 
  • We are using the multer to handle the multipart form data, and storing it temporarily in the Buffer.
  • Creating the post request to get the image from the user. this is an async function. 
  • After the user uploads the image, the image will be available in the req.file 
  • This req.file contains some properties, in that there will be a Buffer property that holds our image data as an 8-bit buffer.
  • We need that so we are passing that req.file.buffer to the rekognition function. after analyzing it, the function returns the array of objects. 
  • We are sending the texts from those objects to the user.

Coming back to frontend

import axios from "axios";
import { useState } from "react";
import "./App.css"; 

function App() {
  const [img, setImg] = useState(null);
  const [pending, setPending] = useState(false);
  const [texts, setTexts] = useState([]);

  const handleImg = (e) => {
    setImg(e.target.files[0]);
  };

  const handleSubmit = async (e) => {
    e.preventDefault();
    if (!img) return; 

    const formData = new FormData();
    formData.append("img", img);

    try {
      setPending(true);
      const response = await axios.post("http://localhost:3000/img", formData);
      setTexts(response.data);
    } catch (error) {
      console.log("Error uploading image:", error);
    } finally {
      setPending(false);
    }
  };

  return (
    <div classname="app-container">
      <div classname="form-container">
        <form onsubmit="{handleSubmit}">
          <input type="file" name="img" accept="image/*" onchange="{handleImg}">
          <br>
          <button type="submit" disabled>
            {pending ? "Uploading..." : "Upload Image"}
          </button>
        </form>
      </div>

      <div classname="result-container">
        {pending && <h1 id="Loading">Loading...</h1>}
        {texts.length > 0 && (
          <ul>
            {texts.map((text, index) => (
              <li key="{index}">{text}</li>
            ))}
          </ul>
        )}
      </div>
    </div>
  );
}

export default App;
  • Using Axios to post the image. and storing the response in the text's state. 
  • Displaying the texts, for now, I am using the index as the Key, but it is not encouraged to use the Index as the key. 
  • I have also added some additional things like loading state and some styles.

Final Output

Text Detection in Images Using AWS Rekognition and Node.js

After clicking the "Upload Image" button, the backend processes the image and returns the detected text, which is then displayed to the user.

For the complete code, check out my: GitHub Repo

Thank You!!!

Follow me on: Medium, GitHub, LinkedIn, X, Instagram

The above is the detailed content of Text Detection in Images Using AWS Rekognition and Node.js. For more information, please follow other related articles on the PHP Chinese website!

Statement
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
From Websites to Apps: The Diverse Applications of JavaScriptFrom Websites to Apps: The Diverse Applications of JavaScriptApr 22, 2025 am 12:02 AM

JavaScript is widely used in websites, mobile applications, desktop applications and server-side programming. 1) In website development, JavaScript operates DOM together with HTML and CSS to achieve dynamic effects and supports frameworks such as jQuery and React. 2) Through ReactNative and Ionic, JavaScript is used to develop cross-platform mobile applications. 3) The Electron framework enables JavaScript to build desktop applications. 4) Node.js allows JavaScript to run on the server side and supports high concurrent requests.

Python vs. JavaScript: Use Cases and Applications ComparedPython vs. JavaScript: Use Cases and Applications ComparedApr 21, 2025 am 12:01 AM

Python is more suitable for data science and automation, while JavaScript is more suitable for front-end and full-stack development. 1. Python performs well in data science and machine learning, using libraries such as NumPy and Pandas for data processing and modeling. 2. Python is concise and efficient in automation and scripting. 3. JavaScript is indispensable in front-end development and is used to build dynamic web pages and single-page applications. 4. JavaScript plays a role in back-end development through Node.js and supports full-stack development.

The Role of C/C   in JavaScript Interpreters and CompilersThe Role of C/C in JavaScript Interpreters and CompilersApr 20, 2025 am 12:01 AM

C and C play a vital role in the JavaScript engine, mainly used to implement interpreters and JIT compilers. 1) C is used to parse JavaScript source code and generate an abstract syntax tree. 2) C is responsible for generating and executing bytecode. 3) C implements the JIT compiler, optimizes and compiles hot-spot code at runtime, and significantly improves the execution efficiency of JavaScript.

JavaScript in Action: Real-World Examples and ProjectsJavaScript in Action: Real-World Examples and ProjectsApr 19, 2025 am 12:13 AM

JavaScript's application in the real world includes front-end and back-end development. 1) Display front-end applications by building a TODO list application, involving DOM operations and event processing. 2) Build RESTfulAPI through Node.js and Express to demonstrate back-end applications.

JavaScript and the Web: Core Functionality and Use CasesJavaScript and the Web: Core Functionality and Use CasesApr 18, 2025 am 12:19 AM

The main uses of JavaScript in web development include client interaction, form verification and asynchronous communication. 1) Dynamic content update and user interaction through DOM operations; 2) Client verification is carried out before the user submits data to improve the user experience; 3) Refreshless communication with the server is achieved through AJAX technology.

Understanding the JavaScript Engine: Implementation DetailsUnderstanding the JavaScript Engine: Implementation DetailsApr 17, 2025 am 12:05 AM

Understanding how JavaScript engine works internally is important to developers because it helps write more efficient code and understand performance bottlenecks and optimization strategies. 1) The engine's workflow includes three stages: parsing, compiling and execution; 2) During the execution process, the engine will perform dynamic optimization, such as inline cache and hidden classes; 3) Best practices include avoiding global variables, optimizing loops, using const and lets, and avoiding excessive use of closures.

Python vs. JavaScript: The Learning Curve and Ease of UsePython vs. JavaScript: The Learning Curve and Ease of UseApr 16, 2025 am 12:12 AM

Python is more suitable for beginners, with a smooth learning curve and concise syntax; JavaScript is suitable for front-end development, with a steep learning curve and flexible syntax. 1. Python syntax is intuitive and suitable for data science and back-end development. 2. JavaScript is flexible and widely used in front-end and server-side programming.

Python vs. JavaScript: Community, Libraries, and ResourcesPython vs. JavaScript: Community, Libraries, and ResourcesApr 15, 2025 am 12:16 AM

Python and JavaScript have their own advantages and disadvantages in terms of community, libraries and resources. 1) The Python community is friendly and suitable for beginners, but the front-end development resources are not as rich as JavaScript. 2) Python is powerful in data science and machine learning libraries, while JavaScript is better in front-end development libraries and frameworks. 3) Both have rich learning resources, but Python is suitable for starting with official documents, while JavaScript is better with MDNWebDocs. The choice should be based on project needs and personal interests.

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

ZendStudio 13.5.1 Mac

ZendStudio 13.5.1 Mac

Powerful PHP integrated development environment

mPDF

mPDF

mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

PhpStorm Mac version

PhpStorm Mac version

The latest (2018.2.1) professional PHP integrated development tool

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools