Text Detection in Images Using AWS Rekognition and Node.js-JS Tutorial-php.cn

Home

Web Front-end

JS Tutorial

Text Detection in Images Using AWS Rekognition and Node.js

王林

Aug 26, 2024 pm 09:35 PM

Hey everyone! In this article, we'll be creating a simple application to perform image text detection using AWS Rekognition with Node.js.

What is AWS Rekognition?

Amazon Rekognition is a service that makes it easy to add image and video analysis to your applications. It offers features like text detection, facial recognition, and even celebrity detection.
While Rekognition can analyze images or videos stored in S3, for this tutorial, we'll be working without S3 to keep things simple.
We'll be using Express for the backend and React for the frontend.

First Steps

Before we start, you'll need to create an AWS account and set up an IAM user. If you already have these, you can skip this section.

Creating IAM user

Log in to AWS: Start by logging into your AWS root account.
Search for IAM: In the AWS console, search for IAM and select it.
Go to the Users section and click Create User.
Set the user name, and under Set Permissions, choose Attach policies directly.
Search for and select the Rekognition policy, then click Next and create the user.
Create Access Keys: After creating the user, select the user, and under the Security credentials tab, create an access key. Be sure to download the .csv file containing your access key and secret access key.
For more detailed instructions, refer to the official AWS documentation: https://docs.aws.amazon.com/IAM/latest/UserGuide/id_users_create.html

Configuring aws-sdk

Install AWS CLI: Install the AWS CLI on your system.
Verify Installation: Open a terminal or command prompt and type aws --version to ensure the CLI is installed correctly.
Configure the AWS CLI: Run aws configure and provide the access key, secret access key, and region from the .csv file you downloaded.

Project Directory

my-directory/
│
├── client/
│   └── src/
│       └── App.jsx
│   └── public/
│   └── package.json
│   └── ... (other React project files)
│
└── server/
    ├── index.js
    └── rekognition/
        └── aws.rek.js

Setting up frontend

npm create vite @latest . -- --template react
it will create the react project in the client folder.

In the App.jsx

import { useState } from "react";

function App() {
  const [img, setImg] = useState(null);

  const handleImg = (e) => {
    setImg(e.target.files[0]);  // Store the selected image in state
  };

  const handleSubmit = (e) => {
    e.preventDefault();
    if (!img) return;

    const formData = new FormData();
    formData.append("img", img);
    console.log(formData);      // Log the form data to the console
  };

  return (
    <div>
      <form onsubmit="{handleSubmit}">
        <input type="file" name="img" accept="image/*" onchange="{handleImg}">
        <br>
        <button type="submit">Submit</button>
      </form>
    </div>
  );
}

export default App;

Let's test this out by ensuring the image is logged to the console after submitting.

Now, Let's move to backend and start making the soul, for this project.

Initializing the backend

in the server folder

npm init -y
npm install express cors nodemon multer @aws-sdk/client-rekognition
I have created a separate folder for rekognition, to handle analyzing logic and create a file inside that folder.

//aws.rek.js

import {
  RekognitionClient,
  DetectTextCommand,
} from "@aws-sdk/client-rekognition";

const client = new RekognitionClient({});

export const Reko = async (params) => {
  try {
      const command = new DetectTextCommand(
          {
              Image: {
                  Bytes:params  //we are using Bytes directly instead of S3
              }
        }
    );
    const response = await client.send(command);
    return response
  } catch (error) {
    console.log(error.message);
  }
};

Explanation

We initialize a RekognitionClient object. Since we've already configured the SDK, we can leave the braces empty.
We create an async function Reko to process the image. In this function Initalize a DetectTextCommand object, which takes an image in Bytes.
This DectedTextCommand is specifically used for text detection.
The function waits for a response and returns it.

Creating the API

In the server folder, create a file index.js or what ever name you want.

//index.js

import express from "express"
import multer from "multer"
import cors from "cors"
import { Reko } from "./rekognition/aws.rek.js";

const app = express()
app.use(cors())
const storage = multer.memoryStorage()
const upload = multer()
const texts = []
let data = []

app.post("/img", upload.single("img"), async(req,res) => {
    const file = req.file
    data = await Reko(file.buffer)
    data.TextDetections.map((item) => {
        texts.push(item.DetectedText)
    })
    res.status(200).send(texts)
})

app.listen(3000, () => {
    console.log("server started");
})

Explanation

Initializing the express and starting the server.
We are using the multer to handle the multipart form data, and storing it temporarily in the Buffer.
Creating the post request to get the image from the user. this is an async function.
After the user uploads the image, the image will be available in the req.file
This req.file contains some properties, in that there will be a Buffer property that holds our image data as an 8-bit buffer.
We need that so we are passing that req.file.buffer to the rekognition function. after analyzing it, the function returns the array of objects.
We are sending the texts from those objects to the user.

Coming back to frontend

import axios from "axios";
import { useState } from "react";
import "./App.css"; 

function App() {
  const [img, setImg] = useState(null);
  const [pending, setPending] = useState(false);
  const [texts, setTexts] = useState([]);

  const handleImg = (e) => {
    setImg(e.target.files[0]);
  };

  const handleSubmit = async (e) => {
    e.preventDefault();
    if (!img) return; 

    const formData = new FormData();
    formData.append("img", img);

    try {
      setPending(true);
      const response = await axios.post("http://localhost:3000/img", formData);
      setTexts(response.data);
    } catch (error) {
      console.log("Error uploading image:", error);
    } finally {
      setPending(false);
    }
  };

  return (
    <div classname="app-container">
      <div classname="form-container">
        <form onsubmit="{handleSubmit}">
          <input type="file" name="img" accept="image/*" onchange="{handleImg}">
          <br>
          <button type="submit" disabled>
            {pending ? "Uploading..." : "Upload Image"}
          </button>
        </form>
      </div>

      <div classname="result-container">
        {pending && <h1 id="Loading">Loading...</h1>}
        {texts.length > 0 && (
          <ul>
            {texts.map((text, index) => (
              <li key="{index}">{text}</li>
            ))}
          </ul>
        )}
      </div>
    </div>
  );
}

export default App;

Using Axios to post the image. and storing the response in the text's state.
Displaying the texts, for now, I am using the index as the Key, but it is not encouraged to use the Index as the key.
I have also added some additional things like loading state and some styles.

Final Output

Text Detection in Images Using AWS Rekognition and Node.js

After clicking the "Upload Image" button, the backend processes the image and returns the detected text, which is then displayed to the user.

For the complete code, check out my: GitHub Repo

Thank You!!!

Follow me on: Medium, GitHub, LinkedIn, X, Instagram

The above is the detailed content of Text Detection in Images Using AWS Rekognition and Node.js. For more information, please follow other related articles on the PHP Chinese website!

Statement

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Javascript Data Types : Is there any difference between Browser and NodeJs?May 14, 2025 am 12:15 AM

JavaScript core data types are consistent in browsers and Node.js, but are handled differently from the extra types. 1) The global object is window in the browser and global in Node.js. 2) Node.js' unique Buffer object, used to process binary data. 3) There are also differences in performance and time processing, and the code needs to be adjusted according to the environment.

JavaScript Comments: A Guide to Using // and /* */May 13, 2025 pm 03:49 PM

JavaScriptusestwotypesofcomments:single-line(//)andmulti-line(//).1)Use//forquicknotesorsingle-lineexplanations.2)Use//forlongerexplanationsorcommentingoutblocksofcode.Commentsshouldexplainthe'why',notthe'what',andbeplacedabovetherelevantcodeforclari

Python vs. JavaScript: A Comparative Analysis for DevelopersMay 09, 2025 am 12:22 AM

The main difference between Python and JavaScript is the type system and application scenarios. 1. Python uses dynamic types, suitable for scientific computing and data analysis. 2. JavaScript adopts weak types and is widely used in front-end and full-stack development. The two have their own advantages in asynchronous programming and performance optimization, and should be decided according to project requirements when choosing.

Python vs. JavaScript: Choosing the Right Tool for the JobMay 08, 2025 am 12:10 AM

Whether to choose Python or JavaScript depends on the project type: 1) Choose Python for data science and automation tasks; 2) Choose JavaScript for front-end and full-stack development. Python is favored for its powerful library in data processing and automation, while JavaScript is indispensable for its advantages in web interaction and full-stack development.

Python and JavaScript: Understanding the Strengths of EachMay 06, 2025 am 12:15 AM

Python and JavaScript each have their own advantages, and the choice depends on project needs and personal preferences. 1. Python is easy to learn, with concise syntax, suitable for data science and back-end development, but has a slow execution speed. 2. JavaScript is everywhere in front-end development and has strong asynchronous programming capabilities. Node.js makes it suitable for full-stack development, but the syntax may be complex and error-prone.

JavaScript's Core: Is It Built on C or C ?May 05, 2025 am 12:07 AM

JavaScriptisnotbuiltonCorC ;it'saninterpretedlanguagethatrunsonenginesoftenwritteninC .1)JavaScriptwasdesignedasalightweight,interpretedlanguageforwebbrowsers.2)EnginesevolvedfromsimpleinterpreterstoJITcompilers,typicallyinC ,improvingperformance.

JavaScript Applications: From Front-End to Back-EndMay 04, 2025 am 12:12 AM

JavaScript can be used for front-end and back-end development. The front-end enhances the user experience through DOM operations, and the back-end handles server tasks through Node.js. 1. Front-end example: Change the content of the web page text. 2. Backend example: Create a Node.js server.

Python vs. JavaScript: Which Language Should You Learn?May 03, 2025 am 12:10 AM

Choosing Python or JavaScript should be based on career development, learning curve and ecosystem: 1) Career development: Python is suitable for data science and back-end development, while JavaScript is suitable for front-end and full-stack development. 2) Learning curve: Python syntax is concise and suitable for beginners; JavaScript syntax is flexible. 3) Ecosystem: Python has rich scientific computing libraries, and JavaScript has a powerful front-end framework.

See all articles