首頁 >web前端 >js教程 >使用 AWS Rekognition 和 Node.js 檢測圖像中的文本

使用 AWS Rekognition 和 Node.js 檢測圖像中的文本

王林原創: 2024-08-26 21:35:021021瀏覽

大家好！在本文中，我們將創建一個簡單的應用程序，以使用 AWS Rekognition 和 Node.js 執行圖像文字檢測。

什麼是 AWS Rekognition？

Amazon Rekognition 是一項服務，可讓您輕鬆地將圖像和視訊分析添加到您的應用程式中。它提供文字偵測、臉部辨識、甚至名人偵測等功能。
雖然 Rekognition 可以分析儲存在 S3 中的圖像或視頻，但在本教程中，為了簡單起見，我們將不使用 S3。
我們將使用 Express 作為後端，使用 React 作為前端。

第一步

在開始之前，您需要建立一個 AWS 帳戶並設定 IAM 使用者。如果您已經有了這些，您可以跳過本節。

建立 IAM 使用者

登入 AWS： 先登入您的 AWS 根帳號。
搜尋 IAM： 在 AWS 控制台中，搜尋 IAM 並選擇它。
前往使用者部分，然後點選建立使用者。
設定使用者名稱，然後在設定權限下選擇直接附加策略。
搜尋並選擇重新識別策略，然後按一下「下一步」並建立使用者。
建立存取金鑰： 建立用戶後，選擇用戶，然後在「安全憑證」標籤下建立存取金鑰。請務必下載包含您的存取金鑰和秘密存取金鑰的 .csv 檔案。
更詳細的說明，請參閱AWS官方文件：https://docs.aws.amazon.com/IAM/latest/UserGuide/id_users_create.html

設定 aws-sdk

安裝 AWS CLI： 在您的系統上安裝 AWS CLI。
驗證安裝： 開啟終端機或命令提示字元並輸入 aws --version 以確保 CLI 已正確安裝。
設定 AWS CLI： 執行 aws configure 並提供您下載的 .csv 檔案中的存取金鑰、秘密存取金鑰和區域。

專案目錄

my-directory/
│
├── client/
│   └── src/
│       └── App.jsx
│   └── public/
│   └── package.json
│   └── ... (other React project files)
│
└── server/
    ├── index.js
    └── rekognition/
        └── aws.rek.js

設定前端

npm 建立 vite @latest。 -- --模板反應
它將在客戶端資料夾中建立反應項目。

在 App.jsx 中

import { useState } from "react";

function App() {
  const [img, setImg] = useState(null);

  const handleImg = (e) => {
    setImg(e.target.files[0]);  // Store the selected image in state
  };

  const handleSubmit = (e) => {
    e.preventDefault();
    if (!img) return;

    const formData = new FormData();
    formData.append("img", img);
    console.log(formData);      // Log the form data to the console
  };

  return (
    <div>
      <form onSubmit={handleSubmit}>
        <input type="file" name="img" accept="image/*" onChange={handleImg} />
        <br />
        <button type="submit">Submit</button>
      </form>
    </div>
  );
}

export default App;

讓我們透過確保圖像在提交後記錄到控制台來測試一下。

現在，讓我們轉到後端並開始為這個專案製作靈魂。

初始化後端

在伺服器資料夾

npm init -y
npm install express cors nodemon multer @aws-sdk/client-rekognition
我創建了一個單獨的資料夾用於重新識別，以處理分析邏輯並在該資料夾內建立一個檔案。

//aws.rek.js

import {
  RekognitionClient,
  DetectTextCommand,
} from "@aws-sdk/client-rekognition";

const client = new RekognitionClient({});

export const Reko = async (params) => {
  try {
      const command = new DetectTextCommand(
          {
              Image: {
                  Bytes:params  //we are using Bytes directly instead of S3
              }
        }
    );
    const response = await client.send(command);
    return response
  } catch (error) {
    console.log(error.message);
  }
};

說明

我們初始化一個 RekognitionClient 物件。由於我們已經配置了 SDK，所以我們可以將大括號留空。
我們建立一個非同步函數 Reko 來處理影像。在此函數中初始化一個 DetectTextCommand 對象，該物件以位元組為單位取得圖像。
這個DectedTextCommand專門用於文字偵測。
函數等待回應並傳回它。

建立 API

在伺服器資料夾中，建立一個檔案index.js 或任何你想要的名稱。

//index.js

import express from "express"
import multer from "multer"
import cors from "cors"
import { Reko } from "./rekognition/aws.rek.js";

const app = express()
app.use(cors())
const storage = multer.memoryStorage()
const upload = multer()
const texts = []
let data = []

app.post("/img", upload.single("img"), async(req,res) => {
    const file = req.file
    data = await Reko(file.buffer)
    data.TextDetections.map((item) => {
        texts.push(item.DetectedText)
    })
    res.status(200).send(texts)
})

app.listen(3000, () => {
    console.log("server started");
})

說明

初始化express並啟動伺服器。
我們使用 multer 來處理多部分錶單數據，並將其暫時儲存在緩衝區中。
建立發布請求以從使用者取得影像。這是一個非同步函數。
使用者上傳圖片後，該圖片將在 req.file 中可用
這個 req.file 包含一些屬性，其中有一個 Buffer 屬性將我們的圖像資料保存為 8 位元緩衝區。
我們需要它，因此我們將該 req.file.buffer 傳遞給 rekognition 函數。分析後，函數傳回物件數組。
我們正在將這些物件中的文字傳送給使用者。

回到前端

import axios from "axios";
import { useState } from "react";
import "./App.css"; 

function App() {
  const [img, setImg] = useState(null);
  const [pending, setPending] = useState(false);
  const [texts, setTexts] = useState([]);

  const handleImg = (e) => {
    setImg(e.target.files[0]);
  };

  const handleSubmit = async (e) => {
    e.preventDefault();
    if (!img) return; 

    const formData = new FormData();
    formData.append("img", img);

    try {
      setPending(true);
      const response = await axios.post("http://localhost:3000/img", formData);
      setTexts(response.data);
    } catch (error) {
      console.log("Error uploading image:", error);
    } finally {
      setPending(false);
    }
  };

  return (
    <div className="app-container">
      <div className="form-container">
        <form onSubmit={handleSubmit}>
          <input type="file" name="img" accept="image/*" onChange={handleImg} />
          <br />
          <button type="submit" disabled={pending}>
            {pending ? "Uploading..." : "Upload Image"}
          </button>
        </form>
      </div>

      <div className="result-container">
        {pending && <h1>Loading...</h1>}
        {texts.length > 0 && (
          <ul>
            {texts.map((text, index) => (
              <li key={index}>{text}</li>
            ))}
          </ul>
        )}
      </div>
    </div>
  );
}

export default App;

使用 Axios 發布圖片。並將回應儲存在文字狀態中。
顯示文本，目前我使用index作為Key，但不鼓勵使用Index作為key。
我還添加了一些額外的東西，例如載入狀態和一些樣式。

最終輸出

Text Detection in Images Using AWS Rekognition and Node.js

點擊「上傳圖片」按鈕後，後端處理圖片並返回偵測到的文本，然後將其顯示給使用者。

完整的程式碼，請查看我的：GitHub Repo

謝謝！！！

追蹤我：Medium、GitHub、LinkedIn、X、Instagram

以上是使用 AWS Rekognition 和 Node.js 檢測圖像中的文本的詳細內容。更多資訊請關注PHP中文網其他相關文章！

html npm express Array Object if for while select Directory Logging using Property JS console function this github https prompt Access axios

陳述：

本文內容由網友自願投稿，版權歸原作者所有。本站不承擔相應的法律責任。如發現涉嫌抄襲或侵權的內容，請聯絡admin@php.cn

上一篇：輕鬆無限滾動：如何使用 Intersection Observer 實現延遲加載下一篇：輕鬆無限滾動：如何使用 Intersection Observer 實現延遲加載

看更多