Maison >développement back-end >Tutoriel Python >Projet Python pour débutant : créer une application de dessin en réalité augmentée à l'aide d'OpenCV et Mediapipe

Projet Python pour débutant : créer une application de dessin en réalité augmentée à l'aide d'OpenCV et Mediapipe

Linda Hamiltonoriginal: 2025-01-02 14:47:38809parcourir

Beginner Python Project: Build an Augmented Reality Drawing App Using OpenCV and Mediapipe

Dans ce projet Python, nous allons créer une simple application de dessin AR. Grâce à votre webcam et aux gestes de vos mains, vous pouvez dessiner virtuellement sur l'écran, personnaliser votre pinceau et même enregistrer vos créations !

Installation

Pour commencer, créez un nouveau dossier et initialisez un nouvel environnement virtuel en utilisant :

python -m venv venv

./venv/Scripts/activate

Ensuite, installez les bibliothèques requises à l'aide de pip ou du programme d'installation de votre choix :

pip install mediapipe

pip install opencv-python

Remarque

Vous pourriez avoir des difficultés à installer Mediapipe avec la dernière version sur Python. Au moment où j'écris ce blog, j'utilise python 3.11.2. Assurez-vous d'utiliser la version compatible sur python.

Étape 1 : Capturer le flux de la webcam

La première étape consiste à configurer votre webcam et à afficher le flux vidéo. Nous utiliserons VideoCapture d'OpenCV pour accéder à la caméra et afficher les images en continu.

import cv2  

# The argument '0' specifies the default camera (usually the built-in webcam).
cap = cv2.VideoCapture(0)

# Start an infinite loop to continuously capture video frames from the webcam
while True:
    # Read a single frame from the webcam
    # `ret` is a boolean indicating success; `frame` is the captured frame.
    ret, frame = cap.read()

    # Check if the frame was successfully captured
    # If not, break the loop and stop the video capture process.
    if not ret:
        break

    # Flip the frame horizontally (like a mirror image)
    frame = cv2.flip(frame, 1)

    # Display the current frame in a window named 'Webcam Feed'
    cv2.imshow('Webcam Feed', frame)

    # Wait for a key press for 1 millisecond
    # If the 'q' key is pressed, break the loop to stop the video feed.
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

# Release the webcam resource to make it available for other programs
cap.release()

# Close all OpenCV-created windows
cv2.destroyAllWindows()

Le saviez-vous ?

Lors de l'utilisation de cv2.waitKey() dans OpenCV, le code clé renvoyé peut inclure des bits supplémentaires en fonction de la plate-forme. Pour vous assurer de détecter correctement les pressions sur les touches, vous pouvez masquer le résultat avec 0xFF pour isoler les 8 bits inférieurs (la valeur ASCII réelle). Sans cela, vos comparaisons clés pourraient échouer sur certains systèmes. Utilisez donc toujours & 0xFF pour un comportement cohérent !

Étape 2 : Intégrer la détection des mains

Grâce à la solution Hands de Mediapipe, nous détecterons la main et extrairons la position des repères clés comme l'index et le majeur.

import cv2  
import mediapipe as mp

# Initialize the MediaPipe Hands module
mp_hands = mp.solutions.hands  # Load the hand-tracking solution from MediaPipe
hands = mp_hands.Hands(
    min_detection_confidence=0.9,
    min_tracking_confidence=0.9 
)

cap = cv2.VideoCapture(0)
while True:
    ret, frame = cap.read()
    if not ret:
        break 

    # Flip the frame horizontally to create a mirror effect
    frame = cv2.flip(frame, 1)

    # Convert the frame from BGR (OpenCV default) to RGB (MediaPipe requirement)
    frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)

    # Process the RGB frame to detect and track hands
    result = hands.process(frame_rgb)

    # If hands are detected in the frame
    if result.multi_hand_landmarks:
        # Iterate through all detected hands
        for hand_landmarks in result.multi_hand_landmarks:
            # Get the frame dimensions (height and width)
            h, w, _ = frame.shape

            # Calculate the pixel coordinates of the tip of the index finger
            cx, cy = int(hand_landmarks.landmark[mp_hands.HandLandmark.INDEX_FINGER_TIP].x * w), \
                     int(hand_landmarks.landmark[mp_hands.HandLandmark.INDEX_FINGER_TIP].y * h)

            # Calculate the pixel coordinates of the tip of the middle finger
            mx, my = int(hand_landmarks.landmark[mp_hands.HandLandmark.MIDDLE_FINGER_TIP].x * w), \
                     int(hand_landmarks.landmark[mp_hands.HandLandmark.MIDDLE_FINGER_TIP].y * h)

            # Draw a circle at the index finger tip on the original frame
            cv2.circle(frame, (cx, cy), 10, (0, 255, 0), -1)  # Green circle with radius 10

    # Display the processed frame in a window named 'Webcam Feed'
    cv2.imshow('Webcam Feed', frame)

    if cv2.waitKey(1) & 0xFF == ord('q'):
        break  # Exit the loop if 'q' is pressed

# Release the webcam resources for other programs
cap.release()
cv2.destroyAllWindows()

Étape 3 : Suivez la position du doigt et dessinez

Nous suivrons l'index et autoriserons le dessin uniquement lorsque l'index et le majeur sont séparés par une distance seuil.

Nous maintiendrons une liste de coordonnées des index à dessiner sur le cadre d'origine et chaque fois que le majeur sera suffisamment proche, nous ajouterons Aucun à ce tableau de coordonnées indiquant une casse.

import cv2  
import mediapipe as mp  
import math  

# Initialize the MediaPipe Hands module
mp_hands = mp.solutions.hands
hands = mp_hands.Hands(
    min_detection_confidence=0.9,  
    min_tracking_confidence=0.9   
)

# Variables to store drawing points and reset state
draw_points = []  # A list to store points where lines should be drawn
reset_drawing = False  # Flag to indicate when the drawing should reset

# Brush settings
brush_color = (0, 0, 255)  
brush_size = 5 


cap = cv2.VideoCapture(0)
while True:
    ret, frame = cap.read()  
    if not ret:
        break 

    frame = cv2.flip(frame, 1) 
    frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB) 
    result = hands.process(frame_rgb)  

    # If hands are detected
    if result.multi_hand_landmarks:
        for hand_landmarks in result.multi_hand_landmarks:
            h, w, _ = frame.shape  # Get the frame dimensions (height and width)

            # Get the coordinates of the index finger tip
            cx, cy = int(hand_landmarks.landmark[mp_hands.HandLandmark.INDEX_FINGER_TIP].x * w), \
                     int(hand_landmarks.landmark[mp_hands.HandLandmark.INDEX_FINGER_TIP].y * h)

            # Get the coordinates of the middle finger tip
            mx, my = int(hand_landmarks.landmark[mp_hands.HandLandmark.MIDDLE_FINGER_TIP].x * w), \
                     int(hand_landmarks.landmark[mp_hands.HandLandmark.MIDDLE_FINGER_TIP].y * h)

            # Calculate the distance between the index and middle finger tips
            distance = math.sqrt((mx - cx) ** 2 + (my - cy) ** 2)

            # Threshold distance to determine if the fingers are close (used to reset drawing)
            threshold = 40 

            # If the fingers are far apart
            if distance > threshold:
                if reset_drawing:  # Check if the drawing was previously reset
                    draw_points.append(None)  # None means no line
                    reset_drawing = False  
                draw_points.append((cx, cy))  # Add the current point to the list for drawing
            else:  # If the fingers are close together set the flag to reset drawing
                reset_drawing = True  # 

    # Draw the lines between points in the `draw_points` list
    for i in range(1, len(draw_points)):
        if draw_points[i - 1] and draw_points[i]:  # Only draw if both points are valid
            cv2.line(frame, draw_points[i - 1], draw_points[i], brush_color, brush_size)


    cv2.imshow('Webcam Feed', frame)

    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

# Release the webcam and close all OpenCV windows
cap.release()
cv2.destroyAllWindows()

Étape 4 : améliorations

Utilisez OpenCV rectangle() et putText() pour les boutons permettant de basculer la taille et la couleur du pinceau.
Ajoutez une option pour enregistrer le cadre.
Ajoutez un outil gomme, utilisez les nouvelles coordonnées pour modifier le tableau draw_points.

Ce qui précède est le contenu détaillé de. pour plus d'informations, suivez d'autres articles connexes sur le site Web de PHP en chinois!

Python pip Array for include using append this display position ASCII opencv ar Access

Déclaration：

Le contenu de cet article est volontairement contribué par les internautes et les droits d'auteur appartiennent à l'auteur original. Ce site n'assume aucune responsabilité légale correspondante. Si vous trouvez un contenu suspecté de plagiat ou de contrefaçon, veuillez contacter admin@php.cn

Article précédent：Présentation de acolor : Un petit utilitaire pour imprimer les codes couleur ANSIArticle suivant：Présentation de acolor : Un petit utilitaire pour imprimer les codes couleur ANSI

Articles Liés

Voir plus