如何使用深度学习和 VGG16 构建人脸和性别识别 Python 项目。
深度学习是机器学习的一个子类别,是一种三层或多层的神经网络。这些神经网络试图通过从大量数据中学习来模拟人脑的行为。虽然单层神经网络仍然可以做出近似预测,但额外的隐藏层可以帮助优化和细化准确性。
深度学习通过在无需人工干预的情况下执行任务来提高自动化程度。深度学习可以在数字助理、语音电视遥控器、信用卡欺诈检测和自动驾驶汽车中找到。
** 在 GitHub 上查看完整代码:https://github.com/alexiacismaru/face-recognision
下载用于人脸检测的VGG16人脸数据集和Haar Cascade XML文件,该文件将用于人脸识别任务中的预处理。
faceCascade = cv2.CascadeClassifier(os.path.join(base_path, "haarcascade_frontal_face_default.xml")) # haar cascade detects faces in images vgg_face_dataset_url = "http://www.robots.ox.ac.uk/~vgg/data/vgg_face/vgg_face_dataset.tar.gz" with request.urlopen(vgg_face_dataset_url) as r, open(os.path.join(base_path, "vgg_face_dataset.tar.gz"), 'wb') as f: f.write(r.read()) # extract VGG dataset with tarfile.open(os.path.join(base_path, "vgg_face_dataset.tar.gz")) as f: f.extractall(os.path.join(base_path)) # download Haar Cascade for face detection trained_haarcascade_url = "https://raw.githubusercontent.com/opencv/opencv/master/data/haarcascades/haarcascade_frontalface_default.xml" with request.urlopen(trained_haarcascade_url) as r, open(os.path.join(base_path, "haarcascade_frontalface_default.xml"), 'wb') as f: f.write(r.read())
从 VGG 人脸数据集中选择性地加载和处理一组预定义主题的特定数量的图像。
# populate the list with the files of the celebrities that will be used for face recognition all_subjects = [subject for subject in sorted(os.listdir(os.path.join(base_path, "vgg_face_dataset", "files"))) if subject.startswith("Jesse_Eisenberg") or subject.startswith("Sarah_Hyland") or subject.startswith("Michael_Cera") or subject.startswith("Mila_Kunis") and subject.endswith(".txt")] # define number of subjects and how many pictures to extract nb_subjects = 4 nb_images_per_subject = 40
通过打开与主题关联的文本文件并阅读内容来迭代每个主题的文件。这些文件中的每一行都包含一个图像的 URL。对于每个 URL(指向图像),代码尝试使用 urllib 加载图像并将其转换为 NumPy 数组。
images = [] for subject in all_subjects[:nb_subjects]: with open(os.path.join(base_path, "vgg_face_dataset", "files", subject), 'r') as f: lines = f.readlines() images_ = [] for line in lines: url = line[line.find("http://"): line.find(".jpg") + 4] try: res = request.urlopen(url) img = np.asarray(bytearray(res.read()), dtype="uint8") # convert the image data into a format suitable for OpenCV # images are colored img = cv2.imdecode(img, cv2.IMREAD_COLOR) h, w = img.shape[:2] images_.append(img) cv2_imshow(cv2.resize(img, (w // 5, h // 5))) except: pass # check if the required number of images has been reached if len(images_) == nb_images_per_subject: # add the list of images to the main images list and move to the next subject images.append(images_) break
# create arrays for all 4 celebrities jesse_images = [] michael_images = [] mila_images = [] sarah_images = [] faceCascade = cv2.CascadeClassifier(os.path.join(base_path, "haarcascade_frontalface_default.xml")) # iterate over the subjects for subject, images_ in zip(all_subjects, images): # create a grayscale copy to simplify the image and reduce computation for img in images_: img_ = img.copy() img_gray = cv2.cvtColor(img_, cv2.COLOR_BGR2GRAY) faces = faceCascade.detectMultiScale( img_gray, scaleFactor=1.2, minNeighbors=5, minSize=(30, 30), flags=cv2.CASCADE_SCALE_IMAGE ) print("Found {} face(s)!".format(len(faces))) for (x, y, w, h) in faces: cv2.rectangle(img_, (x, y), (x+w, y+h), (0, 255, 0), 10) h, w = img_.shape[:2] resized_img = cv2.resize(img_, (224, 224)) cv2_imshow(resized_img) if "Jesse_Eisenberg" in subject: jesse_images.append(resized_img) elif "Michael_Cera" in subject: michael_images.append(resized_img) elif "Mila_Kunis" in subject: mila_images.append(resized_img) elif "Sarah_Hyland" in subject: sarah_images.append(resized_img)
detectMultiScale 方法可识别图像中的人脸。然后,它返回它认为人脸所在的矩形的坐标。对于图像中的每张脸,都会在其周围绘制一个矩形,指示该脸的位置。每个图像的大小都调整为 224x224 像素。
将数据集拆分为训练集和验证集:
faceCascade = cv2.CascadeClassifier(os.path.join(base_path, "haarcascade_frontal_face_default.xml")) # haar cascade detects faces in images vgg_face_dataset_url = "http://www.robots.ox.ac.uk/~vgg/data/vgg_face/vgg_face_dataset.tar.gz" with request.urlopen(vgg_face_dataset_url) as r, open(os.path.join(base_path, "vgg_face_dataset.tar.gz"), 'wb') as f: f.write(r.read()) # extract VGG dataset with tarfile.open(os.path.join(base_path, "vgg_face_dataset.tar.gz")) as f: f.extractall(os.path.join(base_path)) # download Haar Cascade for face detection trained_haarcascade_url = "https://raw.githubusercontent.com/opencv/opencv/master/data/haarcascades/haarcascade_frontalface_default.xml" with request.urlopen(trained_haarcascade_url) as r, open(os.path.join(base_path, "haarcascade_frontalface_default.xml"), 'wb') as f: f.write(r.read())
深度学习模型的准确性取决于训练数据的质量、数量和上下文含义。这是构建深度学习模型时最常见的挑战之一,而且成本高昂且耗时。公司使用数据增强来减少对训练示例的依赖,以快速构建高精度模型。
数据增强是指通过从现有数据生成新数据点来人为地增加数据量。这包括对数据添加微小的更改或使用机器学习模型在原始数据的潜在空间中生成新的数据点以放大数据集。
合成数据代表不使用真实世界图像的人工生成的数据,由生成对抗网络生成。
增强源自原始图像,并进行某种微小的几何变换(例如翻转、平移、旋转或添加噪声),以增加训练集的多样性。
# populate the list with the files of the celebrities that will be used for face recognition all_subjects = [subject for subject in sorted(os.listdir(os.path.join(base_path, "vgg_face_dataset", "files"))) if subject.startswith("Jesse_Eisenberg") or subject.startswith("Sarah_Hyland") or subject.startswith("Michael_Cera") or subject.startswith("Mila_Kunis") and subject.endswith(".txt")] # define number of subjects and how many pictures to extract nb_subjects = 4 nb_images_per_subject = 40
数据增强通过更多样化的数据集提高了机器学习模型的性能,并降低了与数据收集相关的运营成本:
VGG16是一种广泛用于图像识别的卷积神经网络。它被认为是最好的计算机视觉模型架构之一。它由 16 层人工神经元组成,可以增量处理图像以提高准确性。在VGG16中,“VGG”指的是牛津大学视觉几何小组,而“16”指的是网络的16个加权层
VGG16用于图像识别和新图像分类。 VGG16 网络的预训练版本是在 ImageNet 视觉数据库中超过一百万张图像上进行训练的。 VGG16 可用于判断图像是否包含某些物品、动物、植物等。
有 13 个卷积层、5 个 Max Pooling 层和 3 个 Dense 层。这导致 21 个层具有 16 个权重,这意味着它有 16 个可学习参数层。 VGG16 将输入张量大小设为 224x244。该模型侧重于具有步幅为 1 的 3x3 滤波器的卷积层。它始终使用与步幅为 2 的 2x2 滤波器的 maxpool 层相同的填充。
Conv-1 层有 64 个过滤器,Conv-2 有 128 个过滤器,Conv-3 有 256 个过滤器,Conv 4 和 Conv 5 有 512 个过滤器,以及三个全连接层,其中前两个层各有 4096 个通道,第三个层各有 4096 个通道执行 1000 路 ILSVRC 分类并包含 1000 个通道(每个类别一个)。最后一层是 soft-max 层。
开始准备基础模型。
faceCascade = cv2.CascadeClassifier(os.path.join(base_path, "haarcascade_frontal_face_default.xml")) # haar cascade detects faces in images vgg_face_dataset_url = "http://www.robots.ox.ac.uk/~vgg/data/vgg_face/vgg_face_dataset.tar.gz" with request.urlopen(vgg_face_dataset_url) as r, open(os.path.join(base_path, "vgg_face_dataset.tar.gz"), 'wb') as f: f.write(r.read()) # extract VGG dataset with tarfile.open(os.path.join(base_path, "vgg_face_dataset.tar.gz")) as f: f.extractall(os.path.join(base_path)) # download Haar Cascade for face detection trained_haarcascade_url = "https://raw.githubusercontent.com/opencv/opencv/master/data/haarcascades/haarcascade_frontalface_default.xml" with request.urlopen(trained_haarcascade_url) as r, open(os.path.join(base_path, "haarcascade_frontalface_default.xml"), 'wb') as f: f.write(r.read())
为了确保模型能够正确分类图像,我们需要使用额外的层来扩展模型。
# populate the list with the files of the celebrities that will be used for face recognition all_subjects = [subject for subject in sorted(os.listdir(os.path.join(base_path, "vgg_face_dataset", "files"))) if subject.startswith("Jesse_Eisenberg") or subject.startswith("Sarah_Hyland") or subject.startswith("Michael_Cera") or subject.startswith("Mila_Kunis") and subject.endswith(".txt")] # define number of subjects and how many pictures to extract nb_subjects = 4 nb_images_per_subject = 40
全局平均池化 2D 层将从 VGG16 获得的特征图压缩为每个图的单个 1D 向量。它简化了输出并减少了参数总数,有助于防止过度拟合。
密集层是添加的一系列完全连接(密集)层。每层包含指定数量的单元(1024、512 和 256),这些单元是根据常见实践和实验选择的。这些层进一步处理 VGG16 提取的特征。
最后的密集层(输出层)使用适合二元分类的 sigmoid 激活(我们的两个类是“女性”和“男性”)。
Adam 优化算法是随机梯度下降过程的扩展,用于根据训练数据迭代更新网络权重。当处理涉及大量数据或参数的大型问题时,该方法非常有效。它需要更少的内存并且效率更高。
该算法结合了两种梯度下降方法:动量和均方根传播 (RMSP)。
动量是一种算法,用于使用梯度的指数加权平均值来帮助加速梯度下降算法。
均方根道具是一种自适应学习算法,尝试通过采用“指数移动平均值”来改进 AdaGrad。
由于 mt 和 vt 都初始化为 0(基于上述方法),因此观察到它们有“偏向 0”的趋势,因为 β1 和 β2 ≈ 1。此优化器通过计算解决了这个问题“偏差校正”mt 和 vt。这样做也是为了在达到全局最小值时控制权重,以防止接近它时出现高振荡。使用的公式是:
直观上,我们在每次迭代后适应梯度下降,使其在整个过程中保持受控且无偏差,因此得名 Adam。
现在,我们采用偏差校正权重参数 (m_hat)t 和 (v_hat)t,而不是正常的权重参数 mt 和 vt。将它们代入我们的一般方程,我们得到:
来源:Geeksforgeeks,https://www.geeksforgeeks.org/adam-optimizer/
faceCascade = cv2.CascadeClassifier(os.path.join(base_path, "haarcascade_frontal_face_default.xml")) # haar cascade detects faces in images vgg_face_dataset_url = "http://www.robots.ox.ac.uk/~vgg/data/vgg_face/vgg_face_dataset.tar.gz" with request.urlopen(vgg_face_dataset_url) as r, open(os.path.join(base_path, "vgg_face_dataset.tar.gz"), 'wb') as f: f.write(r.read()) # extract VGG dataset with tarfile.open(os.path.join(base_path, "vgg_face_dataset.tar.gz")) as f: f.extractall(os.path.join(base_path)) # download Haar Cascade for face detection trained_haarcascade_url = "https://raw.githubusercontent.com/opencv/opencv/master/data/haarcascades/haarcascade_frontalface_default.xml" with request.urlopen(trained_haarcascade_url) as r, open(os.path.join(base_path, "haarcascade_frontalface_default.xml"), 'wb') as f: f.write(r.read())
在深度学习环境中设置图像数据预处理、增强和模型训练。
faceCascade = cv2.CascadeClassifier(os.path.join(base_path, "haarcascade_frontal_face_default.xml")) # haar cascade detects faces in images vgg_face_dataset_url = "http://www.robots.ox.ac.uk/~vgg/data/vgg_face/vgg_face_dataset.tar.gz" with request.urlopen(vgg_face_dataset_url) as r, open(os.path.join(base_path, "vgg_face_dataset.tar.gz"), 'wb') as f: f.write(r.read()) # extract VGG dataset with tarfile.open(os.path.join(base_path, "vgg_face_dataset.tar.gz")) as f: f.extractall(os.path.join(base_path)) # download Haar Cascade for face detection trained_haarcascade_url = "https://raw.githubusercontent.com/opencv/opencv/master/data/haarcascades/haarcascade_frontalface_default.xml" with request.urlopen(trained_haarcascade_url) as r, open(os.path.join(base_path, "haarcascade_frontalface_default.xml"), 'wb') as f: f.write(r.read())
模型的性能是通过对验证集进行预测来评估的。这可以让我们了解模型对未见数据的执行情况。对这些预测应用阈值,将每个图像分类为两个类别之一(“男性”或“女性”)。
# populate the list with the files of the celebrities that will be used for face recognition all_subjects = [subject for subject in sorted(os.listdir(os.path.join(base_path, "vgg_face_dataset", "files"))) if subject.startswith("Jesse_Eisenberg") or subject.startswith("Sarah_Hyland") or subject.startswith("Michael_Cera") or subject.startswith("Mila_Kunis") and subject.endswith(".txt")] # define number of subjects and how many pictures to extract nb_subjects = 4 nb_images_per_subject = 40
创建混淆矩阵以可视化准确性。
images = [] for subject in all_subjects[:nb_subjects]: with open(os.path.join(base_path, "vgg_face_dataset", "files", subject), 'r') as f: lines = f.readlines() images_ = [] for line in lines: url = line[line.find("http://"): line.find(".jpg") + 4] try: res = request.urlopen(url) img = np.asarray(bytearray(res.read()), dtype="uint8") # convert the image data into a format suitable for OpenCV # images are colored img = cv2.imdecode(img, cv2.IMREAD_COLOR) h, w = img.shape[:2] images_.append(img) cv2_imshow(cv2.resize(img, (w // 5, h // 5))) except: pass # check if the required number of images has been reached if len(images_) == nb_images_per_subject: # add the list of images to the main images list and move to the next subject images.append(images_) break
对于二元分类,接收者操作特征 (ROC) 曲线和曲线下面积 (AUC) 对于理解真阳性率和假阳性率之间的权衡很有用。
# create arrays for all 4 celebrities jesse_images = [] michael_images = [] mila_images = [] sarah_images = [] faceCascade = cv2.CascadeClassifier(os.path.join(base_path, "haarcascade_frontalface_default.xml")) # iterate over the subjects for subject, images_ in zip(all_subjects, images): # create a grayscale copy to simplify the image and reduce computation for img in images_: img_ = img.copy() img_gray = cv2.cvtColor(img_, cv2.COLOR_BGR2GRAY) faces = faceCascade.detectMultiScale( img_gray, scaleFactor=1.2, minNeighbors=5, minSize=(30, 30), flags=cv2.CASCADE_SCALE_IMAGE ) print("Found {} face(s)!".format(len(faces))) for (x, y, w, h) in faces: cv2.rectangle(img_, (x, y), (x+w, y+h), (0, 255, 0), 10) h, w = img_.shape[:2] resized_img = cv2.resize(img_, (224, 224)) cv2_imshow(resized_img) if "Jesse_Eisenberg" in subject: jesse_images.append(resized_img) elif "Michael_Cera" in subject: michael_images.append(resized_img) elif "Mila_Kunis" in subject: mila_images.append(resized_img) elif "Sarah_Hyland" in subject: sarah_images.append(resized_img)
总之,通过使用深度学习和图像处理算法,您可以构建一个 Python 项目来识别人脸并将其分类为男性或女性。
以上是使用 VGG 进行人脸和性别识别的详细内容。更多信息请关注PHP中文网其他相关文章!