Face Recognition System based on FaceNet

Reference

Project Backgroun

  1. Project Foundation
  2. Experiment Condition: computational resource (NVIDIA GPU 显卡)
    • DELL Laptop: GeForce GTX 1050 (4G) x1
    • Windows Server: GeForce RTX 2080 Ti (11G) x2

Face Recognition Function Process

Install ffmpeg

点击
然后点击https://www.gyan.dev/ffmpeg/builds/ffmpeg-git-essentials.7z下载ffmpeg

Crop from video to images

src/crop.py

1. 准备一个文件夹kol_video,子文件夹为KOL人名,子文件夹下放该KOL的video
2. 准备另一个文件夹kol_crop,剪切后的图片会依次存入不同KOL人名的子文件夹
$python src/crop.py C:/Users/PC/Desktop/kol_video C:/Users/PC/Desktop/kol_crop
import os
import sys
import time
import argparse
import subprocess

allowed_set = set(['avi', 'mp4', 'mkv', 'flv', 'wmv', 'mov']) 

def allowed_file(filename, allowed_set):
    check = '.' in filename and filename.rsplit('.', 1)[1].lower() in allowed_set
    return check

def main(args):

    ffmpegCmd   = "C:/Path/ffmpeg/bin/ffmpeg.exe"   # ffmpeg路径
    videoDir    = args.videoDir
    cropDir     = args.cropDir

    if not os.path.exists(cropDir):
            os.makedirs(cropDir)

    kol_name = []
    for i in os.listdir(videoDir):
        kol_name.append(i)

    for i in range(len(kol_name)):
        # count = 0
        dir_name = kol_name[i]
        kol_dir = os.path.join(videoDir, kol_name[i])
        if os.path.isdir(kol_dir):
            kol_crop_dir = os.path.join(cropDir, kol_name[i])
            if not os.path.exists(kol_crop_dir):
                    os.makedirs(kol_crop_dir)
            for kol_video_file in os.listdir(kol_dir):
                if allowed_file(filename=kol_video_file, allowed_set=allowed_set):   # | if (kol_video_file.endswith(".mp4")):
                    kol_video_path  = os.path.join(kol_dir, kol_video_file)
                    kol_crop_path   = os.path.join(kol_crop_dir, kol_video_file.rsplit('.')[0] + "_%04d.png")
                    video2framesCmd = ffmpegCmd + " -i " + kol_video_path + " -f image2 -vf fps=fps=3 -qscale:v 2 " + kol_crop_path
                    try:
                        subprocess.call(video2framesCmd, shell=True)
                        print('crop from %s to %s' % (kol_video_path, kol_crop_path))
                    except:
                        continue

def parse_arguments(argv):
    parser = argparse.ArgumentParser()
    parser.add_argument('videoDir', type=str, help='KOL video root dirctory which contains different KOL identity directories.')
    parser.add_argument('cropDir', type=str, help='KOL crops root dirctory which contains different KOL identity directories which containing labeled and aligned face thumbnails.')
    return parser.parse_args(argv)

if __name__ == '__main__':
    start_time = time.time()
    print('Program.py start at: ' + time.strftime("%Y-%m-%d %H:%M:%S", time.localtime()))
    main(parse_arguments(sys.argv[1:]))
    print('Program.py end at: ' + time.strftime("%Y-%m-%d %H:%M:%S", time.localtime()))
    print('Program.py all time: {0} seconds = {1} minutes = {2} hrs'.format((time.time() - start_time),
                                                                     (time.time() - start_time) / 60,
                                                                     (time.time() - start_time) / 3600))

以FFmpeg剪切一个抖音视频为例

- Tik Tok Video: 13.2MB, 59 seconds, 720x1280, 1751kbps(1751 kb/s 数据速率、码率、取样率), 30帧/秒(30 fps 帧速率)
- crop fps: 1(生成59张图片)、2(生成118张图片) | 生成格式类似 视频名_0001.png 的图片
- 要调整生成图片的多少,就调整 ffmpeg -i /data/video_1.mp4 -f image2  -vf fps=fps=1/60 -qscale:v 2 /data/mp4-%05d.jpg 中 fps的数值

- 取样率:单位时间内取样率越大,精度就越高,处理出来的文件就越接近原始文件,但是文件体积与取样率是成正比的,所以几乎所有的编码格式重视的都是如何用最低的码率达到最少的失真,围绕这个核心衍生出来的cbr(固定码率)与vbr(可变码率),都是在这方面做的文章,不过事情总不是绝对的,从音频方面来说,码率越高,被压缩的比例越小,音质损失越小,与音源的音质越接近
- 帧速率:指每秒钟刷新的图片的帧数,也可以理解为图形处理器每秒钟能够刷新几次。对影片内容而言,帧速率指每秒所显示的静止帧格数。要生成平滑连贯的动画效果,帧速率一般不小于8;而电影的帧速率为24fps。捕捉动态视频内容时,此数字愈高愈好。

FFmpeg剪切视频的几种用法

$ffmpeg -i /data/video_1.mp4 -f image2  -vf fps=fps=1/60 -qscale:v 2 /data/mp4-%05d.jpg # 该函数实现将视频集视频进行分解成图像序列,并放在一个文件夹里面
$ffmpeg -i /data/video_1.mp4 -ss 00:00:10 -t 00:00:50 -f image2 -vf fps=fps=2 -qscale:v 2 /data/mp4-%05d.jpg # -ss规定从什么时候开始剪切,-t规定从什么时候结束剪切

Detect, extract and align face images

src/align/align_dataset_mtcnn.py (基于原facenet代码进行改写)

# 此时大量消耗CPU和内存
$python src/align/align_dataset_mtcnn.py C:/Users/PC/Desktop/kol_crop C:/Users/PC/Desktop/kol_160 --image_size 160 --margin 32 # 如果GPU够强劲
$python src/align/align_dataset_mtcnn.py C:/Users/PC/Desktop/kol_crop C:/Users/PC/Desktop/kol_160 --image_size 160 --margin 32 --gpu_memory_fraction 0.5 # 如果GPU不够强劲
# *********** 第一处
all_people = 0 # 加入all_people计算
with open(bounding_boxes_filename, "w") as text_file:

# *********** 第二处
for cls in dataset:
    count = 0 # 加入count计算

# *********** 第三处
for image_path in cls.image_paths:
    count += 1 # count增加

# *********** 第四处
# filename_base, file_extension = os.path.splitext(output_filename)
# if args.detect_multiple_faces:
#     output_filename_n = "{}_{}{}".format(filename_base, i, file_extension)
# else:
#     output_filename_n = "{}{}".format(filename_base, file_extension)
# misc.imsave(output_filename_n, scaled)
# text_file.write('%s %d %d %d %d\n' % (output_filename_n, bb[0], bb[1], bb[2], bb[3]))

filename_base, file_extension   = os.path.splitext(output_filename)
filename                        = cls.name + '_' + str(count).zfill(4)
if args.detect_multiple_faces:
    all_people += 1
    output_filename_n = "{}_{}{}".format(filename, i, file_extension)
    print("{0} people: crop {1} to {2}".format(all_people, image_path, output_filename_n))
else:
    all_people += 1
    output_filename_n = "{}{}".format(filename, file_extension)
    print("{0} people: crop {1} to {2}".format(all_people, image_path, output_filename_n))
save_filename = os.path.join(output_class_dir, output_filename_n)
misc.imsave(save_filename, scaled)
text_file.write('%s %d %d %d %d\n' % (output_filename_n, bb[0], bb[1], bb[2], bb[3]))

# *********** 第五处
import time
if __name__ == '__main__':
    start_time = time.time()
    print('label_align_dataset.py start at: ' + time.strftime("%Y-%m-%d %H:%M:%S", time.localtime()))
    main(parse_arguments(sys.argv[1:]))
    print('label_align_dataset.py end at: ' + time.strftime("%Y-%m-%d %H:%M:%S", time.localtime()))
    print('label_align_dataset.py all time: {0} seconds = {1} minutes = {2} hrs'.format((time.time() - start_time),
                                                                     (time.time() - start_time) / 60,
                                                                     (time.time() - start_time) / 3600))

Manually clean the data set

手动清洗数据集 kol_160: 每个子文件夹下不是此KOL的图片(手动删除) 、有脸部遮挡的图片(比如手部遮挡、物品遮挡、文字动图遮挡等,手动删除)、非人脸图(手动删除)

Train model with face thumbnails

src/classifier.py

$python src/classifier.py TRAIN C:/Users/PC/Desktop/kol_160 models/20180402-114759/20180402-114759.pb models/kol.pkl

Validate model with face thumbnails

src/classifier.py

$python src/classifier.py CLASSIFY C:/Users/PC/Desktop/kol_160 models/20180402-114759/20180402-114759.pb models/kol.pkl

Predict KOL identity

contributed/predict.py

$python contributed/predict.py C:/Users/PC/Desktop/kol_160/01/01_0001.png models/20180402-114759 models/kol.pkl # 如果GPU够强劲
$python contributed/predict.py C:/Users/PC/Desktop/kol_160/01/01_0001.png models/20180402-114759 models/kol.pkl --gpu_memory_fraction 0.5 # 如果GPU不够强劲

Configura Project Virtual Environment

  1. 添加缺失的包
    $conda activate facenet # FaceNet-Configuration-and-Deployment中配置的facenet环境
    $pip install waitress imutils flask pypinyin
  2. 修改源码C:\Users\PC\.conda\envs\haha\Lib\site-packages\werkzeug\utils.py
    # 使用时: from werkzeug.utils import secure_filename
    
    # *****************Action1:修改代码
    # filename = str(_filename_ascii_strip_re.sub("", "_".join(filename.split()))).strip("._")
    
    _filename_ascii_add_strip_re = re.compile(r'[^A-Za-z0-9_\u4E00-\u9FBF.-]')
    filename = str(_filename_ascii_add_strip_re.sub('', '_'.join( 
                    filename.split()))).strip('._')
  3. 使用cmd开启服务器时,要“以管理员身份运行”
    $cd 根目录下(server.py所在目录)
    $python server.py
  4. 保持一致
    • 测试时的包版本要与训练模型时的包版本一致才可以预测(所以不要随便升级包版本)
      1. AttributeError: 'SVC' object has no attribute '_probA' (or something like that)
      It turned out that I had to stay with the same version of scikit that was used to train the models I currently have. Later versions of scikit don't work with the trained face models. If you want to upgrade scikit, you have to retrain you models with the new version of scikit.
    • 训练和测试的基准模型保持一致
      - 若训练时使用的基准模型为models/20180402-114759/,predict的时候也请使用此模型
      - 若训练时使用的基准模型为models/20180408-102900/,predict的时候也请使用此模型

Run C/S Server to Load Model

Folder Structure

  • Framework: Python Flask + HTML
  • 目录
    - server.py
    - utils.py
    - templates
        - index.html
        - warning.html
        - predict_single.html
        - predict_single_result.html
        - predict_batch.html
        - predict_batch_result.html
        - find_similar_kol.html
        - find_similar_kol_result.html
    - models
        - 20180402-114759
            - 20180402-114759.pb
            - model-20180402-114759.meta
            - model-20180402-114759.ckpt-275.index
            - model-20180402-114759.ckpt-275.data-00000-of-00001
        - kol.pkl (训练好的classifier放在这里| 训练和测试时的包版本必须保持一致,不要随便更新包)

Program Details

Please refer my github project: Face-Recognition-Server-based-on-FaceNet

server.py

server.py

import os
import cv2
import time
import pickle
import shutil
import numpy as np
import tensorflow as tf
from waitress import serve
from scipy.misc import imread
import facenet.facenet as facenet
import align.detect_face as detect_face
from pypinyin import lazy_pinyin
from werkzeug.utils import secure_filename
from imutils.video import WebcamVideoStream
from flask import Flask, request, render_template

from utils import (
    allowed_set,
    allowed_file,
    load_and_align_data
)

tf.reset_default_graph()

app             = Flask(__name__)
app.secret_key  = os.urandom(24)
APP_ROOT        = os.path.dirname(os.path.abspath(__file__))
uploads_path    = os.path.join(APP_ROOT, 'static')

@app.route("/")
def index_page():   # select prediction mode: single | batch | top k similar
    return render_template(template_name_or_list="index.html")

@app.route("/predictSinglePage")
def predict_single_page(): # manually upload single image file for identity prediction
    return render_template(template_name_or_list="predict_single.html")

@app.route('/predictSingleImage', methods=['POST', 'GET'])
def predict_single_image(): # get single image prediction result | upload image files via POST request and feeds them to the FaceNet model to get prediction result
    images_savedir = "static/"
    if  os.path.exists(images_savedir):
        shutil.rmtree(images_savedir)
    if not os.path.exists(images_savedir):
        os.makedirs(images_savedir)
    if request.method == 'POST':
        if 'file' not in request.files:
            return render_template(
                template_name_or_list="warning.html",
                status="No 'file' field in POST request!"
            )
        file        = request.files['file']                                    # <FileStorage: 'download.jpg' ('image/jpeg')>
        filename    = secure_filename(''.join(lazy_pinyin(file.filename))) # download.jpg
        if filename == "":
            return render_template(
                template_name_or_list="warning.html",
                status="No selected file!"
            )
        upload_path = os.path.join(uploads_path, filename)
        file.save(upload_path)
        static_path             = "static/" + filename
        image_paths = []
        image_paths.append(static_path)
        path_nofperson_identity_similarity_timeused = []
        display = path_nofperson_identity_similarity_timeused
        if file and allowed_file(filename=filename, allowed_set=allowed_set):
            tf.reset_default_graph()
            start_time  = time.time()
            images, count_per_image = load_and_align_data(image_paths) # count_per_image = [x] | x = 0时代表没有一个人脸被检测到            
            if count_per_image[0] != 0:
                if count_per_image[0] == "0":
                    return render_template(
                        template_name_or_list="warning.html",
                        status="The uploaded file is illegal. Please upload safe image file!"
                    )
                feed_dict               = {images_placeholder: images , phase_train_placeholder:False}
                emb                     = sess.run(embeddings, feed_dict=feed_dict)
                classifier_filename_exp = os.path.expanduser(classifier_filename)
                if images is not None: 
                    with open(classifier_filename_exp, 'rb') as infile:
                        (model, class_names) = pickle.load(infile)
                    if model:
                        print('Loaded classifier model from file "%s"\n' % classifier_filename_exp)
                        predictions                 = model.predict_proba(emb)
                        best_class_indices          = np.argmax(predictions, axis=1) # <class 'numpy.ndarray'> [0]
                        best_class_probabilities    = predictions[np.arange(len(best_class_indices)), best_class_indices] # [0.99692782]
                        k = 0
                        for j in range(count_per_image[0]):
                            sentence        = str(static_path) + ","
                            sentence        = sentence + str(count_per_image[0]) + " people detected!,"
                            print("\npeople in image %s :" %(filename), '%s: %.3f' % (class_names[best_class_indices[k]], best_class_probabilities[k]))
                            identity        = class_names[best_class_indices[k]]
                            probabilities   = best_class_probabilities[k]
                            k+=1
                            probabilities   = "Similarity: " + str(probabilities).split('.')[0] + '.' + str(probabilities).split('.')[1][:3]
                            spent           = str(time.time() - start_time)
                            spent           = "Time consuming: " + str(spent).split('.')[0] + '.' + str(spent).split('.')[1][:2]  + " seconds"
                            sentence        = sentence + "Person " + str(k) + ": " + str(identity) + "," + str(probabilities) + "," + str(spent)
                            display.append(sentence)
                    else:
                        sentence        = str(static_path) + ","
                        sentence        = sentence + "No embedding classifier was detected!,"
                        identity        = ""
                        probabilities   = ""
                        spent           = str(time.time() - start_time)
                        spent           = "Time consuming: " + str(spent).split('.')[0] + '.' + str(spent).split('.')[1][:2]  + " seconds"
                        sentence        = sentence + str(identity) + "," + str(probabilities) + "," + str(spent)
                        display.append(sentence)
                else:
                    sentence        = str(static_path) + ","
                    sentence        = sentence + "No human face was detected!,"
                    identity        = ""
                    probabilities   = ""
                    spent           = str(time.time() - start_time)
                    spent           = "Time consuming: " + str(spent).split('.')[0] + '.' + str(spent).split('.')[1][:2]  + " seconds"
                    sentence        = sentence + str(identity) + "," + str(probabilities) + "," + str(spent)
                    display.append(sentence)
            else:
                sentence        = str(static_path) + ","
                sentence        = sentence + "No human face was detected!,"
                identity        = ""
                probabilities   = ""
                spent           = str(time.time() - start_time)
                spent           = "Time consuming: " + str(spent).split('.')[0] + '.' + str(spent).split('.')[1][:2]  + " seconds"
                sentence        = sentence + str(identity) + "," + str(probabilities) + "," + str(spent)
                display.append(sentence)
        return render_template(
            template_name_or_list='predict_single_result.html',
            display=display
        )      
    else:
        return render_template(
            template_name_or_list="warning.html",
            status="POST HTTP method required!"
        )

@app.route("/predictBatchPage")
def predict_batch_page(): # manually upload multiple image files for identity prediction
    return render_template(template_name_or_list="predict_batch.html")

@app.route('/predictBatchImage', methods=['POST', 'GET'])
def predict_batch_image(): # get multiple image prediction results | upload image files via POST request and feeds them to the FaceNet model to get prediction results
    images_savedir = "static/"
    if  os.path.exists(images_savedir):
        shutil.rmtree(images_savedir)
    if not os.path.exists(images_savedir):
        os.makedirs(images_savedir)
    if request.method == 'POST':
        if 'file' not in request.files:
            return render_template(
                template_name_or_list="warning.html",
                status="No 'file' field in POST request!"
            )
        files   = request.files.getlist('file')
        path_nofperson_identity_similarity_timeused = []
        display = path_nofperson_identity_similarity_timeused
        for file in files: # file: <FileStorage: '中文.jpg' ('image/jpeg')>  
            one_image = []  
            filename = secure_filename(''.join(lazy_pinyin(file.filename))) # 中文.jpg -> zhongwen.jpg
            if allowed_file(filename=filename, allowed_set=allowed_set):
                if filename == "":
                    return render_template(
                        template_name_or_list="warning.html",
                        status="No selected file!"
                    )
                upload_path             = os.path.join(uploads_path, filename)
                file.save(upload_path)
                static_path             = "static/" + filename
                image_paths = []
                image_paths.append(static_path)
                tf.reset_default_graph()
                start_time  = time.time()
                images, count_per_image = load_and_align_data(image_paths)
                if count_per_image[0] != 0:
                    if count_per_image[0] == "0":
                        return render_template(
                            template_name_or_list="warning.html",
                            status="The uploaded file is illegal. Please upload safe image file!"
                        )
                    feed_dict               = {images_placeholder: images , phase_train_placeholder:False}
                    emb                     = sess.run(embeddings, feed_dict=feed_dict)
                    classifier_filename_exp = os.path.expanduser(classifier_filename)
                    if images is not None: 
                        with open(classifier_filename_exp, 'rb') as infile:
                            (model, class_names) = pickle.load(infile)
                            if model:
                                print('Loaded classifier model from file "%s"\n' % classifier_filename_exp)
                                predictions                 = model.predict_proba(emb)
                                best_class_indices          = np.argmax(predictions, axis=1)
                                best_class_probabilities    = predictions[np.arange(len(best_class_indices)), best_class_indices]
                                k = 0
                                for j in range(count_per_image[0]):
                                    sentence        = str(static_path) + ","
                                    sentence        = sentence + str(count_per_image[0]) + " people detected!,"
                                    print("\npeople in image %s :" %(filename), '%s: %.3f' % (class_names[best_class_indices[k]], best_class_probabilities[k]))
                                    identity        = class_names[best_class_indices[k]]
                                    probabilities   = best_class_probabilities[k]
                                    k+=1
                                    probabilities   = "Similarity: " + str(probabilities).split('.')[0] + '.' + str(probabilities).split('.')[1][:3]
                                    spent           = str(time.time() - start_time)
                                    spent           = "Time consuming: " + str(spent).split('.')[0] + '.' + str(spent).split('.')[1][:2]  + " seconds"
                                    sentence        = sentence + "Person " + str(k) + ": " + str(identity) + "," + str(probabilities) + "," + str(spent)
                                    one_image.append(sentence)
                            else:
                                sentence        = str(static_path) + ","
                                sentence        = sentence + "No embedding classifier was detected!,"
                                identity        = ""
                                probabilities   = ""
                                spent           = str(time.time() - start_time)
                                spent           = "Time consuming: " + str(spent).split('.')[0] + '.' + str(spent).split('.')[1][:2]  + " seconds"
                                sentence        = sentence + str(identity) + "," + str(probabilities) + "," + str(spent)
                                one_image.append(sentence)
                    else:
                        sentence        = str(static_path) + ","
                        sentence        = sentence + "No human face was detected!,"
                        identity        = ""
                        probabilities   = ""
                        spent           = str(time.time() - start_time)
                        spent           = "Time consuming: " + str(spent).split('.')[0] + '.' + str(spent).split('.')[1][:2]  + " seconds"
                        sentence        = sentence + str(identity) + "," + str(probabilities) + "," + str(spent)
                        one_image.append(sentence)
                else:
                    sentence        = str(static_path) + ","
                    sentence        = sentence + "No human face was detected!,"
                    identity        = ""
                    probabilities   = ""
                    spent           = str(time.time() - start_time)
                    spent           = "Time consuming: " + str(spent).split('.')[0] + '.' + str(spent).split('.')[1][:2]  + " seconds"
                    sentence        = sentence + str(identity) + "," + str(probabilities) + "," + str(spent)
                    one_image.append(sentence)
            display.append(one_image)
        return render_template(
            template_name_or_list='predict_batch_result.html',
            display=display
        )      
    else:
        return render_template(
            template_name_or_list="warning.html",
            status="POST HTTP method required!"
        )

@app.route("/findSimilarKOLPage")
def find_similar_kol_page(): # manually upload single image file to find tip k similar identities
    return render_template(template_name_or_list="find_similar_kol.html")

@app.route('/findSimilarKOLResult', methods=['POST', 'GET'])
def find_similar_kol_result(): # get tip k similar identities
    images_savedir = "static/"
    if  os.path.exists(images_savedir):
        shutil.rmtree(images_savedir)
    if not os.path.exists(images_savedir):
        os.makedirs(images_savedir)
    if request.method == 'POST':
        if 'file' not in request.files:
            return render_template(
                template_name_or_list="warning.html",
                status="No 'file' field in POST request!"
            )
        file        = request.files['file']                                    # <FileStorage: 'download.jpg' ('image/jpeg')>
        filename    = secure_filename(''.join(lazy_pinyin(file.filename))) # download.jpg
        if filename == "":
            return render_template(
                template_name_or_list="warning.html",
                status="No selected file!"
            )
        upload_path = os.path.join(uploads_path, filename)
        file.save(upload_path)
        static_path             = "static/" + filename
        image_paths = []
        image_paths.append(static_path)
        kthKOL_similarity_timeused = []
        display   = kthKOL_similarity_timeused
        if file and allowed_file(filename=filename, allowed_set=allowed_set):
            tf.reset_default_graph()
            start_time  = time.time()
            images, count_per_image = load_and_align_data(image_paths) # count_per_image = [x] | x = 0时代表没有一个人脸被检测到            
            if count_per_image[0] != 0:
                if count_per_image[0] == "0":
                    return render_template(
                        template_name_or_list="warning.html",
                        status="The uploaded file is illegal. Please upload safe image file!"
                    )
                if count_per_image[0] != 1:
                    return render_template(
                        template_name_or_list="warning.html",
                        status="Please upload image which contains only one KOL face!"
                    )
                feed_dict               = {images_placeholder: images , phase_train_placeholder:False}
                emb                     = sess.run(embeddings, feed_dict=feed_dict)
                classifier_filename_exp = os.path.expanduser(classifier_filename)
                if images is not None: 
                    with open(classifier_filename_exp, 'rb') as infile:
                        (model, class_names) = pickle.load(infile)
                    if model:
                        print('Loaded classifier model from file "%s"\n' % classifier_filename_exp)
                        predictions                 = model.predict_proba(emb)
                        top_k_class_indices         = predictions[0].argsort()[-5:][::-1] # 最相似的前5个KOL
                        k = 0
                        for i in list(top_k_class_indices):
                            class_indices           = []
                            class_indices.append(i)
                            class_probabilities     = predictions[np.arange(len(class_indices)), class_indices] 
                            identity                = class_names[class_indices[0]]
                            probabilities           = class_probabilities[0]
                            probabilities           = str(probabilities).split('.')[0] + '.' + str(probabilities).split('.')[1][:3]
                            spent                   = str(time.time() - start_time)
                            spent                   = str(spent).split('.')[0] + '.' + str(spent).split('.')[1][:2]
                            k                       += 1
                            kth_kol                 = str(static_path) + "," + str(k) + "th KOL: " + identity + "," + "Similarity: " + probabilities + "," + "Time consuming: " + spent + " seconds"
                            display.append(kth_kol)
                    else:
                        return render_template(
                            template_name_or_list="warning.html",
                            status="No embedding classifier was detected!"
                        )
                else:
                    return render_template(
                        template_name_or_list="warning.html",
                        status="No human face was detected!"
                    )
            else:
                return render_template(
                    template_name_or_list="warning.html",
                    status="No human face was detected!"
                )
        return render_template(
            template_name_or_list='find_similar_kol_result.html',
            display=display
        )      
    else:
        return render_template(
            template_name_or_list="warning.html",
            status="POST HTTP method required!"
        )

if __name__ == '__main__':
    # tf.reset_default_graph()
    with tf.Graph().as_default():
        os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
        os.environ['CUDA_VISIBLE_DEVICES'] = "0" # 指定使用第一块GPU
        config                                              =  tf.ConfigProto()
        config.allow_soft_placement                         = True
        config.gpu_options.per_process_gpu_memory_fraction  = 0.7
        config.gpu_options.allow_growth                     = True
        with tf.Session(config=config) as sess:
            model               = "models/20180402-114759/"
            classifier_filename = "models/kol.pkl"
            image_size          = 160
            facenet.load_model(model)
            images_placeholder      = tf.get_default_graph().get_tensor_by_name("input:0")
            embeddings              = tf.get_default_graph().get_tensor_by_name("embeddings:0")
            phase_train_placeholder = tf.get_default_graph().get_tensor_by_name("phase_train:0")
            serve(app=app, host='0.0.0.0', port=5000)

utils.py

utils.py

import os
import numpy as np
from scipy import misc
import tensorflow as tf
import align.detect_face
from six.moves import xrange
import facenet.facenet as facenet
from scipy.misc import imresize, imsave
from tensorflow.python.platform import gfile
from align.detect_face import detect_face
from align.detect_face import create_mtcnn

allowed_set = set(['png', 'jpg', 'jpeg'])           # allowed image formats for upload


def allowed_file(filename, allowed_set):
    check = '.' in filename and filename.rsplit('.', 1)[1].lower() in allowed_set
    return check


def get_face(img, pnet, rnet, onet, image_size):
    minsize             = 20
    threshold           = [0.6, 0.7, 0.7]
    factor              = 0.709
    margin              = 44
    input_image_size    = image_size
    img_size            = np.asarray(img.shape)[0:2]
    bounding_boxes, _   = detect_face(
        img=img, minsize=minsize, pnet=pnet, rnet=rnet,
        onet=onet, threshold=threshold, factor=factor
    )

    if not len(bounding_boxes) == 0:
        for face in bounding_boxes:
            det         = np.squeeze(face[0:4])
            bb          = np.zeros(4, dtype=np.int32)
            bb[0]       = np.maximum(det[0] - margin / 2, 0)
            bb[1]       = np.maximum(det[1] - margin / 2, 0)
            bb[2]       = np.minimum(det[2] + margin / 2, img_size[1])
            bb[3]       = np.minimum(det[3] + margin / 2, img_size[0])
            cropped     = img[bb[1]: bb[3], bb[0]:bb[2], :]
            face_img    = imresize(arr=cropped, size=(input_image_size, input_image_size), mode='RGB')
            return face_img
    else:
        return None


def load_image(img, do_random_crop, do_random_flip, image_size, do_prewhiten=True):
    image = np.zeros((1, image_size, image_size, 3))
    if img.ndim == 2:
            img = to_rgb(img)
    if do_prewhiten:
            img = prewhiten(img)
    img = crop(img, do_random_crop, image_size)
    img = flip(img, do_random_flip)
    image[:, :, :, :] = img
    return image


def forward_pass(img, session, images_placeholder, phase_train_placeholder, embeddings, image_size):
    if img is not None:
        image = load_image(
            img=img, do_random_crop=False, do_random_flip=False,
            do_prewhiten=True, image_size=image_size
        )
        feed_dict = {images_placeholder: image, phase_train_placeholder: False}
        embedding = session.run(embeddings, feed_dict=feed_dict)
        return embedding
    else:
        return None


def load_embeddings():
    embedding_dict = defaultdict()  
    for embedding in glob.iglob(pathname='embeddings/*.npy'):  
        name                    = remove_file_extension(embedding)
        dict_embedding          = np.load(embedding)
        embedding_dict[name]    = dict_embedding
    return embedding_dict


def remove_file_extension(filename):
    filename = os.path.splitext(filename)[0]
    return filename


def identify_face(embedding, embedding_dict):
    min_distance = 100
    try:
        for (name, dict_embedding) in embedding_dict.items():
            distance = np.linalg.norm(embedding - dict_embedding)
            if distance < min_distance:
                min_distance = distance
                identity = name
        if min_distance <= 1.1:
            identity = identity[11:]
            result = str(identity) + " with distance: " + str(min_distance)
            return result
        else:
            result = "Not in the database, the distance is " + str(min_distance)
            return result
    except Exception as e:
        print(str(e))
        return str(e)


def load_model(model, input_map=None):
    model_exp = os.path.expanduser(model)
    if (os.path.isfile(model_exp)):
        print('Model filename: %s' % model_exp)
        with gfile.FastGFile(model_exp,'rb') as f:
            graph_def = tf.GraphDef()
            graph_def.ParseFromString(f.read())
            tf.import_graph_def(graph_def, input_map=input_map, name='')
    else:
        print('Model directory: %s' % model_exp)
        meta_file, ckpt_file = get_model_filenames(model_exp)
        print('Metagraph file: %s' % meta_file)
        print('Checkpoint file: %s' % ckpt_file)
        saver = tf.train.import_meta_graph(os.path.join(model_exp, meta_file), input_map=input_map)
        saver.restore(tf.get_default_session(), os.path.join(model_exp, ckpt_file))


def load_and_align_data(image_paths):

    minsize     = 20 
    threshold   = [ 0.6, 0.7, 0.7 ]  
    factor      = 0.709 
    image_size  = 160
    margin      = 44
    
    print('Creating networks and loading parameters')
    with tf.Graph().as_default():
        config                                              =  tf.ConfigProto()
        config.allow_soft_placement                         = True
        config.gpu_options.per_process_gpu_memory_fraction  = 0.7
        config.gpu_options.allow_growth                     = True
        sess                                                = tf.Session(config=config)
        with sess.as_default():
            pnet, rnet, onet = create_mtcnn(sess, None)

    nrof_samples = len(image_paths)
    img_list = [] 
    count_per_image = []
    for i in xrange(nrof_samples):
        # img = misc.imread(name=file, mode='RGB')
        img = misc.imread(os.path.expanduser(image_paths[i])) # (157, 320, 3)
        if img.shape[2] !=3:
            print("Cannot feed value of shape (x, y, z, 4) for Tensor 'pnet/input:0', which has shape '(?, ?, ?, 3)'")
            images = "0"
            count_per_image.append("0")
            return images, count_per_image
        img_size = np.asarray(img.shape)[0:2] 
        bounding_boxes, _ = detect_face(img, minsize, pnet, rnet, onet, threshold, factor) # (3, 5) | 3代表检测到3个人脸
        count_per_image.append(len(bounding_boxes)) # bounding_boxes.shape = (x, 5) | x = 0时代表没有一个人脸被检测到
        if len(bounding_boxes) == 0:
            print("No person detected in this image!")
            images = "0"
            return images, count_per_image
        else:
            for j in range(len(bounding_boxes)):	
                det = np.squeeze(bounding_boxes[j,0:4])
                bb = np.zeros(4, dtype=np.int32)
                bb[0] = np.maximum(det[0]-margin/2, 0)
                bb[1] = np.maximum(det[1]-margin/2, 0)
                bb[2] = np.minimum(det[2]+margin/2, img_size[1])
                bb[3] = np.minimum(det[3]+margin/2, img_size[0])
                cropped = img[bb[1]:bb[3],bb[0]:bb[2],:]
                aligned = misc.imresize(cropped, (image_size, image_size), interp='bilinear')
                prewhitened = facenet.prewhiten(aligned) # (160, 160, 3)
                img_list.append(prewhitened)	
            images = np.stack(img_list) # (3, 160, 160, 3)
            return images, count_per_image

templates

index.html

index.html

<!DOCTYPE html>
<html>
<head>
    <title>Upload Image</title>
    <meta charset="utf-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <meta http-equiv="X-UA-Compatible" content="ie=edge">
    <link rel="stylesheet" href="https://stackpath.bootstrapcdn.com/bootstrap/4.2.1/css/bootstrap.min.css">
</head>

<body>
    <div class="container text-center">
            <br><br>
        <h1>Server Online</h1>
            <br><br><br>
            <br><br><br><br><br>
        <a href="{{ url_for('predict_single_page') }}" class="btn btn-outline-dark">Click here for single image identity prediction!</a>
            <br><br><br>
        <a href="{{ url_for('predict_batch_page') }}" class="btn btn-outline-dark">Click here for batch image identity prediction!</a>
        <br><br><br>
        <a href="{{ url_for('find_similar_kol_page') }}" class="btn btn-outline-dark">Click here to find similar KOL!</a>
    </div>
</body>
</html>
warning.html

warning.html

<!DOCTYPE html>
<html>
<head>
    <title>Image identity prediction</title>
    <meta charset="utf-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <meta http-equiv="X-UA-Compatible" content="ie=edge">
    <link rel="stylesheet" href="https://stackpath.bootstrapcdn.com/bootstrap/4.2.1/css/bootstrap.min.css">
</head>

<body>
    <div class="container text-center">
            <br><br><br><br><br><br><br>
        <h3>Warning!</h3>
            <br><br>
        <p>{{ status }}</p>
            <br><br>
        <a href="{{ url_for('index_page') }}" class="btn btn-outline-dark">Back</a>
    </div>
</body>
</html>
predict_single.html

predict_single.html

<!DOCTYPE html>
<html>
<head>
    <title>Image identity prediction</title>
    <meta charset="utf-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <meta http-equiv="X-UA-Compatible" content="ie=edge">
    <link rel="stylesheet" href="https://stackpath.bootstrapcdn.com/bootstrap/4.2.1/css/bootstrap.min.css">
</head>

<body>
    <div class="container text-center">
            <br><br><br><br><br><br><br>
        <h3>Upload your image file for identity prediction:</h3>
            <br><br>
        <form method=POST enctype=multipart/form-data action="{{ url_for('predict_single_image') }}">
            <input type=file name=file class="btn btn-outline-dark">
            <input type="submit" class="btn btn-outline-dark">
        </form>
            <br><br>
        <a href="{{ url_for('index_page') }}" class="btn btn-outline-dark">Back</a>
    </div>
</body>
</html>
predict_single_result.html

predict_single_result.html

<!DOCTYPE html>
<html>
    <head>
        <title>Image identity prediction</title>
        <meta charset="utf-8">
        <meta name="viewport" content="width=device-width, initial-scale=1.0">
        <meta http-equiv="X-UA-Compatible" content="ie=edge">
        <link rel="stylesheet" href="https://stackpath.bootstrapcdn.com/bootstrap/4.2.1/css/bootstrap.min.css">
    </head>

    <body>
        <div class="container text-center">
                <br><br><br><br><br><br><br>
            <h3>Result of prediction:</h3>
                <br><br>
            <p><font color="#ff0033" size="5"><b> {{ display[0].split(',')[1] }} </b></font></p>
            <p>{{ display[0].split(',')[4] }} </p>
            <p> <img style="height: 150px" src="{{display[0].split(',')[0]}}"/> </p>
                <ul>
                    {% for record in display %}
                        <p> <font color="#ff0033" size="3"> {{ record.split(',')[2] }} </font> &nbsp;&nbsp; {{ record.split(',')[3] }} </p>
                    {% endfor %}
                </ul>
                <br><br>
            <a href="{{ url_for('predict_single_page') }}" class="btn btn-outline-dark">Back</a>
        </div>
    </body>
</html>
predict_batch.html

predict_batch.html

<!DOCTYPE html>
<html>
<head>
    <title>Image identity prediction</title>
    <meta charset="utf-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <meta http-equiv="X-UA-Compatible" content="ie=edge">
    <link rel="stylesheet" href="https://stackpath.bootstrapcdn.com/bootstrap/4.2.1/css/bootstrap.min.css">
</head>

<body>
    <div class="container text-center">
            <br><br><br><br><br><br><br>
        <h3>Upload your image file for identity prediction:</h3>
            <br><br>
        <form method=POST enctype=multipart/form-data action="{{ url_for('predict_batch_image') }}">
            <input type=file name=file multiple="multiple" class="btn btn-outline-dark">
            <input type="submit" class="btn btn-outline-dark">
        </form>
            <br><br>
        <a href="{{ url_for('index_page') }}" class="btn btn-outline-dark">Back</a>
    </div>
</body>
</html>
predict_batch_result.html

predict_batch_result.html

<!DOCTYPE html>
<html>
    <head>
        <title>Image identity prediction</title>
        <meta charset="utf-8">
        <meta name="viewport" content="width=device-width, initial-scale=1.0">
        <meta http-equiv="X-UA-Compatible" content="ie=edge">
        <link rel="stylesheet" href="https://stackpath.bootstrapcdn.com/bootstrap/4.2.1/css/bootstrap.min.css">
    </head>

    <body>
        <div class="container text-center">
                <br><br><br><br><br><br><br>
            <h3>Result of prediction:</h3>
                <br><br>
                {% for image in display %}  <!-- 列出每一张图片 -->
                    <p> <img style="height: 150px" src="{{image[0].split(',')[0]}}"/> </p> 
                    <p> <font color="#ff0033" size="3"> {{ image[0].split(',')[1] }} </font> &nbsp;&nbsp; {{image[0].split(',')[4]}} </p>
                    {% for record in image %} <!-- 识别每张图片中的每个人(可能不止一个) -->
                        <p> {{ record.split(',')[2] }} &nbsp;&nbsp; {{ record.split(',')[3] }} </p>
                    {% endfor %}
                {% endfor %}
                <br><br>
            <a href="{{ url_for('predict_batch_page') }}" class="btn btn-outline-dark">Back</a>
        </div>
    </body>
</html>
find_similar_kol.html
<!DOCTYPE html>
<html>
<head>
    <title>Find similar KOL</title>
    <meta charset="utf-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <meta http-equiv="X-UA-Compatible" content="ie=edge">
    <link rel="stylesheet" href="https://stackpath.bootstrapcdn.com/bootstrap/4.2.1/css/bootstrap.min.css">
</head>

<body>
<div class="container text-center">
        <br><br><br><br><br><br><br>
    <h3>Upload your image file to find the similar KOL:</h3>
        <br><br>
    <form method=POST enctype=multipart/form-data action="{{ url_for('find_similar_kol_result') }}">
        <input type=file name=file class="btn btn-outline-dark">
        <input type="submit" class="btn btn-outline-dark">
    </form>
        <br><br>
    <a href="{{ url_for('index_page') }}" class="btn btn-outline-dark">Back</a>
</div>

</body>
</html>
find_similar_kol_result.html
<!DOCTYPE html>
<html>
    <head>
        <title>Similar KOL Finding Results</title>
        <meta charset="utf-8">
        <meta name="viewport" content="width=device-width, initial-scale=1.0">
        <meta http-equiv="X-UA-Compatible" content="ie=edge">
        <link rel="stylesheet" href="https://stackpath.bootstrapcdn.com/bootstrap/4.2.1/css/bootstrap.min.css">
    </head>

    <body>
        <div class="container text-center">
                <br><br><br><br><br><br><br>
            <h3>Similar KOL Finding Results:</h3>
                <br><br>
            <p>{{ display[0].split(',')[3] }} </p>
            <p> <img style="height: 150px" src="{{display[0].split(',')[0]}}"/> </p>
                <ul>
                    {% for record in display %}
                        <p> <font color="#ff0033" size="3"> {{ record.split(',')[1] }} </font> &nbsp;&nbsp; {{ record.split(',')[2] }} </p>
                    {% endfor %}
                </ul>
                <br><br>
            <a href="{{ url_for('predict_single_page') }}" class="btn btn-outline-dark">Back</a>
        </div>
    </body>
</html>

Program Tricks

处理中文文件名

用secure_filename获取中文文件名时,中文会被省略,因为secure_filename()函数只返回ASCII字符,非ASCII字符会被过滤掉。

  1. 修改源码C:\Users\PC\.conda\envs\haha\Lib\site-packages\werkzeug\utils.py
    # 使用时: from werkzeug.utils import secure_filename
    
    # *****************Action1:修改代码
    # filename = str(_filename_ascii_strip_re.sub("", "_".join(filename.split()))).strip("._")
    
    _filename_ascii_add_strip_re = re.compile(r'[^A-Za-z0-9_\u4E00-\u9FBF.-]')
    filename = str(_filename_ascii_add_strip_re.sub('', '_'.join( 
                    filename.split()))).strip('._')
  2. 使用第三方库(pypinyin),将中文名转换成拼音
    from pypinyin import lazy_pinyin
    from werkzeug.utils import secure_filename
    filename = secure_filename(''.join(lazy_pinyin(file.filename))) # filename = secure_filename(file.filename)

保存文件信息

  • 保存文本内容到txt文件:
    path_identity_probability = blablabla
    with open("batch.txt","w") as f:
        f.writelines(path_identity_probability)
  • 保存图片到指定目录:
    upload_path = os.path.join(uploads_path, filename)
    file.save(upload_path)

排查错误

print("0")
# code snippet 1
print("1")
# code snippet 1
# 通过命令行逐一观察,到哪个数字停止,就是前一步出错
Author: ElaineXHZhong
Link: https://elainexhzhong.github.io/2021/06/03/Face-Recognition-System-based-on-FaceNet/
Copyright Notice: All articles in this blog are licensed under CC BY-NC-SA 4.0 unless stating additionally.