diff --git a/cv/face/facenet/tensorflow/README.md b/cv/face/facenet/tensorflow/README.md index c3ba2d38231b4757a808fca5fafdfd3360050b64..e4f5cd5a58548d22a2680f9d8c0d1a5c9470d2dd 100644 --- a/cv/face/facenet/tensorflow/README.md +++ b/cv/face/facenet/tensorflow/README.md @@ -1,44 +1,71 @@ ## Facenet + ## Model description + This is a facenet-tensorflow library that can be used to train your own face recognition model. ## Step 1: Installation + ```bash -pip3 install -r requirements.txt +# Install requirements. +bash init.sh +pip3 install numpy==1.23.5 ``` ## Step 2: Preparing datasets + The [CASIA-WebFace](http://www.cbsr.ia.ac.cn/english/CASIA-WebFace-Database.html) dataset has been used for training. This training set consists of total of 453 453 images over 10 575 identities after face detection. Some performance improvement has been seen if the dataset has been filtered before training. Some more information about how this was done will come later. The best performing model has been trained on the [VGGFace2](https://www.robots.ox.ac.uk/~vgg/data/vgg_face2/) dataset consisting of ~3.3M faces and ~9000 classes. -## download +## download ```bash cd data download dataset in this way: download link: https://pan.baidu.com/s/1qMxFR8H_ih0xmY-rKgRejw password: bcrq The [CASIA-WebFace](http://www.cbsr.ia.ac.cn/english/CASIA-WebFace-Database.html) dataset has been used for training + +# Data tree: +$ ls data/webface_182_44 +0000045 +...... + +$ ls data/lfw_data +lfw lfw_160 lfw.tgz ``` ## Pre-processing + ### Face alignment using MTCNN + One problem with the above approach seems to be that the Dlib face detector misses some of the hard examples (partial occlusion, silhouettes, etc). This makes the training set too "easy" which causes the model to perform worse on other benchmarks. To solve this, other face landmark detectors has been tested. One face landmark detector that has proven to work very well in this setting is the [Multi-task CNN](https://kpzhang93.github.io/MTCNN_face_detection_alignment/index.html). A Matlab/Caffe implementation can be found [here](https://github.com/kpzhang93/MTCNN_face_detection_alignment) and this has been used for face alignment with very good results. A Python/Tensorflow implementation of MTCNN can be found [here](https://github.com/davidsandberg/facenet/tree/master/src/align). This implementation does not give identical results to the Matlab/Caffe implementation but the performance is very similar. ## Step 3: Training + Currently, the best results are achieved by training the model using softmax loss. Details on how to train a model using softmax loss on the CASIA-WebFace dataset can be found on the page [Classifier training of Inception-ResNet-v1](https://github.com/davidsandberg/facenet/wiki/Classifier-training-of-inception-resnet-v1) and . -# One Card +### One Card + ```bash nohup bash train_facenet.sh 1> train_facenet.log 2> train_facenet_error.log & tail -f train_facenet.log ``` +### Multiple cards (DDP) + +```bash +# 8 Cards(DDP) +bash train_facenet_ddp.sh +``` + ## Results -| model | FPS | LFW_Accuracy | -|---------|--------| -----------------| + +| model | FPS | LFW_Accuracy | +| --------- | -------- | ------------------ | | facenet | 216.96 | 0.98900+-0.00642 | ## Reference + https://github.com/davidsandberg/facenet diff --git a/cv/face/facenet/tensorflow/__init__.py b/cv/face/facenet/tensorflow/__init__.py new file mode 100644 index 0000000000000000000000000000000000000000..162e24b462289dcee7b7a2888b93fad1115def81 --- /dev/null +++ b/cv/face/facenet/tensorflow/__init__.py @@ -0,0 +1,14 @@ +# Copyright (c) 2024, Shanghai Iluvatar CoreX Semiconductor Co., Ltd. +# All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); you may +# not use this file except in compliance with the License. You may obtain +# a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT +# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the +# License for the specific language governing permissions and limitations +# under the License. \ No newline at end of file diff --git a/cv/face/facenet/tensorflow/contributed/__init__.py b/cv/face/facenet/tensorflow/contributed/__init__.py new file mode 100644 index 0000000000000000000000000000000000000000..162e24b462289dcee7b7a2888b93fad1115def81 --- /dev/null +++ b/cv/face/facenet/tensorflow/contributed/__init__.py @@ -0,0 +1,14 @@ +# Copyright (c) 2024, Shanghai Iluvatar CoreX Semiconductor Co., Ltd. +# All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); you may +# not use this file except in compliance with the License. You may obtain +# a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT +# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the +# License for the specific language governing permissions and limitations +# under the License. \ No newline at end of file diff --git a/cv/face/facenet/tensorflow/contributed/batch_represent.py b/cv/face/facenet/tensorflow/contributed/batch_represent.py new file mode 100644 index 0000000000000000000000000000000000000000..d1ba1b7a6789c65304ffe0536c4ff755a135dfa3 --- /dev/null +++ b/cv/face/facenet/tensorflow/contributed/batch_represent.py @@ -0,0 +1,162 @@ +#!/usr/bin/env python +# coding=utf-8 +# Copyright (c) 2024, Shanghai Iluvatar CoreX Semiconductor Co., Ltd. +# All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); you may +# not use this file except in compliance with the License. You may obtain +# a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT +# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the +# License for the specific language governing permissions and limitations +# under the License. + +from __future__ import absolute_import +from __future__ import division +from __future__ import print_function + +""" +Allows you to generate embeddings from a directory of images in the format: + +Instructions: + +Image data directory should look like the following figure: +person-1 +├── image-1.jpg +├── image-2.png +... +└── image-p.png + +... + +person-m +├── image-1.png +├── image-2.jpg +... +└── image-q.png + +Trained Model: +- Both the trained model metagraph and the model parameters need to exist +in the same directory, and the metagraph should have the extension '.meta'. + +#### +USAGE: +$ python batch_represent.py -d -o --trained_model_dir +### +""" + +""" +Attributions: +The code is heavily inspired by the code from by David Sandberg's ../src/validate_on_lfw.py +The concept is inspired by Brandon Amos' github.com/cmusatyalab/openface/blob/master/batch-represent/batch-represent.lua +""" + +#---------------------------------------------------- +# MIT License +# +# Copyright (c) 2017 Rakshak Talwar +# +# Permission is hereby granted, free of charge, to any person obtaining a copy +# of this software and associated documentation files (the "Software"), to deal +# in the Software without restriction, including without limitation the rights +# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +# copies of the Software, and to permit persons to whom the Software is +# furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice shall be included in all +# copies or substantial portions of the Software. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +# SOFTWARE. +#---------------------------------------------------- + +import os +import sys +import argparse +import importlib +import time + +sys.path.insert(1, "../src") +import facenet +import numpy as np +from sklearn.datasets import load_files +import tensorflow as tf +from six.moves import xrange + +def main(args): + + with tf.Graph().as_default(): + + with tf.Session() as sess: + + # create output directory if it doesn't exist + output_dir = os.path.expanduser(args.output_dir) + if not os.path.isdir(output_dir): + os.makedirs(output_dir) + + # load the model + print("Loading trained model...\n") + meta_file, ckpt_file = facenet.get_model_filenames(os.path.expanduser(args.trained_model_dir)) + facenet.load_model(args.trained_model_dir, meta_file, ckpt_file) + + # grab all image paths and labels + print("Finding image paths and targets...\n") + data = load_files(args.data_dir, load_content=False, shuffle=False) + labels_array = data['target'] + paths = data['filenames'] + + # Get input and output tensors + images_placeholder = tf.get_default_graph().get_tensor_by_name("input:0") + embeddings = tf.get_default_graph().get_tensor_by_name("embeddings:0") + phase_train_placeholder = tf.get_default_graph().get_tensor_by_name("phase_train:0") + + image_size = images_placeholder.get_shape()[1] + embedding_size = embeddings.get_shape()[1] + + # Run forward pass to calculate embeddings + print('Generating embeddings from images...\n') + start_time = time.time() + batch_size = args.batch_size + nrof_images = len(paths) + nrof_batches = int(np.ceil(1.0*nrof_images / batch_size)) + emb_array = np.zeros((nrof_images, embedding_size)) + for i in xrange(nrof_batches): + start_index = i*batch_size + end_index = min((i+1)*batch_size, nrof_images) + paths_batch = paths[start_index:end_index] + images = facenet.load_data(paths_batch, do_random_crop=False, do_random_flip=False, image_size=image_size, do_prewhiten=True) + feed_dict = { images_placeholder:images, phase_train_placeholder:False} + emb_array[start_index:end_index,:] = sess.run(embeddings, feed_dict=feed_dict) + + time_avg_forward_pass = (time.time() - start_time) / float(nrof_images) + print("Forward pass took avg of %.3f[seconds/image] for %d images\n" % (time_avg_forward_pass, nrof_images)) + + print("Finally saving embeddings and gallery to: %s" % (output_dir)) + # save the gallery and embeddings (signatures) as numpy arrays to disk + np.save(os.path.join(output_dir, "gallery.npy"), labels_array) + np.save(os.path.join(output_dir, "signatures.npy"), emb_array) + +def parse_arguments(argv): + parser = argparse.ArgumentParser(description="Batch-represent face embeddings from a given data directory") + parser.add_argument('-d', '--data_dir', type=str, + help='directory of images with structure as seen at the top of this file.') + parser.add_argument('-o', '--output_dir', type=str, + help='directory containing aligned face patches with file structure as seen at the top of this file.') + parser.add_argument('--trained_model_dir', type=str, + help='Load a trained model before training starts.') + parser.add_argument('--batch_size', type=int, help='Number of images to process in a batch.', default=50) + + return parser.parse_args(argv) + + +if __name__ == "__main__": + main(parse_arguments(sys.argv[1:])) diff --git a/cv/face/facenet/tensorflow/contributed/cluster.py b/cv/face/facenet/tensorflow/contributed/cluster.py new file mode 100644 index 0000000000000000000000000000000000000000..6bd189976183f13e706b6b2e8fad606429fa7824 --- /dev/null +++ b/cv/face/facenet/tensorflow/contributed/cluster.py @@ -0,0 +1,193 @@ +# MIT License +# +# Copyright (c) 2017 PXL University College +# +# Permission is hereby granted, free of charge, to any person obtaining a copy +# of this software and associated documentation files (the "Software"), to deal +# in the Software without restriction, including without limitation the rights +# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +# copies of the Software, and to permit persons to whom the Software is +# furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice shall be included in all +# copies or substantial portions of the Software. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +# SOFTWARE. + +# Clusters similar faces from input folder together in folders based on euclidean distance matrix + +from __future__ import absolute_import +from __future__ import division +from __future__ import print_function + +from scipy import misc +import tensorflow as tf +import numpy as np +import os +import sys +import argparse +import facenet +import align.detect_face +from sklearn.cluster import DBSCAN + + +def main(args): + pnet, rnet, onet = create_network_face_detection(args.gpu_memory_fraction) + + with tf.Graph().as_default(): + + with tf.Session() as sess: + facenet.load_model(args.model) + + image_list = load_images_from_folder(args.data_dir) + images = align_data(image_list, args.image_size, args.margin, pnet, rnet, onet) + + images_placeholder = sess.graph.get_tensor_by_name("input:0") + embeddings = sess.graph.get_tensor_by_name("embeddings:0") + phase_train_placeholder = sess.graph.get_tensor_by_name("phase_train:0") + feed_dict = {images_placeholder: images, phase_train_placeholder: False} + emb = sess.run(embeddings, feed_dict=feed_dict) + + nrof_images = len(images) + + matrix = np.zeros((nrof_images, nrof_images)) + + print('') + # Print distance matrix + print('Distance matrix') + print(' ', end='') + for i in range(nrof_images): + print(' %1d ' % i, end='') + print('') + for i in range(nrof_images): + print('%1d ' % i, end='') + for j in range(nrof_images): + dist = np.sqrt(np.sum(np.square(np.subtract(emb[i, :], emb[j, :])))) + matrix[i][j] = dist + print(' %1.4f ' % dist, end='') + print('') + + print('') + + # DBSCAN is the only algorithm that doesn't require the number of clusters to be defined. + db = DBSCAN(eps=args.cluster_threshold, min_samples=args.min_cluster_size, metric='precomputed') + db.fit(matrix) + labels = db.labels_ + + # get number of clusters + no_clusters = len(set(labels)) - (1 if -1 in labels else 0) + + print('No of clusters:', no_clusters) + + if no_clusters > 0: + if args.largest_cluster_only: + largest_cluster = 0 + for i in range(no_clusters): + print('Cluster {}: {}'.format(i, np.nonzero(labels == i)[0])) + if len(np.nonzero(labels == i)[0]) > len(np.nonzero(labels == largest_cluster)[0]): + largest_cluster = i + print('Saving largest cluster (Cluster: {})'.format(largest_cluster)) + cnt = 1 + for i in np.nonzero(labels == largest_cluster)[0]: + misc.imsave(os.path.join(args.out_dir, str(cnt) + '.png'), images[i]) + cnt += 1 + else: + print('Saving all clusters') + for i in range(no_clusters): + cnt = 1 + print('Cluster {}: {}'.format(i, np.nonzero(labels == i)[0])) + path = os.path.join(args.out_dir, str(i)) + if not os.path.exists(path): + os.makedirs(path) + for j in np.nonzero(labels == i)[0]: + misc.imsave(os.path.join(path, str(cnt) + '.png'), images[j]) + cnt += 1 + else: + for j in np.nonzero(labels == i)[0]: + misc.imsave(os.path.join(path, str(cnt) + '.png'), images[j]) + cnt += 1 + + +def align_data(image_list, image_size, margin, pnet, rnet, onet): + minsize = 20 # minimum size of face + threshold = [0.6, 0.7, 0.7] # three steps's threshold + factor = 0.709 # scale factor + + img_list = [] + + for x in xrange(len(image_list)): + img_size = np.asarray(image_list[x].shape)[0:2] + bounding_boxes, _ = align.detect_face.detect_face(image_list[x], minsize, pnet, rnet, onet, threshold, factor) + nrof_samples = len(bounding_boxes) + if nrof_samples > 0: + for i in xrange(nrof_samples): + if bounding_boxes[i][4] > 0.95: + det = np.squeeze(bounding_boxes[i, 0:4]) + bb = np.zeros(4, dtype=np.int32) + bb[0] = np.maximum(det[0] - margin / 2, 0) + bb[1] = np.maximum(det[1] - margin / 2, 0) + bb[2] = np.minimum(det[2] + margin / 2, img_size[1]) + bb[3] = np.minimum(det[3] + margin / 2, img_size[0]) + cropped = image_list[x][bb[1]:bb[3], bb[0]:bb[2], :] + aligned = misc.imresize(cropped, (image_size, image_size), interp='bilinear') + prewhitened = facenet.prewhiten(aligned) + img_list.append(prewhitened) + + if len(img_list) > 0: + images = np.stack(img_list) + return images + else: + return None + + +def create_network_face_detection(gpu_memory_fraction): + with tf.Graph().as_default(): + gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=gpu_memory_fraction) + sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options, log_device_placement=False)) + with sess.as_default(): + pnet, rnet, onet = align.detect_face.create_mtcnn(sess, None) + return pnet, rnet, onet + + +def load_images_from_folder(folder): + images = [] + for filename in os.listdir(folder): + img = misc.imread(os.path.join(folder, filename)) + if img is not None: + images.append(img) + return images + + +def parse_arguments(argv): + parser = argparse.ArgumentParser() + + parser.add_argument('model', type=str, + help='Either a directory containing the meta_file and ckpt_file or a model protobuf (.pb) file') + parser.add_argument('data_dir', type=str, + help='The directory containing the images to cluster into folders.') + parser.add_argument('out_dir', type=str, + help='The output directory where the image clusters will be saved.') + parser.add_argument('--image_size', type=int, + help='Image size (height, width) in pixels.', default=160) + parser.add_argument('--margin', type=int, + help='Margin for the crop around the bounding box (height, width) in pixels.', default=44) + parser.add_argument('--min_cluster_size', type=int, + help='The minimum amount of pictures required for a cluster.', default=1) + parser.add_argument('--cluster_threshold', type=float, + help='The minimum distance for faces to be in the same cluster', default=1.0) + parser.add_argument('--largest_cluster_only', action='store_true', + help='This argument will make that only the biggest cluster is saved.') + parser.add_argument('--gpu_memory_fraction', type=float, + help='Upper bound on the amount of GPU memory that will be used by the process.', default=1.0) + + return parser.parse_args(argv) + + +if __name__ == '__main__': + main(parse_arguments(sys.argv[1:])) diff --git a/cv/face/facenet/tensorflow/contributed/clustering.py b/cv/face/facenet/tensorflow/contributed/clustering.py new file mode 100644 index 0000000000000000000000000000000000000000..c74383097f98bcf0c86dbe8962bae4bc4e568139 --- /dev/null +++ b/cv/face/facenet/tensorflow/contributed/clustering.py @@ -0,0 +1,282 @@ +# Copyright (c) 2024, Shanghai Iluvatar CoreX Semiconductor Co., Ltd. +# All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); you may +# not use this file except in compliance with the License. You may obtain +# a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT +# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the +# License for the specific language governing permissions and limitations +# under the License. + +""" Face Cluster """ +import tensorflow as tf +import numpy as np +import importlib +import argparse +import facenet +import os +import math +def face_distance(face_encodings, face_to_compare): + """ + Given a list of face encodings, compare them to a known face encoding and get a euclidean distance + for each comparison face. The distance tells you how similar the faces are. + :param faces: List of face encodings to compare + :param face_to_compare: A face encoding to compare against + :return: A numpy ndarray with the distance for each face in the same order as the 'faces' array + """ + import numpy as np + if len(face_encodings) == 0: + return np.empty((0)) + + #return 1/np.linalg.norm(face_encodings - face_to_compare, axis=1) + return np.sum(face_encodings*face_to_compare,axis=1) + +def load_model(model_dir, meta_file, ckpt_file): + model_dir_exp = os.path.expanduser(model_dir) + saver = tf.train.import_meta_graph(os.path.join(model_dir_exp, meta_file)) + saver.restore(tf.get_default_session(), os.path.join(model_dir_exp, ckpt_file)) + +def _chinese_whispers(encoding_list, threshold=0.55, iterations=20): + """ Chinese Whispers Algorithm + + Modified from Alex Loveless' implementation, + http://alexloveless.co.uk/data/chinese-whispers-graph-clustering-in-python/ + + Inputs: + encoding_list: a list of facial encodings from face_recognition + threshold: facial match threshold,default 0.6 + iterations: since chinese whispers is an iterative algorithm, number of times to iterate + + Outputs: + sorted_clusters: a list of clusters, a cluster being a list of imagepaths, + sorted by largest cluster to smallest + """ + + #from face_recognition.api import _face_distance + from random import shuffle + import networkx as nx + # Create graph + nodes = [] + edges = [] + + image_paths, encodings = zip(*encoding_list) + + if len(encodings) <= 1: + print ("No enough encodings to cluster!") + return [] + + for idx, face_encoding_to_check in enumerate(encodings): + # Adding node of facial encoding + node_id = idx+1 + + # Initialize 'cluster' to unique value (cluster of itself) + node = (node_id, {'cluster': image_paths[idx], 'path': image_paths[idx]}) + nodes.append(node) + + # Facial encodings to compare + if (idx+1) >= len(encodings): + # Node is last element, don't create edge + break + + compare_encodings = encodings[idx+1:] + distances = face_distance(compare_encodings, face_encoding_to_check) + encoding_edges = [] + for i, distance in enumerate(distances): + if distance > threshold: + # Add edge if facial match + edge_id = idx+i+2 + encoding_edges.append((node_id, edge_id, {'weight': distance})) + + edges = edges + encoding_edges + + G = nx.Graph() + G.add_nodes_from(nodes) + G.add_edges_from(edges) + + # Iterate + for _ in range(0, iterations): + cluster_nodes = G.nodes() + shuffle(cluster_nodes) + for node in cluster_nodes: + neighbors = G[node] + clusters = {} + + for ne in neighbors: + if isinstance(ne, int): + if G.node[ne]['cluster'] in clusters: + clusters[G.node[ne]['cluster']] += G[node][ne]['weight'] + else: + clusters[G.node[ne]['cluster']] = G[node][ne]['weight'] + + # find the class with the highest edge weight sum + edge_weight_sum = 0 + max_cluster = 0 + #use the max sum of neighbor weights class as current node's class + for cluster in clusters: + if clusters[cluster] > edge_weight_sum: + edge_weight_sum = clusters[cluster] + max_cluster = cluster + + # set the class of target node to the winning local class + G.node[node]['cluster'] = max_cluster + + clusters = {} + + # Prepare cluster output + for (_, data) in G.node.items(): + cluster = data['cluster'] + path = data['path'] + + if cluster: + if cluster not in clusters: + clusters[cluster] = [] + clusters[cluster].append(path) + + # Sort cluster output + sorted_clusters = sorted(clusters.values(), key=len, reverse=True) + + return sorted_clusters + +def cluster_facial_encodings(facial_encodings): + """ Cluster facial encodings + + Intended to be an optional switch for different clustering algorithms, as of right now + only chinese whispers is available. + + Input: + facial_encodings: (image_path, facial_encoding) dictionary of facial encodings + + Output: + sorted_clusters: a list of clusters, a cluster being a list of imagepaths, + sorted by largest cluster to smallest + + """ + + if len(facial_encodings) <= 1: + print ("Number of facial encodings must be greater than one, can't cluster") + return [] + + # Only use the chinese whispers algorithm for now + sorted_clusters = _chinese_whispers(facial_encodings.items()) + return sorted_clusters + +def compute_facial_encodings(sess,images_placeholder,embeddings,phase_train_placeholder,image_size, + embedding_size,nrof_images,nrof_batches,emb_array,batch_size,paths): + """ Compute Facial Encodings + + Given a set of images, compute the facial encodings of each face detected in the images and + return them. If no faces, or more than one face found, return nothing for that image. + + Inputs: + image_paths: a list of image paths + + Outputs: + facial_encodings: (image_path, facial_encoding) dictionary of facial encodings + + """ + + for i in range(nrof_batches): + start_index = i*batch_size + end_index = min((i+1)*batch_size, nrof_images) + paths_batch = paths[start_index:end_index] + images = facenet.load_data(paths_batch, False, False, image_size) + feed_dict = { images_placeholder:images, phase_train_placeholder:False } + emb_array[start_index:end_index,:] = sess.run(embeddings, feed_dict=feed_dict) + + facial_encodings = {} + for x in range(nrof_images): + facial_encodings[paths[x]] = emb_array[x,:] + + + return facial_encodings + +def get_onedir(paths): + dataset = [] + path_exp = os.path.expanduser(paths) + if os.path.isdir(path_exp): + images = os.listdir(path_exp) + image_paths = [os.path.join(path_exp,img) for img in images] + + for x in image_paths: + if os.path.getsize(x)>0: + dataset.append(x) + + return dataset + + +def main(args): + """ Main + + Given a list of images, save out facial encoding data files and copy + images into folders of face clusters. + + """ + from os.path import join, basename, exists + from os import makedirs + import numpy as np + import shutil + import sys + + if not exists(args.output): + makedirs(args.output) + + with tf.Graph().as_default(): + with tf.Session() as sess: + image_paths = get_onedir(args.input) + #image_list, label_list = facenet.get_image_paths_and_labels(train_set) + + meta_file, ckpt_file = facenet.get_model_filenames(os.path.expanduser(args.model_dir)) + + print('Metagraph file: %s' % meta_file) + print('Checkpoint file: %s' % ckpt_file) + load_model(args.model_dir, meta_file, ckpt_file) + + # Get input and output tensors + images_placeholder = tf.get_default_graph().get_tensor_by_name("input:0") + embeddings = tf.get_default_graph().get_tensor_by_name("embeddings:0") + phase_train_placeholder = tf.get_default_graph().get_tensor_by_name("phase_train:0") + + image_size = images_placeholder.get_shape()[1] + print("image_size:",image_size) + embedding_size = embeddings.get_shape()[1] + + # Run forward pass to calculate embeddings + print('Runnning forward pass on images') + + nrof_images = len(image_paths) + nrof_batches = int(math.ceil(1.0*nrof_images / args.batch_size)) + emb_array = np.zeros((nrof_images, embedding_size)) + facial_encodings = compute_facial_encodings(sess,images_placeholder,embeddings,phase_train_placeholder,image_size, + embedding_size,nrof_images,nrof_batches,emb_array,args.batch_size,image_paths) + sorted_clusters = cluster_facial_encodings(facial_encodings) + num_cluster = len(sorted_clusters) + + # Copy image files to cluster folders + for idx, cluster in enumerate(sorted_clusters): + #save all the cluster + cluster_dir = join(args.output, str(idx)) + if not exists(cluster_dir): + makedirs(cluster_dir) + for path in cluster: + shutil.copy(path, join(cluster_dir, basename(path))) + +def parse_args(): + """Parse input arguments.""" + import argparse + parser = argparse.ArgumentParser(description='Get a shape mesh (t-pose)') + parser.add_argument('--model_dir', type=str, help='model dir', required=True) + parser.add_argument('--batch_size', type=int, help='batch size', required=30) + parser.add_argument('--input', type=str, help='Input dir of images', required=True) + parser.add_argument('--output', type=str, help='Output dir of clusters', required=True) + args = parser.parse_args() + + return args + +if __name__ == '__main__': + """ Entry point """ + main(parse_args()) diff --git a/cv/face/facenet/tensorflow/contributed/export_embeddings.py b/cv/face/facenet/tensorflow/contributed/export_embeddings.py new file mode 100644 index 0000000000000000000000000000000000000000..ccbf78755aca337184e67875e4ec2b8b05b6eee5 --- /dev/null +++ b/cv/face/facenet/tensorflow/contributed/export_embeddings.py @@ -0,0 +1,199 @@ +""" +Exports the embeddings and labels of a directory of images as numpy arrays. + +Typicall usage expect the image directory to be of the openface/facenet form and +the images to be aligned. Simply point to your model and your image directory: + python facenet/contributed/export_embeddings.py ~/models/facenet/20170216-091149/ ~/datasets/lfw/mylfw + +Output: +embeddings.npy -- Embeddings as np array, Use --embeddings_name to change name +labels.npy -- Integer labels as np array, Use --labels_name to change name +label_strings.npy -- Strings from folders names, --labels_strings_name to change name + + +Use --image_batch to dictacte how many images to load in memory at a time. + +If your images aren't already pre-aligned, use --is_aligned False + +I started with compare.py from David Sandberg, and modified it to export +the embeddings. The image loading is done use the facenet library if the image +is pre-aligned. If the image isn't pre-aligned, I use the compare.py function. +I've found working with the embeddings useful for classifications models. + +Charles Jekel 2017 + +""" + +# MIT License +# +# Copyright (c) 2016 David Sandberg +# Copyright (c) 2024, Shanghai Iluvatar CoreX Semiconductor Co., Ltd. +# All Rights Reserved. +# +# Permission is hereby granted, free of charge, to any person obtaining a copy +# of this software and associated documentation files (the "Software"), to deal +# in the Software without restriction, including without limitation the rights +# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +# copies of the Software, and to permit persons to whom the Software is +# furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice shall be included in all +# copies or substantial portions of the Software. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +# SOFTWARE. + +from __future__ import absolute_import +from __future__ import division +from __future__ import print_function + +import time +from scipy import misc +import tensorflow as tf +import numpy as np +import sys +import os +import argparse +import facenet +import align.detect_face +import glob + +from six.moves import xrange + +def main(args): + train_set = facenet.get_dataset(args.data_dir) + image_list, label_list = facenet.get_image_paths_and_labels(train_set) + # fetch the classes (labels as strings) exactly as it's done in get_dataset + path_exp = os.path.expanduser(args.data_dir) + classes = [path for path in os.listdir(path_exp) \ + if os.path.isdir(os.path.join(path_exp, path))] + classes.sort() + # get the label strings + label_strings = [name for name in classes if \ + os.path.isdir(os.path.join(path_exp, name))] + + with tf.Graph().as_default(): + + with tf.Session() as sess: + + # Load the model + facenet.load_model(args.model_dir) + + # Get input and output tensors + images_placeholder = tf.get_default_graph().get_tensor_by_name("input:0") + embeddings = tf.get_default_graph().get_tensor_by_name("embeddings:0") + phase_train_placeholder = tf.get_default_graph().get_tensor_by_name("phase_train:0") + + # Run forward pass to calculate embeddings + nrof_images = len(image_list) + print('Number of images: ', nrof_images) + batch_size = args.image_batch + if nrof_images % batch_size == 0: + nrof_batches = nrof_images // batch_size + else: + nrof_batches = (nrof_images // batch_size) + 1 + print('Number of batches: ', nrof_batches) + embedding_size = embeddings.get_shape()[1] + emb_array = np.zeros((nrof_images, embedding_size)) + start_time = time.time() + + for i in range(nrof_batches): + if i == nrof_batches -1: + n = nrof_images + else: + n = i*batch_size + batch_size + # Get images for the batch + if args.is_aligned is True: + images = facenet.load_data(image_list[i*batch_size:n], False, False, args.image_size) + else: + images = load_and_align_data(image_list[i*batch_size:n], args.image_size, args.margin, args.gpu_memory_fraction) + feed_dict = { images_placeholder: images, phase_train_placeholder:False } + # Use the facenet model to calcualte embeddings + embed = sess.run(embeddings, feed_dict=feed_dict) + emb_array[i*batch_size:n, :] = embed + print('Completed batch', i+1, 'of', nrof_batches) + + run_time = time.time() - start_time + print('Run time: ', run_time) + + # export emedings and labels + label_list = np.array(label_list) + + np.save(args.embeddings_name, emb_array) + np.save(args.labels_name, label_list) + label_strings = np.array(label_strings) + np.save(args.labels_strings_name, label_strings[label_list]) + + +def load_and_align_data(image_paths, image_size, margin, gpu_memory_fraction): + + minsize = 20 # minimum size of face + threshold = [ 0.6, 0.7, 0.7 ] # three steps's threshold + factor = 0.709 # scale factor + + print('Creating networks and loading parameters') + with tf.Graph().as_default(): + gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=gpu_memory_fraction) + sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options, log_device_placement=False)) + with sess.as_default(): + pnet, rnet, onet = align.detect_face.create_mtcnn(sess, None) + + nrof_samples = len(image_paths) + img_list = [None] * nrof_samples + for i in xrange(nrof_samples): + print(image_paths[i]) + img = misc.imread(os.path.expanduser(image_paths[i])) + img_size = np.asarray(img.shape)[0:2] + bounding_boxes, _ = align.detect_face.detect_face(img, minsize, pnet, rnet, onet, threshold, factor) + det = np.squeeze(bounding_boxes[0,0:4]) + bb = np.zeros(4, dtype=np.int32) + bb[0] = np.maximum(det[0]-margin/2, 0) + bb[1] = np.maximum(det[1]-margin/2, 0) + bb[2] = np.minimum(det[2]+margin/2, img_size[1]) + bb[3] = np.minimum(det[3]+margin/2, img_size[0]) + cropped = img[bb[1]:bb[3],bb[0]:bb[2],:] + aligned = misc.imresize(cropped, (image_size, image_size), interp='bilinear') + prewhitened = facenet.prewhiten(aligned) + img_list[i] = prewhitened + images = np.stack(img_list) + return images + +def parse_arguments(argv): + parser = argparse.ArgumentParser() + parser.add_argument('model_dir', type=str, + help='Directory containing the meta_file and ckpt_file') + parser.add_argument('data_dir', type=str, + help='Directory containing images. If images are not already aligned and cropped include --is_aligned False.') + parser.add_argument('--is_aligned', type=str, + help='Is the data directory already aligned and cropped?', default=True) + parser.add_argument('--image_size', type=int, + help='Image size (height, width) in pixels.', default=160) + parser.add_argument('--margin', type=int, + help='Margin for the crop around the bounding box (height, width) in pixels.', + default=44) + parser.add_argument('--gpu_memory_fraction', type=float, + help='Upper bound on the amount of GPU memory that will be used by the process.', + default=1.0) + parser.add_argument('--image_batch', type=int, + help='Number of images stored in memory at a time. Default 500.', + default=500) + + # numpy file Names + parser.add_argument('--embeddings_name', type=str, + help='Enter string of which the embeddings numpy array is saved as.', + default='embeddings.npy') + parser.add_argument('--labels_name', type=str, + help='Enter string of which the labels numpy array is saved as.', + default='labels.npy') + parser.add_argument('--labels_strings_name', type=str, + help='Enter string of which the labels as strings numpy array is saved as.', + default='label_strings.npy') + return parser.parse_args(argv) + +if __name__ == '__main__': + main(parse_arguments(sys.argv[1:])) diff --git a/cv/face/facenet/tensorflow/contributed/face.py b/cv/face/facenet/tensorflow/contributed/face.py new file mode 100644 index 0000000000000000000000000000000000000000..bd721b06b8dedf8c390a1668f124c8ad68e84126 --- /dev/null +++ b/cv/face/facenet/tensorflow/contributed/face.py @@ -0,0 +1,158 @@ +# coding=utf-8 +"""Face Detection and Recognition""" +# MIT License +# +# Copyright (c) 2017 François Gervais +# Copyright (c) 2024, Shanghai Iluvatar CoreX Semiconductor Co., Ltd. +# All Rights Reserved. +# +# This is the work of David Sandberg and shanren7 remodelled into a +# high level container. It's an attempt to simplify the use of such +# technology and provide an easy to use facial recognition package. +# +# https://github.com/davidsandberg/facenet +# https://github.com/shanren7/real_time_face_recognition +# +# Permission is hereby granted, free of charge, to any person obtaining a copy +# of this software and associated documentation files (the "Software"), to deal +# in the Software without restriction, including without limitation the rights +# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +# copies of the Software, and to permit persons to whom the Software is +# furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice shall be included in all +# copies or substantial portions of the Software. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +# SOFTWARE. + +import pickle +import os + +import cv2 +import numpy as np +import tensorflow as tf +from scipy import misc + +import align.detect_face +import facenet + + +gpu_memory_fraction = 0.3 +facenet_model_checkpoint = os.path.dirname(__file__) + "/../model_checkpoints/20170512-110547" +classifier_model = os.path.dirname(__file__) + "/../model_checkpoints/my_classifier_1.pkl" +debug = False + + +class Face: + def __init__(self): + self.name = None + self.bounding_box = None + self.image = None + self.container_image = None + self.embedding = None + + +class Recognition: + def __init__(self): + self.detect = Detection() + self.encoder = Encoder() + self.identifier = Identifier() + + def add_identity(self, image, person_name): + faces = self.detect.find_faces(image) + + if len(faces) == 1: + face = faces[0] + face.name = person_name + face.embedding = self.encoder.generate_embedding(face) + return faces + + def identify(self, image): + faces = self.detect.find_faces(image) + + for i, face in enumerate(faces): + if debug: + cv2.imshow("Face: " + str(i), face.image) + face.embedding = self.encoder.generate_embedding(face) + face.name = self.identifier.identify(face) + + return faces + + +class Identifier: + def __init__(self): + with open(classifier_model, 'rb') as infile: + self.model, self.class_names = pickle.load(infile) + + def identify(self, face): + if face.embedding is not None: + predictions = self.model.predict_proba([face.embedding]) + best_class_indices = np.argmax(predictions, axis=1) + return self.class_names[best_class_indices[0]] + + +class Encoder: + def __init__(self): + self.sess = tf.Session() + with self.sess.as_default(): + facenet.load_model(facenet_model_checkpoint) + + def generate_embedding(self, face): + # Get input and output tensors + images_placeholder = tf.get_default_graph().get_tensor_by_name("input:0") + embeddings = tf.get_default_graph().get_tensor_by_name("embeddings:0") + phase_train_placeholder = tf.get_default_graph().get_tensor_by_name("phase_train:0") + + prewhiten_face = facenet.prewhiten(face.image) + + # Run forward pass to calculate embeddings + feed_dict = {images_placeholder: [prewhiten_face], phase_train_placeholder: False} + return self.sess.run(embeddings, feed_dict=feed_dict)[0] + + +class Detection: + # face detection parameters + minsize = 20 # minimum size of face + threshold = [0.6, 0.7, 0.7] # three steps's threshold + factor = 0.709 # scale factor + + def __init__(self, face_crop_size=160, face_crop_margin=32): + self.pnet, self.rnet, self.onet = self._setup_mtcnn() + self.face_crop_size = face_crop_size + self.face_crop_margin = face_crop_margin + + def _setup_mtcnn(self): + with tf.Graph().as_default(): + gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=gpu_memory_fraction) + sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options, log_device_placement=False)) + with sess.as_default(): + return align.detect_face.create_mtcnn(sess, None) + + def find_faces(self, image): + faces = [] + + bounding_boxes, _ = align.detect_face.detect_face(image, self.minsize, + self.pnet, self.rnet, self.onet, + self.threshold, self.factor) + for bb in bounding_boxes: + face = Face() + face.container_image = image + face.bounding_box = np.zeros(4, dtype=np.int32) + + img_size = np.asarray(image.shape)[0:2] + face.bounding_box[0] = np.maximum(bb[0] - self.face_crop_margin / 2, 0) + face.bounding_box[1] = np.maximum(bb[1] - self.face_crop_margin / 2, 0) + face.bounding_box[2] = np.minimum(bb[2] + self.face_crop_margin / 2, img_size[1]) + face.bounding_box[3] = np.minimum(bb[3] + self.face_crop_margin / 2, img_size[0]) + cropped = image[face.bounding_box[1]:face.bounding_box[3], face.bounding_box[0]:face.bounding_box[2], :] + face.image = misc.imresize(cropped, (self.face_crop_size, self.face_crop_size), interp='bilinear') + + faces.append(face) + + return faces diff --git a/cv/face/facenet/tensorflow/contributed/predict.py b/cv/face/facenet/tensorflow/contributed/predict.py new file mode 100644 index 0000000000000000000000000000000000000000..bd210cb787e14937623b81243c509319819e916e --- /dev/null +++ b/cv/face/facenet/tensorflow/contributed/predict.py @@ -0,0 +1,134 @@ + + +from __future__ import absolute_import +from __future__ import division +from __future__ import print_function + +#---------------------------------------------------- +# MIT License +# +# Copyright (c) 2017 Rishi Rai +# Copyright (c) 2024, Shanghai Iluvatar CoreX Semiconductor Co., Ltd. +# All Rights Reserved. +# +# Permission is hereby granted, free of charge, to any person obtaining a copy +# of this software and associated documentation files (the "Software"), to deal +# in the Software without restriction, including without limitation the rights +# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +# copies of the Software, and to permit persons to whom the Software is +# furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice shall be included in all +# copies or substantial portions of the Software. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +# SOFTWARE. +#---------------------------------------------------- + + +import tensorflow as tf +import numpy as np +import argparse +import facenet +import os +import sys +import math +import pickle +from sklearn.svm import SVC +from scipy import misc +import align.detect_face +from six.moves import xrange + +def main(args): + + images, cout_per_image, nrof_samples = load_and_align_data(args.image_files,args.image_size, args.margin, args.gpu_memory_fraction) + with tf.Graph().as_default(): + + with tf.Session() as sess: + + # Load the model + facenet.load_model(args.model) + # Get input and output tensors + images_placeholder = tf.get_default_graph().get_tensor_by_name("input:0") + embeddings = tf.get_default_graph().get_tensor_by_name("embeddings:0") + phase_train_placeholder = tf.get_default_graph().get_tensor_by_name("phase_train:0") + + # Run forward pass to calculate embeddings + feed_dict = { images_placeholder: images , phase_train_placeholder:False} + emb = sess.run(embeddings, feed_dict=feed_dict) + classifier_filename_exp = os.path.expanduser(args.classifier_filename) + with open(classifier_filename_exp, 'rb') as infile: + (model, class_names) = pickle.load(infile) + print('Loaded classifier model from file "%s"\n' % classifier_filename_exp) + predictions = model.predict_proba(emb) + best_class_indices = np.argmax(predictions, axis=1) + best_class_probabilities = predictions[np.arange(len(best_class_indices)), best_class_indices] + k=0 + #print predictions + for i in range(nrof_samples): + print("\npeople in image %s :" %(args.image_files[i])) + for j in range(cout_per_image[i]): + print('%s: %.3f' % (class_names[best_class_indices[k]], best_class_probabilities[k])) + k+=1 + +def load_and_align_data(image_paths, image_size, margin, gpu_memory_fraction): + + minsize = 20 # minimum size of face + threshold = [ 0.6, 0.7, 0.7 ] # three steps's threshold + factor = 0.709 # scale factor + + print('Creating networks and loading parameters') + with tf.Graph().as_default(): + gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=gpu_memory_fraction) + sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options, log_device_placement=False)) + with sess.as_default(): + pnet, rnet, onet = align.detect_face.create_mtcnn(sess, None) + + nrof_samples = len(image_paths) + img_list = [] + count_per_image = [] + for i in xrange(nrof_samples): + img = misc.imread(os.path.expanduser(image_paths[i])) + img_size = np.asarray(img.shape)[0:2] + bounding_boxes, _ = align.detect_face.detect_face(img, minsize, pnet, rnet, onet, threshold, factor) + count_per_image.append(len(bounding_boxes)) + for j in range(len(bounding_boxes)): + det = np.squeeze(bounding_boxes[j,0:4]) + bb = np.zeros(4, dtype=np.int32) + bb[0] = np.maximum(det[0]-margin/2, 0) + bb[1] = np.maximum(det[1]-margin/2, 0) + bb[2] = np.minimum(det[2]+margin/2, img_size[1]) + bb[3] = np.minimum(det[3]+margin/2, img_size[0]) + cropped = img[bb[1]:bb[3],bb[0]:bb[2],:] + aligned = misc.imresize(cropped, (image_size, image_size), interp='bilinear') + prewhitened = facenet.prewhiten(aligned) + img_list.append(prewhitened) + images = np.stack(img_list) + return images, count_per_image, nrof_samples + +def parse_arguments(argv): + parser = argparse.ArgumentParser() + parser.add_argument('image_files', type=str, nargs='+', help='Path(s) of the image(s)') + parser.add_argument('model', type=str, + help='Could be either a directory containing the meta_file and ckpt_file or a model protobuf (.pb) file') + parser.add_argument('classifier_filename', + help='Classifier model file name as a pickle (.pkl) file. ' + + 'For training this is the output and for classification this is an input.') + parser.add_argument('--image_size', type=int, + help='Image size (height, width) in pixels.', default=160) + parser.add_argument('--seed', type=int, + help='Random seed.', default=666) + parser.add_argument('--margin', type=int, + help='Margin for the crop around the bounding box (height, width) in pixels.', default=44) + parser.add_argument('--gpu_memory_fraction', type=float, + help='Upper bound on the amount of GPU memory that will be used by the process.', default=1.0) + return parser.parse_args(argv) + +if __name__ == '__main__': + main(parse_arguments(sys.argv[1:])) + diff --git a/cv/face/facenet/tensorflow/contributed/real_time_face_recognition.py b/cv/face/facenet/tensorflow/contributed/real_time_face_recognition.py new file mode 100644 index 0000000000000000000000000000000000000000..c70198392b287d341064f060675165d475e47fee --- /dev/null +++ b/cv/face/facenet/tensorflow/contributed/real_time_face_recognition.py @@ -0,0 +1,105 @@ +# coding=utf-8 +"""Performs face detection in realtime. + +Based on code from https://github.com/shanren7/real_time_face_recognition +""" +# MIT License +# +# Copyright (c) 2017 François Gervais +# Copyright (c) 2024, Shanghai Iluvatar CoreX Semiconductor Co., Ltd. +# All Rights Reserved. +# +# Permission is hereby granted, free of charge, to any person obtaining a copy +# of this software and associated documentation files (the "Software"), to deal +# in the Software without restriction, including without limitation the rights +# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +# copies of the Software, and to permit persons to whom the Software is +# furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice shall be included in all +# copies or substantial portions of the Software. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +# SOFTWARE. +import argparse +import sys +import time + +import cv2 + +import face + + +def add_overlays(frame, faces, frame_rate): + if faces is not None: + for face in faces: + face_bb = face.bounding_box.astype(int) + cv2.rectangle(frame, + (face_bb[0], face_bb[1]), (face_bb[2], face_bb[3]), + (0, 255, 0), 2) + if face.name is not None: + cv2.putText(frame, face.name, (face_bb[0], face_bb[3]), + cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), + thickness=2, lineType=2) + + cv2.putText(frame, str(frame_rate) + " fps", (10, 30), + cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), + thickness=2, lineType=2) + + +def main(args): + frame_interval = 3 # Number of frames after which to run face detection + fps_display_interval = 5 # seconds + frame_rate = 0 + frame_count = 0 + + video_capture = cv2.VideoCapture(0) + face_recognition = face.Recognition() + start_time = time.time() + + if args.debug: + print("Debug enabled") + face.debug = True + + while True: + # Capture frame-by-frame + ret, frame = video_capture.read() + + if (frame_count % frame_interval) == 0: + faces = face_recognition.identify(frame) + + # Check our current fps + end_time = time.time() + if (end_time - start_time) > fps_display_interval: + frame_rate = int(frame_count / (end_time - start_time)) + start_time = time.time() + frame_count = 0 + + add_overlays(frame, faces, frame_rate) + + frame_count += 1 + cv2.imshow('Video', frame) + + if cv2.waitKey(1) & 0xFF == ord('q'): + break + + # When everything is done, release the capture + video_capture.release() + cv2.destroyAllWindows() + + +def parse_arguments(argv): + parser = argparse.ArgumentParser() + + parser.add_argument('--debug', action='store_true', + help='Enable some debug outputs.') + return parser.parse_args(argv) + + +if __name__ == '__main__': + main(parse_arguments(sys.argv[1:])) diff --git a/cv/face/facenet/tensorflow/data/learning_rate_schedule_classifier_casia.txt b/cv/face/facenet/tensorflow/data/learning_rate_schedule_classifier_casia.txt index adf66d7e3c6186b586fb4035245c24ded0183808..cef6e67353314cd174f4b9e380e04aa17fa5e735 100644 --- a/cv/face/facenet/tensorflow/data/learning_rate_schedule_classifier_casia.txt +++ b/cv/face/facenet/tensorflow/data/learning_rate_schedule_classifier_casia.txt @@ -1,7 +1,6 @@ # Learning rate schedule # Maps an epoch number to a learning rate -0: 0.05 +0: 0.05 60: 0.005 80: 0.0005 -90: 0.00005 -121: -1 +91: -1 diff --git a/cv/face/facenet/tensorflow/data/learning_rate_schedule_classifier_casia_ddp.txt b/cv/face/facenet/tensorflow/data/learning_rate_schedule_classifier_casia_ddp.txt new file mode 100644 index 0000000000000000000000000000000000000000..76a930d2c92f3e7ee8b09bc8761930b85d102a1d --- /dev/null +++ b/cv/face/facenet/tensorflow/data/learning_rate_schedule_classifier_casia_ddp.txt @@ -0,0 +1,7 @@ +# Learning rate schedule +# Maps an epoch number to a learning rate +0: 0.03 +20: 0.003 +30: 0.0006 +40: 0.0003 +50: -1 diff --git a/cv/face/facenet/tensorflow/facenet_report.sh b/cv/face/facenet/tensorflow/facenet_report.sh new file mode 100644 index 0000000000000000000000000000000000000000..084431b15d4f76d9f933b259a8e2978172e6b437 --- /dev/null +++ b/cv/face/facenet/tensorflow/facenet_report.sh @@ -0,0 +1,101 @@ +#!/bin/bash +# Copyright (c) 2024, Shanghai Iluvatar CoreX Semiconductor Co., Ltd. +# All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); you may +# not use this file except in compliance with the License. You may obtain +# a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT +# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the +# License for the specific language governing permissions and limitations +# under the License. +# +# 该脚本针对Facenet单卡训练的结果日志处理,输出符合移动集采要求的模型信息。 +# 模型源码的多卡训练,需要各厂商自行适配,该脚本需要适当修改 + +# 目标精度 +target_acc=0.9905 + +echo "------------facenet report---------------" + +# 一个epoch处理的总样本数 +total_samples=`cat log/fps.txt | grep "Epoch: " | grep "Current Train Samples" | head -n 1 | awk '{print$6}'` +echo "samples(one epoch): ${total_samples}" +echo "target acc: ${target_acc}" + +echo "--------first achieve target acc-----------" + +# 首次达到目标精度时的精度结果 +first_achieve_acc=`cat log/fps.txt | grep "Current Acc" | sort -n -k 6 | awk '$6 >= '${target_acc}'' | sort -n -k 2 | head -n 1 | awk '{print$6}'` +# 首次达到目标精度时的epoch数 +current_epoch=`cat log/fps.txt | grep "Current Acc" | sort -n -k 6 | awk '$6 >= '${target_acc}'' | sort -n -k 2 | head -n 1 | awk '{print$2}'` +# 首次达到目标精度时的训练时长 +current_train_time=`cat log/fps.txt | grep "Epoch: ${current_epoch}[[:blank:]]" | grep "All Train Time" | awk '{print$6}' | sort -n -k 1 -r | head -n 1` +# 首次达到目标精度时的评估时长 +current_eval_time=`cat log/fps.txt | grep "Epoch: ${current_epoch}[[:blank:]]" | grep "All Eval Time" | awk '{print$6}' | head -n 1` +# 首次达到目标精度时的总时长 +current_total_time=`cat log/fps.txt | grep "Epoch: ${current_epoch}[[:blank:]]" | grep "All Time" | awk '{print$5}' | head -n 1` +# 当前训练最大FPS +current_max_FPS=`cat log/fps.txt | awk '$2 <= '${current_epoch}'' | grep "Current Epoch FPS" | awk '{x[$2]+=$6}END{for(i in x){print i, x[i]}}' | sort -n -k 2 -r | head -n 1 | awk '{print$2}'` +# 当前训练平均FPS +current_average_FPS=`awk -v ts=${total_samples} -v eps=${current_epoch} -v ttt=${current_train_time} 'BEGIN{print(ts*(eps)/ttt)}'` +# 当前端到端平均FPS +current_e2e_FPS=`awk -v ts=${total_samples} -v eps=${current_epoch} -v ttt=${current_total_time} 'BEGIN{print(ts*(eps)/ttt)}'` + +echo "first achieve target acc: ${first_achieve_acc}" +current_epoch=`awk -v ce=${current_epoch} 'BEGIN{print(ce+1)}'` +echo "current epoch: ${current_epoch}" +echo "current train time: ${current_train_time}" +echo "current eval time: ${current_eval_time}" +echo "current total time: ${current_total_time}" +echo "current max FPS: ${current_max_FPS}" +echo "current average FPS: ${current_average_FPS}" +echo "current e2e FPS: ${current_e2e_FPS}" + + +echo "------------achieve best acc---------------" +# 达到最优精度时的精度结果 +best_acc=`cat log/fps.txt | grep "Current Acc" | sort -n -k 6 -r | awk '{print$6}' | head -n 1` +# 达到最优精度时的epoch数 +best_acc_epoch=`cat log/fps.txt | grep "Current Acc" | sort -n -k 6 -r -k 2 | awk '{print$2}' | head -n 1` +# 达到最优精度时的训练时长 +current_train_time=`cat log/fps.txt | grep "Epoch: ${best_acc_epoch}[[:blank:]]" | grep "All Train Time" | awk '{print$6}' | sort -n -k 1 -r | head -n 1` +# 首次最优精度时的评估时长 +current_eval_time=`cat log/fps.txt | grep "Epoch: ${best_acc_epoch}[[:blank:]]" | grep "All Eval Time" | awk '{print$6}' | head -n 1` +# 首次最优精度时的总时长 +current_total_time=`cat log/fps.txt | grep "Epoch: ${best_acc_epoch}[[:blank:]]" | grep "All Time" | awk '{print$5}' | head -n 1` +# 当前训练最大FPS +current_max_FPS=`cat log/fps.txt | awk '$2 <= '${best_acc_epoch}'' | grep "Current Epoch FPS" | awk '{x[$2]+=$6}END{for(i in x){print i, x[i]}}' | sort -n -k 2 -r | head -n 1 | awk '{print$2}'` +# 当前训练平均FPS +current_average_FPS=`awk -v ts=${total_samples} -v eps=${best_acc_epoch} -v ttt=${current_train_time} 'BEGIN{print(ts*(eps)/ttt)}'` +# 当前端到端平均FPS +current_e2e_FPS=`awk -v ts=${total_samples} -v eps=${best_acc_epoch} -v ttt=${current_total_time} 'BEGIN{print(ts*(eps)/ttt)}'` + +echo "best acc: ${best_acc}" +best_acc_epoch=`awk -v ce=${best_acc_epoch} 'BEGIN{print(ce+1)}'` +echo "best acc epoch: ${best_acc_epoch}" +echo "current train time: ${current_train_time}" +echo "current eval time: ${current_eval_time}" +echo "current total time: ${current_total_time}" +echo "current max FPS: ${current_max_FPS}" +echo "current average FPS: ${current_average_FPS}" +echo "current e2e FPS: ${current_e2e_FPS}" + + +echo "-------------total time-------------------" +# 总epoch数 +epoch_num=`cat log/fps.txt | sort -n -k 2 -r | head -n 1 | awk '{print$2}'` +# 最小的开始时间 +start_time=`cat log/time.txt | grep "Start" | sort -n -k 3 | awk '{print$3}'` +# 最晚的结束时间 +end_time=`cat log/time.txt | grep "End" | sort -n -k 3 -r | awk '{print$3}'` +# 程序运行总时长 +all_time=`awk -v st=${start_time} -v et=${end_time} 'BEGIN{print(et-st)}'` + +epoch_num=`awk -v ce=${epoch_num} 'BEGIN{print(ce+1)}'` +echo "total epoch number: ${epoch_num}" +echo "all time: ${all_time}" \ No newline at end of file diff --git a/cv/face/facenet/tensorflow/init.sh b/cv/face/facenet/tensorflow/init.sh new file mode 100644 index 0000000000000000000000000000000000000000..7631e4b8060a61c36698b8d052972bc857d99fd5 --- /dev/null +++ b/cv/face/facenet/tensorflow/init.sh @@ -0,0 +1,24 @@ +#!/bin/bash +# Copyright (c) 2024, Shanghai Iluvatar CoreX Semiconductor Co., Ltd. +# All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); you may +# not use this file except in compliance with the License. You may obtain +# a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT +# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the +# License for the specific language governing permissions and limitations +# under the License. + +PY_VERSION=$(python3 -V 2>&1 | awk '{print $2}' | awk -F '.' '{print $2}') +if [ "$PY_VERSION" == "10" ] || [ "$PY_VERSION" == "8" ] || [ "$PY_VERSION" == "9" ]; then + pip3 install -r requirements.txt + pip3 install scipy==1.7.2 +else + pip3 install -r requirements.txt + pip3 install scipy +fi diff --git a/cv/face/facenet/tensorflow/requirements.txt b/cv/face/facenet/tensorflow/requirements.txt index f4727265e30588b8ed4cf02eb12ae4c33de9e8db..469faf575c8bbb3077a6d449cbc408d96eaa00ed 100644 --- a/cv/face/facenet/tensorflow/requirements.txt +++ b/cv/face/facenet/tensorflow/requirements.txt @@ -1,4 +1,5 @@ -scipy +#tensorflow==1.7 +#scipy scikit-learn opencv-python h5py diff --git a/cv/face/facenet/tensorflow/src/align/detect_face.py b/cv/face/facenet/tensorflow/src/align/detect_face.py index 88e90528611d5ab7b5fb2d7a4d987e2c74251c52..bd23a94f489bdeabb7a94eb34cb2cd31946bb04b 100644 --- a/cv/face/facenet/tensorflow/src/align/detect_face.py +++ b/cv/face/facenet/tensorflow/src/align/detect_face.py @@ -4,6 +4,8 @@ https://github.com/kpzhang93/MTCNN_face_detection_alignment # MIT License # # Copyright (c) 2016 David Sandberg +# Copyright (c) 2024, Shanghai Iluvatar CoreX Semiconductor Co., Ltd. +# All Rights Reserved. # # Permission is hereby granted, free of charge, to any person obtaining a copy # of this software and associated documentation files (the "Software"), to deal @@ -82,7 +84,7 @@ class Network(object): session: The current TensorFlow session ignore_missing: If true, serialized weights for missing layers are ignored. """ - data_dict = np.load(data_path, encoding='latin1',allow_pickle=True).item() #pylint: disable=no-member + data_dict = np.load(data_path, encoding='latin1').item() #pylint: disable=no-member for op_name in data_dict: with tf.variable_scope(op_name, reuse=True): diff --git a/cv/face/facenet/tensorflow/src/facenet.py b/cv/face/facenet/tensorflow/src/facenet.py index be95d6ba229a4ab0896b29ad86a8de2635c58b41..2889f41fe8dafa5a0184d6214275ea2405c1bd11 100644 --- a/cv/face/facenet/tensorflow/src/facenet.py +++ b/cv/face/facenet/tensorflow/src/facenet.py @@ -3,6 +3,8 @@ # MIT License # # Copyright (c) 2016 David Sandberg +# Copyright (c) 2024, Shanghai Iluvatar CoreX Semiconductor Co., Ltd. +# All Rights Reserved. # # Permission is hereby granted, free of charge, to any person obtaining a copy # of this software and associated documentation files (the "Software"), to deal @@ -23,8 +25,6 @@ # SOFTWARE. # pylint: disable=missing-docstring -# Copyright (c) 2023, Shanghai Iluvatar CoreX Semiconductor Co., Ltd. -# All Rights Reserved. from __future__ import absolute_import from __future__ import division from __future__ import print_function @@ -107,7 +107,7 @@ FLIP = 16 def create_input_pipeline(input_queue, image_size, nrof_preprocess_threads, batch_size_placeholder, hvd_rank=-1, hvd=None): if hvd_rank == -1: - with tf.name_scope("tempscope"): + with tf.name_scope("tempscope"),tf.device('/cpu:0'): images_and_labels_list = [] for _ in range(nrof_preprocess_threads): filenames, label, control = input_queue.dequeue() @@ -143,85 +143,34 @@ def create_input_pipeline(input_queue, image_size, nrof_preprocess_threads, batc return image_batch, label_batch else: - with tf.name_scope("tempscope"): - images_and_labels_list = [] - for _ in range(nrof_preprocess_threads): - filenames, label, control = input_queue.dequeue() - images = [] - for filename in tf.unstack(filenames): - file_contents = tf.read_file(filename) - image = tf.image.decode_image(file_contents, 3) - image = tf.cond(get_control_flag(control[0], RANDOM_ROTATE), - lambda:tf.py_func(random_rotate_image, [image], tf.uint8), - lambda:tf.identity(image)) - image = tf.cond(get_control_flag(control[0], RANDOM_CROP), - lambda:tf.random_crop(image, image_size + (3,)), - lambda:tf.image.resize_image_with_crop_or_pad(image, image_size[0], image_size[1])) - image = tf.cond(get_control_flag(control[0], RANDOM_FLIP), - lambda:tf.image.random_flip_left_right(image), - lambda:tf.identity(image)) - image = tf.cond(get_control_flag(control[0], FIXED_STANDARDIZATION), - lambda:(tf.cast(image, tf.float32) - 127.5)/128.0, - lambda:tf.cast(tf.image.per_image_standardization(image),tf.float32),tf.float32) - image = tf.cond(get_control_flag(control[0], FLIP), - lambda:tf.image.flip_left_right(image), - lambda:tf.identity(image)) - #pylint: disable=no-member - image.set_shape(image_size + (3,)) - images.append(image) - images_and_labels_list.append([images, label]) - - diff_value = hvd.size() - len(images_and_labels_list) - if diff_value > 0: - for i in range(diff_value): - images_and_labels_list.append(images_and_labels_list[i]) - images_and_labels_list = [x for i, x in enumerate(images_and_labels_list) if i % hvd.size() == hvd.rank()] - image_batch, label_batch = tf.train.batch_join( - images_and_labels_list, batch_size=batch_size_placeholder, - shapes=[image_size + (3,), ()], enqueue_many=True, - capacity=4 * nrof_preprocess_threads * 100, + with tf.name_scope("tempscope"),tf.device('/cpu:0'): + filename, label, control = input_queue.dequeue() + file_contents = tf.read_file(filename[0]) + image = tf.image.decode_image(file_contents, 3) + image = tf.cond(get_control_flag(control[0], RANDOM_ROTATE), + lambda:tf.py_func(random_rotate_image, [image], tf.uint8), + lambda:tf.identity(image)) + image = tf.cond(get_control_flag(control[0], RANDOM_CROP), + lambda:tf.random_crop(image, image_size + (3,)), + lambda:tf.image.resize_image_with_crop_or_pad(image, image_size[0], image_size[1])) + image = tf.cond(get_control_flag(control[0], RANDOM_FLIP), + lambda:tf.image.random_flip_left_right(image), + lambda:tf.identity(image)) + image = tf.cond(get_control_flag(control[0], FIXED_STANDARDIZATION), + lambda:(tf.cast(image, tf.float32) - 127.5)/128.0, + lambda:tf.cast(tf.image.per_image_standardization(image),tf.float32),tf.float32) + image = tf.cond(get_control_flag(control[0], FLIP), + lambda:tf.image.flip_left_right(image), + lambda:tf.identity(image)) + #pylint: disable=no-member + image.set_shape(image_size + (3,)) + image_batch, label_batch = tf.train.batch( + [image, label[0]], + batch_size=batch_size_placeholder, shapes=[image_size+(3, ), ()], + capacity=4 * nrof_preprocess_threads * 100, num_threads=nrof_preprocess_threads, allow_smaller_final_batch=True) - return image_batch, label_batch -def create_input_pipeline_(input_queue, image_size, nrof_preprocess_threads, batch_size_placeholder): - - with tf.name_scope("tempscope"): - images_and_labels_list = [] - for _ in range(nrof_preprocess_threads): - filenames, label, control = input_queue.dequeue() - images = [] - for filename in tf.unstack(filenames): - file_contents = tf.read_file(filename) - image = tf.image.decode_image(file_contents, 3) - image = tf.cond(get_control_flag(control[0], RANDOM_ROTATE), - lambda:tf.py_func(random_rotate_image, [image], tf.uint8), - lambda:tf.identity(image)) - image = tf.cond(get_control_flag(control[0], RANDOM_CROP), - lambda:tf.random_crop(image, image_size + (3,)), - lambda:tf.image.resize_image_with_crop_or_pad(image, image_size[0], image_size[1])) - image = tf.cond(get_control_flag(control[0], RANDOM_FLIP), - lambda:tf.image.random_flip_left_right(image), - lambda:tf.identity(image)) - image = tf.cond(get_control_flag(control[0], FIXED_STANDARDIZATION), - lambda:(tf.cast(image, tf.float32) - 127.5)/128.0, - lambda:tf.cast(tf.image.per_image_standardization(image),tf.float32),tf.float32) - image = tf.cond(get_control_flag(control[0], FLIP), - lambda:tf.image.flip_left_right(image), - lambda:tf.identity(image)) - #pylint: disable=no-member - image.set_shape(image_size + (3,)) - images.append(image) - images_and_labels_list.append([images, label]) - - image_batch, label_batch = tf.train.batch_join( - images_and_labels_list, batch_size=batch_size_placeholder, - shapes=[image_size + (3,), ()], enqueue_many=True, - capacity=4 * nrof_preprocess_threads * 100, - allow_smaller_final_batch=True) - - return image_batch, label_batch - def get_control_flag(control, field): return tf.equal(tf.mod(tf.floor_div(control, field), 2), 1) @@ -256,9 +205,9 @@ def train(total_loss, global_step, optimizer, learning_rate, moving_average_deca # Generate moving averages of all losses and associated summaries. loss_averages_op = _add_loss_summaries(total_loss) - if hvd is not None: - lr_scaler = hvd.size() - learning_rate = learning_rate * lr_scaler + # if hvd is not None: + # lr_scaler = hvd.size() + # learning_rate = learning_rate * lr_scaler # Compute gradients. with tf.control_dependencies([loss_averages_op]): @@ -277,7 +226,7 @@ def train(total_loss, global_step, optimizer, learning_rate, moving_average_deca raise ValueError('Invalid optimization algorithm') if hvd is not None: - opt = hvd.DistributedOptimizer(opt, op=hvd.Average) + opt = hvd.DistributedOptimizer(opt, op=hvd.Sum,groups=1) #opt=tf.train.experimental.enable_mixed_precision_graph_rewrite(opt) diff --git a/cv/face/facenet/tensorflow/src/models/inception_resnet_v1.py b/cv/face/facenet/tensorflow/src/models/inception_resnet_v1.py index 67054c1563955bf96f83a4b7b06c877802070d8f..0fce235594c79d0be7576a7d2ecf2d206f5bf2e3 100644 --- a/cv/face/facenet/tensorflow/src/models/inception_resnet_v1.py +++ b/cv/face/facenet/tensorflow/src/models/inception_resnet_v1.py @@ -1,4 +1,6 @@ # Copyright 2016 The TensorFlow Authors. All Rights Reserved. +# Copyright (c) 2024, Shanghai Iluvatar CoreX Semiconductor Co., Ltd. +# All Rights Reserved. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. @@ -12,8 +14,7 @@ # See the License for the specific language governing permissions and # limitations under the License. # ============================================================================== -# Copyright (c) 2023, Shanghai Iluvatar CoreX Semiconductor Co., Ltd. -# All Rights Reserved. + """Contains the definition of the Inception Resnet V1 architecture. As described in http://arxiv.org/abs/1602.07261. Inception-v4, Inception-ResNet and the Impact of Residual Connections @@ -130,7 +131,7 @@ def reduction_b(net): return net def inference(images, keep_probability, phase_train=True, - bottleneck_layer_size=128, weight_decay=0.0, reuse=None): + bottleneck_layer_size=128, weight_decay=0.0, reuse=None, seed=None): batch_norm_params = { # Decay for the moving averages. 'decay': 0.995, @@ -143,18 +144,19 @@ def inference(images, keep_probability, phase_train=True, } with slim.arg_scope([slim.conv2d, slim.fully_connected], - weights_initializer=slim.initializers.xavier_initializer(), + weights_initializer=slim.initializers.xavier_initializer(seed=seed), weights_regularizer=slim.l2_regularizer(weight_decay), normalizer_fn=slim.batch_norm, normalizer_params=batch_norm_params): return inception_resnet_v1(images, is_training=phase_train, - dropout_keep_prob=keep_probability, bottleneck_layer_size=bottleneck_layer_size, reuse=reuse) + dropout_keep_prob=keep_probability, bottleneck_layer_size=bottleneck_layer_size, reuse=reuse, seed=seed) def inception_resnet_v1(inputs, is_training=True, dropout_keep_prob=0.8, bottleneck_layer_size=128, reuse=None, + seed=None, scope='InceptionResnetV1'): """Creates the Inception Resnet V1 model. Args: @@ -238,7 +240,7 @@ def inception_resnet_v1(inputs, is_training=True, net = slim.flatten(net) net = slim.dropout(net, dropout_keep_prob, is_training=is_training, - scope='Dropout') + scope='Dropout', seed=seed) end_points['PreLogitsFlatten'] = net diff --git a/cv/face/facenet/tensorflow/src/train_softmax.py b/cv/face/facenet/tensorflow/src/train_softmax.py index d42dd392518ab58c48d630f05b2455e64f5db532..a97d8b20e0d543f0b8ebd4f4d5af79eb4269b875 100644 --- a/cv/face/facenet/tensorflow/src/train_softmax.py +++ b/cv/face/facenet/tensorflow/src/train_softmax.py @@ -3,6 +3,8 @@ # MIT License # # Copyright (c) 2016 David Sandberg +# Copyright (c) 2024, Shanghai Iluvatar CoreX Semiconductor Co., Ltd. +# All Rights Reserved. # # Permission is hereby granted, free of charge, to any person obtaining a copy # of this software and associated documentation files (the "Software"), to deal @@ -21,8 +23,7 @@ # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE # SOFTWARE. -# Copyright (c) 2023, Shanghai Iluvatar CoreX Semiconductor Co., Ltd. -# All Rights Reserved. + from __future__ import absolute_import from __future__ import division from __future__ import print_function @@ -46,6 +47,23 @@ from tensorflow.python.ops import data_flow_ops from tensorflow.python.framework import ops from tensorflow.python.ops import array_ops +RANK_ID = 0 +if "RANK_ID" in os.environ: + RANK_ID = int(os.environ["RANK_ID"]) +RANK_SIZE = 1 +if "RANK_SIZE" in os.environ: + RANK_SIZE = int(os.environ["RANK_SIZE"]) + +def broadcast_global_variables(root_rank): + op_list = [] + for var in tf.global_variables(): + if "float" in var.dtype.name: + inputs = [var] + outputs = hccl_ops.broadcast(tensor=inputs, root_rank=root_rank) + if outputs is not None: + op_list.append(outputs[0].op) + op_list.append(tf.assign(var, outputs[0])) + return tf.group(op_list) def main(args): network = importlib.import_module(args.model_def) @@ -109,14 +127,13 @@ def main(args): # Create a queue that produces indices into the image_list and label_list labels = ops.convert_to_tensor(label_list, dtype=tf.int32) range_size = array_ops.shape(labels)[0] - index_queue = tf.train.range_input_producer(range_size, num_epochs=None, + index_queue = tf.train.range_input_producer(range_size, num_epochs=None, shuffle=True, seed=None, capacity=32) index_dequeue_op = index_queue.dequeue_many(args.batch_size*args.epoch_size, 'index_dequeue') learning_rate_placeholder = tf.placeholder(tf.float32, name='learning_rate') batch_size_placeholder = tf.placeholder(tf.int32, name='batch_size') - print('batch_size_placeholder',batch_size_placeholder) phase_train_placeholder = tf.placeholder(tf.bool, name='phase_train') image_paths_placeholder = tf.placeholder(tf.string, shape=(None,1), name='image_paths') labels_placeholder = tf.placeholder(tf.int32, shape=(None,1), name='labels') @@ -192,10 +209,14 @@ def main(args): # Start running operations on the Graph. gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=args.gpu_memory_fraction) gpu_options.allow_growth = True - sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options, log_device_placement=False)) sess.run(tf.global_variables_initializer()) sess.run(tf.local_variables_initializer()) + if RANK_SIZE > 1: + bcast_global_variables_op = broadcast_global_variables(root_rank=0) + sess.run(bcast_global_variables_op) + train_op = util.set_iteration_per_loop(sess, apply_gradient_op, args.iterations_per_loop) + summary_writer = tf.summary.FileWriter(log_dir, sess.graph) coord = tf.train.Coordinator() tf.train.start_queue_runners(coord=coord, sess=sess) @@ -203,14 +224,9 @@ def main(args): with sess.as_default(): if pretrained_model: - print('----------------------Restoring pretrained model: %s' % pretrained_model) + print('Restoring pretrained model: %s' % pretrained_model) saver.restore(sess, pretrained_model) - #ckpt = tf.train.get_checkpoint_state(pretrained_model) - #if ckpt and ckpt.model_checkpoint_path: - # saver.restore(sess, ckpt.model_checkpoint_path) - # print("--------------------------------restore module success!") - # Training and validation loop print('Running training') nrof_steps = args.max_nrof_epochs*args.epoch_size @@ -233,6 +249,16 @@ def main(args): 'time_evaluate': np.zeros((args.max_nrof_epochs,), np.float32), 'prelogits_hist': np.zeros((args.max_nrof_epochs, 1000), np.float32), } + # 声明为全局变量 + global fw + global total_training_time + global total_samples + global total_eval_time + global total_time + # 声明fps日志打印路径 + fps_dir = 'log/' + os.makedirs(fps_dir, exist_ok=True) + fw = open(os.path.join(fps_dir, f'fps.txt'), 'a+', encoding='utf-8') for epoch in range(1,args.max_nrof_epochs+1): step = sess.run(global_step, feed_dict=None) # Train for one epoch @@ -267,7 +293,7 @@ def main(args): print('Saving statistics') with h5py.File(stat_file_name, 'w') as f: - for key, value in stat.items(): + for key, value in stat.iteritems(): f.create_dataset(key, data=value) return model_dir @@ -292,7 +318,6 @@ def filter_dataset(dataset, data_filename, percentile, min_nrof_images_per_class for i in indices: label = label_list[i] image = image_list[i] - print("image:{}".format(image)) if image in filtered_dataset[label].image_paths: filtered_dataset[label].image_paths.remove(image) if len(filtered_dataset[label].image_paths)0, 'The training set should not be empty' val_image_list, val_label_list = facenet.get_image_paths_and_labels(val_set) - + + args.epoch_size = len(image_list) // (args.batch_size * hvd.size()) + # Create a queue that produces indices into the image_list and label_list labels = ops.convert_to_tensor(label_list, dtype=tf.int32) range_size = array_ops.shape(labels)[0] index_queue = tf.train.range_input_producer(range_size, num_epochs=None, - shuffle=True, seed=None, capacity=32) + shuffle=True, seed=args.seed, capacity=32) - index_dequeue_op = index_queue.dequeue_many(args.batch_size*args.epoch_size, 'index_dequeue') + index_dequeue_op = index_queue.dequeue_many(args.batch_size*args.epoch_size*hvd.size(), 'index_dequeue') learning_rate_placeholder = tf.placeholder(tf.float32, name='learning_rate') batch_size_placeholder = tf.placeholder(tf.int32, name='batch_size') - print('batch_size_placeholder',batch_size_placeholder) phase_train_placeholder = tf.placeholder(tf.bool, name='phase_train') image_paths_placeholder = tf.placeholder(tf.string, shape=(None,1), name='image_paths') labels_placeholder = tf.placeholder(tf.int32, shape=(None,1), name='labels') control_placeholder = tf.placeholder(tf.int32, shape=(None,1), name='control') - + nrof_preprocess_threads = 4 input_queue = data_flow_ops.FIFOQueue(capacity=2000000, dtypes=[tf.string, tf.int32, tf.int32], @@ -148,20 +160,19 @@ def main(args): image_batch = tf.identity(image_batch, 'input') label_batch = tf.identity(label_batch, 'label_batch') - print('Number of classes in training set: %d' % nrof_classes) - print('Number of examples in training set: %d' % len(image_list)) - - print('Number of classes in validation set: %d' % len(val_set)) - print('Number of examples in validation set: %d' % len(val_image_list)) - - print('Building training graph') + if hvd.rank() == 0: + print('Number of classes in training set: %d' % nrof_classes) + print('Number of examples in training set: %d' % len(image_list)) + print('Number of classes in validation set: %d' % len(val_set)) + print('Number of examples in validation set: %d' % len(val_image_list)) + print('Building training graph') # Build the inference graph prelogits, _ = network.inference(image_batch, args.keep_probability, phase_train=phase_train_placeholder, bottleneck_layer_size=args.embedding_size, - weight_decay=args.weight_decay) + weight_decay=args.weight_decay, seed=args.seed) logits = slim.fully_connected(prelogits, len(train_set), activation_fn=None, - weights_initializer=slim.initializers.xavier_initializer(), + weights_initializer=slim.initializers.xavier_initializer(seed=args.seed), weights_regularizer=slim.l2_regularizer(args.weight_decay), scope='Logits', reuse=False) @@ -211,30 +222,20 @@ def main(args): gpu_options.allow_growth = True config = tf.ConfigProto(gpu_options=gpu_options) - #config.gpu_options.visible_device_list = str(hvd.local_rank()) os.environ['CUDA_VISIBLE_DEVICES'] = str(hvd.local_rank()) config.log_device_placement = False - # sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options, log_device_placement=False)) - # sess = tf.Session(config=config) sess = tf.train.MonitoredTrainingSession(config=config, hooks=hooks) - # sess.run(tf.global_variables_initializer()) - # sess.run(tf.local_variables_initializer()) + summary_writer = tf.summary.FileWriter(log_dir, sess.graph) coord = tf.train.Coordinator() tf.train.start_queue_runners(coord=coord, sess=sess) - # with sess.as_default(): with tf.train.MonitoredTrainingSession(config=config, hooks=hooks) as sess: if pretrained_model: - print('----------------------Restoring pretrained model: %s' % pretrained_model) + print('Restoring pretrained model: %s' % pretrained_model) saver.restore(sess, pretrained_model) - # ckpt = tf.train.get_checkpoint_state(pretrained_model) - # if ckpt and ckpt.model_checkpoint_path: - # saver.restore(sess, ckpt.model_checkpoint_path) - # print("--------------------------------restore module success!") - # Training and validation loop if hvd.rank() == 0: print('Running training') @@ -258,8 +259,22 @@ def main(args): 'time_evaluate': np.zeros((args.max_nrof_epochs,), np.float32), 'prelogits_hist': np.zeros((args.max_nrof_epochs, 1000), np.float32), } + # 声明为全局变量 + global fw + global total_training_time + global total_samples + global total_eval_time + global total_time + # 声明fps日志打印路径 + fps_dir = 'log/' + if hvd.rank() == 0: + os.makedirs(fps_dir, exist_ok=True) + fw = open(os.path.join(fps_dir, f'fps.txt'), 'a+', encoding='utf-8') + for epoch in range(1, args.max_nrof_epochs+1): step = sess.run(global_step, feed_dict=None) + # one epoch开始时间 + epoch_start = time.time() # Train for one epoch t = time.time() cont = train(args, sess, epoch, image_list, label_list, index_dequeue_op, enqueue_op, image_paths_placeholder, labels_placeholder, @@ -285,10 +300,14 @@ def main(args): save_variables_and_metagraph(sess, saver, summary_writer, model_dir, subdir, epoch) # Evaluate on LFW + # 评估精度 + acc = 0 + # one epoch评估开始 + eval_start = time.time() t = time.time() if args.lfw_dir: if hvd.rank() == 0: - evaluate(sess, enqueue_op, image_paths_placeholder, labels_placeholder, phase_train_placeholder, batch_size_placeholder, control_placeholder, + acc = evaluate(sess, enqueue_op, image_paths_placeholder, labels_placeholder, phase_train_placeholder, batch_size_placeholder, control_placeholder, embeddings, label_batch, lfw_paths, actual_issame, args.lfw_batch_size, args.lfw_nrof_folds, log_dir, step, summary_writer, stat, epoch, args.lfw_distance_metric, args.lfw_subtract_mean, args.lfw_use_flipped_images, args.use_fixed_image_standardization) stat['time_evaluate'][epoch-1] = time.time() - t @@ -298,6 +317,26 @@ def main(args): with h5py.File(stat_file_name, 'w') as f: for key, value in stat.items(): f.create_dataset(key, data=value) + + # one epoch结束时间 + epoch_end = time.time() + # 计算一个epoch评估时长 + cur_epoch_eval_time = epoch_end - eval_start + # 计算总共评估时长 + total_eval_time += cur_epoch_eval_time + # 计算总时长 + total_time += (epoch_end - epoch_start) + if (fw is not None) and (hvd.rank() == 0): + fw.write('Epoch: {}\tCurrent End Time: {}\n'.format(epoch, epoch_end)) + fw.write('Epoch: {}\tCurrent Eval Time: {}\n'.format(epoch, cur_epoch_eval_time)) + fw.write('Epoch: {}\tCurrent Acc(%) Score: {}\n'.format(epoch, acc)) + fw.write('Epoch: {}\tAll Train Samples: {}\n'.format(epoch, total_samples)) + fw.write('Epoch: {}\tAll Train Time: {}\n'.format(epoch, total_training_time)) + fw.write('Epoch: {}\tAll Eval Time: {}\n'.format(epoch, total_eval_time)) + fw.write('Epoch: {}\tAll Time: {}\n'.format(epoch, total_time)) + fw.flush() + if (fw is not None) and (hvd.rank() == 0): + fw.close() return model_dir @@ -352,8 +391,12 @@ def train(args, sess, epoch, image_list, label_list, index_dequeue_op, enqueue_o return False index_epoch = sess.run(index_dequeue_op) - label_epoch = np.array(label_list)[index_epoch] - image_epoch = np.array(image_list)[index_epoch] + + images_per_gpu = args.batch_size*args.epoch_size + start_index = hvd.rank() * images_per_gpu + end_index = (hvd.rank() + 1) * images_per_gpu + label_epoch = np.array(label_list)[index_epoch[start_index : end_index]] + image_epoch = np.array(image_list)[index_epoch[start_index : end_index]] # Enqueue one epoch of image paths and labels labels_array = np.expand_dims(np.array(label_epoch),1) @@ -362,10 +405,25 @@ def train(args, sess, epoch, image_list, label_list, index_dequeue_op, enqueue_o control_array = np.ones_like(labels_array) * control_value sess.run(enqueue_op, {image_paths_placeholder: image_paths_array, labels_placeholder: labels_array, control_placeholder: control_array}) + # one epoch开始时间 + epoch_start = time.time() + # one epoch样本数量 + epoch_samples = 0 + # 一个epoch训练时长 + cur_epoch_train_time = 0 + + global total_samples + global fw + global total_training_time + # Training loop train_time = 0 all_fps=[] while batch_number < args.epoch_size: + total_samples += (args.batch_size * hvd.size()) + epoch_samples += (args.batch_size * hvd.size()) + # one step开始时间 + train_start_time = time.time() start_time = time.time() feed_dict = {learning_rate_placeholder: lr, phase_train_placeholder:True, batch_size_placeholder:args.batch_size} tensor_list = [loss, train_op, step, reg_losses, prelogits, cross_entropy_mean, learning_rate, prelogits_norm, accuracy, prelogits_center_loss] @@ -374,8 +432,12 @@ def train(args, sess, epoch, image_list, label_list, index_dequeue_op, enqueue_o summary_writer.add_summary(summary_str, global_step=step_) else: loss_, _, step_, reg_losses_, prelogits_, cross_entropy_mean_, lr_, prelogits_norm_, accuracy_, center_loss_ = sess.run(tensor_list, feed_dict=feed_dict) - + duration = time.time() - start_time + cur_epoch_train_time += (time.time() - train_start_time) + if not math.isfinite(loss_): + print("Loss is {}, stopping training".format(loss_)) + sys.exit(1) stat['loss'][step_-1] = loss_ stat['center_loss'][step_-1] = center_loss_ stat['reg_loss'][step_-1] = np.sum(reg_losses_) @@ -394,6 +456,15 @@ def train(args, sess, epoch, image_list, label_list, index_dequeue_op, enqueue_o train_time += duration if hvd.rank() == 0: print("AVG FPS: ", sum(all_fps)/len(all_fps)) + # 计算训练总时间 + total_training_time += cur_epoch_train_time + if (fw is not None) and (hvd.rank() == 0): + fw.write('Epoch: {}\tCurrent Start Time: {}\n'.format(epoch, epoch_start)) + fw.write('Epoch: {}\tCurrent Train Samples: {}\n'.format(epoch, epoch_samples)) + fw.write('Epoch: {}\tCurrent Train Time: {}\n'.format(epoch, cur_epoch_train_time)) + fw.write('Epoch: {}\tCurrent Epoch FPS: {}\n'.format(epoch, epoch_samples / cur_epoch_train_time)) + fw.flush() + # Add validation loss and accuracy to summary summary = tf.Summary() #pylint: disable=maybe-no-member @@ -502,7 +573,7 @@ def evaluate(sess, enqueue_op, image_paths_placeholder, labels_placeholder, phas f.write('%d\t%.5f\t%.5f\n' % (step, np.mean(accuracy), val)) stat['lfw_accuracy'][epoch-1] = np.mean(accuracy) stat['lfw_valrate'][epoch-1] = val - + return np.mean(accuracy) def save_variables_and_metagraph(sess, saver, summary_writer, model_dir, model_name, step): # Save the model checkpoint @@ -516,7 +587,6 @@ def save_variables_and_metagraph(sess, saver, summary_writer, model_dir, model_n session = session._sess return session - # saver.save(sess, checkpoint_path, global_step=step, write_meta_graph=False) saver.save(get_session(sess), checkpoint_path, global_step=step, write_meta_graph=False) save_time_variables = time.time() - start_time print('Variables saved in %.2f seconds' % save_time_variables) @@ -558,7 +628,7 @@ def parse_arguments(argv): parser.add_argument('--image_size', type=int, help='Image size (height, width) in pixels.', default=160) parser.add_argument('--epoch_size', type=int, - help='Number of batches per epoch.', default=125) + help='Number of batches per epoch.', default=1000) parser.add_argument('--embedding_size', type=int, help='Dimensionality of the embedding.', default=128) parser.add_argument('--random_crop', @@ -632,8 +702,22 @@ def parse_arguments(argv): parser.add_argument('--lfw_subtract_mean', help='Subtract feature mean before calculating distance.', action='store_true') return parser.parse_args(argv) - + if __name__ == '__main__': + hvd.init() + time_fw = None + + # time_fw为存储时间日志的文件对象,文件绝对路径为'log/time.txt' + if hvd.rank() == 0: + print("[info] hvd.size = ", hvd.size()) + os.makedirs('log/', exist_ok=True) + time_fw = open(os.path.join('log/', f'time.txt'), 'a+', encoding='utf-8') + # time_fw写入程序开始执行的时间 + time_fw.write('Start Time: {:.6f}\n'.format(time.time())) main(parse_arguments(sys.argv[1:])) - + # 记录程序结束的时间 + if hvd.rank() == 0: + time_fw.write('End Time: {}\n'.format(time.time())) + time_fw.flush() + time_fw.close() diff --git a/cv/face/facenet/tensorflow/train8p.sh b/cv/face/facenet/tensorflow/train8p.sh new file mode 100644 index 0000000000000000000000000000000000000000..38cb23a71e50c8be6b62a9b1f44ec37e233adc63 --- /dev/null +++ b/cv/face/facenet/tensorflow/train8p.sh @@ -0,0 +1,49 @@ +#!/bin/bash +# Copyright (c) 2024, Shanghai Iluvatar CoreX Semiconductor Co., Ltd. +# All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); you may +# not use this file except in compliance with the License. You may obtain +# a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT +# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the +# License for the specific language governing permissions and limitations +# under the License. + +start_time=$(date +%s) + +horovodrun -np 16 --gloo python3 src/train_softmax_ddp.py \ + --logs_base_dir ./logs/facenet/ \ + --models_base_dir ./src/models/ \ + --data_dir ./data/webface_182_44 \ + --image_size 160 \ + --model_def models.inception_resnet_v1 \ + --lfw_dir ./data/lfw_data/lfw_160/ \ + --learning_rate -1 \ + --batch_size 128 \ + --optimizer ADAM \ + --max_nrof_epochs 500 \ + --keep_probability 0.8 \ + --random_flip \ + --random_crop \ + --use_fixed_image_standardization \ + --learning_rate_schedule_file ./data/learning_rate_schedule_classifier_casia_ddp.txt \ + --weight_decay 5e-4 \ + --embedding_size 512 \ + --lfw_distance_metric 1 \ + --lfw_use_flipped_images \ + --lfw_subtract_mean \ + --validation_set_split_ratio 0.01 \ + --validate_every_n_epochs 5 \ + --prelogits_norm_loss_factor 5e-4 \ + --gpu_memory_fraction 0.9 \ + --seed 43 \ + --epoch_size 200 2>&1 | tee train.log + +end_time=$(date +%s) +e2e_time=$(($end_time - $start_time)) +echo "end to end time: $e2e_time" >>total_time.log diff --git a/cv/face/facenet/tensorflow/train_facenet.sh b/cv/face/facenet/tensorflow/train_facenet.sh index 20c033b6ec027dadbe2308574b64b9d02172aba9..63a5b607fc5819bbeb3c7e27d7c6c52f4a81c9f4 100644 --- a/cv/face/facenet/tensorflow/train_facenet.sh +++ b/cv/face/facenet/tensorflow/train_facenet.sh @@ -1,4 +1,5 @@ -# Copyright (c) 2023, Shanghai Iluvatar CoreX Semiconductor Co., Ltd. +#!/bin/bash +# Copyright (c) 2024, Shanghai Iluvatar CoreX Semiconductor Co., Ltd. # All Rights Reserved. # # Licensed under the Apache License, Version 2.0 (the "License"); you may @@ -13,18 +14,17 @@ # License for the specific language governing permissions and limitations # under the License. - python3 src/train_softmax.py \ --logs_base_dir ./logs/facenet/ \ --models_base_dir ./src/models/ \ - --data_dir ./data/webface_160/ \ + --data_dir ./data/webface_182_44/ \ --image_size 160 \ --model_def models.inception_resnet_v1 \ --lfw_dir ./data/lfw_data/lfw_160/ \ --optimizer ADAM \ --learning_rate -1 \ - --batch_size 90 \ - --max_nrof_epochs 500\ + --batch_size 520 \ + --max_nrof_epochs 500 \ --keep_probability 0.8 \ --random_crop \ --random_flip \ diff --git a/cv/face/facenet/tensorflow/train_facenet_ddp.sh b/cv/face/facenet/tensorflow/train_facenet_ddp.sh new file mode 100644 index 0000000000000000000000000000000000000000..ce406a10de98f86e927fa63554f7a2286203b103 --- /dev/null +++ b/cv/face/facenet/tensorflow/train_facenet_ddp.sh @@ -0,0 +1,53 @@ +#!/bin/bash +# Copyright (c) 2024, Shanghai Iluvatar CoreX Semiconductor Co., Ltd. +# All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); you may +# not use this file except in compliance with the License. You may obtain +# a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT +# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the +# License for the specific language governing permissions and limitations +# under the License. + +EXIT_STATUS=0 +check_status() { + if ((${PIPESTATUS[0]} != 0)); then + EXIT_STATUS=1 + fi +} + +horovodrun -np 16 --gloo python3 src/train_softmax_ddp.py \ + --logs_base_dir ./logs/facenet/ \ + --models_base_dir ./src/models/ \ + --data_dir ./data/webface_182_44 \ + --image_size 160 \ + --model_def models.inception_resnet_v1 \ + --lfw_dir ./data/lfw_data/lfw_160/ \ + --learning_rate -1 \ + --batch_size 128 \ + --optimizer ADAM \ + --max_nrof_epochs 500 \ + --keep_probability 0.8 \ + --random_flip \ + --random_crop \ + --use_fixed_image_standardization \ + --learning_rate_schedule_file ./data/learning_rate_schedule_classifier_casia_ddp.txt \ + --weight_decay 5e-4 \ + --embedding_size 512 \ + --lfw_distance_metric 1 \ + --lfw_use_flipped_images \ + --lfw_subtract_mean \ + --validation_set_split_ratio 0.01 \ + --validate_every_n_epochs 5 \ + --prelogits_norm_loss_factor 5e-4 \ + --gpu_memory_fraction 0.9 \ + --seed 43 \ + --epoch_size 200 "$@" +check_status + +exit ${EXIT_STATUS}