diff --git a/CHANGELOG.md b/CHANGELOG.md index 8704ef4b..056f3179 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -10,6 +10,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ### Added - CVAT-3D: support lidar data on the server side () +- GPU support for Mask-RCNN and improvement in its deployment time () - CVAT-3D: Load all frames corresponding to the job instance () - Intelligent scissors with OpenCV javascript () @@ -23,7 +24,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - Updated HTTPS install README section (cleanup and described more robust deploy) - Logstash is improved for using with configurable elasticsearch outputs () -- Bumped nuclio version to 1.5.16 +- Bumped nuclio version to 1.5.16 () - All methods for interative segmentation accept negative points as well - Persistent queue added to logstash () @@ -36,7 +37,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - ### Fixed - +- More robust execution of nuclio GPU functions by limiting the GPU memory consumption per worker () - Kibana startup initialization () - The cursor jumps to the end of the line when renaming a task () - SSLCertVerificationError when remote source is used () diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 277d2bc5..5fb9af05 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -122,10 +122,10 @@ You develop CVAT under WSL (Windows subsystem for Linux) following next steps. ### DL models as serverless functions -Install [nuclio platform](https://github.com/nuclio/nuclio): +Follow this [guide](/cvat/apps/documentation/installation_automatic_annotation.md) to install Nuclio: - You have to install `nuctl` command line tool to build and deploy serverless - functions. Download [the latest release](https://github.com/nuclio/nuclio/blob/development/docs/reference/nuctl/nuctl.md#download). + functions. - The simplest way to explore Nuclio is to run its graphical user interface (GUI) of the Nuclio dashboard. All you need in order to run the dashboard is Docker. See [nuclio documentation](https://github.com/nuclio/nuclio#quick-start-steps) diff --git a/README.md b/README.md index ed2ca0fe..dd8b68c6 100644 --- a/README.md +++ b/README.md @@ -80,7 +80,7 @@ For more information about supported formats look at the | [f-BRS](/serverless/pytorch/saic-vul/fbrs/nuclio) | interactor | PyTorch | X | | | [Inside-Outside Guidance](/serverless/pytorch/shiyinzhang/iog/nuclio) | interactor | PyTorch | X | | | [Faster RCNN](/serverless/tensorflow/faster_rcnn_inception_v2_coco/nuclio) | detector | TensorFlow | X | X | -| [Mask RCNN](/serverless/tensorflow/matterport/mask_rcnn/nuclio) | detector | TensorFlow | X | | +| [Mask RCNN](/serverless/tensorflow/matterport/mask_rcnn/nuclio) | detector | TensorFlow | X | X | diff --git a/cvat/apps/documentation/installation.md b/cvat/apps/documentation/installation.md index fdd49faa..7bdbab73 100644 --- a/cvat/apps/documentation/installation.md +++ b/cvat/apps/documentation/installation.md @@ -290,7 +290,7 @@ docker-compose -f docker-compose.yml \ ### Semi-automatic and automatic annotation -Please follow [instructions](/cvat/apps/documentation/installation_automatic_annotation.md) +Please follow this [guide](/cvat/apps/documentation/installation_automatic_annotation.md). ### Stop all containers diff --git a/cvat/apps/documentation/installation_automatic_annotation.md b/cvat/apps/documentation/installation_automatic_annotation.md index e68ce8b4..8b69fa53 100644 --- a/cvat/apps/documentation/installation_automatic_annotation.md +++ b/cvat/apps/documentation/installation_automatic_annotation.md @@ -53,47 +53,80 @@ - See [deploy_cpu.sh](/serverless/deploy_cpu.sh) for more examples. #### GPU Support - - You will need to install Nvidia Container Toolkit and make sure your docker supports GPU. Follow [Nvidia docker instructions](https://www.tensorflow.org/install/docker#gpu_support). - Also you will need to add `--resource-limit nvidia.com/gpu=1` to the nuclio deployment command. + You will need to install [Nvidia Container Toolkit](https://www.tensorflow.org/install/docker#gpu_support). + Also you will need to add `--resource-limit nvidia.com/gpu=1 --triggers '{"myHttpTrigger": {"maxWorkers": 1}}'` to + the nuclio deployment command. You can increase the maxWorker if you have enough GPU memory. As an example, below will run on the GPU: ```bash - nuctl deploy tf-faster-rcnn-inception-v2-coco-gpu \ - --project-name cvat --path "serverless/tensorflow/faster_rcnn_inception_v2_coco/nuclio" --platform local \ - --base-image tensorflow/tensorflow:2.1.1-gpu \ - --desc "Faster RCNN from Tensorflow Object Detection GPU API" \ - --image cvat/tf.faster_rcnn_inception_v2_coco_gpu \ + nuctl deploy --project-name cvat \ + --path `pwd`/tensorflow/matterport/mask_rcnn/nuclio \ + --platform local --base-image tensorflow/tensorflow:1.15.5-gpu-py3 \ + --desc "GPU based implementation of Mask RCNN on Python 3, Keras, and TensorFlow." \ + --image cvat/tf.matterport.mask_rcnn_gpu + --triggers '{"myHttpTrigger": {"maxWorkers": 1}}' \ --resource-limit nvidia.com/gpu=1 ``` **Note:** - - - Since the model is loaded during deployment, the number of GPU functions you can deploy will be limited to your GPU memory. - + - The number of GPU deployed functions will be limited to your GPU memory. - See [deploy_gpu.sh](/serverless/deploy_gpu.sh) script for more examples. -####Debugging Nuclio Functions: +**Troubleshooting Nuclio Functions:** - You can open nuclio dashboard at [localhost:8070](http://localhost:8070). Make sure status of your functions are up and running without any error. +- Test your deployed DL model as a serverless function. The command below should work on Linux and Mac OS. + + ```bash + image=$(curl https://upload.wikimedia.org/wikipedia/en/7/7d/Lenna_%28test_image%29.png --output - | base64 | tr -d '\n') + cat << EOF > /tmp/input.json + {"image": "$image"} + EOF + cat /tmp/input.json | nuctl invoke openvino.omz.public.yolo-v3-tf -c 'application/json' + ``` -- To check for internal server errors, run `docker ps -a` to see the list of containers. Find the container that you are interested, e.g. `nuclio-nuclio-tf-faster-rcnn-inception-v2-coco-gpu`. Then check its logs by +
```bash - docker logs + 20.07.17 12:07:44.519 nuctl.platform.invoker (I) Executing function {"method": "POST", "url": "http://:57308", "headers": {"Content-Type":["application/json"],"X-Nuclio-Log-Level":["info"],"X-Nuclio-Target":["openvino.omz.public.yolo-v3-tf"]}} + 20.07.17 12:07:45.275 nuctl.platform.invoker (I) Got response {"status": "200 OK"} + 20.07.17 12:07:45.275 nuctl (I) >>> Start of function logs + 20.07.17 12:07:45.275 ino.omz.public.yolo-v3-tf (I) Run yolo-v3-tf model {"worker_id": "0", "time": 1594976864570.9353} + 20.07.17 12:07:45.275 nuctl (I) <<< End of function logs + + > Response headers: + Date = Fri, 17 Jul 2020 09:07:45 GMT + Content-Type = application/json + Content-Length = 100 + Server = nuclio + + > Response body: + [ + { + "confidence": "0.9992254", + "label": "person", + "points": [ + 39, + 124, + 408, + 512 + ], + "type": "rectangle" + } + ] ``` +
+- To check for internal server errors, run `docker ps -a` to see the list of containers. + Find the container that you are interested, e.g., `nuclio-nuclio-tf-faster-rcnn-inception-v2-coco-gpu`. + Then check its logs by `docker logs ` e.g., - ```bash docker logs nuclio-nuclio-tf-faster-rcnn-inception-v2-coco-gpu ``` -- If you would like to debug a code inside a container, you can use vscode to directly attach to a container [instructions](https://code.visualstudio.com/docs/remote/attach-container). To apply your changes, make sure to restart the container. - +- To debug a code inside a container, you can use vscode to attach to a container [instructions](https://code.visualstudio.com/docs/remote/attach-container). + To apply your changes, make sure to restart the container. ```bash docker restart ``` - - > **⚠ WARNING:** - > Do not use nuclio dashboard to stop the container because with any modifications, it rebuilds the container and you will lose your changes. diff --git a/serverless/deploy_gpu.sh b/serverless/deploy_gpu.sh index f0b89649..3845a113 100755 --- a/serverless/deploy_gpu.sh +++ b/serverless/deploy_gpu.sh @@ -8,8 +8,18 @@ nuctl create project cvat nuctl deploy --project-name cvat \ --path "$SCRIPT_DIR/tensorflow/faster_rcnn_inception_v2_coco/nuclio" \ --platform local --base-image tensorflow/tensorflow:2.1.1-gpu \ - --desc "Faster RCNN from Tensorflow Object Detection GPU API" \ + --desc "GPU based Faster RCNN from Tensorflow Object Detection API" \ --image cvat/tf.faster_rcnn_inception_v2_coco_gpu \ - --resource-limit nvidia.com/gpu=1 + --triggers '{"myHttpTrigger": {"maxWorkers": 1}}' \ + --resource-limit nvidia.com/gpu=1 --verbose + +nuctl deploy --project-name cvat \ + --path "$SCRIPT_DIR/tensorflow/matterport/mask_rcnn/nuclio" \ + --platform local --base-image tensorflow/tensorflow:1.15.5-gpu-py3 \ + --desc "GPU based implementation of Mask RCNN on Python 3, Keras, and TensorFlow." \ + --image cvat/tf.matterport.mask_rcnn_gpu\ + --triggers '{"myHttpTrigger": {"maxWorkers": 1}}' \ + --resource-limit nvidia.com/gpu=1 --verbose + nuctl get function diff --git a/serverless/tensorflow/faster_rcnn_inception_v2_coco/nuclio/model_loader.py b/serverless/tensorflow/faster_rcnn_inception_v2_coco/nuclio/model_loader.py index 74aa85bc..36b5188e 100644 --- a/serverless/tensorflow/faster_rcnn_inception_v2_coco/nuclio/model_loader.py +++ b/serverless/tensorflow/faster_rcnn_inception_v2_coco/nuclio/model_loader.py @@ -15,9 +15,10 @@ class ModelLoader: serialized_graph = fid.read() od_graph_def.ParseFromString(serialized_graph) tf.import_graph_def(od_graph_def, name='') - - config = tf.ConfigProto() - config.gpu_options.allow_growth = True + gpu_fraction = 0.333 + gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=gpu_fraction, + allow_growth=True) + config = tf.ConfigProto(gpu_options=gpu_options) self.session = tf.Session(graph=detection_graph, config=config) self.image_tensor = detection_graph.get_tensor_by_name('image_tensor:0') diff --git a/serverless/tensorflow/matterport/mask_rcnn/nuclio/function.yaml b/serverless/tensorflow/matterport/mask_rcnn/nuclio/function.yaml index 1d3ee39b..0de22d1b 100644 --- a/serverless/tensorflow/matterport/mask_rcnn/nuclio/function.yaml +++ b/serverless/tensorflow/matterport/mask_rcnn/nuclio/function.yaml @@ -102,22 +102,19 @@ spec: value: /opt/nuclio/Mask_RCNN build: image: cvat/tf.matterport.mask_rcnn - baseImage: tensorflow/tensorflow:2.1.0-py3 + baseImage: tensorflow/tensorflow:1.13.1-py3 directives: postCopy: - kind: WORKDIR value: /opt/nuclio - kind: RUN - value: apt update && apt install --no-install-recommends -y git curl libsm6 libxext6 libgl1-mesa-glx + value: apt update && apt install --no-install-recommends -y git curl - kind: RUN - value: git clone https://github.com/matterport/Mask_RCNN.git + value: git clone --depth 1 https://github.com/matterport/Mask_RCNN.git - kind: RUN value: curl -L https://github.com/matterport/Mask_RCNN/releases/download/v2.0/mask_rcnn_coco.h5 -o Mask_RCNN/mask_rcnn_coco.h5 - kind: RUN - value: pip3 install scipy cython matplotlib scikit-image opencv-python-headless h5py \ - imgaug IPython[all] tensorflow==1.13.1 keras==2.1.0 pillow pyyaml - - kind: RUN - value: pip3 install pycocotools + value: pip3 install numpy cython pyyaml keras==2.1.0 scikit-image Pillow triggers: myHttpTrigger: diff --git a/serverless/tensorflow/matterport/mask_rcnn/nuclio/model_loader.py b/serverless/tensorflow/matterport/mask_rcnn/nuclio/model_loader.py index 1b2a4cd8..66be2a2c 100644 --- a/serverless/tensorflow/matterport/mask_rcnn/nuclio/model_loader.py +++ b/serverless/tensorflow/matterport/mask_rcnn/nuclio/model_loader.py @@ -1,4 +1,4 @@ -# Copyright (C) 2018-2020 Intel Corporation +# Copyright (C) 2020-2021 Intel Corporation # # SPDX-License-Identifier: MIT @@ -6,24 +6,13 @@ import os import numpy as np import sys from skimage.measure import find_contours, approximate_polygon - -# workaround for tf.placeholder() is not compatible with eager execution -# https://github.com/tensorflow/tensorflow/issues/18165 import tensorflow as tf -tf.compat.v1.disable_eager_execution() -#import tensorflow.compat.v1 as tf -# tf.disable_v2_behavior() - -# The directory should contain a clone of -# https://github.com/matterport/Mask_RCNN repository and -# downloaded mask_rcnn_coco.h5 model. MASK_RCNN_DIR = os.path.abspath(os.environ.get('MASK_RCNN_DIR')) if MASK_RCNN_DIR: sys.path.append(MASK_RCNN_DIR) # To find local version of the library - sys.path.append(os.path.join(MASK_RCNN_DIR, 'samples/coco')) - from mrcnn import model as modellib -import coco +from mrcnn.config import Config + class ModelLoader: def __init__(self, labels): @@ -31,12 +20,21 @@ class ModelLoader: if COCO_MODEL_PATH is None: raise OSError('Model path env not found in the system.') - class InferenceConfig(coco.CocoConfig): - # Set batch size to 1 since we'll be running inference on - # one image at a time. Batch size = GPU_COUNT * IMAGES_PER_GPU + class InferenceConfig(Config): + NAME = "coco" + NUM_CLASSES = 1 + 80 # COCO has 80 classes GPU_COUNT = 1 IMAGES_PER_GPU = 1 + # Limit gpu memory to 30% to allow for other nuclio gpu functions. Increase fraction as you like + import keras.backend.tensorflow_backend as ktf + def get_session(gpu_fraction=0.333): + gpu_options = tf.GPUOptions( + per_process_gpu_memory_fraction=gpu_fraction, + allow_growth=True) + return tf.Session(config=tf.ConfigProto(gpu_options=gpu_options)) + + ktf.set_session(get_session()) # Print config details self.config = InferenceConfig() self.config.display() @@ -54,7 +52,7 @@ class ModelLoader: for i in range(len(output["rois"])): score = output["scores"][i] class_id = output["class_ids"][i] - mask = output["masks"][:,:,i] + mask = output["masks"][:, :, i] if score >= threshold: mask = mask.astype(np.uint8) contours = find_contours(mask, MASK_THRESHOLD) @@ -74,6 +72,4 @@ class ModelLoader: "type": "polygon", }) - return results - - + return results \ No newline at end of file