Changed "prepare data on the fly" functionality (#2217)

* Added ability to upload meta information with video & some fixes * Added documentation for data on the fly preparation * Added ability to prepare meta information for video manually * fix * style: fix codacy issues * Refactoring * docs: add optional parameter * Add test * Add license header * Update CHANGELOG Co-authored-by: Boris Sekachev <boris.sekachev@yandex.ru>
5 years ago · 072482ffe8
parent e9552f84f3
commit 072482ffe8
11 changed files with 292 additions and 38 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@ -19,6 +19,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 It supports regular navigation, searching a frame according to annotations
 filters and searching the nearest frame without any annotations (<https://github.com/openvinotoolkit/cvat/pull/2221>)
 - MacOS users notes in CONTRIBUTING.md
+- Ability to prepare meta information manually (<https://github.com/openvinotoolkit/cvat/pull/2217>)
+- Ability to upload prepared meta information along with a video when creating a task (<https://github.com/openvinotoolkit/cvat/pull/2217>)
 - Optional chaining plugin for cvat-canvas and cvat-ui (<https://github.com/openvinotoolkit/cvat/pull/2249>)

 ### Changed
@ -46,6 +48,7 @@ filters and searching the nearest frame without any annotations (<https://github
 - Fixed case when a task with 0 jobs is shown as "Completed" in UI (<https://github.com/openvinotoolkit/cvat/pull/2200>)
 - Fixed use case when UI throws exception: Cannot read property 'objectType' of undefined #2053 (<https://github.com/openvinotoolkit/cvat/pull/2203>)
 - Fixed use case when logs could be saved twice or more times #2202 (<https://github.com/openvinotoolkit/cvat/pull/2203>)
+- Fixed issues from #2112 (<https://github.com/openvinotoolkit/cvat/pull/2217>)

 ### Security
 -
--- a/cvat/apps/documentation/data_on_fly.md
+++ b/cvat/apps/documentation/data_on_fly.md
@ -0,0 +1,35 @@
+# Data preparation on the fly
+
+## Description
+Data on the fly processing is a way of working with data, the main idea of which is as follows:
+Minimum necessary meta information is collected, when task is created.
+This meta information allows in the future to create a necessary chunks when receiving a request from a client.
+
+Generated chunks are stored in a cache of limited size with a policy of evicting less popular items.
+
+When a request received from a client, the required chunk is searched for in the cache.
+If the chunk does not exist yet, it is created using a prepared meta information and then put into the cache.
+
+This method of working with data  allows:
+- reduce the task creation time.
+- store data in a cache of limited size with a policy of evicting less popular items.
+
+## Prepare meta information
+Different meta information is collected for different types of uploaded data.
+### Video
+For video, this is a valid mapping of key frame numbers and their timestamps. This information is saved to `meta_info.txt`.
+
+Unfortunately, this method will not work for all videos with valid meta information.
+If there are not enough keyframes in the video for smooth video decoding, the task will be created in the old way.
+
+#### Uploading meta information along with data
+
+When creating a task, you can upload a file with meta information along with the video,
+which will further reduce the time for creating a task.
+You can see how to prepare meta information [here](/utils/prepare_meta_information/README.md).
+
+It is worth noting that the generated file also contains information about the number of frames in the video at the end.
+
+### Images
+Mapping of chunk number and paths to images that should enter the chunk
+is saved at the time of creating a task in a files `dummy_{chunk_number}.txt`
--- a/cvat/apps/documentation/static/documentation/images/image128.jpg
+++ b/cvat/apps/documentation/static/documentation/images/image128.jpg
--- a/cvat/apps/documentation/static/documentation/images/image128_use_cache.jpg
+++ b/cvat/apps/documentation/static/documentation/images/image128_use_cache.jpg
--- a/cvat/apps/documentation/user_guide.md
+++ b/cvat/apps/documentation/user_guide.md
@ -141,17 +141,24 @@ Go to the [Django administration panel](http://localhost:8080/admin). There you
    **Select files**. Press tab ``My computer`` to choose some files for annotation from your PC.
    If you select tab ``Connected file share`` you can choose files for annotation from your network.
    If you select `` Remote source`` , you'll see a field where you can enter a list of URLs (one URL per line).
+    If you upload a video data and select ``Use cache`` option, you can along with the video file attach a file with meta information.
+    You can find how to prepare it [here](/utils/prepare_meta_information/README.md).

      ![](static/documentation/images/image127.jpg)

    #### Advanced configuration

-      ![](static/documentation/images/image128.jpg)
+      ![](static/documentation/images/image128_use_cache.jpg)

    **Z-Order**. Defines the order on drawn polygons. Check the box for enable layered displaying.

    **Use zip chunks**. Force to use zip chunks as compressed data. Actual for videos only.

+    **Use cache**. Defines how to work with data. Select the checkbox to switch to the "on-the-fly data processing",
+    which will reduce the task creation time (by preparing chunks when requests are received)
+    and store data in a cache of limited size with a policy of evicting less popular items.
+    See more [here](/cvat/apps/documentation/data_on_fly.md).
+
    **Image Quality**. Use this option to specify quality of uploaded images.
    The option helps to load high resolution datasets faster.
    Use the value from ``5`` (almost completely compressed images) to ``100`` (not compressed images).
--- a/cvat/apps/engine/prepare.py
+++ b/cvat/apps/engine/prepare.py
@ -3,7 +3,9 @@
 # SPDX-License-Identifier: MIT

 import av
+from collections import OrderedDict
 import hashlib
+import os

 class WorkWithVideo:
    def __init__(self, **kwargs):
@ -72,27 +74,30 @@ class PrepareInfo(WorkWithVideo):
    def get_task_size(self):
        return self.frames

+    @property
+    def frame_sizes(self):
+        frame = next(iter(self.key_frames.values()))
+        return (frame.width, frame.height)
+
+    def check_key_frame(self, container, video_stream, key_frame):
+        for packet in container.demux(video_stream):
+            for frame in packet.decode():
+                if md5_hash(frame) != md5_hash(key_frame[1]) or frame.pts != key_frame[1].pts:
+                    self.key_frames.pop(key_frame[0])
+                return
+
    def check_seek_key_frames(self):
        container = self._open_video_container(self.source_path, mode='r')
        video_stream = self._get_video_stream(container)

        key_frames_copy = self.key_frames.copy()

-        for index, key_frame in key_frames_copy.items():
-            container.seek(offset=key_frame.pts, stream=video_stream)
-            flag = True
-            for packet in container.demux(video_stream):
-                for frame in packet.decode():
-                    if md5_hash(frame) != md5_hash(key_frame) or frame.pts != key_frame.pts:
-                        self.key_frames.pop(index)
-                    flag = False
-                    break
-                if not flag:
-                    break
+        for key_frame in key_frames_copy.items():
+            container.seek(offset=key_frame[1].pts, stream=video_stream)
+            self.check_key_frame(container, video_stream, key_frame)

-        #TODO: correct ratio of number of frames to keyframes
-        if len(self.key_frames) == 0:
-            raise Exception('Too few keyframes')
+    def check_frames_ratio(self, chunk_size):
+        return (len(self.key_frames) and (self.frames // len(self.key_frames)) <= 2 * chunk_size)

    def save_key_frames(self):
        container = self._open_video_container(self.source_path, mode='r')
@ -152,4 +157,79 @@ class PrepareInfo(WorkWithVideo):
                    self._close_video_container(container)
                    return

-        self._close_video_container(container)
+        self._close_video_container(container)
+
+class UploadedMeta(PrepareInfo):
+    def __init__(self, **kwargs):
+        super().__init__(**kwargs)
+
+        with open(self.meta_path, 'r') as meta_file:
+            lines = meta_file.read().strip().split('\n')
+            self.frames = int(lines.pop())
+
+            key_frames = {int(line.split()[0]): int(line.split()[1]) for line in lines}
+            self.key_frames = OrderedDict(sorted(key_frames.items(), key=lambda x: x[0]))
+
+    @property
+    def frame_sizes(self):
+        container = self._open_video_container(self.source_path, 'r')
+        video_stream = self._get_video_stream(container)
+        container.seek(offset=next(iter(self.key_frames.values())), stream=video_stream)
+        for packet in container.demux(video_stream):
+            for frame in packet.decode():
+                self._close_video_container(container)
+                return (frame.width, frame.height)
+
+    def save_meta_info(self):
+        with open(self.meta_path, 'w') as meta_file:
+            for index, pts in self.key_frames.items():
+                meta_file.write('{} {}\n'.format(index, pts))
+
+    def check_key_frame(self, container, video_stream, key_frame):
+        for packet in container.demux(video_stream):
+            for frame in packet.decode():
+                assert frame.pts == key_frame[1], "Uploaded meta information does not match the video"
+                return
+
+    def check_seek_key_frames(self):
+        container = self._open_video_container(self.source_path, mode='r')
+        video_stream = self._get_video_stream(container)
+
+        for key_frame in self.key_frames.items():
+            container.seek(offset=key_frame[1], stream=video_stream)
+            self.check_key_frame(container, video_stream, key_frame)
+
+        self._close_video_container(container)
+
+    def check_frames_numbers(self):
+        container = self._open_video_container(self.source_path, mode='r')
+        video_stream = self._get_video_stream(container)
+        # not all videos contain information about numbers of frames
+        if video_stream.frames:
+            self._close_video_container(container)
+            assert video_stream.frames == self.frames, "Uploaded meta information does not match the video"
+            return
+        self._close_video_container(container)
+
+def prepare_meta(media_file, upload_dir=None, meta_dir=None, chunk_size=None):
+    paths = {
+        'source_path': os.path.join(upload_dir, media_file) if upload_dir else media_file,
+        'meta_path': os.path.join(meta_dir, 'meta_info.txt') if meta_dir else os.path.join(upload_dir, 'meta_info.txt'),
+    }
+    analyzer = AnalyzeVideo(source_path=paths.get('source_path'))
+    analyzer.check_type_first_frame()
+    analyzer.check_video_timestamps_sequences()
+
+    meta_info = PrepareInfo(source_path=paths.get('source_path'),
+                            meta_path=paths.get('meta_path'))
+    meta_info.save_key_frames()
+    meta_info.check_seek_key_frames()
+    meta_info.save_meta_info()
+    smooth_decoding = meta_info.check_frames_ratio(chunk_size) if chunk_size else None
+    return (meta_info, smooth_decoding)
+
+def prepare_meta_for_upload(func, *args):
+    meta_info, smooth_decoding = func(*args)
+    with open(meta_info.meta_path, 'a') as meta_file:
+        meta_file.write(str(meta_info.get_task_size()))
+    return smooth_decoding
--- a/cvat/apps/engine/task.py
+++ b/cvat/apps/engine/task.py
@ -6,6 +6,7 @@
 import itertools
 import os
 import sys
+from re import findall
 import rq
 import shutil
 from traceback import print_exception
@ -16,6 +17,7 @@ from urllib import request as urlrequest
 from cvat.apps.engine.media_extractors import get_mime, MEDIA_TYPES, Mpeg4ChunkWriter, ZipChunkWriter, Mpeg4CompressedChunkWriter, ZipCompressedChunkWriter
 from cvat.apps.engine.models import DataChoice, StorageMethodChoice
 from cvat.apps.engine.utils import av_scan_paths
+from cvat.apps.engine.prepare import prepare_meta

 import django_rq
 from django.conf import settings
@ -24,7 +26,6 @@ from distutils.dir_util import copy_tree

 from . import models
 from .log import slogger
-from .prepare import PrepareInfo, AnalyzeVideo

 ############################# Low Level server API

@ -105,7 +106,7 @@ def _save_task_to_db(db_task):
    db_task.data.save()
    db_task.save()

-def _count_files(data):
+def _count_files(data, meta_info_file=None):
    share_root = settings.SHARE_ROOT
    server_files = []

@ -132,11 +133,12 @@ def _count_files(data):
            mime = get_mime(full_path)
            if mime in counter:
                counter[mime].append(rel_path)
+            elif findall('meta_info.txt$', rel_path):
+                meta_info_file.append(rel_path)
            else:
                slogger.glob.warn("Skip '{}' file (its mime type doesn't "
                    "correspond to a video or an image file)".format(full_path))

-
    counter = { media_type: [] for media_type in MEDIA_TYPES.keys() }

    count_files(
@ -151,7 +153,7 @@ def _count_files(data):

    return counter

-def _validate_data(counter):
+def _validate_data(counter, meta_info_file=None):
    unique_entries = 0
    multiple_entries = 0
    for media_type, media_config in MEDIA_TYPES.items():
@ -161,6 +163,9 @@ def _validate_data(counter):
            else:
                multiple_entries += len(counter[media_type])

+            if meta_info_file and media_type != 'video':
+                raise Exception('File with meta information can only be uploaded with video file')
+
    if unique_entries == 1 and multiple_entries > 0 or unique_entries > 1:
        unique_types = ', '.join([k for k, v in MEDIA_TYPES.items() if v['unique']])
        multiply_types = ', '.join([k for k, v in MEDIA_TYPES.items() if not v['unique']])
@ -219,8 +224,12 @@ def _create_thread(tid, data):
    if data['remote_files']:
        data['remote_files'] = _download_data(data['remote_files'], upload_dir)

-    media = _count_files(data)
-    media, task_mode = _validate_data(media)
+    meta_info_file = []
+    media = _count_files(data, meta_info_file)
+    media, task_mode = _validate_data(media, meta_info_file)
+    if meta_info_file:
+        assert settings.USE_CACHE and db_data.storage_method == StorageMethodChoice.CACHE, \
+            "File with meta information can be uploaded if 'Use cache' option is also selected"

    if data['server_files']:
        _copy_data_from_share(data['server_files'], upload_dir)
@ -290,25 +299,51 @@ def _create_thread(tid, data):

            if task_mode == MEDIA_TYPES['video']['mode']:
                try:
-                    analyzer = AnalyzeVideo(source_path=os.path.join(upload_dir, media_files[0]))
-                    analyzer.check_type_first_frame()
-                    analyzer.check_video_timestamps_sequences()
-
-                    meta_info = PrepareInfo(source_path=os.path.join(upload_dir, media_files[0]),
-                                            meta_path=os.path.join(upload_dir, 'meta_info.txt'))
-                    meta_info.save_key_frames()
-                    meta_info.check_seek_key_frames()
-                    meta_info.save_meta_info()
+                    if meta_info_file:
+                        try:
+                            from cvat.apps.engine.prepare import UploadedMeta
+                            if os.path.split(meta_info_file[0])[0]:
+                                os.replace(
+                                    os.path.join(upload_dir, meta_info_file[0]),
+                                    db_data.get_meta_path()
+                                )
+                            meta_info = UploadedMeta(source_path=os.path.join(upload_dir, media_files[0]),
+                                                     meta_path=db_data.get_meta_path())
+                            meta_info.check_seek_key_frames()
+                            meta_info.check_frames_numbers()
+                            meta_info.save_meta_info()
+                            assert len(meta_info.key_frames) > 0, 'No key frames.'
+                        except Exception as ex:
+                            base_msg = str(ex) if isinstance(ex, AssertionError) else \
+                                'Invalid meta information was upload.'
+                            job.meta['status'] = '{} Start prepare valid meta information.'.format(base_msg)
+                            job.save_meta()
+                            meta_info, smooth_decoding = prepare_meta(
+                                media_file=media_files[0],
+                                upload_dir=upload_dir,
+                                chunk_size=db_data.chunk_size
+                            )
+                            assert smooth_decoding == True, 'Too few keyframes for smooth video decoding.'
+                    else:
+                        meta_info, smooth_decoding = prepare_meta(
+                            media_file=media_files[0],
+                            upload_dir=upload_dir,
+                            chunk_size=db_data.chunk_size
+                        )
+                        assert smooth_decoding == True, 'Too few keyframes for smooth video decoding.'

                    all_frames = meta_info.get_task_size()
+                    video_size = meta_info.frame_sizes
+
                    db_data.size = len(range(db_data.start_frame, min(data['stop_frame'] + 1 if data['stop_frame'] else all_frames, all_frames), db_data.get_frame_step()))
                    video_path = os.path.join(upload_dir, media_files[0])
-                    frame = meta_info.key_frames.get(next(iter(meta_info.key_frames)))
-                    video_size = (frame.width, frame.height)
-
-                except Exception:
+                except Exception as ex:
                    db_data.storage_method = StorageMethodChoice.FILE_SYSTEM
-
+                    if os.path.exists(db_data.get_meta_path()):
+                        os.remove(db_data.get_meta_path())
+                    base_msg = str(ex) if isinstance(ex, AssertionError) else "Uploaded video does not support a quick way of task creating."
+                    job.meta['status'] = "{} The task will be created using the old method".format(base_msg)
+                    job.save_meta()
            else:#images,archive
                db_data.size = len(extractor)

--- a/cvat/apps/engine/tests/_test_rest_api.py
+++ b/cvat/apps/engine/tests/_test_rest_api.py
@ -81,6 +81,7 @@ from rest_framework.test import APIClient, APITestCase

 from cvat.apps.engine.models import (AttributeType, Data, Job, Project,
    Segment, StatusChoice, Task, StorageMethodChoice)
+from cvat.apps.engine.prepare import prepare_meta, prepare_meta_for_upload

 _setUpModule()

@ -1644,6 +1645,8 @@ class TaskDataAPITestCase(APITestCase):
        path = os.path.join(settings.SHARE_ROOT, "videos", "test_video_1.mp4")
        os.remove(path)

+        path = os.path.join(settings.SHARE_ROOT, "videos", "meta_info.txt")
+        os.remove(path)

    def _run_api_v1_tasks_id_data_post(self, tid, user, data):
        with ForceLogin(user, self.client):
@ -2057,6 +2060,31 @@ class TaskDataAPITestCase(APITestCase):
        self._test_api_v1_tasks_id_data_spec(user, task_spec, task_data,
            self.ChunkType.IMAGESET, self.ChunkType.IMAGESET, image_sizes)

+        prepare_meta_for_upload(
+            prepare_meta,
+            os.path.join(settings.SHARE_ROOT, "videos", "test_video_1.mp4"),
+            os.path.join(settings.SHARE_ROOT, "videos")
+        )
+        task_spec = {
+            "name": "my video with meta info task #11",
+            "overlap": 0,
+            "segment_size": 0,
+            "labels": [
+                {"name": "car"},
+                {"name": "person"},
+            ]
+        }
+        task_data = {
+            "server_files[0]": os.path.join("videos", "test_video_1.mp4"),
+            "server_files[1]": os.path.join("videos", "meta_info.txt"),
+            "image_quality": 70,
+            "use_cache": True
+        }
+        image_sizes = self._image_sizes[task_data['server_files[0]']]
+
+        self._test_api_v1_tasks_id_data_spec(user, task_spec, task_data, self.ChunkType.VIDEO,
+                                            self.ChunkType.VIDEO, image_sizes, StorageMethodChoice.CACHE)
+
    def test_api_v1_tasks_id_data_admin(self):
        self._test_api_v1_tasks_id_data(self.admin)

--- a/cvat/settings/base.py
+++ b/cvat/settings/base.py
@ -427,14 +427,14 @@ RESTRICTIONS = {
        ),
 }

+# http://www.grantjenks.com/docs/diskcache/tutorial.html#djangocache
 CACHES = {
   'default' : {
       'BACKEND' : 'diskcache.DjangoCache',
       'LOCATION' : CACHE_ROOT,
       'TIMEOUT' : None,
       'OPTIONS' : {
-            #'statistics' :True,
-            'size_limit' : 2 ** 40, # 1 тб
+            'size_limit' : 2 ** 40, # 1 Tb
       }
   }
 }
--- a/utils/prepare_meta_information/README.md
+++ b/utils/prepare_meta_information/README.md
@ -0,0 +1,29 @@
+# Simple command line for prepare meta information for video data
+
+**Usage**
+```bash
+usage: prepare.py [-h] [-chunk_size CHUNK_SIZE] video_file meta_directory
+
+positional arguments:
+  video_file            Path to video file
+  meta_directory        Directory where the file with meta information will be saved
+
+optional arguments:
+  -h, --help            show this help message and exit
+  -chunk_size CHUNK_SIZE
+                        Chunk size that will be specified when creating the task with specified video and generated meta information
+```
+
+**NOTE**: For smooth video decoding, the `chunk size` must be greater than or equal to the ratio of number of frames
+to a number of key frames.
+You can understand the approximate `chunk size` by preparing and looking at the file with meta information.
+
+**NOTE**: If ratio of number of frames to number of key frames is small compared to the `chunk size`,
+then when creating a task with prepared meta information, you should expect that the waiting time for some chunks
+will be longer than the waiting time for other chunks. (At the first iteration, when there is no chunk in the cache)
+
+**Examples**
+
+```bash
+python prepare.py ~/Documents/some_video.mp4 ~/Documents
+```
--- a/utils/prepare_meta_information/prepare.py
+++ b/utils/prepare_meta_information/prepare.py
@ -0,0 +1,37 @@
+# Copyright (C) 2020 Intel Corporation
+#
+# SPDX-License-Identifier: MIT
+import argparse
+import sys
+import os
+
+def get_args():
+    parser = argparse.ArgumentParser()
+    parser.add_argument('video_file',
+        type=str,
+        help='Path to video file')
+    parser.add_argument('meta_directory',
+        type=str,
+        help='Directory where the file with meta information will be saved')
+    parser.add_argument('-chunk_size',
+        type=int,
+        help='Chunk size that will be specified when creating the task with specified video and generated meta information')
+
+    return parser.parse_args()
+
+def main():
+    args = get_args()
+    try:
+        smooth_decoding = prepare_meta_for_upload(prepare_meta, args.video_file, None, args.meta_directory, args.chunk_size)
+        print('Meta information for video has been prepared')
+
+        if smooth_decoding != None and not smooth_decoding:
+            print('NOTE: prepared meta information contains too few key frames for smooth decoding.')
+    except Exception:
+        print('Impossible to prepare meta information')
+
+if __name__ == "__main__":
+    base_dir = os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
+    sys.path.append(base_dir)
+    from cvat.apps.engine.prepare import prepare_meta, prepare_meta_for_upload
+    main()