diff --git a/CHANGELOG.md b/CHANGELOG.md index 0700f075..87dc74e1 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -19,6 +19,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 It supports regular navigation, searching a frame according to annotations filters and searching the nearest frame without any annotations () - MacOS users notes in CONTRIBUTING.md +- Ability to prepare meta information manually () +- Ability to upload prepared meta information along with a video when creating a task () - Optional chaining plugin for cvat-canvas and cvat-ui () ### Changed @@ -46,6 +48,7 @@ filters and searching the nearest frame without any annotations () - Fixed use case when UI throws exception: Cannot read property 'objectType' of undefined #2053 () - Fixed use case when logs could be saved twice or more times #2202 () +- Fixed issues from #2112 () ### Security - diff --git a/cvat/apps/documentation/data_on_fly.md b/cvat/apps/documentation/data_on_fly.md new file mode 100644 index 00000000..87c99264 --- /dev/null +++ b/cvat/apps/documentation/data_on_fly.md @@ -0,0 +1,35 @@ +# Data preparation on the fly + +## Description +Data on the fly processing is a way of working with data, the main idea of which is as follows: +Minimum necessary meta information is collected, when task is created. +This meta information allows in the future to create a necessary chunks when receiving a request from a client. + +Generated chunks are stored in a cache of limited size with a policy of evicting less popular items. + +When a request received from a client, the required chunk is searched for in the cache. +If the chunk does not exist yet, it is created using a prepared meta information and then put into the cache. + +This method of working with data allows: +- reduce the task creation time. +- store data in a cache of limited size with a policy of evicting less popular items. + +## Prepare meta information +Different meta information is collected for different types of uploaded data. +### Video +For video, this is a valid mapping of key frame numbers and their timestamps. This information is saved to `meta_info.txt`. + +Unfortunately, this method will not work for all videos with valid meta information. +If there are not enough keyframes in the video for smooth video decoding, the task will be created in the old way. + +#### Uploading meta information along with data + +When creating a task, you can upload a file with meta information along with the video, +which will further reduce the time for creating a task. +You can see how to prepare meta information [here](/utils/prepare_meta_information/README.md). + +It is worth noting that the generated file also contains information about the number of frames in the video at the end. + +### Images +Mapping of chunk number and paths to images that should enter the chunk +is saved at the time of creating a task in a files `dummy_{chunk_number}.txt` diff --git a/cvat/apps/documentation/static/documentation/images/image128.jpg b/cvat/apps/documentation/static/documentation/images/image128.jpg deleted file mode 100644 index e1e196d5..00000000 Binary files a/cvat/apps/documentation/static/documentation/images/image128.jpg and /dev/null differ diff --git a/cvat/apps/documentation/static/documentation/images/image128_use_cache.jpg b/cvat/apps/documentation/static/documentation/images/image128_use_cache.jpg new file mode 100644 index 00000000..31fae43d Binary files /dev/null and b/cvat/apps/documentation/static/documentation/images/image128_use_cache.jpg differ diff --git a/cvat/apps/documentation/user_guide.md b/cvat/apps/documentation/user_guide.md index e44bbc75..690f14f7 100644 --- a/cvat/apps/documentation/user_guide.md +++ b/cvat/apps/documentation/user_guide.md @@ -141,17 +141,24 @@ Go to the [Django administration panel](http://localhost:8080/admin). There you **Select files**. Press tab ``My computer`` to choose some files for annotation from your PC. If you select tab ``Connected file share`` you can choose files for annotation from your network. If you select `` Remote source`` , you'll see a field where you can enter a list of URLs (one URL per line). + If you upload a video data and select ``Use cache`` option, you can along with the video file attach a file with meta information. + You can find how to prepare it [here](/utils/prepare_meta_information/README.md). ![](static/documentation/images/image127.jpg) #### Advanced configuration - ![](static/documentation/images/image128.jpg) + ![](static/documentation/images/image128_use_cache.jpg) **Z-Order**. Defines the order on drawn polygons. Check the box for enable layered displaying. **Use zip chunks**. Force to use zip chunks as compressed data. Actual for videos only. + **Use cache**. Defines how to work with data. Select the checkbox to switch to the "on-the-fly data processing", + which will reduce the task creation time (by preparing chunks when requests are received) + and store data in a cache of limited size with a policy of evicting less popular items. + See more [here](/cvat/apps/documentation/data_on_fly.md). + **Image Quality**. Use this option to specify quality of uploaded images. The option helps to load high resolution datasets faster. Use the value from ``5`` (almost completely compressed images) to ``100`` (not compressed images). diff --git a/cvat/apps/engine/prepare.py b/cvat/apps/engine/prepare.py index 3d4ca7da..9465b680 100644 --- a/cvat/apps/engine/prepare.py +++ b/cvat/apps/engine/prepare.py @@ -3,7 +3,9 @@ # SPDX-License-Identifier: MIT import av +from collections import OrderedDict import hashlib +import os class WorkWithVideo: def __init__(self, **kwargs): @@ -72,27 +74,30 @@ class PrepareInfo(WorkWithVideo): def get_task_size(self): return self.frames + @property + def frame_sizes(self): + frame = next(iter(self.key_frames.values())) + return (frame.width, frame.height) + + def check_key_frame(self, container, video_stream, key_frame): + for packet in container.demux(video_stream): + for frame in packet.decode(): + if md5_hash(frame) != md5_hash(key_frame[1]) or frame.pts != key_frame[1].pts: + self.key_frames.pop(key_frame[0]) + return + def check_seek_key_frames(self): container = self._open_video_container(self.source_path, mode='r') video_stream = self._get_video_stream(container) key_frames_copy = self.key_frames.copy() - for index, key_frame in key_frames_copy.items(): - container.seek(offset=key_frame.pts, stream=video_stream) - flag = True - for packet in container.demux(video_stream): - for frame in packet.decode(): - if md5_hash(frame) != md5_hash(key_frame) or frame.pts != key_frame.pts: - self.key_frames.pop(index) - flag = False - break - if not flag: - break + for key_frame in key_frames_copy.items(): + container.seek(offset=key_frame[1].pts, stream=video_stream) + self.check_key_frame(container, video_stream, key_frame) - #TODO: correct ratio of number of frames to keyframes - if len(self.key_frames) == 0: - raise Exception('Too few keyframes') + def check_frames_ratio(self, chunk_size): + return (len(self.key_frames) and (self.frames // len(self.key_frames)) <= 2 * chunk_size) def save_key_frames(self): container = self._open_video_container(self.source_path, mode='r') @@ -152,4 +157,79 @@ class PrepareInfo(WorkWithVideo): self._close_video_container(container) return - self._close_video_container(container) \ No newline at end of file + self._close_video_container(container) + +class UploadedMeta(PrepareInfo): + def __init__(self, **kwargs): + super().__init__(**kwargs) + + with open(self.meta_path, 'r') as meta_file: + lines = meta_file.read().strip().split('\n') + self.frames = int(lines.pop()) + + key_frames = {int(line.split()[0]): int(line.split()[1]) for line in lines} + self.key_frames = OrderedDict(sorted(key_frames.items(), key=lambda x: x[0])) + + @property + def frame_sizes(self): + container = self._open_video_container(self.source_path, 'r') + video_stream = self._get_video_stream(container) + container.seek(offset=next(iter(self.key_frames.values())), stream=video_stream) + for packet in container.demux(video_stream): + for frame in packet.decode(): + self._close_video_container(container) + return (frame.width, frame.height) + + def save_meta_info(self): + with open(self.meta_path, 'w') as meta_file: + for index, pts in self.key_frames.items(): + meta_file.write('{} {}\n'.format(index, pts)) + + def check_key_frame(self, container, video_stream, key_frame): + for packet in container.demux(video_stream): + for frame in packet.decode(): + assert frame.pts == key_frame[1], "Uploaded meta information does not match the video" + return + + def check_seek_key_frames(self): + container = self._open_video_container(self.source_path, mode='r') + video_stream = self._get_video_stream(container) + + for key_frame in self.key_frames.items(): + container.seek(offset=key_frame[1], stream=video_stream) + self.check_key_frame(container, video_stream, key_frame) + + self._close_video_container(container) + + def check_frames_numbers(self): + container = self._open_video_container(self.source_path, mode='r') + video_stream = self._get_video_stream(container) + # not all videos contain information about numbers of frames + if video_stream.frames: + self._close_video_container(container) + assert video_stream.frames == self.frames, "Uploaded meta information does not match the video" + return + self._close_video_container(container) + +def prepare_meta(media_file, upload_dir=None, meta_dir=None, chunk_size=None): + paths = { + 'source_path': os.path.join(upload_dir, media_file) if upload_dir else media_file, + 'meta_path': os.path.join(meta_dir, 'meta_info.txt') if meta_dir else os.path.join(upload_dir, 'meta_info.txt'), + } + analyzer = AnalyzeVideo(source_path=paths.get('source_path')) + analyzer.check_type_first_frame() + analyzer.check_video_timestamps_sequences() + + meta_info = PrepareInfo(source_path=paths.get('source_path'), + meta_path=paths.get('meta_path')) + meta_info.save_key_frames() + meta_info.check_seek_key_frames() + meta_info.save_meta_info() + smooth_decoding = meta_info.check_frames_ratio(chunk_size) if chunk_size else None + return (meta_info, smooth_decoding) + +def prepare_meta_for_upload(func, *args): + meta_info, smooth_decoding = func(*args) + with open(meta_info.meta_path, 'a') as meta_file: + meta_file.write(str(meta_info.get_task_size())) + return smooth_decoding diff --git a/cvat/apps/engine/task.py b/cvat/apps/engine/task.py index 4d2e6caf..fad3654f 100644 --- a/cvat/apps/engine/task.py +++ b/cvat/apps/engine/task.py @@ -6,6 +6,7 @@ import itertools import os import sys +from re import findall import rq import shutil from traceback import print_exception @@ -16,6 +17,7 @@ from urllib import request as urlrequest from cvat.apps.engine.media_extractors import get_mime, MEDIA_TYPES, Mpeg4ChunkWriter, ZipChunkWriter, Mpeg4CompressedChunkWriter, ZipCompressedChunkWriter from cvat.apps.engine.models import DataChoice, StorageMethodChoice from cvat.apps.engine.utils import av_scan_paths +from cvat.apps.engine.prepare import prepare_meta import django_rq from django.conf import settings @@ -24,7 +26,6 @@ from distutils.dir_util import copy_tree from . import models from .log import slogger -from .prepare import PrepareInfo, AnalyzeVideo ############################# Low Level server API @@ -105,7 +106,7 @@ def _save_task_to_db(db_task): db_task.data.save() db_task.save() -def _count_files(data): +def _count_files(data, meta_info_file=None): share_root = settings.SHARE_ROOT server_files = [] @@ -132,11 +133,12 @@ def _count_files(data): mime = get_mime(full_path) if mime in counter: counter[mime].append(rel_path) + elif findall('meta_info.txt$', rel_path): + meta_info_file.append(rel_path) else: slogger.glob.warn("Skip '{}' file (its mime type doesn't " "correspond to a video or an image file)".format(full_path)) - counter = { media_type: [] for media_type in MEDIA_TYPES.keys() } count_files( @@ -151,7 +153,7 @@ def _count_files(data): return counter -def _validate_data(counter): +def _validate_data(counter, meta_info_file=None): unique_entries = 0 multiple_entries = 0 for media_type, media_config in MEDIA_TYPES.items(): @@ -161,6 +163,9 @@ def _validate_data(counter): else: multiple_entries += len(counter[media_type]) + if meta_info_file and media_type != 'video': + raise Exception('File with meta information can only be uploaded with video file') + if unique_entries == 1 and multiple_entries > 0 or unique_entries > 1: unique_types = ', '.join([k for k, v in MEDIA_TYPES.items() if v['unique']]) multiply_types = ', '.join([k for k, v in MEDIA_TYPES.items() if not v['unique']]) @@ -219,8 +224,12 @@ def _create_thread(tid, data): if data['remote_files']: data['remote_files'] = _download_data(data['remote_files'], upload_dir) - media = _count_files(data) - media, task_mode = _validate_data(media) + meta_info_file = [] + media = _count_files(data, meta_info_file) + media, task_mode = _validate_data(media, meta_info_file) + if meta_info_file: + assert settings.USE_CACHE and db_data.storage_method == StorageMethodChoice.CACHE, \ + "File with meta information can be uploaded if 'Use cache' option is also selected" if data['server_files']: _copy_data_from_share(data['server_files'], upload_dir) @@ -290,25 +299,51 @@ def _create_thread(tid, data): if task_mode == MEDIA_TYPES['video']['mode']: try: - analyzer = AnalyzeVideo(source_path=os.path.join(upload_dir, media_files[0])) - analyzer.check_type_first_frame() - analyzer.check_video_timestamps_sequences() - - meta_info = PrepareInfo(source_path=os.path.join(upload_dir, media_files[0]), - meta_path=os.path.join(upload_dir, 'meta_info.txt')) - meta_info.save_key_frames() - meta_info.check_seek_key_frames() - meta_info.save_meta_info() + if meta_info_file: + try: + from cvat.apps.engine.prepare import UploadedMeta + if os.path.split(meta_info_file[0])[0]: + os.replace( + os.path.join(upload_dir, meta_info_file[0]), + db_data.get_meta_path() + ) + meta_info = UploadedMeta(source_path=os.path.join(upload_dir, media_files[0]), + meta_path=db_data.get_meta_path()) + meta_info.check_seek_key_frames() + meta_info.check_frames_numbers() + meta_info.save_meta_info() + assert len(meta_info.key_frames) > 0, 'No key frames.' + except Exception as ex: + base_msg = str(ex) if isinstance(ex, AssertionError) else \ + 'Invalid meta information was upload.' + job.meta['status'] = '{} Start prepare valid meta information.'.format(base_msg) + job.save_meta() + meta_info, smooth_decoding = prepare_meta( + media_file=media_files[0], + upload_dir=upload_dir, + chunk_size=db_data.chunk_size + ) + assert smooth_decoding == True, 'Too few keyframes for smooth video decoding.' + else: + meta_info, smooth_decoding = prepare_meta( + media_file=media_files[0], + upload_dir=upload_dir, + chunk_size=db_data.chunk_size + ) + assert smooth_decoding == True, 'Too few keyframes for smooth video decoding.' all_frames = meta_info.get_task_size() + video_size = meta_info.frame_sizes + db_data.size = len(range(db_data.start_frame, min(data['stop_frame'] + 1 if data['stop_frame'] else all_frames, all_frames), db_data.get_frame_step())) video_path = os.path.join(upload_dir, media_files[0]) - frame = meta_info.key_frames.get(next(iter(meta_info.key_frames))) - video_size = (frame.width, frame.height) - - except Exception: + except Exception as ex: db_data.storage_method = StorageMethodChoice.FILE_SYSTEM - + if os.path.exists(db_data.get_meta_path()): + os.remove(db_data.get_meta_path()) + base_msg = str(ex) if isinstance(ex, AssertionError) else "Uploaded video does not support a quick way of task creating." + job.meta['status'] = "{} The task will be created using the old method".format(base_msg) + job.save_meta() else:#images,archive db_data.size = len(extractor) diff --git a/cvat/apps/engine/tests/_test_rest_api.py b/cvat/apps/engine/tests/_test_rest_api.py index 1cfdd42c..8518c974 100644 --- a/cvat/apps/engine/tests/_test_rest_api.py +++ b/cvat/apps/engine/tests/_test_rest_api.py @@ -81,6 +81,7 @@ from rest_framework.test import APIClient, APITestCase from cvat.apps.engine.models import (AttributeType, Data, Job, Project, Segment, StatusChoice, Task, StorageMethodChoice) +from cvat.apps.engine.prepare import prepare_meta, prepare_meta_for_upload _setUpModule() @@ -1644,6 +1645,8 @@ class TaskDataAPITestCase(APITestCase): path = os.path.join(settings.SHARE_ROOT, "videos", "test_video_1.mp4") os.remove(path) + path = os.path.join(settings.SHARE_ROOT, "videos", "meta_info.txt") + os.remove(path) def _run_api_v1_tasks_id_data_post(self, tid, user, data): with ForceLogin(user, self.client): @@ -2057,6 +2060,31 @@ class TaskDataAPITestCase(APITestCase): self._test_api_v1_tasks_id_data_spec(user, task_spec, task_data, self.ChunkType.IMAGESET, self.ChunkType.IMAGESET, image_sizes) + prepare_meta_for_upload( + prepare_meta, + os.path.join(settings.SHARE_ROOT, "videos", "test_video_1.mp4"), + os.path.join(settings.SHARE_ROOT, "videos") + ) + task_spec = { + "name": "my video with meta info task #11", + "overlap": 0, + "segment_size": 0, + "labels": [ + {"name": "car"}, + {"name": "person"}, + ] + } + task_data = { + "server_files[0]": os.path.join("videos", "test_video_1.mp4"), + "server_files[1]": os.path.join("videos", "meta_info.txt"), + "image_quality": 70, + "use_cache": True + } + image_sizes = self._image_sizes[task_data['server_files[0]']] + + self._test_api_v1_tasks_id_data_spec(user, task_spec, task_data, self.ChunkType.VIDEO, + self.ChunkType.VIDEO, image_sizes, StorageMethodChoice.CACHE) + def test_api_v1_tasks_id_data_admin(self): self._test_api_v1_tasks_id_data(self.admin) diff --git a/cvat/settings/base.py b/cvat/settings/base.py index 1e3fe437..7df15709 100644 --- a/cvat/settings/base.py +++ b/cvat/settings/base.py @@ -427,14 +427,14 @@ RESTRICTIONS = { ), } +# http://www.grantjenks.com/docs/diskcache/tutorial.html#djangocache CACHES = { 'default' : { 'BACKEND' : 'diskcache.DjangoCache', 'LOCATION' : CACHE_ROOT, 'TIMEOUT' : None, 'OPTIONS' : { - #'statistics' :True, - 'size_limit' : 2 ** 40, # 1 тб + 'size_limit' : 2 ** 40, # 1 Tb } } } diff --git a/utils/prepare_meta_information/README.md b/utils/prepare_meta_information/README.md new file mode 100644 index 00000000..92d58774 --- /dev/null +++ b/utils/prepare_meta_information/README.md @@ -0,0 +1,29 @@ +# Simple command line for prepare meta information for video data + +**Usage** +```bash +usage: prepare.py [-h] [-chunk_size CHUNK_SIZE] video_file meta_directory + +positional arguments: + video_file Path to video file + meta_directory Directory where the file with meta information will be saved + +optional arguments: + -h, --help show this help message and exit + -chunk_size CHUNK_SIZE + Chunk size that will be specified when creating the task with specified video and generated meta information +``` + +**NOTE**: For smooth video decoding, the `chunk size` must be greater than or equal to the ratio of number of frames +to a number of key frames. +You can understand the approximate `chunk size` by preparing and looking at the file with meta information. + +**NOTE**: If ratio of number of frames to number of key frames is small compared to the `chunk size`, +then when creating a task with prepared meta information, you should expect that the waiting time for some chunks +will be longer than the waiting time for other chunks. (At the first iteration, when there is no chunk in the cache) + +**Examples** + +```bash +python prepare.py ~/Documents/some_video.mp4 ~/Documents +``` diff --git a/utils/prepare_meta_information/prepare.py b/utils/prepare_meta_information/prepare.py new file mode 100644 index 00000000..0cd200a0 --- /dev/null +++ b/utils/prepare_meta_information/prepare.py @@ -0,0 +1,37 @@ +# Copyright (C) 2020 Intel Corporation +# +# SPDX-License-Identifier: MIT +import argparse +import sys +import os + +def get_args(): + parser = argparse.ArgumentParser() + parser.add_argument('video_file', + type=str, + help='Path to video file') + parser.add_argument('meta_directory', + type=str, + help='Directory where the file with meta information will be saved') + parser.add_argument('-chunk_size', + type=int, + help='Chunk size that will be specified when creating the task with specified video and generated meta information') + + return parser.parse_args() + +def main(): + args = get_args() + try: + smooth_decoding = prepare_meta_for_upload(prepare_meta, args.video_file, None, args.meta_directory, args.chunk_size) + print('Meta information for video has been prepared') + + if smooth_decoding != None and not smooth_decoding: + print('NOTE: prepared meta information contains too few key frames for smooth decoding.') + except Exception: + print('Impossible to prepare meta information') + +if __name__ == "__main__": + base_dir = os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__)))) + sys.path.append(base_dir) + from cvat.apps.engine.prepare import prepare_meta, prepare_meta_for_upload + main() \ No newline at end of file