Annotation formats documentation (#719)

* added handling of truncated and difficult attributes for pascal voc loader/dumper added descriptions of supported annotation formats * added YOLO example * made match_frame as Annotations method changed 'image/source_id' field TF feature from int64 to string (according to TF OD API dataset utlis) * updated README improved match_frame function * added unit tests for dump/load
6 years ago · 05c52302f6
parent a0f083d274
commit 05c52302f6
7 changed files with 689 additions and 115 deletions
--- a/cvat/apps/annotation/README.md
+++ b/cvat/apps/annotation/README.md
@ -131,7 +131,7 @@ It allows to download and upload annotations in different formats and easily add

        annotations.add_shape(shape)
    ```
-    Full examples can be found in [builtin](builtin) folder.
+    Full examples can be found in corrseponding *.py files (cvat.py, coco.py, yolo.py, etc.).
 1.  Add path to a new python script to the annotation app settings:

    ```python
@ -150,3 +150,353 @@ It allows to download and upload annotations in different formats and easily add
  Possible solutions: install additional modules via pip call to a separate directory for each Annotation Format
  to reduce version conflicts, etc. Thus, custom code can be run in an extended environment, and core CVAT modules
  should not be affected. As well, this functionality can be useful for Auto Annotation module.
+
+## Format specifications
+
+### CVAT
+This is native CVAT annotation format.
+[Detailed format description](cvat/apps/documentation/xml_format.md)
+
+#### CVAT XML for images dumper
+- downloaded file: Single unpacked XML
+- supported shapes - Rectangles, Polygons, Polylines, Points
+
+#### CVAT XML for videos dumper
+- downloaded file: Single unpacked XML
+- supported shapes - Rectangles, Polygons, Polylines, Points
+
+#### CVAT XML Loader
+- uploaded file: Single unpacked XML
+- supported shapes - Rectangles, Polygons, Polylines, Points
+
+### [Pascal VOC](http://host.robots.ox.ac.uk/pascal/VOC/)
+
+#### Pascal dumper description
+- downloaded file: a zip archive with following structure:
+  ```bash
+  taskname.zip
+  ├── frame_000001.xml
+  ├── frame_000002.xml
+  ├── frame_000003.xml
+  └── ...
+  ```
+  Each annotation `*.xml` file has a name that corresponds to the name of the image file
+  (e.g. `frame_000001.xml` is the annotation for the `frame_000001.jpg` image).
+  Detailed structure specification of the `*.xml` file can be found
+  [here](http://host.robots.ox.ac.uk/pascal/VOC/voc2012/devkit_doc.pdf).
+- supported shapes - Rectangles
+- additional comments: If you plan to use 'truncated' and 'difficult' attributes please add the corresponding
+  items to the CVAT label attributes:
+  `~checkbox=difficult:false ~checkbox=truncated:false`
+
+#### Pascal loader description
+-   uploaded file: a zip archive with following structure:
+    ```bash
+    taskname.zip
+    ├── frame_000001.xml
+    ├── frame_000002.xml
+    ├── frame_000003.xml
+    └── ...
+    ```
+    It should be possible to match the CVAT frame(imagename) and image filename from the annotation \*.xml
+    file (the tag filename, e.g. `<filename>2008_004457.jpg</filename>`). There are 2 options:
+    1. full match between image name and filename from annotation *.xml
+       file (in case of a task was created from images or archive of images).
+    1. match by frame number (if CVAT cannot match by name). File name should be in the following format `frame_%6d.jpg`.
+       It will be used when task was created from a video.
+
+-   supported shapes: Rectangles
+-   limitations: Support of Pascal VOC object detection format
+-   additional comments: the CVAT task should be created with the full label set that may be in the annotation files
+
+#### How to create a task from Pascal VOC dataset
+1.  Download the Pascal Voc dataset (Can be downloaded from the
+    [PASCAL VOC website](http://host.robots.ox.ac.uk/pascal/VOC/))
+1.  Create a CVAT task with the following labels:
+    ```bash
+    aeroplane bicycle bird boat bottle bus car cat chair cow diningtable dog horse motorbike person pottedplant sheep sofa train tvmonitor
+    ```
+    You can add `~checkbox=difficult:false ~checkbox=truncated:false` attributes for each label if you want to use them.
+
+    Select interesting image files
+    (See [Creating an annotation task](cvat/apps/documentation/user_guide.md#creating-an-annotation-task)
+    guide for details)
+1.  zip the corresponding annotation files
+1.  click `Upload annotation` button, choose `Pascal VOC ZIP 1.0`
+and select the *.zip file with annotations from previous step.
+It may take some time.
+
+### [YOLO](https://pjreddie.com/darknet/yolo/)
+#### Yolo dumper description
+- downloaded file: a zip archive with following structure:
+  ```bash
+  taskname.zip
+  ├── frame_000001.txt
+  ├── frame_000002.txt
+  ├── ...
+  └── obj.names
+  ```
+  Each annotation `*.txt` file has a name that corresponds to the name of the image file
+  (e.g. `frame_000001.txt` is the annotation for the `frame_000001.jpg` image).
+  Short description of  `*.txt` file structure: each line describes label and bounding box
+  in the following format `label_id cx cy w h`.
+  `obj.names` contains the ordered list of label names.
+- supported shapes - Rectangles
+
+#### Yolo loader description
+-   uploaded file: a zip archive with following structure:
+    ```bash
+    taskname.zip
+    ├── frame_000001.txt
+    ├── frame_000002.txt
+    ├── frame_000003.txt
+    ├── ...
+    └──obj.names
+    ```
+    It should be possible to match the CVAT frame(imagename) and annotation filename
+    There are 2 options:
+    1. full match between image name and name of annotation `*.txt` file
+       (in case of a task was created from images or archive of images).
+    1. match by frame number (if CVAT cannot match by name). File name should be in the following format `frame_%6d.jpg`.
+       It will be used when task was created from a video.
+
+-   supported shapes: Rectangles
+-   additional comments: the CVAT task should be created with the full label set that may be in the annotation files
+
+#### How to create a task from YOLO formatted dataset (from VOC for example)
+1. Follow the official [guide](https://pjreddie.com/darknet/yolo/)(see Training YOLO on VOC section)
+   and prepare the YOLO formatted annotation files.
+1. Zip train images
+   ```bash
+   zip images.zip -j -@ < train.txt
+   ```
+1. Create a CVAT task with the following labels:
+   ```bash
+   aeroplane bicycle bird boat bottle bus car cat chair cow diningtable dog horse motorbike person pottedplant sheep sofa train tvmonitor
+   ```
+   Select images.zip as data. Most likely you should use `share`
+   functionality because size of images.zip is more than 500Mb.
+   See [Creating an annotation task](cvat/apps/documentation/user_guide.md#creating-an-annotation-task)
+   guide for details.
+1. Create `obj.names` with the following content:
+   ```bash
+   aeroplane
+   bicycle
+   bird
+   boat
+   bottle
+   bus
+   car
+   cat
+   chair
+   cow
+   diningtable
+   dog
+   horse
+   motorbike
+   person
+   pottedplant
+   sheep
+   sofa
+   train
+   tvmonitor
+   ```
+1. Zip all label files together (we need to add only label files that correspond to the train subset)
+   ```bash
+   cat train.txt | while read p; do echo ${p%/*/*}/labels/${${p##*/}%%.*}.txt; done | zip labels.zip -j -@ obj.names
+   ```
+1. Click `Upload annotation` button, choose `YOLO ZIP 1.0` and select the *.zip file with labels from previous step.
+   It may take some time.
+
+### [MS COCO Object Detection](http://cocodataset.org/#format-data)
+#### COCO dumper description
+- downloaded file: single unpacked `json`. Detailed description of the MS COCO format can be found [here](http://cocodataset.org/#format-data)
+- supported shapes - Polygons, Rectangles (interpreted as polygons)
+
+#### COCO loader description
+- uploaded file: single unpacked `*.json`.
+- supported shapes: Polygons (the `segmentation` must not be empty)
+- additional comments: the CVAT task should be created with the full label set that may be in the annotation files
+
+#### How to create a task from MS COCO dataset
+1.  Download the [MS COCO dataset](http://cocodataset.org/#download).
+    For example [2017 Val images](http://images.cocodataset.org/zips/val2017.zip)
+    and [2017 Train/Val annotations](http://images.cocodataset.org/annotations/annotations_trainval2017.zip).
+1.  Create a CVAT task with the following labels:
+      ```bash
+      person bicycle car motorcycle airplane bus train truck boat "traffic light" "fire hydrant" "stop sign" "parking meter" bench bird cat dog horse sheep cow elephant bear zebra giraffe backpack umbrella handbag tie suitcase frisbee skis snowboard "sports ball" kite "baseball bat" "baseball glove" skateboard surfboard "tennis racket" bottle "wine glass" cup fork knife spoon bowl banana apple sandwich orange broccoli carrot "hot dog" pizza donut cake chair couch "potted plant" bed "dining table" toilet tv laptop mouse remote keyboard "cell phone" microwave oven toaster sink refrigerator book clock vase scissors "teddy bear" "hair drier" toothbrush
+      ```
+
+      Select val2017.zip as data
+      (See [Creating an annotation task](cvat/apps/documentation/user_guide.md#creating-an-annotation-task)
+      guide for details)
+1.  unpack annotations_trainval2017.zip
+1.  click `Upload annotation` button,
+    choose `COCO JSON 1.0` and select `instances_val2017.json.json` annotation file. It may take some time.
+
+### [TFRecord](https://www.tensorflow.org/tutorials/load_data/tf_records)
+TFRecord is a very flexible format, but we try to correspond the format that used in
+[TF object detection](https://github.com/tensorflow/models/tree/master/research/object_detection)
+with minimal modifications.
+Used feature description:
+```python
+image_feature_description = {
+    'image/filename': tf.io.FixedLenFeature([], tf.string),
+    'image/source_id': tf.io.FixedLenFeature([], tf.string),
+    'image/height': tf.io.FixedLenFeature([], tf.int64),
+    'image/width': tf.io.FixedLenFeature([], tf.int64),
+    # Object boxes and classes.
+    'image/object/bbox/xmin': tf.io.VarLenFeature(tf.float32),
+    'image/object/bbox/xmax': tf.io.VarLenFeature(tf.float32),
+    'image/object/bbox/ymin': tf.io.VarLenFeature(tf.float32),
+    'image/object/bbox/ymax': tf.io.VarLenFeature(tf.float32),
+    'image/object/class/label': tf.io.VarLenFeature(tf.int64),
+    'image/object/class/text': tf.io.VarLenFeature(tf.string),
+}
+```
+#### TFRecord dumper description
+- downloaded file: a zip archive with following structure:
+  ```bash
+  taskname.zip
+  ├── task2.tfrecord
+  └── label_map.pbtxt
+  ```
+- supported shapes - Rectangles
+
+#### TFRecord loader description
+- uploaded file: a zip archive with following structure:
+  ```bash
+  taskname.zip
+  └── task2.tfrecord
+  ```
+- supported shapes: Rectangles
+- additional comments: the CVAT task should be created with the full label set that may be in the annotation files
+
+#### How to create a task from TFRecord dataset (from VOC2007 for example)
+1. Create label_map.pbtxt file with the following content:
+```js
+item {
+	id: 1
+	name: 'aeroplane'
+}
+item {
+	id: 2
+	name: 'bicycle'
+}
+item {
+	id: 3
+	name: 'bird'
+}
+item {
+	id: 4
+	name: 'boat'
+}
+item {
+	id: 5
+	name: 'bottle'
+}
+item {
+	id: 6
+	name: 'bus'
+}
+item {
+	id: 7
+	name: 'car'
+}
+item {
+	id: 8
+	name: 'cat'
+}
+item {
+	id: 9
+	name: 'chair'
+}
+item {
+	id: 10
+	name: 'cow'
+}
+item {
+	id: 11
+	name: 'diningtable'
+}
+item {
+	id: 12
+	name: 'dog'
+}
+item {
+	id: 13
+	name: 'horse'
+}
+item {
+	id: 14
+	name: 'motorbike'
+}
+item {
+	id: 15
+	name: 'person'
+}
+item {
+	id: 16
+	name: 'pottedplant'
+}
+item {
+	id: 17
+	name: 'sheep'
+}
+item {
+	id: 18
+	name: 'sofa'
+}
+item {
+	id: 19
+	name: 'train'
+}
+item {
+	id: 20
+	name: 'tvmonitor'
+}
+```
+1. Use [create_pascal_tf_record.py](https://github.com/tensorflow/models/blob/master/research/object_detection/dataset_tools/create_pascal_tf_record.py)
+to convert VOC2007 dataset to TFRecord format.
+As example:
+```bash
+python create_pascal_tf_record.py --data_dir <path to VOCdevkit> --set train --year VOC2007 --output_path pascal.tfrecord --label_map_path label_map.pbtxt
+```
+1. Zip train images
+   ```bash
+   cat <path to VOCdevkit>/VOC2007/ImageSets/Main/train.txt | while read p; do echo <path to VOCdevkit>/VOC2007/JPEGImages/${p}.jpg  ; done | zip images.zip -j -@
+   ```
+1. Create a CVAT task with the following labels:
+   ```bash
+   aeroplane bicycle bird boat bottle bus car cat chair cow diningtable dog horse motorbike person pottedplant sheep sofa train tvmonitor
+   ```
+   Select images.zip as data.
+   See [Creating an annotation task](cvat/apps/documentation/user_guide.md#creating-an-annotation-task)
+   guide for details.
+1. Zip pascal.tfrecord and label_map.pbtxt files together
+   ```bash
+   zip anno.zip -j <path to pascal.tfrecord> <path to label_map.pbtxt>
+   ```
+1. Click `Upload annotation` button, choose `TFRecord ZIP 1.0` and select the *.zip file
+   with labels from previous step. It may take some time.
+
+### PNG mask
+#### Mask dumper description
+- downloaded file: a zip archive with the following structure:
+  ```bash
+  taskname.zip
+  ├── frame_000001.png
+  ├── frame_000002.png
+  ├── frame_000003.png
+  ├── ...
+  └── colormap.txt
+  ```
+  Mask is a png image with several (RGB) channels where each pixel has own color which corresponds to a label.
+  Color generation correspond to the Pascal VOC color generation
+  [algorithm](http://host.robots.ox.ac.uk/pascal/VOC/voc2012/htmldoc/devkit_doc.html#sec:voclabelcolormap).
+  (0, 0, 0) is used for background.
+  `colormap.txt` file contains the values of the used colors in RGB format.
+- supported shapes - Rectangles, Polygons
+
+#### Mask loader description
+Not supported
--- a/cvat/apps/annotation/annotation.py
+++ b/cvat/apps/annotation/annotation.py
@ -118,6 +118,8 @@ class Annotation:
        self._host = host
        self._create_callback=create_callback
        self._MAX_ANNO_SIZE=30000
+        self._frame_info = {}
+        self._frame_mapping = {}

        db_labels = self._db_task.label_set.all().prefetch_related('attributespec_set').order_by('pk')

@ -189,6 +191,10 @@ class Annotation:
                "height": db_image.height,
            } for db_image in self._db_task.image_set.all()}

+        self._frame_mapping = {
+            self._get_filename(info["path"]): frame for frame, info in self._frame_info.items()
+        }
+
    def _init_meta(self):
        db_segments = self._db_task.segment_set.all().prefetch_related('job_set')
        self._meta = OrderedDict([
@ -424,3 +430,15 @@ class Annotation:
    @property
    def frame_info(self):
        return self._frame_info
+
+    @staticmethod
+    def _get_filename(path):
+        return os.path.splitext(os.path.basename(path))[0]
+
+    def match_frame(self, filename):
+        # try to match by filename
+        _filename = self._get_filename(filename)
+        if _filename in self._frame_mapping:
+            return self._frame_mapping[_filename]
+
+        raise Exception("Cannot match filename or determinate framenumber for {} filename".format(filename))
--- a/cvat/apps/annotation/coco.py
+++ b/cvat/apps/annotation/coco.py
@ -333,26 +333,6 @@ def load(file_object, annotations):
    from pycocotools import mask as mask_utils
    import numpy as np

-    def get_filename(path):
-        import os
-        return os.path.splitext(os.path.basename(path))[0]
-
-    def match_frame(frame_info, filename):
-        import re
-        # try to match by filename
-        yolo_filename = get_filename(filename)
-        for frame_number, info in frame_info.items():
-            cvat_filename = get_filename(info["path"])
-            if cvat_filename == yolo_filename:
-                return frame_number
-
-        # try to extract frame number from filename
-        numbers = re.findall(r"\d+", filename)
-        if numbers and len(numbers) == 1:
-            return int(numbers[0])
-
-        raise Exception("Cannot match filename or determinate framenumber for {} filename".format(filename))
-
    coco = coco_loader.COCO(file_object.name)
    labels={cat['id']: cat['name'] for cat in coco.loadCats(coco.getCatIds())}

@ -360,7 +340,7 @@ def load(file_object, annotations):
    for img_id in coco.getImgIds():
        anns = coco.loadAnns(coco.getAnnIds(imgIds=img_id))
        img = coco.loadImgs(ids=img_id)[0]
-        frame_number = match_frame(annotations.frame_info, img['file_name'])
+        frame_number = annotations.match_frame(img['file_name'])
        for ann in anns:
            group = 0
            label_name = labels[ann['category_id']]
--- a/cvat/apps/annotation/pascal_voc.py
+++ b/cvat/apps/annotation/pascal_voc.py
@ -25,31 +25,12 @@ format_spec = {
 def load(file_object, annotations):
    from pyunpack import Archive
    import os
-    import re
    from tempfile import TemporaryDirectory

-    def match_frame(frame_info, filename):
-        def get_filename(path):
-            return os.path.splitext(os.path.basename(path))[0]
-
-        # try to match by filename
-        pascal_filename = get_filename(filename)
-        for frame_number, info in frame_info.items():
-            cvat_filename = get_filename(info['path'])
-            if cvat_filename == pascal_filename:
-                return frame_number
-
-        # try to extract framenumber from filename
-        numbers = re.findall(r'\d+', filename)
-        if numbers and len(numbers) == 1:
-            return int(numbers[0])
-
-        raise Exception('Cannot match filename or determinate framenumber for {} filename'.format(filename))
-
    def parse_xml_file(annotation_file):
        import xml.etree.ElementTree as ET
        root = ET.parse(annotation_file).getroot()
-        frame_number = match_frame(annotations.frame_info, root.find('filename').text)
+        frame_number = annotations.match_frame(root.find('filename').text)

        for obj_tag in root.iter('object'):
            bbox_tag = obj_tag.find("bndbox")
@ -58,6 +39,10 @@ def load(file_object, annotations):
            ymin = float(bbox_tag.find('ymin').text)
            xmax = float(bbox_tag.find('xmax').text)
            ymax = float(bbox_tag.find('ymax').text)
+            truncated = obj_tag.find('truncated')
+            truncated = truncated.text if truncated is not None else 0
+            difficult = obj_tag.find('difficult')
+            difficult = difficult.text if difficult is not None else 0

            annotations.add_shape(annotations.LabeledShape(
                type='rectangle',
@ -65,7 +50,10 @@ def load(file_object, annotations):
                label=label,
                points=[xmin, ymin, xmax, ymax],
                occluded=False,
-                attributes=[],
+                attributes=[
+                    annotations.Attribute('truncated', truncated),
+                    annotations.Attribute('difficult', difficult),
+                ],
            ))

    archive_file = getattr(file_object, 'name')
@ -97,12 +85,30 @@ def dump(file_object, annotations):
                for shape in frame_annotation.labeled_shapes:
                    if shape.type != "rectangle":
                        continue
+
                    label = shape.label
                    xtl = shape.points[0]
                    ytl = shape.points[1]
                    xbr = shape.points[2]
                    ybr = shape.points[3]
-                    writer.addObject(label, xtl, ytl, xbr, ybr)
+
+                    difficult = 0
+                    truncated = 0
+                    for attribute in shape.attributes:
+                        if attribute.name == 'truncated' and 'true' == attribute.value.lower():
+                            truncated = 1
+                        elif attribute.name == 'difficult' and 'true' == attribute.value.lower():
+                            difficult = 1
+
+                    writer.addObject(
+                        name=label,
+                        xmin=xtl,
+                        ymin=ytl,
+                        xmax=xbr,
+                        ymax=ybr,
+                        truncated=truncated,
+                        difficult=difficult,
+                    )

                anno_name = os.path.basename('{}.{}'.format(os.path.splitext(image_name)[0], 'xml'))
                anno_file = os.path.join(out_dir, anno_name)
--- a/cvat/apps/annotation/tfrecord.py
+++ b/cvat/apps/annotation/tfrecord.py
@ -81,7 +81,7 @@ def dump(file_object, annotations):
            'image/height': int64_feature(height),
            'image/width': int64_feature(width),
            'image/filename': bytes_feature(image_name.encode('utf8')),
-            'image/source_id': int64_feature(img_id),
+            'image/source_id': bytes_feature(str(img_id).encode('utf8')),
            'image/object/bbox/xmin': float_list_feature(xmins),
            'image/object/bbox/xmax': float_list_feature(xmaxs),
            'image/object/bbox/ymin': float_list_feature(ymins),
@ -138,7 +138,7 @@ def load(file_object, annotations):
        dataset = tf.data.TFRecordDataset(filenames)
        image_feature_description = {
            'image/filename': tf.io.FixedLenFeature([], tf.string),
-            'image/source_id': tf.io.FixedLenFeature([], tf.int64),
+            'image/source_id': tf.io.FixedLenFeature([], tf.string),
            'image/height': tf.io.FixedLenFeature([], tf.int64),
            'image/width': tf.io.FixedLenFeature([], tf.int64),
            # Object boxes and classes.
@ -152,7 +152,7 @@ def load(file_object, annotations):

        for record in dataset:
            parsed_record = tf.io.parse_single_example(record, image_feature_description)
-            frame_number = tf.cast(parsed_record['image/source_id'], tf.int64).numpy().item()
+            frame_number = annotations.match_frame(parsed_record['image/filename'].numpy().decode('utf-8'))
            frame_height = tf.cast(parsed_record['image/height'], tf.int64).numpy().item()
            frame_width = tf.cast(parsed_record['image/width'], tf.int64).numpy().item()
            xmins = tf.sparse.to_dense(parsed_record['image/object/bbox/xmin']).numpy()
--- a/cvat/apps/annotation/yolo.py
+++ b/cvat/apps/annotation/yolo.py
@ -22,10 +22,6 @@ format_spec = {
    ],
 }

-def get_filename(path):
-    import os
-    return os.path.splitext(os.path.basename(path))[0]
-
 def load(file_object, annotations):
    from pyunpack import Archive
    import os
@ -49,24 +45,8 @@ def load(file_object, annotations):
        label_id, x, y, w, h = obj.split(" ")
        return int(label_id), convert_from_yolo(img_size, (float(x), float(y), float(w), float(h)))

-    def match_frame(frame_info, filename):
-        import re
-        # try to match by filename
-        yolo_filename = get_filename(filename)
-        for frame_number, info in frame_info.items():
-            cvat_filename = get_filename(info["path"])
-            if cvat_filename == yolo_filename:
-                return frame_number
-
-        # try to extract frame number from filename
-        numbers = re.findall(r"\d+", filename)
-        if numbers and len(numbers) == 1:
-            return int(numbers[0])
-
-        raise Exception("Cannot match filename or determinate framenumber for {} filename".format(filename))
-
    def parse_yolo_file(annotation_file, labels_mapping):
-        frame_number = match_frame(annotations.frame_info, annotation_file)
+        frame_number = annotations.match_frame(annotation_file)
        with open(annotation_file, "r") as fp:
            line = fp.readline()
            while line:
@ -105,6 +85,7 @@ def load(file_object, annotations):

 def dump(file_object, annotations):
    from zipfile import ZipFile
+    import os

    # convertation formulas are based on https://github.com/pjreddie/darknet/blob/master/scripts/voc_label.py
    # <x> <y> <width> <height> - float values relative to width and height of image
@ -122,7 +103,7 @@ def dump(file_object, annotations):
    with ZipFile(file_object, "w") as output_zip:
        for frame_annotation in annotations.group_by_frame():
            image_name = frame_annotation.name
-            annotation_name = "{}.txt".format(get_filename(image_name))
+            annotation_name = "{}.txt".format(os.path.splitext(os.path.basename(image_name))[0])
            width = frame_annotation.width
            height = frame_annotation.height

--- a/cvat/apps/engine/tests/test_rest_api.py
+++ b/cvat/apps/engine/tests/test_rest_api.py
@ -18,6 +18,9 @@ from unittest import mock
 import io
 import xml.etree.ElementTree as ET
 from collections import defaultdict
+import zipfile
+from pycocotools import coco as coco_loader
+import tempfile

 def create_db_users(cls):
    (group_admin, _) = Group.objects.get_or_create(name="admin")
@ -1540,7 +1543,7 @@ class TaskDataAPITestCase(APITestCase):
        response = self._create_task(None, data)
        self.assertEqual(response.status_code, status.HTTP_401_UNAUTHORIZED)

-def compare_objects(self, obj1, obj2, ignore_keys):
+def compare_objects(self, obj1, obj2, ignore_keys, fp_tolerance=.001):
    if isinstance(obj1, dict):
        self.assertTrue(isinstance(obj2, dict), "{} != {}".format(obj1, obj2))
        for k in obj1.keys():
@ -1553,7 +1556,10 @@ def compare_objects(self, obj1, obj2, ignore_keys):
        for v1, v2 in zip(obj1, obj2):
            compare_objects(self, v1, v2, ignore_keys)
    else:
-        self.assertEqual(obj1, obj2)
+        if isinstance(obj1, float) or isinstance(obj2, float):
+            self.assertAlmostEqual(obj1, obj2, delta=fp_tolerance)
+        else:
+            self.assertEqual(obj1, obj2)

 class JobAnnotationAPITestCase(APITestCase):
    def setUp(self):
@ -2117,25 +2123,38 @@ class TaskAnnotationAPITestCase(JobAnnotationAPITestCase):

        return response

+    def _upload_api_v1_tasks_id_annotations(self, pk, user, data, query_params=""):
+        with ForceLogin(user, self.client):
+            response = self.client.put(
+                path="/api/v1/tasks/{0}/annotations?{1}".format(pk, query_params),
+                data=data,
+                format="multipart",
+                )
+
+        return response
+
+    def _get_annotation_formats(self, user):
+        with ForceLogin(user, self.client):
+            response = self.client.get(
+                path="/api/v1/server/annotation/formats"
+            )
+        return response
+
    def _check_response(self, response, data):
        if not response.status_code in [
            status.HTTP_401_UNAUTHORIZED, status.HTTP_403_FORBIDDEN]:
            compare_objects(self, data, response.data, ignore_keys=["id"])

    def _run_api_v1_tasks_id_annotations(self, owner, assignee, annotator):
-        task, jobs = self._create_task(owner, assignee)
+        task, _ = self._create_task(owner, assignee)
        if annotator:
            HTTP_200_OK = status.HTTP_200_OK
            HTTP_204_NO_CONTENT = status.HTTP_204_NO_CONTENT
            HTTP_400_BAD_REQUEST = status.HTTP_400_BAD_REQUEST
-            HTTP_202_ACCEPTED = status.HTTP_202_ACCEPTED
-            HTTP_201_CREATED = status.HTTP_201_CREATED
        else:
            HTTP_200_OK = status.HTTP_401_UNAUTHORIZED
            HTTP_204_NO_CONTENT = status.HTTP_401_UNAUTHORIZED
            HTTP_400_BAD_REQUEST = status.HTTP_401_UNAUTHORIZED
-            HTTP_202_ACCEPTED = status.HTTP_401_UNAUTHORIZED
-            HTTP_201_CREATED = status.HTTP_401_UNAUTHORIZED

        data = {
            "version": 0,
@ -2503,51 +2522,262 @@ class TaskAnnotationAPITestCase(JobAnnotationAPITestCase):
            "create", data)
        self.assertEqual(response.status_code, HTTP_400_BAD_REQUEST)

-        cvat_format = AnnotationFormat.objects.get(name="CVAT")
-        for annotation_handler in cvat_format.annotationdumper_set.all():
-            response = self._dump_api_v1_tasks_id_annotations(task["id"], annotator,
-                "format={}".format(annotation_handler.display_name))
-            self.assertEqual(response.status_code, HTTP_202_ACCEPTED)
-
-            response = self._dump_api_v1_tasks_id_annotations(task["id"], annotator,
-                "format={}".format(annotation_handler.display_name))
-            self.assertEqual(response.status_code, HTTP_201_CREATED)
-
-            response = self._dump_api_v1_tasks_id_annotations(task["id"], annotator,
-                "action=download&format={}".format(annotation_handler.display_name))
-            self.assertEqual(response.status_code, HTTP_200_OK)
-            self._check_dump_response(response, task, jobs, data)
-
-    def _check_dump_response(self, response, task, jobs, data):
-        if response.status_code == status.HTTP_200_OK:
-            def etree_to_dict(t):
-                d = {t.tag: {} if t.attrib else None}
-                children = list(t)
-                if children:
-                    dd = defaultdict(list)
-                    for dc in map(etree_to_dict, children):
-                        for k, v in dc.items():
-                            dd[k].append(v)
-                    d = {t.tag: {k: v[0] if len(v) == 1 else v
-                        for k, v in dd.items()}}
-                if t.attrib:
-                    d[t.tag].update(('@' + k, v) for k, v in t.attrib.items())
-                if t.text:
-                    text = t.text.strip()
-                    if not (children or t.attrib):
-                        d[t.tag] = text
-                return d
-
-            self.assertTrue(response.streaming)
-            content = io.BytesIO(b''.join(response.streaming_content))
-            xmldump = ET.fromstring(content.read())
+    def _run_api_v1_tasks_id_annotations_dump_load(self, owner, assignee, annotator):
+        if annotator:
+            HTTP_200_OK = status.HTTP_200_OK
+            HTTP_204_NO_CONTENT = status.HTTP_204_NO_CONTENT
+            HTTP_202_ACCEPTED = status.HTTP_202_ACCEPTED
+            HTTP_201_CREATED = status.HTTP_201_CREATED
+        else:
+            HTTP_200_OK = status.HTTP_401_UNAUTHORIZED
+            HTTP_204_NO_CONTENT = status.HTTP_401_UNAUTHORIZED
+            HTTP_202_ACCEPTED = status.HTTP_401_UNAUTHORIZED
+            HTTP_201_CREATED = status.HTTP_401_UNAUTHORIZED
+
+        def _get_initial_annotation(annotation_format):
+            rectangle_tracks_with_attrs = [{
+                "frame": 0,
+                "label_id": task["labels"][0]["id"],
+                "group": 0,
+                "attributes": [
+                    {
+                        "spec_id": task["labels"][0]["attributes"][0]["id"],
+                        "value": task["labels"][0]["attributes"][0]["values"][0]
+                    },
+                ],
+                "shapes": [
+                    {
+                        "frame": 0,
+                        "points": [1.0, 2.1, 50.1, 30.22],
+                        "type": "rectangle",
+                        "occluded": False,
+                        "outside": False,
+                        "attributes": [
+                            {
+                                "spec_id": task["labels"][0]["attributes"][1]["id"],
+                                "value": task["labels"][0]["attributes"][1]["default_value"]
+                            }
+                        ]
+                    },
+                    {
+                        "frame": 1,
+                        "points": [2.0, 2.1, 77.2, 36.22],
+                        "type": "rectangle",
+                        "occluded": True,
+                        "outside": True,
+                        "attributes": [
+                            {
+                                "spec_id": task["labels"][0]["attributes"][1]["id"],
+                                "value": task["labels"][0]["attributes"][1]["default_value"]
+                            }
+                        ]
+                    },
+                ]
+            }]
+            rectangle_tracks_wo_attrs = [{
+                "frame": 1,
+                "label_id": task["labels"][1]["id"],
+                "group": 0,
+                "attributes": [],
+                "shapes": [
+                    {
+                        "frame": 1,
+                        "attributes": [],
+                        "points": [1.0, 2.1, 50.2, 36.6],
+                        "type": "rectangle",
+                        "occluded": False,
+                        "outside": False
+                    },
+                    {
+                        "frame": 2,
+                        "attributes": [],
+                        "points": [1.0, 2.1, 51, 36.6],
+                        "type": "rectangle",
+                        "occluded": False,
+                        "outside": True
+                    }
+                ]
+            }]
+
+            rectangle_shapes_with_attrs = [{
+                "frame": 0,
+                "label_id": task["labels"][0]["id"],
+                "group": 0,
+                "attributes": [
+                    {
+                        "spec_id": task["labels"][0]["attributes"][0]["id"],
+                        "value": task["labels"][0]["attributes"][0]["values"][0]
+                    },
+                    {
+                        "spec_id": task["labels"][0]["attributes"][1]["id"],
+                        "value": task["labels"][0]["attributes"][1]["default_value"]
+                    }
+                ],
+                "points": [1.0, 2.1, 10.6, 53.22],
+                "type": "rectangle",
+                "occluded": False
+            }]
+
+            rectangle_shapes_wo_attrs = [{
+                "frame": 1,
+                "label_id": task["labels"][1]["id"],
+                "group": 0,
+                "attributes": [],
+                "points": [2.0, 2.1, 40, 50.7],
+                "type": "rectangle",
+                "occluded": False
+            }]
+
+            polygon_shapes_wo_attrs = [{
+                "frame": 1,
+                "label_id": task["labels"][1]["id"],
+                "group": 0,
+                "attributes": [],
+                "points": [2.0, 2.1, 100, 30.22, 40, 77, 1, 3],
+                "type": "polygon",
+                "occluded": False
+            }]
+
+            annotations = {
+                    "version": 0,
+                    "tags": [],
+                    "shapes": [],
+                    "tracks": [],
+                }
+            if annotation_format == "CVAT XML 1.1 for videos":
+                annotations["tracks"] = rectangle_tracks_with_attrs + rectangle_tracks_wo_attrs
+
+            elif annotation_format == "CVAT XML 1.1 for images":
+                annotations["shapes"] = rectangle_shapes_with_attrs + rectangle_shapes_wo_attrs
+
+            elif annotation_format == "PASCAL VOC ZIP 1.0" or \
+                 annotation_format == "YOLO ZIP 1.0" or \
+                 annotation_format == "TFRecord ZIP 1.0":
+                 annotations["shapes"] = rectangle_shapes_wo_attrs
+
+            elif annotation_format == "COCO JSON 1.0":
+                annotations["shapes"] = polygon_shapes_wo_attrs
+
+            elif annotation_format == "MASK ZIP 1.0":
+                annotations["shapes"] = rectangle_shapes_with_attrs + rectangle_shapes_wo_attrs + polygon_shapes_wo_attrs
+                annotations["tracks"] = rectangle_tracks_with_attrs + rectangle_tracks_wo_attrs

+            return annotations
+
+        response = self._get_annotation_formats(annotator)
+        self.assertEqual(response.status_code, HTTP_200_OK)
+
+
+        if annotator is not None:
+            supported_formats = response.data
+        else:
+            supported_formats = [{
+                "name": "CVAT",
+                "dumpers": [{
+                    "display_name": "CVAT XML 1.1 for images"
+                }],
+                "loaders": [{
+                    "display_name": "CVAT XML 1.1"
+                }]
+            }]
+
+        self.assertTrue(isinstance(supported_formats, list) and supported_formats)
+
+        for annotation_format in supported_formats:
+            for dumper in annotation_format["dumpers"]:
+                # 1. create task
+                task, jobs = self._create_task(owner, assignee)
+
+                # 2. add annotation
+                data = _get_initial_annotation(dumper["display_name"])
+                response = self._put_api_v1_tasks_id_annotations(task["id"], annotator, data)
+                data["version"] += 1
+
+                self.assertEqual(response.status_code, HTTP_200_OK)
+                self._check_response(response, data)
+
+                # 3. download annotation
+                response = self._dump_api_v1_tasks_id_annotations(task["id"], annotator,
+                    "format={}".format(dumper["display_name"]))
+                self.assertEqual(response.status_code, HTTP_202_ACCEPTED)
+
+                response = self._dump_api_v1_tasks_id_annotations(task["id"], annotator,
+                    "format={}".format(dumper["display_name"]))
+                self.assertEqual(response.status_code, HTTP_201_CREATED)
+
+                response = self._dump_api_v1_tasks_id_annotations(task["id"], annotator,
+                    "action=download&format={}".format(dumper["display_name"]))
+                self.assertEqual(response.status_code, HTTP_200_OK)
+
+                # 4. check downloaded data
+                if response.status_code == status.HTTP_200_OK:
+                    self.assertTrue(response.streaming)
+                    content = io.BytesIO(b"".join(response.streaming_content))
+                    self._check_dump_content(content, task, jobs, data, annotation_format["name"])
+                    content.seek(0)
+
+                    # 5. remove annotation form the task
+                    response = self._delete_api_v1_tasks_id_annotations(task["id"], annotator)
+                    data["version"] += 1
+                    self.assertEqual(response.status_code, HTTP_204_NO_CONTENT)
+
+                    # 6. upload annotation and check annotation
+                    uploaded_data = {
+                        "annotation_file": content,
+                    }
+
+                    for loader in annotation_format["loaders"]:
+                        response = self._upload_api_v1_tasks_id_annotations(task["id"], annotator, uploaded_data, "format={}".format(loader["display_name"]))
+                        self.assertEqual(response.status_code, HTTP_202_ACCEPTED)
+
+                        response = self._upload_api_v1_tasks_id_annotations(task["id"], annotator, {}, "format={}".format(loader["display_name"]))
+                        self.assertEqual(response.status_code, HTTP_201_CREATED)
+
+                        response = self._get_api_v1_tasks_id_annotations(task["id"], annotator)
+                        self.assertEqual(response.status_code, HTTP_200_OK)
+                        data["version"] += 2 # upload is delete + put
+                        self._check_response(response, data)
+
+    def _check_dump_content(self, content, task, jobs, data, annotation_format_name):
+        def etree_to_dict(t):
+            d = {t.tag: {} if t.attrib else None}
+            children = list(t)
+            if children:
+                dd = defaultdict(list)
+                for dc in map(etree_to_dict, children):
+                    for k, v in dc.items():
+                        dd[k].append(v)
+                d = {t.tag: {k: v[0] if len(v) == 1 else v
+                    for k, v in dd.items()}}
+            if t.attrib:
+                d[t.tag].update(('@' + k, v) for k, v in t.attrib.items())
+            if t.text:
+                text = t.text.strip()
+                if not (children or t.attrib):
+                    d[t.tag] = text
+            return d
+
+        if annotation_format_name == "CVAT":
+            xmldump = ET.fromstring(content.read())
            self.assertEqual(xmldump.tag, "annotations")
            tags = xmldump.findall("./meta")
            self.assertEqual(len(tags), 1)
            meta = etree_to_dict(tags[0])["meta"]
            self.assertEqual(meta["task"]["name"], task["name"])
-
+        elif annotation_format_name == "PASCAL VOC":
+            self.assertTrue(zipfile.is_zipfile(content))
+        elif annotation_format_name == "YOLO":
+            self.assertTrue(zipfile.is_zipfile(content))
+        elif annotation_format_name == "COCO":
+            with tempfile.NamedTemporaryFile() as tmp_file:
+                tmp_file.write(content.read())
+                tmp_file.flush()
+                coco = coco_loader.COCO(tmp_file.name)
+                self.assertTrue(coco.getAnnIds())
+        elif annotation_format_name == "TFRecord":
+            self.assertTrue(zipfile.is_zipfile(content))
+        elif annotation_format_name == "MASK":
+            self.assertTrue(zipfile.is_zipfile(content))

    def test_api_v1_tasks_id_annotations_admin(self):
        self._run_api_v1_tasks_id_annotations(self.admin, self.assignee,
@ -2560,7 +2790,16 @@ class TaskAnnotationAPITestCase(JobAnnotationAPITestCase):
    def test_api_v1_tasks_id_annotations_no_auth(self):
        self._run_api_v1_tasks_id_annotations(self.user, self.assignee, None)

+    def test_api_v1_tasks_id_annotations_dump_load_admin(self):
+        self._run_api_v1_tasks_id_annotations_dump_load(self.admin, self.assignee,
+            self.assignee)
+
+    def test_api_v1_tasks_id_annotations_dump_load_user(self):
+        self._run_api_v1_tasks_id_annotations_dump_load(self.user, self.assignee,
+            self.assignee)

+    def test_api_v1_tasks_id_annotations_dump_load_no_auth(self):
+        self._run_api_v1_tasks_id_annotations_dump_load(self.user, self.assignee, None)

 class ServerShareAPITestCase(APITestCase):
    def setUp(self):