You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

7.0 KiB

Raw Blame History

Datumaro

Concept
Design
RC 1 vision

Concept

Datumaro is:

a tool to build composite datasets and iterate over them
a tool to create and maintain datasets
- Version control of annotations and images
- Publication (with removal of sensitive information)
- Editing
- Joining and splitting
- Exporting, format changing
- Image preprocessing
a dataset storage
a tool to debug datasets
- A network can be used to generate informative data subsets (e.g. with false-positives) to be analyzed further

Requirements

User interfaces
- a library
- a console tool with visualization means
Targets: single datasets, composite datasets, single images / videos
Built-in support for well-known annotation formats and datasets: CVAT, COCO, PASCAL VOC, Cityscapes, ImageNet
Extensibility with user-provided components
Lightweightness - it should be easy to start working with Datumaro
- Minimal dependency on environment and configuration
- It should be easier to use Datumaro than writing own code for computation of statistics or dataset manipulations

Functionality and ideas

Blur sensitive areas on dataset images
Dataset annotation filters, relabelling etc.
Dataset augmentation
Calculation of statistics:
- Mean & std, custom stats
"Edit" command to modify annotations
Versioning (for images, annotations, subsets, sources etc., comparison)
Documentation generation
Provision of iterators for user code
Dataset building (export in a specific format, indexation, statistics, documentation)
Dataset exporting to other formats
Dataset debugging (run inference, generate dataset slices, compute statistics)
"Explainable AI" - highlight network attention areas (paper)
- Black-box approach
  - Classification, Detection, Segmentation, Captioning
  - White-box approach

Research topics

exploration of network prediction uncertainty (aka Bayessian approach) Use case: explanation of network "quality", "stability", "certainty"
adversarial attacks on networks
dataset minification / reduction Use case: removal of redundant information to reach the same network quality with lesser training time
dataset expansion and filtration of additions Use case: add only important data
guidance for key frame selection for tracking (paper) Use case: more effective annotation, better predictions

Design

Command-line

Use Docker as an example. Basically, the interface is partitioned on contexts and shortcuts. Contexts are semantically grouped commands, related to a single topic or target. Shortcuts are handy shorter alternatives for the most used commands and also special commands, which are hard to be put into specific context.

FreeMind tool link

High-level architecture

Using MVVM UI pattern

Datumaro project and environment structure

├── [datumaro module]
└── [project folder]
    ├── .datumaro/
    │   ├── config.yml
    │   ├── .git/
    │   ├── importers/
    │   │   ├── custom_format_importer1.py
    │   │   └── ...
    │   ├── statistics/
    │   │   ├── custom_statistic1.py
    │   │   └── ...
    │   ├── visualizers/
    │   │   ├── custom_visualizer1.py
    │   │   └── ...
    │   └── extractors/
    │       ├── custom_extractor1.py
    │       └── ...
    └── sources/
        ├── source1
        └── ...

RC 1 vision

In the first version Datumaro should be a project manager for CVAT. It should only consume data from CVAT. The collected dataset can be downloaded by user to be operated on with Datumaro CLI.

        User
          |
          v
 +------------------+
 |       CVAT       |
 +--------v---------+       +------------------+       +--------------+
 | Datumaro module  | ----> | Datumaro project | <---> | Datumaro CLI | <--- User
 +------------------+       +------------------+       +--------------+

Interfaces

Python API for user code
- Installation as a package
A command-line tool for dataset manipulations

Features

Dataset format support (reading, exporting)
- Own format
- COCO
- PASCAL VOC
- Cityscapes
- ImageNet
- CVAT
Dataset visualization (show)
- Ability to visualize a dataset
  - with TensorBoard
Calculation of statistics for datasets
- Pixel mean, std
- Object counts (detection scenario)
- Image-Class distribution (classification scenario)
- Pixel-Class distribution (segmentation scenario)
- Image clusters
- Custom statistics
Dataset building
- Composite dataset building
- Annotation remapping
- Subset splitting
- Dataset filtering (extract)
- Dataset merging (merge)
- Dataset item editing (edit)
Dataset comparison (diff)
- Annotation-annotation comparison
- Annotation-inference comparison
- Annotation quality estimation (for CVAT)
  - Provide a simple method to check annotation quality with a model and generate summary
Dataset and model debugging
- Inference explanation (explain)
- Black-box approach (RISE paper)
- Ability to run a model on a dataset and read the results
CVAT-integration features
- Task export
  - Datumaro project export
  - Dataset export
  - Original raw data (images, a video file) can be downloaded (exported) together with annotations or just have links on CVAT server (in the future support S3, etc)
    - Be able to use local files instead of remote links
      - Specify cache directory
- Use case "annotate for model training"
  - create a task
  - annotate
  - export the task
  - convert to a training format
  - train a DL model
- Use case "annotate and estimate quality"
  - create a task
  - annotate
  - estimate quality of annotations

Optional features

Dataset publishing
- Versioning (for annotations, subsets, sources, etc.)
- Blur sensitive areas on images
- Tracking of legal information
- Documentation generation
Dataset building
- Dataset minification / Extraction of the most representative subset
  - Use case: generate low-precision calibration dataset
Dataset and model debugging
- Training visualization
- Inference explanation (explain)
  - White-box approach

Properties

Lightweightness
Modularity
Extensibility

7.0 KiB Raw Blame History