DeepMIB - Directories and Preprocessing tab

This tab allows choosing directories with images for training and prediction as well as various parameters used during image loading and preprocessing

Back to Index --> User Guide --> Menu --> Tools Menu --> Deep learning segmentation

expand all on page

Widgets and settings
Preprocessing of files
Organization of directories without preprocessing for semantic segmentation
Organization of directories with preprocessing for semantic segmentation
Organization of directories for 2D patch-wise workflow

expand all

Widgets and settings

Directory with images and labels for training
[ used only for training ]

use these widgets to select directory that contain images and model to be used for training. For the organization of directories see the organization schemes below.
For 2D networks the files should contain individual 2D images, while for 3D networks individual 3D datasets.
The extension ▼ dropdown menu on the right-hand side can be used to specify extension of the image files.
The [✓] Bio checkbox toggles standard or Bio-format readers for loading images. If the Bio-Format file is a collection of image, the Index... edit box can be used to specify an index of the file within the container.
For better performance, it is recommended to convert Bio-Formats compatible images to standard formats or to use the Preprocessing option (see below).

Important notes considering training files

Number of model or mask files should match the number of image files (one exception is 2D networks, where it is allowed to have a single model file in MIB *.model format, when Single MIB model file: ticked). This option requires data preprocessing
For labels in standard image formats it is important to specify number of classes including the Exterior into the Number of classes edit box
Important! It is not possible to use numbers as names of materials, please name materials in a sensible way when using the *.model format!

Directory with images for prediction
[ used only for prediction ]
use these widgets to specify directory with images for prediction (named 2_Prediction in the file organization schemes below).

The image files should be placed under Images subfolder (it is also possible to place images directly into a folder specified in this panel). Optionally, when the ground truth labels for prediction images are available, they can be placed under Labels subfolder.

When the preprocessing mode is used the images from this folder are converted and saved to 3_Results\Prediction images directory. When the ground truth labels are present, they are also processed and copied to 3_Results\PredictionImages\GroundTruthLabels. These labels can be used for evaluation of results (see for details).

For 2D networks the files should contain individual 2D images or 3D stacks, while for 3D networks individual 3D datasets.

The extension ▼ dropdown menu on the right-hand side can be used to specify extension of the image files. The [✓] Bio checkbox toggles standard or Bio-format readers for loading the images. If the Bio-Format file is a collection of image, the Index edit box can be used to specify an index of the file within the container.

Directory with resulting images
use these widgets to specify the main output directory; results and all preprocessed images are stored there.

All subfolders inside this directory are automatically created by Deep MIB:

Description of directories created by DeepMIB

PredictionImages, place for the prepocessed images for prediction
PredictionImages\GroundTruthLabels, place for ground truth labels for prediction images, when available
PredictionImages\ResultsModels, the main outout directory with generated labels after prediction. The 2D models can be combined in MIB by selecting the files using the ⇧ Shift+left mouse click during loading
PredictionImages\ResultsScores, folder for generated prediction scores (probability) for each material. The score values are scaled between 0 and 255
ScoreNetwork, for accuracy and loss score plots, when the Export training plots option of the Train tab is ticked and for storing checkpoints of the network after each epoch (or specified frequency), when the [✓] Save progress after each epoch checkbox is ticked. The score files are started with a date-time tag and overwritten when a new training is started
TrainImages, images to be used for training (only for preprocessing mode)
TrainLabels, labels accompanying images to be used for training (only for preprocessing mode)
ValidationImages, images to be used for validation during training (only for preprocessing mode)
ValidationLabels, labels accompanying images for validation (only for preprocessing mode)

Label file details

The [✓] Single MIB model file checkbox, (only for 2D networks) when checked, a single model file with labels will be used
The Labels extension ▼ dropdown, (only for 2D networks) is used to select extension of files containing models. For 3D network MIB model format is used
The Number of classes edit box, (TIF or PNG formats only) is used to define number of classes (including Exterior) in labels. For label files in MIB *.model format, this field will be updated automatically
[✓] Use masking checkbox is used when some parts of the training data should be excluded from training. The masks may be provided in various formats and number of mask files should match the number of image files. When mask files are provided the preprocessing operation has to be done. When USE 0-s IN LABELS ▼ is selected the mask is assumed to be areas of label files with 0-values. This option is recommended for work with masks without the preprocessing operation
- When USE 0-s IN LABELS ▼ is used, the first material in the prediction results will be assigned to the Exterior material, i.e. will acquire index 0.
  It is recommended to have the first material in the ground truth assigned to background!
- When mask with the preprocessing operation is used, the Exterior material will be used to indicate the background areas
- Masking may give drop in precision of training due to inconsistency within the image patches, it is recommended to minimize use of masking
Mask extension ▼ is used to select extension for files that contain masks. Without preprocessing (USE 0-s IN LABELS ▼ any files are allowed); with preprocessing only *.mask format is allowed for the 3D network

Additional settings

[✓] Compress processed images, tick to compress the processed images. The processed images are stored in *.mibImg format that can be loaded in MIB. *.mibImg is a variation of standard MATLAB format and can also be directly loaded into MATLAB using similar to this command:
res = load('img01.mibImg, '-mat');.
Compression of images slows down performance!
[✓] Compress processed labels, tick to compress labels during preprocessing. The processed labels are stored in *.mibCat format that can be loaded in MIB (Menu->Models->Load model). It is a variation of a standard MATLAB format, where the model is encoded using categorical class of MATLAB.
Compression of labels slows down performance but brings significant benefit of small file sizes
[✓] Use parallel processing, when ticked DeepMIB is using multiple cores to process images. Number of cores can be specified using the Workers edit box. The parallel processing during preprocessing operation brings significant decrease in time required for preprocessing.
Fraction of images for validation, define fraction of images that will be randomly (depending on Random generator seed) assigned into the validation set. When set to 0, the validation option will not be used during the training
Random generator seed, a number to initialize random seed generator, which defines how the images for training and validation are split. For reproducibility of tests keep value fixed. When random seed is initialized with 0, the random seed generator is shuffled based on the current system time
Preprocess for ▼, select mode of operation upon press of the Preprocess button. Results of the preprocessing operation for each mode are presented in schemes below

expand all

Preprocessing of files

Originally, the preprocessing of files in DeepMIB was required for most of workflows. Currently, however, DeepMIB is capable to work with unprocessed images most of times: use the Preprocessing is not required ▼ or Split for training/validation ▼ options.

When the preprocessing step is required or recommended

The preprocessing is recommended/required in the following situations:

when labels are stored in a single *.MODEL file
when training set is coming in proprietary formats that can only be read using BioFormats reader

During preprocessing the images and model files are converted to mibImg and mibCat formats (a variation of MATLAB standard data format) that are adapted for training and prediction.

expand all

Organization of directories without preprocessing for semantic segmentation

Without preprocessing, when datasets are manually split into training and validation sets

In this mode, the training image files are not preprocessed and loaded on-demand during network training. The image files should be split into subfolders TrainImages, TrainLabels and optional subfolders ValidationImages, ValidationLabels (for details see Snapshot with the legend below). The images may also be automatically split, see the following section for details.
When Bio-Format library is used, it is recommended to preprocess images for the training process to speed up the file reading performance.

Snapshot with the directory tree

Snapshot with the legend

Without preprocessing, with automatic splitting of datasets into training and validation sets

In this mode, the image and label files are randomly split into the train and validation sets. The split is done upon press of the Preprocess button, when Preprocess for: Split for training and validation
Splitting of the files depends on a seed value provided in the Random generator seed field; when seed is 0 a new random seed value is used each time the spliting is done.

Snapshot with the directory tree

Snapshot with the legend

expand all

Organization of directories with preprocessing for semantic segmentation

This mode is enabled when the Preprocess for has one of the following selections:

Training and prediction, to preprocess images for both training and prediction
Training, to preprocess images only for training
Prediction, to preprocess images only for prediction

The preprocessing starts by pressing of the Preprocess button.

The scheme below demonstrates organization of directories, when the preprocessing mode is used.

Snapshot with the directory tree

Snapshot with the legend

expand all

Organization of directories for 2D patch-wise workflow

The 2D patch-wise workflow requires slightly different organization of images in folders. In brief, instead having Images and Labels training directories, all images are organized in Images\[ClassnameN] subfolders. Where ClassnameN encodes a directory name with images patches that belong to ClassnameN class. Number of these subfolders should match number of classes to be used for training.

In contrast to semantic segmentation, the preprocessing is not used during the patch-wise mode.

Without preprocessing, when datasets are manually split into training and validation sets

The images for training should be organized in own subfolders named by corresponding class names and placed under:

1_Training\TrainImages, images to be used for training
1_Training\ValidationImages, images to be used for validation (optionally)

The images may also be automatically split into subfolder for training and validation.

Snapshot with the directory tree

bg and spots are examples of two class names

When the ground-truth data for prediction is present, it can be arranged in a similar way to the semantic segmentation under
2_Prediction\Images and
2_Prediction\Labels directories
or in subfolders named by class names as 2_Prediction\bg and 2_Prediction\spots, where bg and spots subfolders contain patches that belong to these classes.

Snapshot with the legend

Without preprocessing, with automatic splitting of datasets into training and validation sets

In this mode, the files are randomly split (depending on *Random generator seed*) into the train and validation sets. The split is done upon press of the Preprocess button, when Preprocess for: Split for training and validation
Initially, all images for training should be organized in own subfolders named by corresponding class names and placed under: 1_Training\Images

Snapshot with the directory tree

bg and spots are examples of two class names

Snapshot with the legend

Back to Index --> User Guide --> Menu --> Tools Menu --> Deep learning segmentation

DeepMIB - Directories and Preprocessing tab

Contents

Widgets and settings

Preprocessing of files

Organization of directories without preprocessing for semantic segmentation

Organization of directories with preprocessing for semantic segmentation

Organization of directories for 2D patch-wise workflow