TA Tensors Tutorial

Introduction

Torchaug tensors are a subclass of Tensor. They are largely based on the torchvision tensors. Their use is described in detail in this section.

Therefore a TATensor can be used in torch operations just like any Tensor but some operations need to be considered carefully We recommand that you first have a look at Torchvision’s documentation.

All the TATensors

We define several TATensor as describe in the next subsections.

Image

An Image is a tensor used to represent an image, just as a TV Image.

Its dimension is at least 3.

Video

A Video is a tensor used to represent an video, just as a TV Video.

Its dimension is at least 4.

BoundingBoxes

A BoundingBoxes is a tensor used to represent bounding boxes, just as a TVBoundingBoxes.

It has the following added attributes:

  • canvas_size the size of the associated tensor image or video.

  • format a BoundingBoxFormat

Mask

A Mask is a tensor used to represent bounding boxes, just as a TV Mask.

Its dimension is at least 2.

Labels

A Labels is a tensor used to represent labels.

BatchImages

A BatchImages is a tensor used to represent a batch of images.

Its dimension is at least 4.

BatchVideos

A BatchVideos is a tensor used to represent a batch of videos.

Its dimension is at least 5.

BatchBoundingBoxes

A BatchBoundingBoxes is a tensor used to represent a batch of bounding boxes.

It has the following added attributes:

  • canvas_size the size of the associated batch of images or videos.

  • format a BoundingBoxFormat

  • samples_ranges a list of the range of the indices of the bounding boxes for each sample.

It also have specific behavior and defined class methods to handle some cases that are documented.

BatchMasks

A BatchMasks is a tensor to represent a batch of masks.

It has the following added attributes:

  • samples_ranges a list of the range of the indices of the masks for each sample.

It also have specific behavior and defined class methods to handle some cases that are documented.

BatchLabels

A BatchLabels is a tensor to represent a batch of labels.

It has the following added attributes:

  • samples_ranges a list of the range of the indices of the labels for each sample.

It also have specific behavior and defined class methods to handle some cases that are documented.

How TATensors are used

Internally

Internally, Torchaug uses the same notion of kernel for its functionals as in Torchvision. This means that each type of TATensor can have different transformations. The Transforms correctly dispatch the inputs to the functionals.

For example:

In your code

To use TA tensors, you can simply import the classes and instantiate them as a TVTensor:

from torchaug.ta_tensors import Image

image = Image(torch.randint(0, 256, (3, 224, 224), dtype=torch.uint8))

To help you to collate TATensors in batches, torchaug provides a default_collate to use with DataLoader.

import torch
from torch.utils.data import Dataset, DataLoader
from torchaug.data.dataloader import default_collate
from torchaug.ta_tensors import Image, BatchImages

class CustomDataset(Dataset):
    def __init__(self, len: int = 100):
        self._len = len

    def __getitem__(self, idx):
        return Image(
            torch.randint(0, 256, (3, 224, 224),
            dtype=torch.uint8)
        )

    def __len__(self):
        return self._len

dataloader = DataLoader(CustomDataset(), batch_size=2, collate_fn=default_collate)

batch = next(iter(dataloader))
assert isinstance(batch, BatchImages)
assert list(batch.shape) == [2, 3, 224, 224]