TA Tensors Tutorial#

All the TATensors#

Torchaug tensors are a subclass of Tensor. They are largely based on the torchvision tensors. Their use is described in detail in this section.

Therefore a TATensor can be used in torch operations just like any Tensor but some operations need to be considered carefully We recommand that you first have a look at Torchvision’s documentation.

We define several TATensor as describe in the next subsections.

Image#

An Image is a tensor used to represent an image, just as a TV Image.

Its dimension is at least 3.

Video#

A Video is a tensor used to represent an video, just as a TV Video.

Its dimension is at least 4.

BoundingBoxes#

A BoundingBoxes is a tensor used to represent bounding boxes, just as a TVBoundingBoxes.

It has the following added attributes:

canvas_size the size of the associated tensor image or video.
format a BoundingBoxFormat

Mask#

A Mask is a tensor used to represent bounding boxes, just as a TV Mask.

Its dimension is at least 2.

BatchImages#

A BatchImages is a tensor used to represent a batch of images.

Its dimension is at least 4.

BatchVideos#

A BatchVideos is a tensor used to represent a batch of videos.

Its dimension is at least 5.

BatchBoundingBoxes#

A BatchBoundingBoxes is a tensor used to represent a batch of bounding boxes.

It has the following added attributes:

canvas_size the size of the associated batch of images or videos.
format a BoundingBoxFormat
idx_sample a list of the index of the first bounding box for each sample in the associated batch of images or videos. This is because each sample can define different number of bounding boxes.

It also have specific behavior and defined class methods to handle some cases that are documented.

BatchMasks#

A BatchMasks is a tensor to represent a batch of makss.

It has the following added attributes:

idx_sample a list of the index of the first mask for each sample in the associated batch of masks. This is because each sample can define different number of masks.

It also have specific behavior and defined class methods to handle some cases that are documented.

How TATensors are used#

Internally#

Internally, Torchaug uses the same notion of kernel for its functionals as in Torchvision. This means that each type of TATensor can have different transformations. The Transforms correctly dispatch the inputs to the functionals.

For example:

F.resize should work differently for BatchImages and BatchBoundingBoxes.
F.adjust_brightness_batch should be defined for BatchImages and not for BatchBoundingBoxes

In your code#

To use TA tensors, you can simply import the classes and instantiate them as a TVTensor:

from torchaug.ta_tensors import Image

image = Image(torch.randint(0, 256, (3, 224, 224), dtype=torch.uint8))
assert isinstance(image, Image)

To help you with collate TATensors and form batches, torchaug provides a default_collate to use with DataLoader.

import torch
from torch.utils.data import Dataset, DataLoader
from torchaug.data.dataloader import default_collate
from torchaug.ta_tensors import Image, BatchImages

class CustomDataset(Dataset):
    def __init__(self, len: int = 100):
        self._len = len

    def __getitem__(self, idx):
        return Image(
            torch.randint(0, 256, (3, 224, 224),
            dtype=torch.uint8)
        )

    def __len__(self):
        return self._len

dataloader = DataLoader(CustomDataset(), batch_size=2, collate_fn=default_collate)

batch = next(iter(dataloader))
assert isinstance(batch, BatchImages)
assert list(batch.shape) == [2, 3, 224, 224]