ElasticTransform¶

class torchaug.transforms.ElasticTransform(alpha=50.0, sigma=5.0, interpolation=InterpolationMode.BILINEAR, fill=0, batch_inplace=False, batch_transform=False)[source]¶

Transform the input with elastic transformations.

If the input is a torch.Tensor or a TATensor (e.g. Image, Video, BoundingBoxes etc.) it can have arbitrary number of leading batch dimensions. For example, the image can have [..., C, H, W] shape. A bounding box can have [..., 4] shape.

Given alpha and sigma, it will generate displacement vectors for all pixels based on random offsets. Alpha controls the strength and sigma controls the smoothness of the displacements. The displacements are added to an identity grid and the resulting grid is used to transform the input.

Note

Implementation to transform bounding boxes is approximative (not exact). We construct an approximation of the inverse grid as inverse_grid = identity - displacement. This is not an exact inverse of the grid used to transform images, i.e. grid = identity + displacement. Our assumption is that displacement * displacement is small and can be ignored. Large displacements would lead to large errors in the approximation.

Applications:: Randomly transforms the morphology of objects in images and produces a see-through-water-like effect.

Parameters:

alpha (Union[float, Sequence[float]], optional) – Magnitude of displacements. Default: 50.0
sigma (Union[float, Sequence[float]], optional) – Smoothness of displacements. Default: 5.0
interpolation (Union[InterpolationMode, int], optional) – Desired interpolation enum defined by torchvision.transforms.InterpolationMode. If input is Tensor, only InterpolationMode.NEAREST, InterpolationMode.BILINEAR are supported. The corresponding Pillow integer constants, e.g. PIL.Image.BILINEAR are accepted as well. Default: InterpolationMode.BILINEAR
fill (Union[int, float, Sequence[int], Sequence[float], None, Dict[Union[Type, str], Union[int, float, Sequence[int], Sequence[float], None]]], optional) – Pixel fill value used when the padding_mode is constant. If a tuple of length 3, it is used to fill R, G, B channels respectively. Fill value can be also a dictionary mapping data type to the fill value, e.g. fill={ta_tensors.Image: 127, ta_tensors.Mask: 0} where Image will be filled with 127 and Mask will be filled with 0. Default: 0
batch_inplace (bool, optional) – whether to apply the batch transform in-place. Does not prevent functionals to make copy but can reduce time and memory consumption. Default: False
batch_transform (bool, optional) – whether to apply the transform in batch mode. Default: False