pytorch中数据增强方法

Tom2Code

发布于 2022-11-21 12:04:45

8860

发布于 2022-11-21 12:04:45

文章被收录于专栏：Tom

pytorch提供的torchvision中有三剑客

datasets 包含了很多数据集
models 包含了很多预训练模型
transforms 包含了转换数据的方法或者是数据增强的方法

今天我们就来谈一下transforms中的一些方法：

1.torchvision.transforms.RandomCrop()

随机位置裁剪，下面是随机裁剪的五个参数的含义

Init signature:
torchvision.transforms.RandomCrop(
    size,
    padding=None,
    pad_if_needed=False,
    fill=0,
    padding_mode='constant',
)
Docstring:     
Crop the given PIL Image at a random location.

Args:
    size (sequence or int): Desired output size of the crop. If size is an
        int instead of sequence like (h, w), a square crop (size, size) is
        made.
    padding (int or sequence, optional): Optional padding on each border
        of the image. Default is None, i.e no padding. If a sequence of length
        4 is provided, it is used to pad left, top, right, bottom borders
        respectively. If a sequence of length 2 is provided, it is used to
        pad left/right, top/bottom borders, respectively.
    pad_if_needed (boolean): It will pad the image if smaller than the
        desired size to avoid raising an exception. Since cropping is done
        after padding, the padding seems to be done at a random offset.
    fill: Pixel fill value for constant fill. Default is 0. If a tuple of
        length 3, it is used to fill R, G, B channels respectively.
        This value is only used when the padding_mode is constant
    padding_mode: Type of padding. Should be: constant, edge, reflect or symmetric. Default is constant.

         - constant: pads with a constant value, this value is specified with fill

         - edge: pads with the last value on the edge of the image

         - reflect: pads with reflection of image (without repeating the last value on the edge)

            padding [1, 2, 3, 4] with 2 elements on both sides in reflect mode
            will result in [3, 2, 1, 2, 3, 4, 3, 2]

         - symmetric: pads with reflection of image (repeating the last value on the edge)

            padding [1, 2, 3, 4] with 2 elements on both sides in symmetric mode
            will result in [2, 1, 1, 2, 3, 4, 4, 3]

2.torchvision.transforms.RandomHorizontalFlip()

随机水平翻转，一个参数p是概率参数

Init signature: torchvision.transforms.RandomHorizontalFlip(p=0.5)
Docstring:     
Horizontally flip the given PIL Image randomly with a given probability.

Args:
    p (float): probability of the image being flipped. Default value is 0.5

3.torchvision.transforms.RandomVerticalFlip()

随机上下翻转，一个参数p也为翻转的概率

Init signature: torchvision.transforms.RandomVerticalFlip(p=0.5)
Docstring:     
Vertically flip the given PIL Image randomly with a given probability.

Args:
    p (float): probability of the image being flipped. Default value is 0.5

4.torchvision.transforms.RandomRotation()

随机旋转一个角度，参数值第一个值则为角度

Init signature:
torchvision.transforms.RandomRotation(
    degrees,
    resample=False,
    expand=False,
    center=None,
)
Docstring:     
Rotate the image by angle.

5.torchvision.transforms.ColorJitter()

修改图像的属性，参数的含义分别是亮度，对比度，饱和度和颜色

Init signature:
torchvision.transforms.ColorJitter(
    brightness=0,
    contrast=0,
    saturation=0,
    hue=0,
)
Docstring:     
Randomly change the brightness, contrast and saturation of an image.

Args:
    brightness (float or tuple of float (min, max)): How much to jitter brightness.
        brightness_factor is chosen uniformly from [max(0, 1 - brightness), 1 + brightness]
        or the given [min, max]. Should be non negative numbers.
    contrast (float or tuple of float (min, max)): How much to jitter contrast.
        contrast_factor is chosen uniformly from [max(0, 1 - contrast), 1 + contrast]
        or the given [min, max]. Should be non negative numbers.
    saturation (float or tuple of float (min, max)): How much to jitter saturation.
        saturation_factor is chosen uniformly from [max(0, 1 - saturation), 1 + saturation]
        or the given [min, max]. Should be non negative numbers.
    hue (float or tuple of float (min, max)): How much to jitter hue.
        hue_factor is chosen uniformly from [-hue, hue] or the given [min, max].
        Should have 0<= hue <= 0.5 or -0.5 <= min <= max <= 0.5.

6.torchvision.transforms.RandomGrayscale()

图片随机灰度化，一个参数是概率参数

Init signature: torchvision.transforms.RandomGrayscale(p=0.1)
Docstring:     
Randomly convert image to grayscale with a probability of p (default 0.1).

Args:
    p (float): probability that image should be converted to grayscale.

Returns:
    PIL Image: Grayscale version of the input image with probability p and unchanged
    with probability (1-p).
    - If input image is 1 channel: grayscale version is 1 channel
    - If input image is 3 channel: grayscale version is 3 channel with r == g == b

今天就先介绍这六种常用的增强方法，谢谢大家的观看。

本文参与腾讯云自媒体同步曝光计划，分享自微信公众号。

原始发表：2022-06-16，如有侵权请联系 cloudcommunity@tencent.com 删除

pytorch