Notice
Recent Posts
Recent Comments
Link
«   2024/04   »
1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30
Archives
Today
Total
관리 메뉴

uos-machine-learning

파이썬 라이브러리 소개 - imgaug 본문

딥러닝

파이썬 라이브러리 소개 - imgaug

이산한하루 2019. 9. 12. 14:21

오늘은 딥러닝 모델을 돌릴 때 Image Data Augmentation을 편하게 해주는 imgaug 라이브러리에 대하여 소개하고자 합니다.

Data augmentation은 학습 데이터의 변조를 통해 좀 더 일반화된 모델을 얻기 위해서 사용하는 기법인데, 그 중에서도 imgaug는 이미지를 augmentation 해주는 효과를 가지고 있습니다. 

imgaug의 가장 좋은 점은 단순히 데이터를 회전, 노이즈, 확대 등을 할 수도 있지만 anchor box, heatmaps, seg.Maps, Keypoints 등도 같이 적용할 수 있다는 점입니다. 

Github에 있는 Data Augmentation 예시

Installation

파이썬 라이브러리다 보니 설치도 매우 쉽게 가능합니다. conda, pip 로 설치가 가능합니다!

Installation : Anaconda

conda config --add channels conda-forge
conda install imgaug

Installation : pip

pip install six numpy scipy Pillow matplotlib scikit-image opencv-python imageio Shapely
pip install imgaug

Example Images

종류는 정말 장난 아니게 많습니다. 

https://github.com/aleju/imgaug 

 

aleju/imgaug

Image augmentation for machine learning experiments. - aleju/imgaug

github.com

여기로 들어가 보시면 확인하실 수 있습니다. 아래는 예시입니다.

Code Examples

그럼 어떻게 코드를 적용할 수 있는지 확인해 보겠습니다!

Example: Simple Training Setting

아래는 깃허브에 있는 첫 번째 예제입니다. imgaug.augmenters 객체를 가지고 온 뒤에 Sequential 모델 안에 Augmentation 종류를 넣은 것을 확인할 수 있습니다. 

import numpy as np
import imgaug.augmenters as iaa


def load_batch(batch_idx):
    # dummy function, implement this
    # Return a numpy array of shape (N, height, width, #channels)
    # or a list of (height, width, #channels) arrays (may have different image
    # sizes).
    # Images should be in RGB for colorspace augmentations.
    # (cv2.imread() returns BGR!)
    # Images should usually be in uint8 with values from 0-255.
    return np.zeros((128, 32, 32, 3), dtype=np.uint8) + (batch_idx % 255)


def train_on_images(images):
    # dummy function, implement this
    pass


seq = iaa.Sequential([
    iaa.Crop(px=(0, 16)), # crop images from each side by 0 to 16px (randomly chosen)
    iaa.Fliplr(0.5), # horizontally flip 50% of the images
    iaa.GaussianBlur(sigma=(0, 3.0)) # blur images with a sigma of 0 to 3.0
])

for batch_idx in range(1000):
    images = load_batch(batch_idx)
    images_aug = seq.augment_images(images)  # done by the library
    train_on_images(images_aug)

Example: Very Complex Augmentation Pipeline

아래는 조금 더 복잡한 Augmentation에 대하여 설명하고 있습니다. 아래에는 sometimes, SomeOf 같은 명령어도 있는 것을 확인할 수 있는데 이는 Random 하게 augmentation을 수행하게 도와줍니다. 특히, random_order=True 인자는 순서도 바꿔가면서 Augmentation을 할 수 있습니다.

import numpy as np
import imgaug as ia
import imgaug.augmenters as iaa


# random example images
images = np.random.randint(0, 255, (16, 128, 128, 3), dtype=np.uint8)

# Sometimes(0.5, ...) applies the given augmenter in 50% of all cases,
# e.g. Sometimes(0.5, GaussianBlur(0.3)) would blur roughly every second image.
sometimes = lambda aug: iaa.Sometimes(0.5, aug)

# Define our sequence of augmentation steps that will be applied to every image
# All augmenters with per_channel=0.5 will sample one value _per image_
# in 50% of all cases. In all other cases they will sample new values
# _per channel_.
seq = iaa.Sequential(
    [
        # apply the following augmenters to most images
        iaa.Fliplr(0.5), # horizontally flip 50% of all images
        iaa.Flipud(0.2), # vertically flip 20% of all images
        # crop images by -5% to 10% of their height/width
        sometimes(iaa.CropAndPad(
            percent=(-0.05, 0.1),
            pad_mode=ia.ALL,
            pad_cval=(0, 255)
        )),
        sometimes(iaa.Affine(
            scale={"x": (0.8, 1.2), "y": (0.8, 1.2)}, # scale images to 80-120% of their size, individually per axis
            translate_percent={"x": (-0.2, 0.2), "y": (-0.2, 0.2)}, # translate by -20 to +20 percent (per axis)
            rotate=(-45, 45), # rotate by -45 to +45 degrees
            shear=(-16, 16), # shear by -16 to +16 degrees
            order=[0, 1], # use nearest neighbour or bilinear interpolation (fast)
            cval=(0, 255), # if mode is constant, use a cval between 0 and 255
            mode=ia.ALL # use any of scikit-image's warping modes (see 2nd image from the top for examples)
        )),
        # execute 0 to 5 of the following (less important) augmenters per image
        # don't execute all of them, as that would often be way too strong
        iaa.SomeOf((0, 5),
            [
                sometimes(iaa.Superpixels(p_replace=(0, 1.0), n_segments=(20, 200))), # convert images into their superpixel representation
                iaa.OneOf([
                    iaa.GaussianBlur((0, 3.0)), # blur images with a sigma between 0 and 3.0
                    iaa.AverageBlur(k=(2, 7)), # blur image using local means with kernel sizes between 2 and 7
                    iaa.MedianBlur(k=(3, 11)), # blur image using local medians with kernel sizes between 2 and 7
                ]),
                iaa.Sharpen(alpha=(0, 1.0), lightness=(0.75, 1.5)), # sharpen images
                iaa.Emboss(alpha=(0, 1.0), strength=(0, 2.0)), # emboss images
                # search either for all edges or for directed edges,
                # blend the result with the original image using a blobby mask
                iaa.SimplexNoiseAlpha(iaa.OneOf([
                    iaa.EdgeDetect(alpha=(0.5, 1.0)),
                    iaa.DirectedEdgeDetect(alpha=(0.5, 1.0), direction=(0.0, 1.0)),
                ])),
                iaa.AdditiveGaussianNoise(loc=0, scale=(0.0, 0.05*255), per_channel=0.5), # add gaussian noise to images
                iaa.OneOf([
                    iaa.Dropout((0.01, 0.1), per_channel=0.5), # randomly remove up to 10% of the pixels
                    iaa.CoarseDropout((0.03, 0.15), size_percent=(0.02, 0.05), per_channel=0.2),
                ]),
                iaa.Invert(0.05, per_channel=True), # invert color channels
                iaa.Add((-10, 10), per_channel=0.5), # change brightness of images (by -10 to 10 of original value)
                iaa.AddToHueAndSaturation((-20, 20)), # change hue and saturation
                # either change the brightness of the whole image (sometimes
                # per channel) or change the brightness of subareas
                iaa.OneOf([
                    iaa.Multiply((0.5, 1.5), per_channel=0.5),
                    iaa.FrequencyNoiseAlpha(
                        exponent=(-4, 0),
                        first=iaa.Multiply((0.5, 1.5), per_channel=True),
                        second=iaa.ContrastNormalization((0.5, 2.0))
                    )
                ]),
                iaa.ContrastNormalization((0.5, 2.0), per_channel=0.5), # improve or worsen the contrast
                iaa.Grayscale(alpha=(0.0, 1.0)),
                sometimes(iaa.ElasticTransformation(alpha=(0.5, 3.5), sigma=0.25)), # move pixels locally around (with random strengths)
                sometimes(iaa.PiecewiseAffine(scale=(0.01, 0.05))), # sometimes move parts of the image around
                sometimes(iaa.PerspectiveTransform(scale=(0.01, 0.1)))
            ],
            random_order=True
        )
    ],
    random_order=True
)

images_aug = seq.augment_images(images)

Example: Visualize Augmentations

아래 코드로 시각화 역시 가능합니다. 다만 input 이미지가 0~255 사이값을 가진다는 특징이 있습니다. (다른 함수도 마찬가지!)

import numpy as np
import imgaug.augmenters as iaa


images = np.random.randint(0, 255, (16, 128, 128, 3), dtype=np.uint8)
seq = iaa.Sequential([iaa.Fliplr(0.5), iaa.GaussianBlur((0, 3.0))])

# Show an image with 8*8 augmented versions of image 0 and 8*8 augmented
# versions of image 1. Identical augmentations will be applied to
# image 0 and 1.
seq.show_grid([images[0], images[1]], cols=8, rows=8)

Example : Augment Images and Keypoints

imgaug 의 강력한 기능입니다. Keypoints / Landmarks 역시 같이 augmentation이 가능하기 때문에 yolo같은 object detection 모델에서도 사용가능합니다!

import numpy as np
import imgaug as ia
import imgaug.augmenters as iaa
from imgaug.augmentables.kps import KeypointsOnImage


images = np.random.randint(0, 50, (4, 128, 128, 3), dtype=np.uint8)

# Generate random keypoints, 1-10 per image with float32 coordinates
keypoints = []
for image in images:
    n_keypoints = np.random.randint(1, 10)
    kps = np.random.random((n_keypoints, 2))
    kps[:, 0] *= image.shape[0]
    kps[:, 1] *= image.shape[1]
    keypoints.append(kps)

seq = iaa.Sequential([iaa.GaussianBlur((0, 3.0)),
                      iaa.Affine(scale=(0.5, 0.7))])

# augment keypoints and images
images_aug, keypoints_aug = seq(images=images, keypoints=keypoints)

# Example code to show each image and print the new keypoints coordinates
for i in range(len(images)):
    print("[Image #%d]" % (i,))
    keypoints_before = KeypointsOnImage.from_xy_array(
        keypoints[i], shape=images[i].shape)
    keypoints_after = KeypointsOnImage.from_xy_array(
        keypoints_aug[i], shape=images_aug[i].shape)
    image_before = keypoints_before.draw_on_image(images[i])
    image_after = keypoints_after.draw_on_image(images_aug[i])
    ia.imshow(np.hstack([image_before, image_after]))

    kps_zipped = zip(keypoints_before.keypoints,
                     keypoints_after.keypoints)
    for keypoint_before, keypoint_after in kps_zipped:
        x_before, y_before = keypoint_before.x, keypoint_before.y
        x_after, y_after = keypoint_after.x, keypoint_after.y
        print("before aug: x=%d y=%d | after aug: x=%d y=%d" % (
            x_before, y_before, x_after, y_after))

그럼 실제 코드에서는 어떻게 적용이 가능할까요??

케라스

케라스에는 fit_generator라는 함수가 있습니다. generator를 통해 배치 사이즈만큼 데이터를 뽑아내는데, 이 때 정의한 do_augmentation 함수를 통해 augmentation을 할 수 있습니다. 아래는 segmentation에서 augmentation을 하기위해 직접 구현한 함수입니다.

    def generator(self, features, labels, batch_size):
        batch_features = np.zeros((batch_size, 128, 128, 3))
        batch_labels = np.zeros((batch_size, 128, 128, 1))
        while True:
            for i in range(batch_size):
                index = random.randint(0, len(features)-1)
                random_augmented_image, random_augmented_labels = self.do_augmentation(self.seq_det, features[index], labels[index])
                batch_features[i] = random_augmented_image
                batch_labels[i] = random_augmented_labels
            yield batch_features, batch_labels

    def do_augmentation(self, seq_det, x_train, y_train):
        ret_y_train = np.zeros((128,128,1))
        ret_y_train[:,:,:1] = y_train

        aug_x_train = seq_det.augment_images([x_train])[0]
        aug_y_train = seq_det.augment_images([ret_y_train])[0]

        ret_x_train = aug_x_train
        ret_y_train = aug_y_train
        return ret_x_train, ret_y_train[:,:,:1]

파이토치

파이토치는 적용하기가 더 쉬운 편입니다.

colab 링크 : https://colab.research.google.com/drive/109vu3F1LTzD1gdVV6cho9fKGx7lzbFll

 

Google Colaboratory

 

colab.research.google.com

핵심적인 부분만 보여드리면 우선 아래와 같이 imgaug를 이용해 Transform을 실행하는 ImgAugTransform을 정의합니다.

class ImgAugTransform:
  def __init__(self):
    self.aug = iaa.Sequential([
        iaa.Scale((224, 224)),
        iaa.Sometimes(0.25, iaa.GaussianBlur(sigma=(0, 3.0))),
        iaa.Fliplr(0.5),
        iaa.Affine(rotate=(-20, 20), mode='symmetric'),
        iaa.Sometimes(0.25,
                      iaa.OneOf([iaa.Dropout(p=(0, 0.1)),
                                 iaa.CoarseDropout(0.1, size_percent=0.5)])),
        iaa.AddToHueAndSaturation(value=(-10, 10), per_channel=True)
    ])
      
  def __call__(self, img):
    img = np.array(img)
    return self.aug.augment_image(img)

transforms = ImgAugTransform()

dataset = torchvision.datasets.ImageFolder('pytorch-examples/data/', transform=transforms)

기존의 torch의 transforms 함수와 같이 쓰고 싶다면 아래와 같이 구현합니다.

from torchvision import datasets, transforms

# Define a transform to normalize the data
transform = transforms.Compose([# transforms.RandomRotation(20),
                                aug_transforms,
                                transforms.ToTensor(),
                                transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])])

그리고 loader를 아래와 같이 만들면 되겠죠.

# Download and load the training data
trainset = datasets.CIFAR10('~/.pytorch/MNIST_data/', download=True, train=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=64, shuffle=True)

# Download and load the test data
testset = datasets.CIFAR10('~/.pytorch/MNIST_data/', download=True, train=False, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=64, shuffle=True)

그럼 단점과 개선방향에는 어떤 것이 있을까요??

1. batch 에 대하여 augmentation이 들어가므로 연산이 더 들어간다. == 느려진다. 

 - 개선 방법 : keras의 벡엔드 함수로 GPU 연산으로 augmentation 을 구현한다.(난이도 상)

2. 픽셀이 0~255으로 맞춰줘야 한다. == MRI 같이 가변적인 픽셀값에 대하여 적용하기 힘들다.

 - 개선 방법 : .......

 

Comments