API - Preprocessing¶
We provide abundant data augmentation and processing functions by using Numpy, Scipy, Threading and Queue.
However, we recommend you to use TensorFlow operation function like tf.image.central_crop
,
more TensorFlow data augmentation method can be found
here and tutorial_cifar10_tfrecord.py
.
Some of the code in this package are borrowed from Keras.
threading_data ([data, fn, thread_count]) |
Return a batch of result by given data. |
rotation (x[, rg, is_random, row_index, …]) |
Rotate an image randomly or non-randomly. |
rotation_multi (x[, rg, is_random, …]) |
Rotate multiple images with the same arguments, randomly or non-randomly. |
crop (x, wrg, hrg[, is_random, row_index, …]) |
Randomly or centrally crop an image. |
crop_multi (x, wrg, hrg[, is_random, …]) |
Randomly or centrally crop multiple images. |
flip_axis (x[, axis, is_random]) |
Flip the axis of an image, such as flip left and right, up and down, randomly or non-randomly, |
flip_axis_multi (x, axis[, is_random]) |
Flip the axises of multiple images together, such as flip left and right, up and down, randomly or non-randomly, |
shift (x[, wrg, hrg, is_random, row_index, …]) |
Shift an image randomly or non-randomly. |
shift_multi (x[, wrg, hrg, is_random, …]) |
Shift images with the same arguments, randomly or non-randomly. |
shear (x[, intensity, is_random, row_index, …]) |
Shear an image randomly or non-randomly. |
shear_multi (x[, intensity, is_random, …]) |
Shear images with the same arguments, randomly or non-randomly. |
shear2 (x[, shear, is_random, row_index, …]) |
Shear an image randomly or non-randomly. |
shear_multi2 (x[, shear, is_random, …]) |
Shear images with the same arguments, randomly or non-randomly. |
swirl (x[, center, strength, radius, …]) |
Swirl an image randomly or non-randomly, see scikit-image swirl API and example. |
swirl_multi (x[, center, strength, radius, …]) |
Swirl multiple images with the same arguments, randomly or non-randomly. |
elastic_transform (x, alpha, sigma[, mode, …]) |
Elastic deformation of images as described in [Simard2003] . |
elastic_transform_multi (x, alpha, sigma[, …]) |
Elastic deformation of images as described in [Simard2003]. |
zoom (x[, zoom_range, is_random, row_index, …]) |
Zoom in and out of a single image, randomly or non-randomly. |
zoom_multi (x[, zoom_range, is_random, …]) |
Zoom in and out of images with the same arguments, randomly or non-randomly. |
brightness (x[, gamma, gain, is_random]) |
Change the brightness of a single image, randomly or non-randomly. |
brightness_multi (x[, gamma, gain, is_random]) |
Change the brightness of multiply images, randomly or non-randomly. |
illumination (x[, gamma, contrast, …]) |
Perform illumination augmentation for a single image, randomly or non-randomly. |
rgb_to_hsv (rgb) |
Input RGB image [0~255] return HSV image [0~1]. |
hsv_to_rgb (hsv) |
Input HSV image [0~1] return RGB image [0~255]. |
adjust_hue (im[, hout, is_offset, is_clip, …]) |
Adjust hue of an RGB image. |
imresize (x[, size, interp, mode]) |
Resize an image by given output size and method. |
pixel_value_scale (im[, val, clip, is_random]) |
Scales each value in the pixels of the image. |
samplewise_norm (x[, rescale, …]) |
Normalize an image by rescale, samplewise centering and samplewise centering in order. |
featurewise_norm (x[, mean, std, epsilon]) |
Normalize every pixels by the same given mean and std, which are usually compute from all examples. |
channel_shift (x, intensity[, is_random, …]) |
Shift the channels of an image, randomly or non-randomly, see numpy.rollaxis. |
channel_shift_multi (x, intensity[, …]) |
Shift the channels of images with the same arguments, randomly or non-randomly, see numpy.rollaxis . |
drop (x[, keep]) |
Randomly set some pixels to zero by a given keeping probability. |
transform_matrix_offset_center (matrix, x, y) |
Return transform matrix offset center. |
apply_transform (x, transform_matrix[, …]) |
Return transformed images by given transform_matrix from transform_matrix_offset_center . |
projective_transform_by_points (x, src, dst) |
Projective transform by given coordinates, usually 4 coordinates. |
array_to_img (x[, dim_ordering, scale]) |
Converts a numpy array to PIL image object (uint8 format). |
find_contours (x[, level, fully_connected, …]) |
Find iso-valued contours in a 2D array for a given level value, returns list of (n, 2)-ndarrays see skimage.measure.find_contours . |
pt2map ([list_points, size, val]) |
Inputs a list of points, return a 2D image. |
binary_dilation (x[, radius]) |
Return fast binary morphological dilation of an image. |
dilation (x[, radius]) |
Return greyscale morphological dilation of an image, see skimage.morphology.dilation. |
binary_erosion (x[, radius]) |
Return binary morphological erosion of an image, see skimage.morphology.binary_erosion. |
erosion (x[, radius]) |
Return greyscale morphological erosion of an image, see skimage.morphology.erosion. |
obj_box_coord_rescale ([coord, shape]) |
Scale down one coordinates from pixel unit to the ratio of image size i.e. |
obj_box_coords_rescale ([coords, shape]) |
Scale down a list of coordinates from pixel unit to the ratio of image size i.e. |
obj_box_coord_scale_to_pixelunit (coord[, shape]) |
Convert one coordinate [x, y, w (or x2), h (or y2)] in ratio format to image coordinate format. |
obj_box_coord_centroid_to_upleft_butright (coord) |
Convert one coordinate [x_center, y_center, w, h] to [x1, y1, x2, y2] in up-left and botton-right format. |
obj_box_coord_upleft_butright_to_centroid (coord) |
Convert one coordinate [x1, y1, x2, y2] to [x_center, y_center, w, h]. |
obj_box_coord_centroid_to_upleft (coord) |
Convert one coordinate [x_center, y_center, w, h] to [x, y, w, h]. |
obj_box_coord_upleft_to_centroid (coord) |
Convert one coordinate [x, y, w, h] to [x_center, y_center, w, h]. |
parse_darknet_ann_str_to_list (annotation) |
Input string format of class, x, y, w, h, return list of list format. |
parse_darknet_ann_list_to_cls_box (annotation) |
Input list of [[class, x, y, w, h], …], return two list of [class …] and [[x, y, w, h], …]. |
obj_box_left_right_flip (im[, coords, …]) |
Left-right flip the image and coordinates for object detection. |
obj_box_imresize (im[, coords, size, interp, …]) |
Resize an image, and compute the new bounding box coordinates. |
obj_box_crop (im[, classes, coords, wrg, …]) |
Randomly or centrally crop an image, and compute the new bounding box coordinates. |
obj_box_shift (im[, classes, coords, wrg, …]) |
Shift an image randomly or non-randomly, and compute the new bounding box coordinates. |
obj_box_zoom (im[, classes, coords, …]) |
Zoom in and out of a single image, randomly or non-randomly, and compute the new bounding box coordinates. |
pad_sequences (sequences[, maxlen, dtype, …]) |
Pads each sequence to the same length: the length of the longest sequence. |
remove_pad_sequences (sequences[, pad_id]) |
Remove padding. |
process_sequences (sequences[, end_id, …]) |
Set all tokens(ids) after END token to the padding value, and then shorten (option) it to the maximum sequence length in this batch. |
sequences_add_start_id (sequences[, …]) |
Add special start token(id) in the beginning of each sequence. |
sequences_add_end_id (sequences[, end_id]) |
Add special end token(id) in the end of each sequence. |
sequences_add_end_id_after_pad (sequences[, …]) |
Add special end token(id) in the end of each sequence. |
sequences_get_mask (sequences[, pad_val]) |
Return mask for sequences. |
Threading¶
-
tensorlayer.prepro.
threading_data
(data=None, fn=None, thread_count=None, **kwargs)[source]¶ Return a batch of result by given data. Usually be used for data augmentation.
Parameters: - data : numpy array, file names and etc, see Examples below.
- thread_count : the number of threads to use
- fn : the function for data processing.
- more args : the args for fn, see Examples below.
References
Examples
- Single array
>>> X --> [batch_size, row, col, 1] greyscale >>> results = threading_data(X, zoom, zoom_range=[0.5, 1], is_random=True) ... results --> [batch_size, row, col, channel] >>> tl.visualize.images2d(images=np.asarray(results), second=0.01, saveable=True, name='after', dtype=None) >>> tl.visualize.images2d(images=np.asarray(X), second=0.01, saveable=True, name='before', dtype=None)
- List of array (e.g. functions with
multi
)
>>> X, Y --> [batch_size, row, col, 1] greyscale >>> data = threading_data([_ for _ in zip(X, Y)], zoom_multi, zoom_range=[0.5, 1], is_random=True) ... data --> [batch_size, 2, row, col, 1] >>> X_, Y_ = data.transpose((1,0,2,3,4)) ... X_, Y_ --> [batch_size, row, col, 1] >>> tl.visualize.images2d(images=np.asarray(X_), second=0.01, saveable=True, name='after', dtype=None) >>> tl.visualize.images2d(images=np.asarray(Y_), second=0.01, saveable=True, name='before', dtype=None)
- Single array split across
thread_count
threads (e.g. functions withmulti
)
>>> X, Y --> [batch_size, row, col, 1] greyscale >>> data = threading_data(X, zoom_multi, 8, zoom_range=[0.5, 1], is_random=True) ... data --> [batch_size, 2, row, col, 1] >>> X_, Y_ = data.transpose((1,0,2,3,4)) ... X_, Y_ --> [batch_size, row, col, 1] >>> tl.visualize.images2d(images=np.asarray(X_), second=0.01, saveable=True, name='after', dtype=None) >>> tl.visualize.images2d(images=np.asarray(Y_), second=0.01, saveable=True, name='before', dtype=None)
- Customized function for image segmentation
>>> def distort_img(data): ... x, y = data ... x, y = flip_axis_multi([x, y], axis=0, is_random=True) ... x, y = flip_axis_multi([x, y], axis=1, is_random=True) ... x, y = crop_multi([x, y], 100, 100, is_random=True) ... return x, y >>> X, Y --> [batch_size, row, col, channel] >>> data = threading_data([_ for _ in zip(X, Y)], distort_img) >>> X_, Y_ = data.transpose((1,0,2,3,4))
Images¶
- These functions only apply on a single image, use
threading_data
to apply multiple threading seetutorial_image_preprocess.py
. - All functions have argument
is_random
. - All functions end with multi , usually be used for image segmentation i.e. the input and output image should be matched.
Rotation¶
-
tensorlayer.prepro.
rotation
(x, rg=20, is_random=False, row_index=0, col_index=1, channel_index=2, fill_mode='nearest', cval=0.0, order=1)[source]¶ Rotate an image randomly or non-randomly.
Parameters: - x : numpy array
An image with dimension of [row, col, channel] (default).
- rg : int or float
Degree to rotate, usually 0 ~ 180.
- is_random : boolean, default False
If True, randomly rotate.
- row_index, col_index, channel_index : int
Index of row, col and channel, default (0, 1, 2), for theano (1, 2, 0).
- fill_mode : string
Method to fill missing pixel, default ‘nearest’, more options ‘constant’, ‘reflect’ or ‘wrap’
- cval : scalar, optional
Value used for points outside the boundaries of the input if mode=’constant’. Default is 0.0
- order : int, optional
The order of interpolation. The order has to be in the range 0-5. See
apply_transform
.
Examples
>>> x --> [row, col, 1] greyscale >>> x = rotation(x, rg=40, is_random=False) >>> tl.visualize.frame(x[:,:,0], second=0.01, saveable=True, name='temp',cmap='gray')
-
tensorlayer.prepro.
rotation_multi
(x, rg=20, is_random=False, row_index=0, col_index=1, channel_index=2, fill_mode='nearest', cval=0.0, order=1)[source]¶ Rotate multiple images with the same arguments, randomly or non-randomly. Usually be used for image segmentation which x=[X, Y], X and Y should be matched.
Parameters: - x : list of numpy array
List of images with dimension of [n_images, row, col, channel] (default).
- others : see
rotation
.
Examples
>>> x, y --> [row, col, 1] greyscale >>> x, y = rotation_multi([x, y], rg=90, is_random=False) >>> tl.visualize.frame(x[:,:,0], second=0.01, saveable=True, name='x',cmap='gray') >>> tl.visualize.frame(y[:,:,0], second=0.01, saveable=True, name='y',cmap='gray')
Crop¶
-
tensorlayer.prepro.
crop
(x, wrg, hrg, is_random=False, row_index=0, col_index=1, channel_index=2)[source]¶ Randomly or centrally crop an image.
Parameters: - x : numpy array
An image with dimension of [row, col, channel] (default).
- wrg : int
Size of width.
- hrg : int
Size of height.
- is_random : boolean, default False
If True, randomly crop, else central crop.
- row_index, col_index, channel_index : int
Index of row, col and channel, default (0, 1, 2), for theano (1, 2, 0).
Flip¶
-
tensorlayer.prepro.
flip_axis
(x, axis=1, is_random=False)[source]¶ Flip the axis of an image, such as flip left and right, up and down, randomly or non-randomly,
Parameters: - x : numpy array
An image with dimension of [row, col, channel] (default).
- axis : int
- 0, flip up and down
- 1, flip left and right
- 2, flip channel
- is_random : boolean, default False
If True, randomly flip.
-
tensorlayer.prepro.
flip_axis_multi
(x, axis, is_random=False)[source]¶ Flip the axises of multiple images together, such as flip left and right, up and down, randomly or non-randomly,
Parameters: - x : list of numpy array
List of images with dimension of [n_images, row, col, channel] (default).
- others : see
flip_axis
.
Shift¶
-
tensorlayer.prepro.
shift
(x, wrg=0.1, hrg=0.1, is_random=False, row_index=0, col_index=1, channel_index=2, fill_mode='nearest', cval=0.0, order=1)[source]¶ Shift an image randomly or non-randomly.
Parameters: - x : numpy array
An image with dimension of [row, col, channel] (default).
- wrg : float
Percentage of shift in axis x, usually -0.25 ~ 0.25.
- hrg : float
Percentage of shift in axis y, usually -0.25 ~ 0.25.
- is_random : boolean, default False
If True, randomly shift.
- row_index, col_index, channel_index : int
Index of row, col and channel, default (0, 1, 2), for theano (1, 2, 0).
- fill_mode : string
Method to fill missing pixel, default ‘nearest’, more options ‘constant’, ‘reflect’ or ‘wrap’.
- cval : scalar, optional
Value used for points outside the boundaries of the input if mode=’constant’. Default is 0.0.
- order : int, optional
The order of interpolation. The order has to be in the range 0-5. See
apply_transform
.
-
tensorlayer.prepro.
shift_multi
(x, wrg=0.1, hrg=0.1, is_random=False, row_index=0, col_index=1, channel_index=2, fill_mode='nearest', cval=0.0, order=1)[source]¶ Shift images with the same arguments, randomly or non-randomly. Usually be used for image segmentation which x=[X, Y], X and Y should be matched.
Parameters: - x : list of numpy array
List of images with dimension of [n_images, row, col, channel] (default).
- others : see
shift
.
Shear¶
-
tensorlayer.prepro.
shear
(x, intensity=0.1, is_random=False, row_index=0, col_index=1, channel_index=2, fill_mode='nearest', cval=0.0, order=1)[source]¶ Shear an image randomly or non-randomly.
Parameters: - x : numpy array
An image with dimension of [row, col, channel] (default).
- intensity : float
Percentage of shear, usually -0.5 ~ 0.5 (is_random==True), 0 ~ 0.5 (is_random==False), you can have a quick try by shear(X, 1).
- is_random : boolean, default False
If True, randomly shear.
- row_index, col_index, channel_index : int
Index of row, col and channel, default (0, 1, 2), for theano (1, 2, 0).
- fill_mode : string
Method to fill missing pixel, default ‘nearest’, more options ‘constant’, ‘reflect’ or ‘wrap’.
- cval : scalar, optional
Value used for points outside the boundaries of the input if mode=’constant’. Default is 0.0.
- order : int, optional
The order of interpolation. The order has to be in the range 0-5. See
apply_transform
.
References
-
tensorlayer.prepro.
shear_multi
(x, intensity=0.1, is_random=False, row_index=0, col_index=1, channel_index=2, fill_mode='nearest', cval=0.0, order=1)[source]¶ Shear images with the same arguments, randomly or non-randomly. Usually be used for image segmentation which x=[X, Y], X and Y should be matched.
Parameters: - x : list of numpy array
List of images with dimension of [n_images, row, col, channel] (default).
- others : see
tl.prepro.shear
.
Shear V2¶
-
tensorlayer.prepro.
shear2
(x, shear=(0.1, 0.1), is_random=False, row_index=0, col_index=1, channel_index=2, fill_mode='nearest', cval=0.0, order=1)[source]¶ Shear an image randomly or non-randomly.
Parameters: - x : numpy array
An image with dimension of [row, col, channel] (default).
- shear : tuple of two floats
Percentage of shear for height and width direction (0, 1).
- is_random : boolean, default False
If True, randomly shear.
- row_index, col_index, channel_index : int
Index of row, col and channel, default (0, 1, 2), for theano (1, 2, 0).
- fill_mode : string
Method to fill missing pixel, default ‘nearest’, more options ‘constant’, ‘reflect’ or ‘wrap’.
- cval : scalar, optional
Value used for points outside the boundaries of the input if mode=’constant’. Default is 0.0.
- order : int, optional
The order of interpolation. The order has to be in the range 0-5. See
apply_transform
.
References
-
tensorlayer.prepro.
shear_multi2
(x, shear=(0.1, 0.1), is_random=False, row_index=0, col_index=1, channel_index=2, fill_mode='nearest', cval=0.0, order=1)[source]¶ Shear images with the same arguments, randomly or non-randomly. Usually be used for image segmentation which x=[X, Y], X and Y should be matched.
Parameters: - x : list of numpy array
List of images with dimension of [n_images, row, col, channel] (default).
- others : see
tl.prepro.shear2
.
Swirl¶
-
tensorlayer.prepro.
swirl
(x, center=None, strength=1, radius=100, rotation=0, output_shape=None, order=1, mode='constant', cval=0, clip=True, preserve_range=False, is_random=False)[source]¶ Swirl an image randomly or non-randomly, see scikit-image swirl API and example.
Parameters: - x : numpy array
An image with dimension of [row, col, channel] (default).
- center : (row, column) tuple or (2,) ndarray, optional
Center coordinate of transformation.
- strength : float, optional
The amount of swirling applied.
- radius : float, optional
The extent of the swirl in pixels. The effect dies out rapidly beyond radius.
- rotation : float, (degree) optional
Additional rotation applied to the image, usually [0, 360], relates to center.
- output_shape : tuple (rows, cols), optional
Shape of the output image generated. By default the shape of the input image is preserved.
- order : int, optional
The order of the spline interpolation, default is 1. The order has to be in the range 0-5. See skimage.transform.warp for detail.
- mode : {‘constant’, ‘edge’, ‘symmetric’, ‘reflect’, ‘wrap’}, optional
Points outside the boundaries of the input are filled according to the given mode, with ‘constant’ used as the default. Modes match the behaviour of numpy.pad.
- cval : float, optional
Used in conjunction with mode ‘constant’, the value outside the image boundaries.
- clip : bool, optional
Whether to clip the output to the range of values of the input image. This is enabled by default, since higher order interpolation may produce values outside the given input range.
- preserve_range : bool, optional
Whether to keep the original range of values. Otherwise, the input image is converted according to the conventions of img_as_float.
- is_random : boolean, default False
- If True, random swirl.
- random center = [(0 ~ x.shape[0]), (0 ~ x.shape[1])]
- random strength = [0, strength]
- random radius = [1e-10, radius]
- random rotation = [-rotation, rotation]
Examples
>>> x --> [row, col, 1] greyscale >>> x = swirl(x, strength=4, radius=100)
-
tensorlayer.prepro.
swirl_multi
(x, center=None, strength=1, radius=100, rotation=0, output_shape=None, order=1, mode='constant', cval=0, clip=True, preserve_range=False, is_random=False)[source]¶ Swirl multiple images with the same arguments, randomly or non-randomly. Usually be used for image segmentation which x=[X, Y], X and Y should be matched.
Parameters: - x : list of numpy array
List of images with dimension of [n_images, row, col, channel] (default).
- others : see
swirl
.
Elastic transform¶
-
tensorlayer.prepro.
elastic_transform
(x, alpha, sigma, mode='constant', cval=0, is_random=False)[source]¶ Elastic deformation of images as described in [Simard2003] .
Parameters: - x : numpy array, a greyscale image.
- alpha : scalar factor.
- sigma : scalar or sequence of scalars, the smaller the sigma, the more transformation.
Standard deviation for Gaussian kernel. The standard deviations of the Gaussian filter are given for each axis as a sequence, or as a single number, in which case it is equal for all axes.
- mode : default constant, see scipy.ndimage.filters.gaussian_filter.
- cval : float, optional. Used in conjunction with mode ‘constant’, the value outside the image boundaries.
- is_random : boolean, default False
References
Examples
>>> x = elastic_transform(x, alpha = x.shape[1] * 3, sigma = x.shape[1] * 0.07)
-
tensorlayer.prepro.
elastic_transform_multi
(x, alpha, sigma, mode='constant', cval=0, is_random=False)[source]¶ Elastic deformation of images as described in [Simard2003].
Parameters: - x : list of numpy array
- others : see
elastic_transform
.
Zoom¶
-
tensorlayer.prepro.
zoom
(x, zoom_range=(0.9, 1.1), is_random=False, row_index=0, col_index=1, channel_index=2, fill_mode='nearest', cval=0.0, order=1)[source]¶ Zoom in and out of a single image, randomly or non-randomly.
Parameters: - x : numpy array
An image with dimension of [row, col, channel] (default).
- zoom_range : list or tuple
- If is_random=False, (h, w) are the fixed zoom factor for row and column axies, factor small than one is zoom in.
- If is_random=True, it is (min zoom out, max zoom out) for x and y with different random zoom in/out factor.
e.g (0.5, 1) zoom in 1~2 times.
- is_random : boolean, default False
If True, randomly zoom.
- row_index, col_index, channel_index : int
Index of row, col and channel, default (0, 1, 2), for theano (1, 2, 0).
- fill_mode : string
Method to fill missing pixel, default ‘nearest’, more options ‘constant’, ‘reflect’ or ‘wrap’.
- cval : scalar, optional
Value used for points outside the boundaries of the input if mode=’constant’. Default is 0.0.
- order : int, optional
The order of interpolation. The order has to be in the range 0-5. See
apply_transform
.
-
tensorlayer.prepro.
zoom_multi
(x, zoom_range=(0.9, 1.1), is_random=False, row_index=0, col_index=1, channel_index=2, fill_mode='nearest', cval=0.0, order=1)[source]¶ Zoom in and out of images with the same arguments, randomly or non-randomly. Usually be used for image segmentation which x=[X, Y], X and Y should be matched.
Parameters: - x : list of numpy array
List of images with dimension of [n_images, row, col, channel] (default).
- others : see
zoom
.
Brightness¶
-
tensorlayer.prepro.
brightness
(x, gamma=1, gain=1, is_random=False)[source]¶ Change the brightness of a single image, randomly or non-randomly.
Parameters: - x : numpy array
An image with dimension of [row, col, channel] (default).
- gamma : float, small than 1 means brighter.
Non negative real number. Default value is 1, smaller means brighter.
- If is_random is True, gamma in a range of (1-gamma, 1+gamma).
- gain : float
The constant multiplier. Default value is 1.
- is_random : boolean, default False
- If True, randomly change brightness.
References
-
tensorlayer.prepro.
brightness_multi
(x, gamma=1, gain=1, is_random=False)[source]¶ Change the brightness of multiply images, randomly or non-randomly. Usually be used for image segmentation which x=[X, Y], X and Y should be matched.
Parameters: - x : list of numpy array
List of images with dimension of [n_images, row, col, channel] (default).
- others : see
brightness
.
Brightness, contrast and saturation¶
-
tensorlayer.prepro.
illumination
(x, gamma=1.0, contrast=1.0, saturation=1.0, is_random=False)[source]¶ Perform illumination augmentation for a single image, randomly or non-randomly.
Parameters: - x : numpy array
an image with dimension of [row, col, channel] (default).
- gamma : change brightness (the same with
tl.prepro.brightness
) - if is_random=False, one float number, small than one means brighter, greater than one means darker.
- if is_random=True, tuple of two float numbers, (min, max).
- contrast : change contrast
- if is_random=False, one float number, small than one means blur.
- if is_random=True, tuple of two float numbers, (min, max).
- saturation : change saturation
- if is_random=False, one float number, small than one means unsaturation.
- if is_random=True, tuple of two float numbers, (min, max).
- is_random : whether the parameters are randomly set.
Examples
- Random
>>> x = illumination(x, gamma=(0.5, 5.0), contrast=(0.3, 1.0), saturation=(0.7, 1.0), is_random=True) - Non-random >>> x = illumination(x, 0.5, 0.6, 0.8, is_random=False)
RGB to HSV¶
HSV to RGB¶
Adjust Hue¶
-
tensorlayer.prepro.
adjust_hue
(im, hout=0.66, is_offset=True, is_clip=True, is_random=False)[source]¶ Adjust hue of an RGB image. This is a convenience method that converts an RGB image to float representation, converts it to HSV, add an offset to the hue channel, converts back to RGB and then back to the original data type. For TF, see tf.image.adjust_hue and tf.image.random_hue.
Parameters: - im : should be a numpy arrays with values between 0 and 255.
- hout : float.
- If is_offset is False, set all hue values to this value. 0 is red; 0.33 is green; 0.66 is blue.
- If is_offset is True, add this value as the offset to the hue channel.
- is_offset : boolean, default True.
- is_clip : boolean, default True.
- If True, set negative hue values to 0.
- is_random : boolean, default False.
References
Examples
- Random, add a random value between -0.2 and 0.2 as the offset to every hue values.
>>> im_hue = tl.prepro.adjust_hue(image, hout=0.2, is_offset=True, is_random=False)
- Non-random, make all hue to green.
>>> im_green = tl.prepro.adjust_hue(image, hout=0.66, is_offset=False, is_random=False)
Resize¶
-
tensorlayer.prepro.
imresize
(x, size=[100, 100], interp='bicubic', mode=None)[source]¶ Resize an image by given output size and method. Warning, this function will rescale the value to [0, 255].
Parameters: - x : numpy array
An image with dimension of [row, col, channel] (default).
- size : int, float or tuple (h, w)
- int, Percentage of current size.
- float, Fraction of current size.
- tuple, Size of the output image.
- interp : str, optional
Interpolation to use for re-sizing (‘nearest’, ‘lanczos’, ‘bilinear’, ‘bicubic’ or ‘cubic’).
- mode : str, optional
The PIL image mode (‘P’, ‘L’, etc.) to convert arr before resizing.
Returns: - imresize : ndarray
- The resized array of image.
References
Pixel value scale¶
-
tensorlayer.prepro.
pixel_value_scale
(im, val=0.9, clip=[], is_random=False)[source]¶ Scales each value in the pixels of the image.
Parameters: - im : numpy array for one image.
- val : float.
- If is_random=False, multiply this value with all pixels.
- If is_random=True, multiply a value between [1-val, 1+val] with all pixels.
Examples
- Random
>>> im = pixel_value_scale(im, 0.1, [0, 255], is_random=True)
- Non-random
>>> im = pixel_value_scale(im, 0.9, [0, 255], is_random=False)
Normalization¶
-
tensorlayer.prepro.
samplewise_norm
(x, rescale=None, samplewise_center=False, samplewise_std_normalization=False, channel_index=2, epsilon=1e-07)[source]¶ Normalize an image by rescale, samplewise centering and samplewise centering in order.
Parameters: - x : numpy array
An image with dimension of [row, col, channel] (default).
- rescale : rescaling factor.
If None or 0, no rescaling is applied, otherwise we multiply the data by the value provided (before applying any other transformation)
- samplewise_center : set each sample mean to 0.
- samplewise_std_normalization : divide each input by its std.
- epsilon : small position value for dividing standard deviation.
Notes
When samplewise_center and samplewise_std_normalization are True.
- For greyscale image, every pixels are subtracted and divided by the mean and std of whole image.
- For RGB image, every pixels are subtracted and divided by the mean and std of this pixel i.e. the mean and std of a pixel is 0 and 1.
Examples
>>> x = samplewise_norm(x, samplewise_center=True, samplewise_std_normalization=True) >>> print(x.shape, np.mean(x), np.std(x)) ... (160, 176, 1), 0.0, 1.0
-
tensorlayer.prepro.
featurewise_norm
(x, mean=None, std=None, epsilon=1e-07)[source]¶ Normalize every pixels by the same given mean and std, which are usually compute from all examples.
Parameters: - x : numpy array
An image with dimension of [row, col, channel] (default).
- mean : value for subtraction.
- std : value for division.
- epsilon : small position value for dividing standard deviation.
Channel shift¶
-
tensorlayer.prepro.
channel_shift
(x, intensity, is_random=False, channel_index=2)[source]¶ Shift the channels of an image, randomly or non-randomly, see numpy.rollaxis.
Parameters: - x : numpy array
An image with dimension of [row, col, channel] (default).
- intensity : float
Intensity of shifting.
- is_random : boolean, default False
If True, randomly shift.
- channel_index : int
Index of channel, default 2.
-
tensorlayer.prepro.
channel_shift_multi
(x, intensity, is_random=False, channel_index=2)[source]¶ Shift the channels of images with the same arguments, randomly or non-randomly, see numpy.rollaxis . Usually be used for image segmentation which x=[X, Y], X and Y should be matched.
Parameters: - x : list of numpy array
List of images with dimension of [n_images, row, col, channel] (default).
- others : see
channel_shift
.
Noise¶
Transform matrix offset¶
Apply affine transform by matrix¶
-
tensorlayer.prepro.
apply_transform
(x, transform_matrix, channel_index=2, fill_mode='nearest', cval=0.0, order=1)[source]¶ Return transformed images by given transform_matrix from
transform_matrix_offset_center
.Parameters: - x : numpy array
An image with dimension of [row, col, channel] (default).
- transform_matrix : numpy array
Transform matrix (offset center), can be generated by
transform_matrix_offset_center
- channel_index : int
Index of channel, default 2.
- fill_mode : string
Method to fill missing pixel, default ‘nearest’, more options ‘constant’, ‘reflect’ or ‘wrap’
- cval : scalar, optional
Value used for points outside the boundaries of the input if mode=’constant’. Default is 0.0
- order : int, optional
The order of interpolation. The order has to be in the range 0-5:
- 0 Nearest-neighbor
- 1 Bi-linear (default)
- 2 Bi-quadratic
- 3 Bi-cubic
- 4 Bi-quartic
- 5 Bi-quintic
- scipy ndimage affine_transform
Examples
- See
rotation
,shift
,shear
,zoom
.
Projective transform by points¶
-
tensorlayer.prepro.
projective_transform_by_points
(x, src, dst, map_args={}, output_shape=None, order=1, mode='constant', cval=0.0, clip=True, preserve_range=False)[source]¶ Projective transform by given coordinates, usually 4 coordinates. see scikit-image.
Parameters: - x : numpy array
An image with dimension of [row, col, channel] (default).
- src : list or numpy
The original coordinates, usually 4 coordinates of (width, height).
- dst : list or numpy
The coordinates after transformation, the number of coordinates is the same with src.
- map_args : dict, optional
Keyword arguments passed to inverse_map.
- output_shape : tuple (rows, cols), optional
Shape of the output image generated. By default the shape of the input image is preserved. Note that, even for multi-band images, only rows and columns need to be specified.
- order : int, optional
The order of interpolation. The order has to be in the range 0-5:
- 0 Nearest-neighbor
- 1 Bi-linear (default)
- 2 Bi-quadratic
- 3 Bi-cubic
- 4 Bi-quartic
- 5 Bi-quintic
- mode : {‘constant’, ‘edge’, ‘symmetric’, ‘reflect’, ‘wrap’}, optional
Points outside the boundaries of the input are filled according to the given mode. Modes match the behaviour of numpy.pad.
- cval : float, optional
Used in conjunction with mode ‘constant’, the value outside the image boundaries.
- clip : bool, optional
Whether to clip the output to the range of values of the input image. This is enabled by default, since higher order interpolation may produce values outside the given input range.
- preserve_range : bool, optional
Whether to keep the original range of values. Otherwise, the input image is converted according to the conventions of img_as_float.
References
Examples
>>> Assume X is an image from CIFAR 10, i.e. shape == (32, 32, 3) >>> src = [[0,0],[0,32],[32,0],[32,32]] # [w, h] >>> dst = [[10,10],[0,32],[32,0],[32,32]] >>> x = projective_transform_by_points(X, src, dst)
Numpy and PIL¶
-
tensorlayer.prepro.
array_to_img
(x, dim_ordering=(0, 1, 2), scale=True)[source]¶ Converts a numpy array to PIL image object (uint8 format).
Parameters: - x : numpy array
A image with dimension of 3 and channels of 1 or 3.
- dim_ordering : list or tuple of 3 int
Index of row, col and channel, default (0, 1, 2), for theano (1, 2, 0).
- scale : boolean, default is True
If True, converts image to [0, 255] from any range of value like [-1, 2].
References
Find contours¶
-
tensorlayer.prepro.
find_contours
(x, level=0.8, fully_connected='low', positive_orientation='low')[source]¶ Find iso-valued contours in a 2D array for a given level value, returns list of (n, 2)-ndarrays see skimage.measure.find_contours .
Parameters: - x : 2D ndarray of double. Input data in which to find contours.
- level : float. Value along which to find contours in the array.
- fully_connected : str, {‘low’, ‘high’}. Indicates whether array elements below the given level value are to be considered fully-connected (and hence elements above the value will only be face connected), or vice-versa. (See notes below for details.)
- positive_orientation : either ‘low’ or ‘high’. Indicates whether the output contours will produce positively-oriented polygons around islands of low- or high-valued elements. If ‘low’ then contours will wind counter-clockwise around elements below the iso-value. Alternately, this means that low-valued elements are always on the left of the contour.
Points to Image¶
Binary dilation¶
-
tensorlayer.prepro.
binary_dilation
(x, radius=3)[source]¶ Return fast binary morphological dilation of an image. see skimage.morphology.binary_dilation.
Parameters: - x : 2D array image.
- radius : int for the radius of mask.
Greyscale dilation¶
-
tensorlayer.prepro.
dilation
(x, radius=3)[source]¶ Return greyscale morphological dilation of an image, see skimage.morphology.dilation.
Parameters: - x : 2D array image.
- radius : int for the radius of mask.
Binary erosion¶
-
tensorlayer.prepro.
binary_erosion
(x, radius=3)[source]¶ Return binary morphological erosion of an image, see skimage.morphology.binary_erosion.
Parameters: - x : 2D array image.
- radius : int for the radius of mask.
Greyscale erosion¶
-
tensorlayer.prepro.
erosion
(x, radius=3)[source]¶ Return greyscale morphological erosion of an image, see skimage.morphology.erosion.
Parameters: - x : 2D array image.
- radius : int for the radius of mask.
Object detection¶
Tutorial for Image Aug¶
Hi, here is an example for image augmentation on VOC dataset.
import tensorlayer as tl
## download VOC 2012 dataset
imgs_file_list, _, _, _, classes, _, _,\
_, objs_info_list, _ = tl.files.load_voc_dataset(dataset="2012")
## parse annotation and convert it into list format
ann_list = []
for info in objs_info_list:
ann = tl.prepro.parse_darknet_ann_str_to_list(info)
c, b = tl.prepro.parse_darknet_ann_list_to_cls_box(ann)
ann_list.append([c, b])
# read and save one image
idx = 2 # you can select your own image
image = tl.vis.read_image(imgs_file_list[idx])
tl.vis.draw_boxes_and_labels_to_image(image, ann_list[idx][0],
ann_list[idx][1], [], classes, True, save_name='_im_original.png')
# left right flip
im_flip, coords = tl.prepro.obj_box_left_right_flip(image,
ann_list[idx][1], is_rescale=True, is_center=True, is_random=False)
tl.vis.draw_boxes_and_labels_to_image(im_flip, ann_list[idx][0],
coords, [], classes, True, save_name='_im_flip.png')
# resize
im_resize, coords = tl.prepro.obj_box_imresize(image,
coords=ann_list[idx][1], size=[300, 200], is_rescale=True)
tl.vis.draw_boxes_and_labels_to_image(im_resize, ann_list[idx][0],
coords, [], classes, True, save_name='_im_resize.png')
# crop
im_crop, clas, coords = tl.prepro.obj_box_crop(image, ann_list[idx][0],
ann_list[idx][1], wrg=200, hrg=200,
is_rescale=True, is_center=True, is_random=False)
tl.vis.draw_boxes_and_labels_to_image(im_crop, clas, coords, [],
classes, True, save_name='_im_crop.png')
# shift
im_shfit, clas, coords = tl.prepro.obj_box_shift(image, ann_list[idx][0],
ann_list[idx][1], wrg=0.1, hrg=0.1,
is_rescale=True, is_center=True, is_random=False)
tl.vis.draw_boxes_and_labels_to_image(im_shfit, clas, coords, [],
classes, True, save_name='_im_shift.png')
# zoom
im_zoom, clas, coords = tl.prepro.obj_box_zoom(image, ann_list[idx][0],
ann_list[idx][1], zoom_range=(1.3, 0.7),
is_rescale=True, is_center=True, is_random=False)
tl.vis.draw_boxes_and_labels_to_image(im_zoom, clas, coords, [],
classes, True, save_name='_im_zoom.png')
In practice, you may want to use threading method to process a batch of images as follows.
import tensorlayer as tl
import random
batch_size = 64
im_size = [416, 416]
n_data = len(imgs_file_list)
jitter = 0.2
def _data_pre_aug_fn(data):
im, ann = data
clas, coords = ann
## change image brightness, contrast and saturation randomly
im = tl.prepro.illumination(im, gamma=(0.5, 1.5),
contrast=(0.5, 1.5), saturation=(0.5, 1.5), is_random=True)
## flip randomly
im, coords = tl.prepro.obj_box_left_right_flip(im, coords,
is_rescale=True, is_center=True, is_random=True)
## randomly resize and crop image, it can have same effect as random zoom
tmp0 = random.randint(1, int(im_size[0]*jitter))
tmp1 = random.randint(1, int(im_size[1]*jitter))
im, coords = tl.prepro.obj_box_imresize(im, coords,
[im_size[0]+tmp0, im_size[1]+tmp1], is_rescale=True,
interp='bicubic')
im, clas, coords = tl.prepro.obj_box_crop(im, clas, coords,
wrg=im_size[1], hrg=im_size[0], is_rescale=True,
is_center=True, is_random=True)
## rescale value from [0, 255] to [-1, 1] (optional)
im = im / 127.5 - 1
return im, [clas, coords]
# randomly read a batch of image and the corresponding annotations
idexs = tl.utils.get_random_int(min=0, max=n_data-1, number=batch_size)
b_im_path = [imgs_file_list[i] for i in idexs]
b_images = tl.prepro.threading_data(b_im_path, fn=tl.vis.read_image)
b_ann = [ann_list[i] for i in idexs]
# threading process
data = tl.prepro.threading_data([_ for _ in zip(b_images, b_ann)],
_data_pre_aug_fn)
b_images2 = [d[0] for d in data]
b_ann = [d[1] for d in data]
# save all images
for i in range(len(b_images)):
tl.vis.draw_boxes_and_labels_to_image(b_images[i],
ann_list[idexs[i]][0], ann_list[idexs[i]][1], [],
classes, True, save_name='_bbox_vis_%d_original.png' % i)
tl.vis.draw_boxes_and_labels_to_image((b_images2[i]+1)*127.5,
b_ann[i][0], b_ann[i][1], [], classes, True,
save_name='_bbox_vis_%d.png' % i)
Coordinate pixel unit to percentage¶
-
tensorlayer.prepro.
obj_box_coord_rescale
(coord=[], shape=[100, 200])[source]¶ Scale down one coordinates from pixel unit to the ratio of image size i.e. in the range of [0, 1]. It is the reverse process of
obj_box_coord_scale_to_pixelunit
.Parameters: - coords : list of list for coordinates [[x, y, w, h], [x, y, w, h], …]
- shape : list of 2 integers for [height, width] of the image.
Examples
>>> coord = obj_box_coord_rescale(coord=[30, 40, 50, 50], shape=[100, 100]) ... [[0.3, 0.4, 0.5, 0.5]]
Coordinates pixel unit to percentage¶
-
tensorlayer.prepro.
obj_box_coords_rescale
(coords=[], shape=[100, 200])[source]¶ Scale down a list of coordinates from pixel unit to the ratio of image size i.e. in the range of [0, 1].
Parameters: - coords : list of list for coordinates [[x, y, w, h], [x, y, w, h], …]
- shape : list of 2 integers for [height, width] of the image.
Examples
>>> coords = obj_box_coords_rescale(coords=[[30, 40, 50, 50], [10, 10, 20, 20]], shape=[100, 100]) >>> print(coords) ... [[0.3, 0.4, 0.5, 0.5], [0.1, 0.1, 0.2, 0.2]] >>> coords = obj_box_coords_rescale(coords=[[30, 40, 50, 50]], shape=[50, 100]) >>> print(coords) ... [[0.3, 0.8, 0.5, 1.0]] >>> coords = obj_box_coords_rescale(coords=[[30, 40, 50, 50]], shape=[100, 200]) >>> print(coords) ... [[0.15, 0.4, 0.25, 0.5]]
Coordinate percentage to pixel unit¶
-
tensorlayer.prepro.
obj_box_coord_scale_to_pixelunit
(coord, shape=(100, 100, 3))[source]¶ Convert one coordinate [x, y, w (or x2), h (or y2)] in ratio format to image coordinate format. It is the reverse process of
obj_box_coord_rescale
.Parameters: - coord : list of float, [x, y, w (or x2), h (or y2)] in ratio format, i.e value range [0~1].
- shape : tuple of (height, width, channel (optional))
Examples
>>> x, y, x2, y2 = obj_box_coord_scale_to_pixelunit([0.2, 0.3, 0.5, 0.7], shape=(100, 200, 3)) ... (40, 30, 100, 70)
Coordinate [x_center, x_center, w, h] to up-left button-right¶
Coordinate up-left button-right to [x_center, x_center, w, h]¶
Coordinate [x_center, x_center, w, h] to up-left-width-high¶
Coordinate up-left-width-high to [x_center, x_center, w, h]¶
Darknet format string to list¶
Darknet format split class and coordinate¶
Image Aug - Flip¶
-
tensorlayer.prepro.
obj_box_left_right_flip
(im, coords=[], is_rescale=False, is_center=False, is_random=False)[source]¶ Left-right flip the image and coordinates for object detection.
Parameters: - im : numpy array
An image with dimension of [row, col, channel] (default).
- coords : list of list for coordinates [[x, y, w, h], [x, y, w, h], …]
- is_rescale : boolean, default False
Set to True, if the input coordinates are rescaled to [0, 1].
- is_center : boolean, default False
Set to True, if the x and y of coordinates are the centroid. (i.e. darknet format)
- is_random : boolean, default False
If True, randomly flip.
Examples
>>> im = np.zeros([80, 100]) # as an image with shape width=100, height=80 >>> im, coords = obj_box_left_right_flip(im, coords=[[0.2, 0.4, 0.3, 0.3], [0.1, 0.5, 0.2, 0.3]], is_rescale=True, is_center=True, is_random=False) >>> print(coords) ... [[0.8, 0.4, 0.3, 0.3], [0.9, 0.5, 0.2, 0.3]] >>> im, coords = obj_box_left_right_flip(im, coords=[[0.2, 0.4, 0.3, 0.3]], is_rescale=True, is_center=False, is_random=False) >>> print(coords) ... [[0.5, 0.4, 0.3, 0.3]] >>> im, coords = obj_box_left_right_flip(im, coords=[[20, 40, 30, 30]], is_rescale=False, is_center=True, is_random=False) >>> print(coords) ... [[80, 40, 30, 30]] >>> im, coords = obj_box_left_right_flip(im, coords=[[20, 40, 30, 30]], is_rescale=False, is_center=False, is_random=False) >>> print(coords) ... [[50, 40, 30, 30]]
Image Aug - Resize¶
-
tensorlayer.prepro.
obj_box_imresize
(im, coords=[], size=[100, 100], interp='bicubic', mode=None, is_rescale=False)[source]¶ Resize an image, and compute the new bounding box coordinates.
Parameters: - im : numpy array
An image with dimension of [row, col, channel] (default).
- coords : list of list for coordinates [[x, y, w, h], [x, y, w, h], …]
- size, interp, mode : see
tl.prepro.imresize
for details. - is_rescale : boolean, default False
Set to True, if the input coordinates are rescaled to [0, 1], then return the original coordinates.
Examples
>>> im = np.zeros([80, 100, 3]) # as an image with shape width=100, height=80 >>> _, coords = obj_box_imresize(im, coords=[[20, 40, 30, 30], [10, 20, 20, 20]], size=[160, 200], is_rescale=False) >>> print(coords) ... [[40, 80, 60, 60], [20, 40, 40, 40]] >>> _, coords = obj_box_imresize(im, coords=[[20, 40, 30, 30]], size=[40, 100], is_rescale=False) >>> print(coords) ... [20, 20, 30, 15] >>> _, coords = obj_box_imresize(im, coords=[[20, 40, 30, 30]], size=[60, 150], is_rescale=False) >>> print(coords) ... [30, 30, 45, 22] >>> im2, coords = obj_box_imresize(im, coords=[[0.2, 0.4, 0.3, 0.3]], size=[160, 200], is_rescale=True) >>> print(coords, im2.shape) ... [0.2, 0.4, 0.3, 0.3] (160, 200, 3)
Image Aug - Crop¶
-
tensorlayer.prepro.
obj_box_crop
(im, classes=[], coords=[], wrg=100, hrg=100, is_rescale=False, is_center=False, is_random=False, thresh_wh=0.02, thresh_wh2=12.0)[source]¶ Randomly or centrally crop an image, and compute the new bounding box coordinates. Objects outside the cropped image will be removed.
Parameters: - im : numpy array
An image with dimension of [row, col, channel] (default).
- classes : list of class ID (int).
- coords : list of list for coordinates [[x, y, w, h], [x, y, w, h], …]
- wrg, hrg, is_random : see
tl.prepro.crop
for details. - is_rescale : boolean, default False
Set to True, if the input coordinates are rescaled to [0, 1].
- is_center : boolean, default False
Set to True, if the x and y of coordinates are the centroid. (i.e. darknet format)
- thresh_wh : float
Threshold, remove the box if its ratio of width(height) to image size less than the threshold.
- thresh_wh2 : float
Threshold, remove the box if its ratio of width to height or vice verse higher than the threshold.
Image Aug - Shift¶
-
tensorlayer.prepro.
obj_box_shift
(im, classes=[], coords=[], wrg=0.1, hrg=0.1, row_index=0, col_index=1, channel_index=2, fill_mode='nearest', cval=0.0, order=1, is_rescale=False, is_center=False, is_random=False, thresh_wh=0.02, thresh_wh2=12.0)[source]¶ Shift an image randomly or non-randomly, and compute the new bounding box coordinates. Objects outside the cropped image will be removed.
Parameters: - im : numpy array
An image with dimension of [row, col, channel] (default).
- classes : list of class ID (int).
- coords : list of list for coordinates [[x, y, w, h], [x, y, w, h], …]
- wrg, hrg, row_index, col_index, channel_index, is_random, fill_mode, cval, order : see
tl.prepro.shift
. - is_rescale : boolean, default False
Set to True, if the input coordinates are rescaled to [0, 1].
- is_center : boolean, default False
Set to True, if the x and y of coordinates are the centroid. (i.e. darknet format)
- thresh_wh : float
Threshold, remove the box if its ratio of width(height) to image size less than the threshold.
- thresh_wh2 : float
Threshold, remove the box if its ratio of width to height or vice verse higher than the threshold.
Image Aug - Zoom¶
-
tensorlayer.prepro.
obj_box_zoom
(im, classes=[], coords=[], zoom_range=(0.9, 1.1), row_index=0, col_index=1, channel_index=2, fill_mode='nearest', cval=0.0, order=1, is_rescale=False, is_center=False, is_random=False, thresh_wh=0.02, thresh_wh2=12.0)[source]¶ Zoom in and out of a single image, randomly or non-randomly, and compute the new bounding box coordinates. Objects outside the cropped image will be removed.
Parameters: - im : numpy array
An image with dimension of [row, col, channel] (default).
- classes : list of class ID (int).
- coords : list of list for coordinates [[x, y, w, h], [x, y, w, h], …]
- zoom_range, row_index, col_index, channel_index, is_random, fill_mode, cval, order : see
tl.prepro.zoom
. - is_rescale : boolean, default False
Set to True, if the input coordinates are rescaled to [0, 1].
- is_center : boolean, default False
Set to True, if the x and y of coordinates are the centroid. (i.e. darknet format)
- thresh_wh : float
Threshold, remove the box if its ratio of width(height) to image size less than the threshold.
- thresh_wh2 : float
Threshold, remove the box if its ratio of width to height or vice verse higher than the threshold.
Sequence¶
More related functions can be found in tensorlayer.nlp
.
Padding¶
-
tensorlayer.prepro.
pad_sequences
(sequences, maxlen=None, dtype='int32', padding='post', truncating='pre', value=0.0)[source]¶ Pads each sequence to the same length: the length of the longest sequence. If maxlen is provided, any sequence longer than maxlen is truncated to maxlen. Truncation happens off either the beginning (default) or the end of the sequence. Supports post-padding and pre-padding (default).
Parameters: - sequences : list of lists where each element is a sequence
- maxlen : int, maximum length
- dtype : type to cast the resulting sequence.
- padding : ‘pre’ or ‘post’, pad either before or after each sequence.
- truncating : ‘pre’ or ‘post’, remove values from sequences larger than
maxlen either in the beginning or in the end of the sequence
- value : float, value to pad the sequences to the desired value.
Returns: - x : numpy array with dimensions (number_of_sequences, maxlen)
Examples
>>> sequences = [[1,1,1,1,1],[2,2,2],[3,3]] >>> sequences = pad_sequences(sequences, maxlen=None, dtype='int32', ... padding='post', truncating='pre', value=0.) ... [[1 1 1 1 1] ... [2 2 2 0 0] ... [3 3 0 0 0]]
Remove Padding¶
-
tensorlayer.prepro.
remove_pad_sequences
(sequences, pad_id=0)[source]¶ Remove padding.
Parameters: - sequences : list of list.
- pad_id : int.
Examples
>>> sequences = [[2,3,4,0,0], [5,1,2,3,4,0,0,0], [4,5,0,2,4,0,0,0]] >>> print(remove_pad_sequences(sequences, pad_id=0)) ... [[2, 3, 4], [5, 1, 2, 3, 4], [4, 5, 0, 2, 4]]
Process¶
-
tensorlayer.prepro.
process_sequences
(sequences, end_id=0, pad_val=0, is_shorten=True, remain_end_id=False)[source]¶ Set all tokens(ids) after END token to the padding value, and then shorten (option) it to the maximum sequence length in this batch.
Parameters: - sequences : numpy array or list of list with token IDs.
e.g. [[4,3,5,3,2,2,2,2], [5,3,9,4,9,2,2,3]]
- end_id : int, the special token for END.
- pad_val : int, replace the end_id and the ids after end_id to this value.
- is_shorten : boolean, default True.
Shorten the sequences.
- remain_end_id : boolean, default False.
Keep an end_id in the end.
Examples
>>> sentences_ids = [[4, 3, 5, 3, 2, 2, 2, 2], <-- end_id is 2 ... [5, 3, 9, 4, 9, 2, 2, 3]] <-- end_id is 2 >>> sentences_ids = precess_sequences(sentences_ids, end_id=vocab.end_id, pad_val=0, is_shorten=True) ... [[4, 3, 5, 3, 0], [5, 3, 9, 4, 9]]
Add Start ID¶
-
tensorlayer.prepro.
sequences_add_start_id
(sequences, start_id=0, remove_last=False)[source]¶ Add special start token(id) in the beginning of each sequence.
Examples
>>> sentences_ids = [[4,3,5,3,2,2,2,2], [5,3,9,4,9,2,2,3]] >>> sentences_ids = sequences_add_start_id(sentences_ids, start_id=2) ... [[2, 4, 3, 5, 3, 2, 2, 2, 2], [2, 5, 3, 9, 4, 9, 2, 2, 3]] >>> sentences_ids = sequences_add_start_id(sentences_ids, start_id=2, remove_last=True) ... [[2, 4, 3, 5, 3, 2, 2, 2], [2, 5, 3, 9, 4, 9, 2, 2]]
- For Seq2seq
>>> input = [a, b, c] >>> target = [x, y, z] >>> decode_seq = [start_id, a, b] <-- sequences_add_start_id(input, start_id, True)
Add End ID¶
-
tensorlayer.prepro.
sequences_add_end_id
(sequences, end_id=888)[source]¶ Add special end token(id) in the end of each sequence.
Parameters: - sequences : list of list.
- end_id : int.
Examples
>>> sequences = [[1,2,3],[4,5,6,7]] >>> print(sequences_add_end_id(sequences, end_id=999)) ... [[1, 2, 3, 999], [4, 5, 6, 999]]
Add End ID after pad¶
-
tensorlayer.prepro.
sequences_add_end_id_after_pad
(sequences, end_id=888, pad_id=0)[source]¶ Add special end token(id) in the end of each sequence.
Parameters: - sequences : list of list.
- end_id : int.
- pad_id : int.
Examples
>>> sequences = [[1,2,0,0], [1,2,3,0], [1,2,3,4]] >>> print(sequences_add_end_id_after_pad(sequences, end_id=99, pad_id=0)) ... [[1, 2, 99, 0], [1, 2, 3, 99], [1, 2, 3, 4]]