API - Iteration¶
Data iteration.
minibatches ([inputs, targets, batch_size, …]) |
Generate a generator that input a group of example in numpy.array and their labels, return the examples and labels by the given batchsize. |
seq_minibatches (inputs, targets, batch_size, …) |
Generate a generator that return a batch of sequence inputs and targets. |
seq_minibatches2 (inputs, targets, …) |
Generate a generator that iterates on two list of words. |
ptb_iterator (raw_data, batch_size, num_steps) |
Generate a generator that iterates on a list of words, see PTB tutorial. |
Non-time series¶
-
tensorlayer.iterate.
minibatches
(inputs=None, targets=None, batch_size=None, shuffle=False)[source]¶ Generate a generator that input a group of example in numpy.array and their labels, return the examples and labels by the given batchsize.
Parameters: - inputs : numpy.array
- The input features, every row is a example.
- targets : numpy.array
- The labels of inputs, every row is a example.
- batch_size : int
The batch size.
- shuffle : boolean
Indicating whether to use a shuffling queue, shuffle the dataset before return.
Examples
>>> X = np.asarray([['a','a'], ['b','b'], ['c','c'], ['d','d'], ['e','e'], ['f','f']]) >>> y = np.asarray([0,1,2,3,4,5]) >>> for batch in tl.iterate.minibatches(inputs=X, targets=y, batch_size=2, shuffle=False): >>> print(batch) ... (array([['a', 'a'], ... ['b', 'b']], ... dtype='<U1'), array([0, 1])) ... (array([['c', 'c'], ... ['d', 'd']], ... dtype='<U1'), array([2, 3])) ... (array([['e', 'e'], ... ['f', 'f']], ... dtype='<U1'), array([4, 5]))
Time series¶
Sequence iteration 1¶
-
tensorlayer.iterate.
seq_minibatches
(inputs, targets, batch_size, seq_length, stride=1)[source]¶ Generate a generator that return a batch of sequence inputs and targets. If
batch_size = 100, seq_length = 5
, one return will have500
rows (examples).Examples
- Synced sequence input and output.
>>> X = np.asarray([['a','a'], ['b','b'], ['c','c'], ['d','d'], ['e','e'], ['f','f']]) >>> y = np.asarray([0, 1, 2, 3, 4, 5]) >>> for batch in tl.iterate.seq_minibatches(inputs=X, targets=y, batch_size=2, seq_length=2, stride=1): >>> print(batch) ... (array([['a', 'a'], ... ['b', 'b'], ... ['b', 'b'], ... ['c', 'c']], ... dtype='<U1'), array([0, 1, 1, 2])) ... (array([['c', 'c'], ... ['d', 'd'], ... ['d', 'd'], ... ['e', 'e']], ... dtype='<U1'), array([2, 3, 3, 4])) ... ...
- Many to One
>>> return_last = True >>> num_steps = 2 >>> X = np.asarray([['a','a'], ['b','b'], ['c','c'], ['d','d'], ['e','e'], ['f','f']]) >>> Y = np.asarray([0,1,2,3,4,5]) >>> for batch in tl.iterate.seq_minibatches(inputs=X, targets=Y, batch_size=2, seq_length=num_steps, stride=1): >>> x, y = batch >>> if return_last: >>> tmp_y = y.reshape((-1, num_steps) + y.shape[1:]) >>> y = tmp_y[:, -1] >>> print(x, y) ... [['a' 'a'] ... ['b' 'b'] ... ['b' 'b'] ... ['c' 'c']] [1 2] ... [['c' 'c'] ... ['d' 'd'] ... ['d' 'd'] ... ['e' 'e']] [3 4]
Sequence iteration 2¶
-
tensorlayer.iterate.
seq_minibatches2
(inputs, targets, batch_size, num_steps)[source]¶ Generate a generator that iterates on two list of words. Yields (Returns) the source contexts and the target context by the given batch_size and num_steps (sequence_length), see
PTB tutorial
. In TensorFlow’s tutorial, this generates the batch_size pointers into the raw PTB data, and allows minibatch iteration along these pointers.- Hint, if the input data are images, you can modify the code as follow.
from data = np.zeros([batch_size, batch_len) to data = np.zeros([batch_size, batch_len, inputs.shape[1], inputs.shape[2], inputs.shape[3]])
Parameters: - inputs : a list
the context in list format; note that context usually be represented by splitting by space, and then convert to unique word IDs.
- targets : a list
the context in list format; note that context usually be represented by splitting by space, and then convert to unique word IDs.
- batch_size : int
the batch size.
- num_steps : int
the number of unrolls. i.e. sequence_length
Yields: - Pairs of the batched data, each a matrix of shape [batch_size, num_steps].
Raises: - ValueError : if batch_size or num_steps are too high.
Examples
>>> X = [i for i in range(20)] >>> Y = [i for i in range(20,40)] >>> for batch in tl.iterate.seq_minibatches2(X, Y, batch_size=2, num_steps=3): ... x, y = batch ... print(x, y) ... ... [[ 0. 1. 2.] ... [ 10. 11. 12.]] ... [[ 20. 21. 22.] ... [ 30. 31. 32.]] ... ... [[ 3. 4. 5.] ... [ 13. 14. 15.]] ... [[ 23. 24. 25.] ... [ 33. 34. 35.]] ... ... [[ 6. 7. 8.] ... [ 16. 17. 18.]] ... [[ 26. 27. 28.] ... [ 36. 37. 38.]]
PTB dataset iteration¶
-
tensorlayer.iterate.
ptb_iterator
(raw_data, batch_size, num_steps)[source]¶ Generate a generator that iterates on a list of words, see PTB tutorial. Yields (Returns) the source contexts and the target context by the given batch_size and num_steps (sequence_length).
see
PTB tutorial
.e.g. x = [0, 1, 2] y = [1, 2, 3] , when batch_size = 1, num_steps = 3, raw_data = [i for i in range(100)]
In TensorFlow’s tutorial, this generates batch_size pointers into the raw PTB data, and allows minibatch iteration along these pointers.
Parameters: - raw_data : a list
the context in list format; note that context usually be represented by splitting by space, and then convert to unique word IDs.
- batch_size : int
the batch size.
- num_steps : int
the number of unrolls. i.e. sequence_length
Yields: - Pairs of the batched data, each a matrix of shape [batch_size, num_steps].
- The second element of the tuple is the same data time-shifted to the
- right by one.
Raises: - ValueError : if batch_size or num_steps are too high.
Examples
>>> train_data = [i for i in range(20)] >>> for batch in tl.iterate.ptb_iterator(train_data, batch_size=2, num_steps=3): >>> x, y = batch >>> print(x, y) ... [[ 0 1 2] <---x 1st subset/ iteration ... [10 11 12]] ... [[ 1 2 3] <---y ... [11 12 13]] ... ... [[ 3 4 5] <--- 1st batch input 2nd subset/ iteration ... [13 14 15]] <--- 2nd batch input ... [[ 4 5 6] <--- 1st batch target ... [14 15 16]] <--- 2nd batch target ... ... [[ 6 7 8] 3rd subset/ iteration ... [16 17 18]] ... [[ 7 8 9] ... [17 18 19]]