BucketGenerator
class BucketGenerator
Performs bucketing of elements in a dataset by length.
__init__
def __init__(element_length_function, batch_size, buffer_size_batches, batches_to_bucket, shuffle, seed)
Initializes the BucketGenerator.
Args
-
element_length_function: Element_length_function
-
batch_size: The size of the batches to bucket the sequences into buffer_size_batches
-
batches_to_bucket: Number of batches in buffer to use for bucketing. If set to buffer_size_batches, the resulting batches will be deterministic.
-
shuffle: Whether to shuffle elements across batches and the resulting buckets.
-
seed: Seed for shuffling.
__call__
def __call__(data)
Returns iterable of data with elements ordered by bucketed sequence lengths, e.g for batch size = 2 the transformation could look like this: [1], [3, 3, 3], [1], [4, 4, 4, 4] -> [1], [1], [3, 3, 3], [4, 4, 4, 4]