Vectorizer
class Vectorizer
Transforms tuples of text into tuples of vector sequences.
__init__
def __init__(tokenizer_encoder, tokenizer_decoder, max_input_len, max_output_len)
Initializes the vectorizer.
Args
-
tokenizer_encoder: Tokenizer that encodes the input text.
-
tokenizer_decoder: Tokenizer that encodes the target text.
-
max_input_len (optional): Maximum length of input sequence, longer sequences will be truncated.
-
max_output_len (optional): Maximum length of target sequence, longer sequences will be truncated and shorter sequences will be padded to max len.
__call__
def __call__(data)
Encodes preprocessed strings into sequences of one-hot indices.
encode_input
def encode_input(text)
encode_output
def encode_output(text)
decode_input
def decode_input(sequence)
decode_output
def decode_output(sequence)