Vectorizer

class Vectorizer

Transforms tuples of text into tuples of vector sequences.

__init__

def __init__(tokenizer_encoder, tokenizer_decoder, max_input_len, max_output_len)

Initializes the vectorizer.

Args
  • tokenizer_encoder: Tokenizer that encodes the input text.

  • tokenizer_decoder: Tokenizer that encodes the target text.

  • max_input_len (optional): Maximum length of input sequence, longer sequences will be truncated.

  • max_output_len (optional): Maximum length of target sequence, longer sequences will be truncated and shorter sequences will be padded to max len.

__call__

def __call__(data)

Encodes preprocessed strings into sequences of one-hot indices.

encode_input

def encode_input(text)

encode_output

def encode_output(text)

decode_input

def decode_input(sequence)

decode_output

def decode_output(sequence)