Tokenizer
class Tokenizer
Encodes text to sequences and decodes sequences to text.
encode
def encode(text)
Encodes a given string into a sequence of indices.
Args
- text: Text to encode.
decode
def decode(sequence)
Decodees a given sequence into a text.
Args
- sequence: Sequence to decode.
vocab_size
def vocab_size()
Size of token vocab.