src.utils.vocabulary

Module Contents

Functions

make_char_to_ix()

Make a character to index dictionary.

make_word_to_ix(train_sentences, char_to_split_at=’ ‘, unk_tag=’<UNK>’)

Make a word to index dictionary

src.utils.vocabulary.make_char_to_ix()[source]

Make a character to index dictionary.

Returns

character to index

Return type

dict

src.utils.vocabulary.make_word_to_ix(train_sentences, char_to_split_at=' ', unk_tag='<UNK>')[source]

Make a word to index dictionary

Parameters
  • train_sentences (list) – list of sentences

  • char_to_split_at (str, optional) – str. Character to use to split the sentence (for tokenization). Defaults to ” “.

  • unk_tag (str, optional) – Unknown tag. Defaults to “<UNK>”.

Returns

[description]

Return type

[type]