crslab.data.dataset.tgredial package¶
Submodules¶
TGReDial¶
References
Zhou, Kun, et al. “Towards Topic-Guided Conversational Recommender System.” in COLING 2020.
-
class
crslab.data.dataset.tgredial.tgredial.
TGReDialDataset
(opt, tokenize, restore=False, save=False)[source]¶ Bases:
crslab.data.dataset.base.BaseDataset
-
train_data
¶ train dataset.
-
valid_data
¶ valid dataset.
-
test_data
¶ test dataset.
-
vocab
¶ { 'tok2ind': map from token to index, 'ind2tok': map from index to token, 'topic2ind': map from topic to index, 'ind2topic': map from index to topic, 'entity2id': map from entity to index, 'id2entity': map from index to entity, 'word2id': map from word to index, 'vocab_size': len(self.tok2ind), 'n_topic': len(self.topic2ind) + 1, 'n_entity': max(self.entity2id.values()) + 1, 'n_word': max(self.word2id.values()) + 1, }
- Type
dict
Notes
'unk'
and'pad_topic'
must be specified in'special_token_idx'
inresources.py
.Specify tokenized resource and init base dataset.
- Parameters
opt (Config or dict) – config for dataset or the whole system.
tokenize (str) – how to tokenize dataset.
restore (bool) – whether to restore saved dataset which has been processed. Defaults to False.
save (bool) – whether to save dataset after processing. Defaults to False.
-