crslab.data.dataset.durecdial package¶
Submodules¶
DuRecDial¶
References
Liu, Zeming, et al. “Towards Conversational Recommendation over Multi-Type Dialogs.” in ACL 2020.
-
class
crslab.data.dataset.durecdial.durecdial.
DuRecDialDataset
(opt, tokenize, restore=False, save=False)[source]¶ Bases:
crslab.data.dataset.base.BaseDataset
-
train_data
¶ train dataset.
-
valid_data
¶ valid dataset.
-
test_data
¶ test dataset.
-
vocab
¶ { 'tok2ind': map from token to index, 'ind2tok': map from index to token, 'entity2id': map from entity to index, 'id2entity': map from index to entity, 'word2id': map from word to index, 'vocab_size': len(self.tok2ind), 'n_entity': max(self.entity2id.values()) + 1, 'n_word': max(self.word2id.values()) + 1, }
- Type
dict
Notes
'unk'
must be specified in'special_token_idx'
inresources.py
.- Parameters
opt (Config or dict) – config for dataset or the whole system.
tokenize (str) – how to tokenize dataset.
restore (bool) – whether to restore saved dataset which has been processed. Defaults to False.
save (bool) – whether to save dataset after processing. Defaults to False.
-