Words in similar contexts have similar meanings. For example, the word puppy usually comes with words such as cute, pretty, cuteness.
CBOW(Continuous Bag of Words) predicts words in the middle by inputting words around the middle. w1 w2 w3...ㅁ...w5 w6 w7 → → → ↑ ← ← ← |
Skip-Gram predicts surrounding words by inputting words in the middle. ㅁㅁㅁ...w...ㅁㅁㅁ ←←← ↓ →→→ |
- Training Word2Vec model
model=Word2Vec(sentences=result, vector_size=150, window=3, min_count=5, workers=3, sg=0)
vector_size : The number of dimensions of each word. How many vector dimensions of one word represents.
min_ count : Limit the minimum frequency of words. Less frequency are not learned.
workers : How many process are you going to use.
sg : Skip-Grams, 0 is CBOW, 1 is Skip-gram.
- Find the most similar word
print(model.wv.most_similar("only"))
wv : Word Vectors
- Save the Word2Vec model
model.wv.save_word2vec_format('eng_w2v')
- Load the Word2Vec model
loaded_model=KeyedVectors.load_word2vec_format('eng_w2v')
'Deep Learning' 카테고리의 다른 글
BPE(Byte Pair Encoding) (0) | 2022.11.30 |
---|---|
Terms-text encoding, text decoding, embedding (0) | 2022.09.23 |
How to learning of DL (0) | 2022.03.17 |
Activation function (0) | 2022.03.17 |
sklearn (0) | 2022.03.08 |