Deep Learning

Word2Vec

Naranjito 2022. 3. 29. 17:26

Words in similar contexts have similar meanings. For example, the word puppy usually comes with words such as cute, pretty, cuteness.

CBOW(Continuous Bag of Words) predicts words in the middle by inputting words around the middle.

w1 w2 w3...ㅁ...w5 w6 w7
→  →   →   ↑    ←   ←  ←

 

 

Skip-Gram predicts surrounding words by inputting words in the middle.

ㅁㅁㅁ...w...ㅁㅁㅁ
←←←  ↓   →→→

 

- Training Word2Vec model

model=Word2Vec(sentences=result, vector_size=150, window=3, min_count=5, workers=3, sg=0)

vector_size : The number of dimensions of each word. How many vector dimensions of one word represents.

min_ count : Limit the minimum frequency of words. Less frequency are not learned.

workers : How many process are you going to use.

sg : Skip-Grams, 0 is CBOW, 1 is Skip-gram.

 

- Find the most similar word

print(model.wv.most_similar("only"))

wv : Word Vectors

 

- Save the Word2Vec model

model.wv.save_word2vec_format('eng_w2v')

 

- Load the Word2Vec model

loaded_model=KeyedVectors.load_word2vec_format('eng_w2v')

'Deep Learning' 카테고리의 다른 글

BPE(Byte Pair Encoding)  (0) 2022.11.30
Terms-text encoding, text decoding, embedding  (0) 2022.09.23
How to learning of DL  (0) 2022.03.17
Activation function  (0) 2022.03.17
sklearn  (0) 2022.03.08