Word2Vec

Deep Learning

Word2Vec

Naranjito 2022. 3. 29. 17:26

Words in similar contexts have similar meanings. For example, the word puppy usually comes with words such as cute, pretty, cuteness.

CBOW(Continuous Bag of Words) predicts words in the middle by inputting words around the middle.

w1 w2 w3...ㅁ...w5 w6 w7
→ → → ↑ ← ← ←

Skip-Gram predicts surrounding words by inputting words in the middle.

ㅁㅁㅁ...w...ㅁㅁㅁ
←←← ↓ →→→

- Training Word2Vec model

model=Word2Vec(sentences=result, vector_size=150, window=3, min_count=5, workers=3, sg=0)

vector_size : The number of dimensions of each word. How many vector dimensions of one word represents.

min_ count : Limit the minimum frequency of words. Less frequency are not learned.

workers : How many process are you going to use.

sg : Skip-Grams, 0 is CBOW, 1 is Skip-gram.

- Find the most similar word

print(model.wv.most_similar("only"))

wv : Word Vectors

- Save the Word2Vec model

model.wv.save_word2vec_format('eng_w2v')

- Load the Word2Vec model

loaded_model=KeyedVectors.load_word2vec_format('eng_w2v')

저작자표시

'Deep Learning' 카테고리의 다른 글

BPE(Byte Pair Encoding) (0)	2022.11.30
Terms-text encoding, text decoding, embedding (0)	2022.09.23
How to learning of DL (0)	2022.03.17
Activation function (0)	2022.03.17
sklearn (0)	2022.03.08

현재글Word2Vec

forward propagation, Step Function, randn, Filter, nvidia-smi, axis, selectall, cross-entropy, batch size, yield from, d3js, textdistance, classmethod, abstractmethod, Sigmoid function, docker-compose, kafka, global variable, Regular Expression, zeros,

일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

¡Hola, Mundo!

Word2Vec

'Deep Learning' 카테고리의 다른 글

'Deep Learning'의 다른글

티스토리툴바

개인정보

단축키

내 블로그

블로그 게시글

모든 영역

Word2Vec

'Deep Learning' 카테고리의 다른 글

'Deep Learning'의 다른글

관련글

티스토리툴바

개인정보

단축키

내 블로그

블로그 게시글

모든 영역