Deep Learning

encoding vs embedding

Naranjito 2022. 12. 8. 11:44
AIM
  • encoding : Identifying relativity between each data
  • embedding : Numerical vectorization

 

  • encoding

The acquisition of vector that enables distinguish and numerically meaningful.

For example,
'wonderful, beautiful = Seoul'
If it can be quantifid, it can be bectorized as below.
(10, 6) = Seoul

Now, in this way, each cities can be expressed based on 'wonderful' and 'beautiful', and this allows us to express the similarity between each cities as 'distance' in the space.

 

Encoding is the output of multidimensional vector that can represent data through relativity and correlation.

 

After encoding, it can be measured the distance for each data, and obtainable probabilities for new data.

Each data will be separated from each other on the coordicate. And there must be some empty space between each data.

When new data enters into empty space, that data will be able to identify nearby space and compare what data is around and how far it is, in order to find out where it belongs.

If this new data is close to data A(point A), it is close to data A(point A).

And if it is close to data B(point B), it is close to data B(point B).

However, it may not be close to any of A nor B, then it might exist as a new point C, rather than trying it to the nearest one.

 

  • embedding

Converting data into numeric vector for the machine to understand.

For example, Word2Vec is one method to converting word to vector. ex (Man = [0.52,0.76, 1.21, 0.22...])

'Deep Learning' 카테고리의 다른 글

1. Fundamentals of CNNs and RNNs  (0) 2023.01.31
AutoEncoder  (0) 2022.12.12
BPE(Byte Pair Encoding)  (0) 2022.11.30
Terms-text encoding, text decoding, embedding  (0) 2022.09.23
Word2Vec  (0) 2022.03.29