Analyze Data/Measure of similarity

Vector Similarity-2. Euclidean distance

Naranjito 2021. 3. 3. 15:52
  • Euclidean distance

 

- It is the length of a line segment between the two points. 

- The distance between two objects that are not points is usually defined to be the smallest distance among pairs of points from the two objects.

- Smaller, Closer.

In three dimensions, for points given by their Cartesian coordinates, the distance is

Reference : en.wikipedia.org/wiki/Euclidean_distance

 

def distance(x,y):
  return np.sqrt(np.sum((x-y)**2))

doc1=np.array((2,3,0,1))
doc2 = np.array((1,2,3,1))
doc3 = np.array((2,1,2,2))
docQ = np.array((1,1,0,1))

print(distance(doc1,docQ))
print(distance(doc2,docQ))
print(distance(doc3,docQ))
>>>2.23606797749979
3.1622776601683795
2.449489742783178

Between doc1 and docQ is the shortest among doc1, doc2, doc3.