Deep Learning

sklearn

Naranjito 2022. 3. 8. 23:29
  • vocabulary_
Mapping all the words with integers.
text=["Don't be fooled by the dark sounding name, Mr. Jone's Orphanage is as cheery as cheery goes for a pastry-shop."]
vector=CountVectorizer() # Or TfidfVectorizer()
print(vector.vocabulary_)

>>>
{'don': 5, 'be': 1, 'fooled': 6, 'by': 2, 'the': 17, 'dark': 4, 'sounding': 16, 'name': 12, 'mr': 11, 'jone': 10, 'orphanage': 13, 'is': 9, 'as': 0, 'cheery': 3, 'goes': 8, 'for': 7, 'pastry': 14, 'shop': 15}
 
  • fit_transform

Return the frequency of each words.

print(vector.fit_transform(text).toarray())

>>>
[[2 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1]]

 

  • preprocessing.LabelEncoder

Replaces characters with integer numbers starting with 0.

lee=preprocessing.LabelEncoder() 
arr=[1,2,2,5]
lee.fit(arr) 
lee.transform(arr)

>>>

array([0, 1, 1, 2])

 

  • classes_

The label for each class.

LabelEncoder().classes_