Analyze Data/Python Libraries

numpy-empty, where, allclose, dot, argsort, corrcoef, astype, nan, hstack, argmax

Naranjito 2022. 12. 7. 15:55
  • empty

Return a new array of given shape and type.

np.empty((3,3))
>>>
array([[-1.,  0.,  5.],
       [12., 18., 24.],
       [35., 60., inf]])

 

  • where

Return elements depending on condition.

ar=np.arange(1,10)
np.where(ar>5)
>>>
(array([5, 6, 7, 8]),)

 

  • allclose

Returns True if two arrays are element-wise equal.

np.allclose([2,3],[2,3],equal_nan=True)
>>>
True

equal_nan : Whether to compare NaN’s as equal. If True, NaN’s in a will be considered equal to NaN’s in b in the output array.

 

  • dot

Multiply of two arrays.

a=[[1,2],[4,5]]
b=[[4,5],[4,5]]
np.dot(a,b)
>>>
array([[12, 15],
       [36, 45]])

 

  • argsort

Argument Sort, returns index of array.

a=np.array([1.5, 0.2, 4.2, 2.5])
s=a.argsort()
print(s)
print(a[s])
>>>
[1 0 3 2]
[0.2 1.5 2.5 4.2]

 

  • corrcoef

Correlation coefficients, it returns Pearson product-moment correlation coefficients.

user_id	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16	17	18	19	20	21	22	23	24	25	26	27	28	29	30	31	32	33	34	35	36	37	38	39	40	...	904	905	906	907	908	909	910	911	912	913	914	915	916	917	918	919	920	921	922	923	924	925	926	927	928	929	930	931	932	933	934	935	936	937	938	939	940	941	942	943
movie title																																																																																	
'Til There Was You (1997)	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	...	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0
1-900 (1994)	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.
...

result.shape
>>>
(1664, 12)

corr_mat=np.corrcoef(result)
corr_mat.shape
>>>
(1664, 1664)

 

  • astype

numpy.ndarray.astype() : Change the numpy.ndarray type as ()

te_array
>>>
array([[False, False, False,  True, False,  True,  True,  True,  True,
        False,  True],
       [False, False,  True,  True, False,  True, False,  True,  True,
        False,  True],
       [ True, False, False,  True, False,  True,  True, False, False,
        False, False],
       [False,  True, False, False, False,  True,  True, False, False,
         True,  True],
       [False,  True, False,  True,  True,  True, False, False,  True,
        False, False]])
        
te_array.astype('int')
>>>
array([[0, 0, 0, 1, 0, 1, 1, 1, 1, 0, 1],
       [0, 0, 1, 1, 0, 1, 0, 1, 1, 0, 1],
       [1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0],
       [0, 1, 0, 0, 0, 1, 1, 0, 0, 1, 1],
       [0, 1, 0, 1, 1, 1, 0, 0, 1, 0, 0]])

 

  • nan

Replace NaN values with zeros.

NTM_df['ACCD_DMG_PROTO_NM']=NTM_df['ACCD_DMG_PROTO_NM'].replace(np.nan,0)

 

  • hstack

Horizontally stack.

a=np.array((1,2,3))
b=np.array((4,5,6))
np.hstack((a,b))

>>>
array([1, 2, 3, 4, 5, 6])

 

  • argmax

Returns the indices of the maximum values along an axis.

reference : https://geekflare.com/numpy-argmax-function-python/

 

a=np.arange(6).reshape(2,3)
a
>>>

array([[0, 1, 2],
       [3, 4, 5]])

np.argmax(a,axis=1)

>>>
array([2, 2])

np.argmax(a,axis=0)

>>>
array([1, 1, 1])

 

(19,19,5,80)
  • Axis 0 = 19 elements
  • Axis 1 = 19 elements
  • Axis 2 = 5 elements
  • Axis 3 = 80 elements

Now, negative numbers work exactly like in python lists, in numpy arrays, etc. Negative numbers represent the inverse order:

  • Axis -1 = 80 elements
  • Axis -2 = 5 elements
  • Axis -3 = 19 elements
  • Axis -4 = 19 elements

When you pass the axis parameter to the argmax function, the indices returned will be based on this axis. Your results will lose this specific axes, but keep the others.

See what shape argmax will return for each index:

  • K.argmax(a,axis= 0 or -4) returns (19,5,80) with values from 0 to 18
  • K.argmax(a,axis= 1 or -3) returns (19,5,80) with values from 0 to 18
  • K.argmax(a,axis= 2 or -2) returns (19,19,80) with values from 0 to 4
  • K.argmax(a,axis= 3 or -1) returns (19,19,5) with values from 0 to 79

'Analyze Data > Python Libraries' 카테고리의 다른 글

numpy-axis, expand_dims  (0) 2023.05.24
numpy-random  (0) 2022.12.29
numpy-ndim, ravel, permutation, clip, subtract  (0) 2022.05.10
regular expression  (0) 2022.04.26
collections-Counter, most_common, FreqDist, defaultdict  (0) 2022.03.04