목록전체 글 (462)
Note
How to get the positions of top n values from a numpy array? # Input np.random.seed(100) a = np.random.uniform(1,50, 20) # 1 print(a.argsort()) # output [18 7 3 10 15] # 2 np.argpartition(-a, 5)[:5] # output [15 10 3 7 18] # Below methods will get you the values. # 1: a[a.argsort()][-5:] # 2: np.sort(a)[-5:] # 3: np.partition(a, kth=-5)[-5:] # 4: a[np.argpartition(-a, 5)][:5]
How to format all the values in a dataframe as percentages? # Input df = pd.DataFrame(np.random.random(4), columns=['random']) # Solution out = df.style.format({ 'random': '{0:.2%}'.format, }) out # output random 021.66% 144.90% 285.69% 392.12%
How to replace all values greater than a given value to a given cutoff? # Input np.set_printoptions(precision=2) np.random.seed(100) a = np.random.uniform(1,50, 20) # 1: Using np.clip np.clip(a, a_min=10, a_max=30) # 2: Using np.where print(np.where(a 30, 30, a))) # output [ 27.63 14.64 21.8 30. 10. 10. 30. 30. 10. 29.18 30. 11.25 10.08 10. 11.77 30. 30. 10. 30. 14.43]
How to format or suppress scientific notations in a pandas dataframe? # Input df = pd.DataFrame(np.random.random(4)**10, columns=['random']) # 1: Rounding df.round(4) # 2: Use apply to change format df.apply(lambda x: '%.4f' % x, axis=1) # or df.applymap(lambda x: '%.4f' % x) # 3: Use set_option pd.set_option('display.float_format', lambda x: '%.4f' % x) # 4: Assign display.float_format pd.optio..
How to find the position of the first occurrence of a value greater than a given value? # Input: url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data' iris = np.genfromtxt(url, delimiter=',', dtype='object') # Solution: (edit: changed argmax to argwhere. Thanks Rong!) np.argwhere(iris[:, 3].astype(float) > 1.0)[0] # output 50