목록Pandas (57)
Note
How to get the nrows, ncolumns, datatype, summary stats of each column of a dataframe? Also get the array and list equivalent. df = pd.read_csv('https://raw.githubusercontent.com/selva86/datasets/master/Cars93_miss.csv') # number of rows and columns print(df.shape) # output (93, 27) # datatypes print(df.dtypes) # output Manufacturer object Model object Type object Min.Price float64 Price float64..
How to import only specified columns from a csv file? df = pd.read_csv('https://raw.githubusercontent.com/selva86/datasets/master/BostonHousing.csv', usecols=['crim', 'medv']) print(df.head()) # output crim medv 0 0.00632 24.0 1 0.02731 21.6 2 0.02729 34.7 3 0.03237 33.4 4 0.06905 36.2
How to create a dataframe with rows as strides from a given series? L = pd.Series(range(15)) def gen_strides(a, stride_len=5, window_len=5): n_strides = ((a.size-window_len)//stride_len) + 1 return np.array([a[s:(s+window_len)] for s in np.arange(0, a.size, stride_len)[:n_strides]]) gen_strides(L, stride_len=2, window_len=4) # output array([[ 0, 1, 2, 3], [ 2, 3, 4, 5], [ 4, 5, 6, 7], [ 6, 7, 8,..
How to change column values when importing csv to a dataframe? # 1: Using converter parameter df = pd.read_csv('https://raw.githubusercontent.com/selva86/datasets/master/BostonHousing.csv', converters={'medv': lambda x: 'High' if float(x) > 25 else 'Low'}) # 2: Using csv reader import csv with open('BostonHousing.csv', 'r') as f: reader = csv.reader(f) out = [] for i, row in enumerate(reader): i..
How to import only every nth row from a csv file to create a dataframe? # 1: Use chunks and for-loop df = pd.read_csv('https://raw.githubusercontent.com/selva86/datasets/master/BostonHousing.csv', chunksize=50) df2 = pd.DataFrame() for chunk in df: df2 = df2.append(chunk.iloc[0,:]) # 2: Use chunks and list comprehension df = pd.read_csv('https://raw.githubusercontent.com/selva86/datasets/master/..