목록Pandas (57)
Note
How to compute the autocorrelations of a numeric series? # Input ser = pd.Series(np.arange(20) + np.random.normal(1, 10, 20)) # Solution autocorrelations = [ser.autocorr(i).round(2) for i in range(11)] print(autocorrelations[1:]) print('Lag having highest correlation: ', np.argmax(np.abs(autocorrelations[1:]))+1) # output [0.29999999999999999, -0.11, -0.17000000000000001, 0.46000000000000002, 0...
How to fill an intermittent time series so all missing dates show up with values of previous non-missing date? # Input ser = pd.Series([1,10,3, np.nan], index=pd.to_datetime(['2000-01-01', '2000-01-03', '2000-01-06', '2000-01-08'])) # 1 ser.resample('D').ffill() # fill with previous value # 2 ser.resample('D').bfill() # fill with next value ser.resample('D').bfill().ffill() # fill next else prev..
How to create a TimeSeries starting ‘2000-01-01’ and 10 weekends (saturdays) after that having random numbers as values? # Solution ser = pd.Series(np.random.randint(1,10,10), pd.date_range('2000-01-01', periods=10, freq='W-SAT')) ser # output 2000-01-01 6 2000-01-08 7 2000-01-15 4 2000-01-22 6 2000-01-29 8 2000-02-05 6 2000-02-12 5 2000-02-19 8 2000-02-26 1 2000-03-04 7 Freq: W-SAT, dtype: int64
How to replace missing spaces in a string with the least frequent character? # Input my_str = 'dbc deb abed gade' # Solution ser = pd.Series(list('dbc deb abed gade')) freq = ser.value_counts() print(freq) # output d 4 b 3 e 3 3 a 2 g 1 c 1 dtype: int64 least_freq = freq.dropna().index[-1] "".join(ser.replace(' ', least_freq)) # output 'dbccdebcabedcgade'