Note

Pandas (54) 본문

Pandas

Pandas (54)

알 수 없는 사용자 2022. 10. 13. 19:22
728x90

How to find and cap outliers from a series or dataframe column?

# Input
ser = pd.Series(np.logspace(-2, 2, 30))

# Solution
def cap_outliers(ser, low_perc, high_perc):
    low, high = ser.quantile([low_perc, high_perc])
    print(low_perc, '%ile: ', low, '|', high_perc, '%ile: ', high)
    ser[ser < low] = low
    ser[ser > high] = high
    return(ser)

capped_ser = cap_outliers(ser, .05, .95)

# output
0.05 %ile:  0.016049294077 | 0.95 %ile:  63.8766722202

'Pandas' 카테고리의 다른 글

Pandas (56)  (0) 2022.10.18
Pandas (55)  (0) 2022.10.14
Pandas (52)  (1) 2022.09.21
Pandas (53)  (0) 2022.09.20
Pandas (51)  (1) 2022.09.17
Comments