Note

Pandas (33) 본문

Pandas

Pandas (33)

알 수 없는 사용자 2022. 8. 25. 00:00
728x90

How to import only every nth row from a csv file to create a dataframe?

# 1: Use chunks and for-loop
df = pd.read_csv('https://raw.githubusercontent.com/selva86/datasets/master/BostonHousing.csv', chunksize=50)
df2 = pd.DataFrame()
for chunk in df:
    df2 = df2.append(chunk.iloc[0,:])


# 2: Use chunks and list comprehension
df = pd.read_csv('https://raw.githubusercontent.com/selva86/datasets/master/BostonHousing.csv', chunksize=50)
df2 = pd.concat([chunk.iloc[0] for chunk in df], axis=1)
df2 = df2.transpose()

# 3: Use csv reader
import csv          
with open('BostonHousing.csv', 'r') as f:
    reader = csv.reader(f)
    out = []
    for i, row in enumerate(reader):
        if i%50 == 0:
            out.append(row)

df2 = pd.DataFrame(out[1:], columns=out[0])
print(df2.head())

# output
                 crim    zn  indus chas                  nox     rm   age  \
0              0.21977   0.0   6.91    0  0.44799999999999995  5.602  62.0   
1               0.0686   0.0   2.89    0                0.445  7.416  62.5   
2   2.7339700000000002   0.0  19.58    0                0.871  5.597  94.9   
3               0.0315  95.0   1.47    0  0.40299999999999997  6.975  15.3   
4  0.19072999999999998  22.0   5.86    0                0.431  6.718  17.5   

      dis rad  tax ptratio       b  lstat  medv  
0  6.0877   3  233    17.9   396.9   16.2  19.4  
1  3.4952   2  276    18.0   396.9   6.19  33.2  
2  1.5257   5  403    14.7  351.85  21.45  15.4  
3  7.6534   3  402    17.0   396.9   4.56  34.9  
4  7.8265   7  330    19.1  393.74   6.56  26.2

'Pandas' 카테고리의 다른 글

Pandas (35)  (0) 2022.08.27
Pandas (34)  (0) 2022.08.26
Pandas (32)  (0) 2022.08.24
Pandas (31)  (0) 2022.08.23
Pandas (30)  (0) 2022.08.22
Comments