Note
Pandas (33) 본문
728x90
How to import only every nth row from a csv file to create a dataframe?
# 1: Use chunks and for-loop
df = pd.read_csv('https://raw.githubusercontent.com/selva86/datasets/master/BostonHousing.csv', chunksize=50)
df2 = pd.DataFrame()
for chunk in df:
df2 = df2.append(chunk.iloc[0,:])
# 2: Use chunks and list comprehension
df = pd.read_csv('https://raw.githubusercontent.com/selva86/datasets/master/BostonHousing.csv', chunksize=50)
df2 = pd.concat([chunk.iloc[0] for chunk in df], axis=1)
df2 = df2.transpose()
# 3: Use csv reader
import csv
with open('BostonHousing.csv', 'r') as f:
reader = csv.reader(f)
out = []
for i, row in enumerate(reader):
if i%50 == 0:
out.append(row)
df2 = pd.DataFrame(out[1:], columns=out[0])
print(df2.head())
# output
crim zn indus chas nox rm age \
0 0.21977 0.0 6.91 0 0.44799999999999995 5.602 62.0
1 0.0686 0.0 2.89 0 0.445 7.416 62.5
2 2.7339700000000002 0.0 19.58 0 0.871 5.597 94.9
3 0.0315 95.0 1.47 0 0.40299999999999997 6.975 15.3
4 0.19072999999999998 22.0 5.86 0 0.431 6.718 17.5
dis rad tax ptratio b lstat medv
0 6.0877 3 233 17.9 396.9 16.2 19.4
1 3.4952 2 276 18.0 396.9 6.19 33.2
2 1.5257 5 403 14.7 351.85 21.45 15.4
3 7.6534 3 402 17.0 396.9 4.56 34.9
4 7.8265 7 330 19.1 393.74 6.56 26.2
'Pandas' 카테고리의 다른 글
Pandas (35) (0) | 2022.08.27 |
---|---|
Pandas (34) (0) | 2022.08.26 |
Pandas (32) (0) | 2022.08.24 |
Pandas (31) (0) | 2022.08.23 |
Pandas (30) (0) | 2022.08.22 |
Comments