Note
Numpy (42) 본문
728x90
How to do probabilistic sampling in numpy?
# Import iris keeping the text column intact
url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'
iris = np.genfromtxt(url, delimiter=',', dtype='object')
# Solution
# Get the species column
species = iris[:, 4]
# 1: Generate Probablistically
np.random.seed(100)
a = np.array(['Iris-setosa', 'Iris-versicolor', 'Iris-virginica'])
species_out = np.random.choice(a, 150, p=[0.5, 0.25, 0.25])
# 2: Probablistic Sampling (preferred)
np.random.seed(100)
probs = np.r_[np.linspace(0, 0.500, num=50), np.linspace(0.501, .750, num=50), np.linspace(.751, 1.0, num=50)]
index = np.searchsorted(probs, np.random.random(150))
species_out = species[index]
print(np.unique(species_out, return_counts=True))
# output
(array([b'Iris-setosa', b'Iris-versicolor', b'Iris-virginica'], dtype=object), array([77, 37, 36]))
'Numpy' 카테고리의 다른 글
Numpy (44) (0) | 2022.09.10 |
---|---|
Numpy (43) (0) | 2022.09.07 |
Numpy (41) (0) | 2022.09.04 |
Numpy (40) (2) | 2022.09.03 |
Numpy (39) (0) | 2022.08.31 |
Comments