Note

Numpy (42) 본문

Numpy

Numpy (42)

알 수 없는 사용자 2022. 9. 6. 18:53
728x90

How to do probabilistic sampling in numpy?

# Import iris keeping the text column intact
url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'
iris = np.genfromtxt(url, delimiter=',', dtype='object')

# Solution
# Get the species column
species = iris[:, 4]

# 1: Generate Probablistically
np.random.seed(100)
a = np.array(['Iris-setosa', 'Iris-versicolor', 'Iris-virginica'])
species_out = np.random.choice(a, 150, p=[0.5, 0.25, 0.25])

# 2: Probablistic Sampling (preferred)
np.random.seed(100)
probs = np.r_[np.linspace(0, 0.500, num=50), np.linspace(0.501, .750, num=50), np.linspace(.751, 1.0, num=50)]
index = np.searchsorted(probs, np.random.random(150))
species_out = species[index]
print(np.unique(species_out, return_counts=True))

# output

(array([b'Iris-setosa', b'Iris-versicolor', b'Iris-virginica'], dtype=object), array([77, 37, 36]))

'Numpy' 카테고리의 다른 글

Numpy (44)  (0) 2022.09.10
Numpy (43)  (0) 2022.09.07
Numpy (41)  (0) 2022.09.04
Numpy (40)  (2) 2022.09.03
Numpy (39)  (0) 2022.08.31
Comments