목록전체 글 (462)
Note
!pip install tweepy # 개발자 계정으로 받은 토큰 입력 consumer_key = "" consumer_secret = "" access_token = "" access_token_secret = "" import tweepy auth = tweepy.OAuthHandler(consumer_key, consumer_secret) auth.set_access_token(access_token, access_token_secret) api = tweepy.API(auth) # 검색하고 싶은 키워드 keyword = "" # 결과 담을 리스트 result = [] # 트윗 가져오기 tweets = api.search(q = keyword, result_type = 'recent', count ..
How to get the second largest value of an array when grouped by another array? # Import iris keeping the text column intact url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data' iris = np.genfromtxt(url, delimiter=',', dtype='object') # Solution # Get the species and petal length columns petal_len_setosa = iris[iris[:, 4] == b'Iris-setosa', [2]].astype('float') # Get t..
How to use apply function on existing columns with global variables as additional arguments? # Input df = pd.read_csv('https://raw.githubusercontent.com/selva86/datasets/master/Cars93_miss.csv') # Solution d = {'Min.Price': np.nanmean, 'Max.Price': np.nanmedian} df[['Min.Price', 'Max.Price']] = df[['Min.Price', 'Max.Price']].apply(lambda x, d: x.fillna(d[x.name](x)), args=(d, ))
How to do probabilistic sampling in numpy? # Import iris keeping the text column intact url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data' iris = np.genfromtxt(url, delimiter=',', dtype='object') # Solution # Get the species column species = iris[:, 4] # 1: Generate Probablistically np.random.seed(100) a = np.array(['Iris-setosa', 'Iris-versicolor', 'Iris-virginica'..
How to replace missing values of multiple numeric columns with the mean? # Input df = pd.read_csv('https://raw.githubusercontent.com/selva86/datasets/master/Cars93_miss.csv') # Solution df_out = df[['Min.Price', 'Max.Price']] = df[['Min.Price', 'Max.Price']].apply(lambda x: x.fillna(x.mean())) print(df_out.head()) # output Min.Price Max.Price 0 12.900000 18.800000 1 29.200000 38.700000 2 25.9000..