Note
Pandas (25) 본문
728x90
How to filter valid emails from a series?
# Input
emails = pd.Series(['buying books at amazom.com', 'rameses@egypt.com', 'matt@t.co', 'narendra@modi.com'])
# 1 (as series of strings)
import re
pattern ='[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\\.[A-Za-z]{2,4}'
mask = emails.map(lambda x: bool(re.match(pattern, x)))
emails[mask]
# 2 (as series of list)
emails.str.findall(pattern, flags=re.IGNORECASE)
# 3 (as list)
[x[0] for x in [re.findall(pattern, email) for email in emails] if len(x) > 0]
# output
['rameses@egypt.com', 'matt@t.co', 'narendra@modi.com']
'Pandas' 카테고리의 다른 글
Pandas (27) (0) | 2022.08.19 |
---|---|
Pandas (26) (0) | 2022.08.17 |
Pandas (24) (0) | 2022.08.15 |
Pandas (23) (0) | 2022.08.14 |
Pandas (22) (0) | 2022.08.13 |
Comments