Note

Pandas (25) 본문

Pandas

Pandas (25)

알 수 없는 사용자 2022. 8. 16. 20:00
728x90

How to filter valid emails from a series?

# Input
emails = pd.Series(['buying books at amazom.com', 'rameses@egypt.com', 'matt@t.co', 'narendra@modi.com'])

# 1 (as series of strings)
import re
pattern ='[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\\.[A-Za-z]{2,4}'
mask = emails.map(lambda x: bool(re.match(pattern, x)))
emails[mask]

# 2 (as series of list)
emails.str.findall(pattern, flags=re.IGNORECASE)

# 3 (as list)
[x[0] for x in [re.findall(pattern, email) for email in emails] if len(x) > 0]

# output
['rameses@egypt.com', 'matt@t.co', 'narendra@modi.com']

'Pandas' 카테고리의 다른 글

Pandas (27)  (0) 2022.08.19
Pandas (26)  (0) 2022.08.17
Pandas (24)  (0) 2022.08.15
Pandas (23)  (0) 2022.08.14
Pandas (22)  (0) 2022.08.13
Comments