video url을 통한 비디오 정보 수집

Notice

Recent Posts

Recent Comments

Link

« 2025/07 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Tags more

Archives

Today

Total

관리 메뉴

Note

video url을 통한 비디오 정보 수집 본문

etc/Crawling

video url을 통한 비디오 정보 수집

Jun's N 2022. 4. 8. 23:53

video_results = {} 
cnt = 0
for video_id in tqdm(array_video_id): #Video_ID 목록
    if(cnt % 9 == 2):
        time.sleep(3)
    cnt += 1
    result = {} 
    video_url = "https://www.youtube.com"+video_id
    response = session.get(video_url,headers = headers) #URL 통신
    if(response.status_code == 429):
        print(response)
    soup = bs(response.text, "html.parser")
    
    try: # 예외 발생하면 가져오지 않음
        meta = soup.find_all("meta") 
        result['video_id'] = soup.find("meta", itemprop="videoId")['content']
        result['channel_id'] = soup.find("meta", itemprop="channelId")['content']
        result['title'] = soup.find("meta", property="og:title")['content']
        result['image'] = soup.find("meta", property="og:image")['content']
        result['genre'] = soup.find("meta", itemprop="genre")['content']
        result['published_date'] = soup.find("meta", itemprop="datePublished")['content']
        view_count = soup.find("meta", itemprop="interactionCount")
        if(view_count != None):
            result['views'] = soup.find("meta", itemprop="interactionCount")['content']
        else:
            result['views'] = 0
        result['duration'] =  soup.find("meta", itemprop="duration")['content']
        for tag in meta:
            if 'name' in tag.attrs.keys() and tag.attrs['name'].strip().lower() in ['description', 'keywords']:
                result[tag.attrs['name']] = tag.attrs['content']
        video_results[video_id] = result
    except:
        continue       
final_results = pd.DataFrame(video_results).T
insert_df = final_results[['video_id','channel_id','title','image','published_date','views','likes','duration','genre','ads_yn']]
insert_df.columns = ['video_id','channel_id','title','video_thumbnails_url','publishDate','views','likes','duration','genre','ads_yn']

728x90

저작자표시 비영리 (새창열림)

'etc > Crawling' 카테고리의 다른 글

유튜브 채널 검색에 따른 채널 이름, 구독자 수, 영상 수 가져오기 (0)	2022.05.02
유튜브 api 활용 채널 정보 가져오기 (0)	2022.04.11
bs4와 selenium을 활용한 video url 수집 (0)	2022.04.07
유튜브 댓글 크롤링 (0)	2021.06.09
웹 크롤링 - 네이버 뉴스 (0)	2021.05.03

'etc/Crawling' Related Articles

Comments

Note

video url을 통한 비디오 정보 수집 본문

video url을 통한 비디오 정보 수집

'etc > Crawling' 카테고리의 다른 글

티스토리툴바