Scraping Video Information from YouTube

Web scraping is a way to extract information from the internet in an automated fashion. We all know that YouTube is a huge resource of data having tons of videos with their relative information’s like views, comments, etc.In this blog we will learn how to use web scraping in python to extract video information from YouTube search. In video information we will extract number of views and video heading appeared in search results.

To get started with this, we first need to install two important libraries. First is ” requests ” to get the response from a YouTube search result and other is ” Beautiful Soup ” to parse this response into html content.

pip install requests
pip install -U bs4

1 2	pip install requests pip install -U bs4

Now we have install the required libraries, let’s get started.

Import the libraries

from bs4 import BeautifulSoup as bs
import requests

1 2	from bs4 import BeautifulSoup as bs import requests

Whenever you search in YouTube, it creates a base search URL and then adds your search query into that URL to complete the it. Let say we search ” theailearner ” in the YouTube. Base search URL and query can be defined as follows.

base_url = 'https://www.youtube.com/results?search_query='
search_string = 'theailearner'
URL = base_url + search_string

base_url = 'https://www.youtube.com/results?search_query='

search_string = 'theailearner'

URL = base_url + search_string

Now, we will scrape the data from this URL using ” requests ” library.

response = requests.get(URL)
page = response.text

1 2	response = requests.get(URL) page = response.text

Once we scraped the data, we will parse it into HTML using beautiful soup and find all the videos information resulted in search result. To extract particular information we will use particular class from HTML data.

soup = bs(page, 'html.parser')
vids = soup.findAll('a',attrs={'class':'yt-uix-tile-link'})

1 2	soup = bs(page, 'html.parser') vids = soup.findAll('a',attrs={'class':'yt-uix-tile-link'})

The above used soup.findall() function will give the required data, but to make it easily understandable we need to run a simple python script.

for v in vids:
    print(v['title'])
    v = str(v)
    views = ''
    try:
        indx = v.index('views')
        indx = indx - 2
        while v[indx] is not ' ':
            views = views + v[indx]
            indx = indx -1
        print(views[::-1])
    except:
        continue

for v in vids:

print(v['title'])

v = str(v)

views = ''

try:

indx = v.index('views')

indx = indx - 2

while v[indx] is not ' ':

views = views + v[indx]

indx = indx -1

print(views[::-1])

except:

continue

Now you might have got some feeling about how to scrape data from YouTube. We can also scrape the other data from YouTube like video information from a channel, comments in a video, likes and dislikes and etc.

Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

0 Shares

3 thoughts on “Scraping Video Information from YouTube”

simon 7 Jun 2019 at 3:16 pm

so beautifully written and explained

Reply ↓
freelancer parvez 30 Sep 2019 at 3:38 pm

The post like very much like this was very good about this ,very nice

Reply ↓
Sophie 20 Jan 2020 at 8:45 am

The attribute value “yt-uix-tile-link” does not exist in page source. How did this happen?

Reply ↓

TheAILearner

Mastering Artificial Intelligence

Scraping Video Information from YouTube

3 thoughts on “Scraping Video Information from YouTube”

Leave a Reply Cancel reply