Skip to content Skip to sidebar Skip to footer

Web Crawling Using Python Beautifulsoup

How to extract data that is inside

paragraph tags and

  • which are under a named
    class?

    Solution 1:

    Use the functions find() and find_all():

    import requests
    from bs4 import BeautifulSoup
    
    url = '...'
    
    r = requests.get(url)
    data = r.text
    soup = BeautifulSoup(data, 'html.parser')
    
    div = soup.find('div', {'class':'class-name'})
    ps = div.find_all('p')
    lis = div.find_all('li')
    
    # print the content of all <p> tags
    for p in ps:
        print(p.text)
    
    # print the content of all <li> tags
    for li in lis:
        print(li.text)
    

  • Post a Comment for "Web Crawling Using Python Beautifulsoup"