-
Web Scraping:
Web scraping is the process of extracting data from websites by parsing the HTML content of web pages.
-
Requests:
It is a Python library used to send HTTP requests and retrieve web page content or data from a URL.
-
BeautifulSoup:
It is a Python library used for parsing HTML or XML documents and extracting data from them in a structured way.
-
Web Scraping Process:
Send an HTTP request to a website, parse the HTML content, and extract the required data.
-
Requests Process:
Use requests.get(url) to send an HTTP request and receive the web page's HTML or data.
-
BeautifulSoup Process:
Parse the HTML content using BeautifulSoup and use methods like find() or find_all() to extract specific elements.
-
import requests
from bs4 import BeautifulSoup
url = 'https://quotes.toscrape.com/'
response = requests.get(url)
if response.status_code == 200:
print("Request Successful!")
else:
print("Failed to retrieve the page")
exit()
soup = BeautifulSoup(response.text, 'html.parser')
quotes = soup.find_all('span', class_='text')
authors = soup.find_all('small', class_='author')
for quote, author in zip(quotes, authors):
print(f"Quote: {quote.text}")
print(f"Author: {author.text}")
print("-" * 50)