📘 Lesson  ·  Lesson 98

Web Scraping (BeautifulSoup)

Web Scraping (BeautifulSoup)

About this Project

💡 At a Glance

BeautifulSoup reads a web page's HTML and lets you extract data like titles, links and text.

The Program

Python
import requests
from bs4 import BeautifulSoup

url = "https://example.com"
html = requests.get(url).text
soup = BeautifulSoup(html, "html.parser")

print("Title:", soup.title.text)

# all links on the page
for link in soup.find_all("a"):
    print(link.get("href"))
Title: Example Domain https://www.iana.org/domains/example

Scrape Responsibly

⚠️ Note

Always check a site's robots.txt and terms before scraping. Do not overload servers.

Summary

  • requests fetches the HTML; BeautifulSoup parses it.
  • Use soup.title, soup.find_all() to extract elements.

इस Project के बारे में

💡 एक नज़र में

BeautifulSoup web page का HTML पढ़ता है और titles, links, text जैसा data निकालने देता है।

Program

Python
import requests
from bs4 import BeautifulSoup

url = "https://example.com"
html = requests.get(url).text
soup = BeautifulSoup(html, "html.parser")

print("Title:", soup.title.text)

# page के सारे links
for link in soup.find_all("a"):
    print(link.get("href"))
Title: Example Domain https://www.iana.org/domains/example

ज़िम्मेदारी से Scrape करें

⚠️ Note

Scraping से पहले site का robots.txt और terms ज़रूर देखें। Servers पर ज़्यादा load न डालें।

सारांश

  • requests HTML लाता है; BeautifulSoup parse करता है।
  • Elements निकालने को soup.title, soup.find_all() use करें।
← Back to Python Tutorial
🔗

Share this topic with a friend

यह topic किसी दोस्त को भेजें

Found it useful? Send it to a classmate learning the same thing.

अच्छा लगा? जो दोस्त यही सीख रहा है, उसे भेज दीजिए।

\n