Type: lib.beautifulsoup.WebsiteContentExtractor

Namespace: lib.beautifulsoup

Description

Extract main content from a website, removing navigation, ads, and other non-essential elements. scrape, web scraping, content extraction, text analysis

Use cases:
- Clean web content for further analysis
- Extract article text from news websites
- Prepare web content for summarization

Properties

Property Type Description Default
html_content str The raw HTML content of the website. ``

Outputs

Output Type Description
output str  

Metadata

Browse other nodes in the lib.beautifulsoup namespace.