Type: lib.beautifulsoup.HTMLToText

Namespace: lib.beautifulsoup

Description

Converts HTML to plain text by removing tags and decoding entities using BeautifulSoup. html, text, convert

Use cases:
- Cleaning HTML content for text analysis
- Extracting readable content from web pages
- Preparing HTML data for natural language processing

Properties

Property Type Description Default
text str   ``
preserve_linebreaks bool Convert block-level elements to newlines True

Outputs

Output Type Description
output str  

Metadata

Browse other nodes in the lib.beautifulsoup namespace.