Type: lib.pdfplumber.ExtractText
Namespace: lib.pdfplumber
Description
Extract text content from a PDF file. pdf, text, extract
Use cases:
- Convert PDF documents to plain text
- Extract content for analysis
- Enable text search in PDF documents
Properties
| Property | Type | Description | Default |
|---|---|---|---|
document |
The PDF file to extract text from | {'type': 'document', 'uri': '', 'asset_id': None, 'data': None} |
|
| start_page | int |
The start page to extract. 0-based indexing | 0 |
| end_page | int |
The end page to extract. -1 for all pages | 4 |
Outputs
| Output | Type | Description |
|---|---|---|
| output | str |
Metadata
Related Nodes
Browse other nodes in the lib.pdfplumber namespace.