Split Markdown

Type: nodetool.document.SplitMarkdown

Namespace: nodetool.document

Description

Splits markdown text by headers while preserving header hierarchy in metadata. markdown, split, headers

Use cases:
- Splitting markdown documentation while preserving structure
- Processing markdown files for semantic search
- Creating context-aware chunks from markdown content

Properties

Property	Type	Description	Default
document	`document`		`{'type': 'document', 'uri': '', 'asset_id': None, 'data': None}`
headers_to_split_on	`List[Tuple[str, str]]`	List of tuples containing (header_symbol, header_name)	`[['#', 'Header 1'], ['##', 'Header 2'], ['###', 'Header 3']]`
strip_headers	`bool`	Whether to remove headers from the output content	`True`
return_each_line	`bool`	Whether to split into individual lines instead of header sections	`False`
chunk_size	`Optional[int]`	Optional maximum chunk size for further splitting	-
chunk_overlap	`int`	Overlap size when using chunk_size	`30`

Outputs

Output	Type	Description
text	`str`
source_id	`str`
start_index	`int`

Metadata

Browse other nodes in the nodetool.document namespace.

Split Markdown

Description

Properties

Outputs

Metadata

Related Nodes