✂Text Splitter

Easily split CSV data, log files, lists, and any other text by your choice of delimiter. Filter and sort the split items, remove duplicates, convert to a JSON array, or rejoin with a different delimiter. All processing runs in your browser — no data is sent to a server.

How to Use

1
Enter text
Paste or type the text you want to split. Use the example buttons to quickly load sample data.
2
Choose delimiter
Select from newline, comma, semicolon, space, tab, or pipe — or enter a custom delimiter.
3
Configure options
Toggle trim whitespace, remove empty items, and deduplicate as needed.
4
Use the results
Browse the split item list, search to filter, and sort. Copy as a joined string or as a JSON array.

Tips

💡Split a CSV row by comma to quickly inspect individual column values.
💡Use deduplicate to get a unique values list in one click.
💡Copy the JSON array output and paste it directly into your code.
💡Use the filter search to find items containing a specific keyword.

FAQ

Q. Why is text splitting important for LLMs?: A. Large Language Models (LLMs) have a fixed context window — a maximum number of tokens they can process at once. For RAG (Retrieval-Augmented Generation) systems, long documents must be split into chunks that fit within this limit.
Q. How do I choose the right chunk size?: A. Typical chunk sizes range from 256 to 1024 tokens. Smaller chunks are more precise for retrieval but lose context; larger chunks preserve context but may include irrelevant content. Experiment based on your document type and LLM.
Q. What is chunk overlap and why does it matter?: A. Overlap is the number of characters or tokens shared between consecutive chunks. It prevents key information from being cut off at a boundary, improving the quality of retrieved context in RAG pipelines.