Normalize pipe, tab and custom delimited text before JSON or SQL workflows
Not every handoff arrives as RFC-compliant CSV. Mainframe extracts use pipes, European dumps use semicolons and ad hoc logs use tabs without headers. This converter splits and rejoins plain text with the delimiter you specify, entirely in the browser, so ragged files never upload to a third-party service. Use it when a supplier swears the file is “CSV” but Excel only opens it after you pick the right separator, or when you need consistent commas before the CSV ⇄ JSON Converter and warehouse loaders expect standard quoting.
Delimiter workflow
- Paste a sample including the header row when one exists.
- Select input delimiter and desired output delimiter or line ending.
- Scan row counts and column counts for shifted data.
- Convert to JSON or SQL once columns align.
Signs the delimiter is wrong
A single column that contains the entire row usually means the separator does not match the file. Sudden spikes in column count often indicate an unescaped quote in a text field. Blank rows at the end of spreadsheet exports are harmless but can confuse row counters — trim them before you share samples. Processing large files synchronously in-tab may stutter; work on head and tail excerpts during meetings.
Next steps in the data path
After normalization, move structured work to the CSV ⇄ JSON Converter and generate load scripts with the CSV → SQL Import Helper. Clean duplicate keys or repeated log lines with the Duplicate Line Remover when extracts were concatenated from multiple runs. Validate JSON syntax before you diff or schema-generate downstream in the JSON Validator.
Supplier communication
Document delimiter, encoding, header presence and null sentinels in your data contract. Ask for a ten-row sample in tickets instead of full files when PII is a concern — processing stays local but screen shares still leak. Keep before and after snippets in the ticket so the next engineer can reproduce the fix without guessing which menu option you chose.
Delimiter detection and quoting
European exports often use semicolons because commas appear inside numbers. If columns shift, open the file in a text editor and confirm the delimiter before converting. Quoted fields with embedded newlines are valid CSV but confuse simple splitters — this tool handles common cases yet extreme files may still need a server-side ETL job. Document delimiter and encoding choices in your data contract README.
Line endings and trailing newlines
Delimiter problems get the attention, but line endings cause just as many silent failures. Windows exports use CRLF while Unix tools expect LF. A stray carriage return can attach an invisible character to the last column of every row, enough to break a join key or a numeric cast downstream. Mixed endings in a concatenated file split rows inconsistently. A missing final newline trips some loaders while a double one creates a phantom empty row. Normalize line endings as part of this step and confirm the row count matches the source before you hand the file to a warehouse loader.