What Is CSV?
CSV stands for Comma-Separated Values. It is one of the oldest and most widely used data interchange formats in computing. A CSV file stores tabular data — rows and columns — as plain text, with each row on its own line and each field within a row separated by a delimiter (usually a comma). The first row typically contains column headers.
A simple CSV file looks like this:
name,age,city
Alice,30,New York
Bob,25,Los Angeles
Charlie,35,"San Francisco"CSV files can be opened by virtually any spreadsheet application (Excel, Google Sheets, LibreOffice Calc), database tool, or programming language. Their simplicity and universality make them the go-to format for data exchange between systems that do not share a common API.
The CSV Format Specification (RFC 4180)
Although CSV has existed since the early days of computing, it was not formally standardized until 2005 when RFC 4180 was published. The specification defines the following rules:
- Each record is on a separate line, delimited by a line break (CRLF).
- The last record in a file may or may not have a trailing line break.
- An optional header row may appear as the first line with the same format as normal record lines.
- Fields are separated by commas. Spaces adjacent to commas are part of the field and should not be ignored.
- Fields that contain commas, double quotes, or line breaks must be enclosed in double quotes.
- A double quote inside a quoted field is escaped by preceding it with another double quote (e.g.,
"").
Despite the specification, many CSV implementations in the wild deviate from RFC 4180 — using different delimiters, different quoting rules, or different line endings. This is why robust CSV parsers (including this tool) handle quoted fields, escaped quotes, and various delimiters.
JSON vs CSV: When to Use Each
JSON and CSV are both data interchange formats, but they serve different purposes and have distinct strengths:
- Structure. CSV is inherently tabular — it represents a flat table of rows and columns. JSON supports nested and hierarchical data structures (objects within objects, arrays within arrays). If your data has a natural table shape, CSV is simpler. If your data has nested relationships, JSON is more expressive.
- Readability. For tabular data, CSV is more readable because each line corresponds to one record. JSON adds curly braces, brackets, and repeated key names that increase verbosity for flat data.
- Data types. JSON preserves types: strings, numbers, booleans, null, arrays, and objects. CSV treats everything as a string by default — the consuming application must infer or cast types.
- File size. For flat data, CSV is typically smaller because it avoids repeating key names on every record. For hierarchical data, JSON may be more compact because it avoids the column explosion that flattening creates.
- Tooling. CSV is supported by every spreadsheet application and database tool. JSON is the lingua franca of web APIs and JavaScript applications. Most ETL and data processing tools support both.
- Streaming. CSV is easy to stream line by line. JSON requires parsing the complete structure (unless using JSON Lines / NDJSON format, where each line is a separate JSON object).
CSV Parsing Challenges
Parsing CSV is deceptively difficult. What seems like a trivial “split on commas” operation quickly breaks down when you encounter real-world data:
Quoted Fields
If a field contains the delimiter character (e.g., a comma inside a city name like “San Francisco, CA”), the entire field must be wrapped in double quotes. A naive split on commas would incorrectly split this field in two. Proper CSV parsers must track whether they are inside or outside a quoted region.
Escaped Quotes
If a quoted field itself contains a double quote, the quote is escaped by doubling it: "". For example, the value She said "hello" would be encoded as "She said ""hello""". This tool correctly handles this escaping in both directions.
Line Breaks Inside Fields
A quoted field can span multiple lines. This means you cannot simply split a CSV file on newlines to get individual records — you must parse the quoting context first.
Encoding Issues
CSV files can be encoded in UTF-8, UTF-16, Latin-1, or other character sets. Many spreadsheet applications (notably Excel) default to the system's locale encoding, which can cause garbled text when files are shared across different operating systems. Adding a UTF-8 BOM (byte order mark) at the start of a CSV file helps Excel open it correctly with UTF-8 encoding.
Delimiter Ambiguity
While “CSV” implies commas, many files use tabs, semicolons, or pipes as delimiters. European locales that use commas as decimal separators (e.g., 3,14 for pi) often use semicolons as the CSV delimiter. This tool lets you choose from four common delimiters to handle these variations.
Handling Nested JSON: Dot Notation Flattening
CSV is a flat format, but JSON often contains nested objects. This tool handles nested objects by flattening them using dot notation. For example:
{
"name": "Alice",
"address": {
"city": "New York",
"zip": "10001"
}
}Becomes the following CSV columns:
name,address.city,address.zip
Alice,New York,10001This approach preserves the hierarchical path information in the column headers, making it possible to reconstruct the nested structure later. Deeply nested objects (e.g., a.b.c.d) are fully supported.
CSV in Spreadsheets: Excel and Google Sheets
CSV is the primary format for importing and exporting data in spreadsheet applications:
- Microsoft Excel: Supports CSV import/export. Use “Data → From Text/CSV” for more control over delimiters and encoding. Excel may auto-format values (converting gene names to dates, trimming leading zeros from ZIP codes), so review your data carefully after import.
- Google Sheets: Open CSV files directly from Google Drive or use “File → Import → Upload.” Google Sheets handles UTF-8 encoding well and auto-detects delimiters.
- LibreOffice Calc: Provides a detailed import dialog where you can choose delimiter, encoding, and column types before importing.
ETL and Data Pipelines
In data engineering, converting between JSON and CSV is a common step in ETL (Extract, Transform, Load) pipelines:
- Extract: Data is pulled from APIs (which typically return JSON) or databases.
- Transform: JSON data is flattened, filtered, aggregated, and converted to CSV for loading into data warehouses or analytical tools that expect tabular input.
- Load: CSV files are bulk-loaded into databases (PostgreSQL
COPY, MySQLLOAD DATA INFILE), data warehouses (BigQuery, Redshift, Snowflake), or BI tools (Tableau, Power BI).
This tool is useful for quick, one-off conversions during development and debugging. For automated production pipelines, libraries like Python's pandas or Node.js's csv-parse provide programmatic conversion capabilities.
Working with CSV in JavaScript and Python
JavaScript / Node.js
In JavaScript, there is no built-in CSV parser. Popular libraries include csv-parse (for Node.js streams), PapaParse (browser and Node.js), and fast-csv. For simple cases, you can split on the delimiter, but robust parsing requires handling quoted fields and escaped quotes — exactly what this tool does internally.
Python
Python's standard library includes the csv module, which handles reading and writing CSV files with proper quoting. For JSON-to-CSV conversion, pandas is the most popular choice:
import pandas as pd
import json
# JSON to CSV
data = json.loads(json_string)
df = pd.json_normalize(data) # flattens nested objects
df.to_csv("output.csv", index=False)
# CSV to JSON
df = pd.read_csv("input.csv")
json_string = df.to_json(orient="records", indent=2)The pd.json_normalize() function performs the same dot- notation flattening that this tool uses, making it easy to transition between quick browser-based conversions and scripted workflows.
Data Interchange Formats Beyond JSON and CSV
While JSON and CSV dominate, other formats exist for specialized use cases:
- TSV (Tab-Separated Values): Identical to CSV but uses tabs as delimiters. Less ambiguous than CSV because tabs rarely appear in data. This tool supports TSV via the tab delimiter option.
- Parquet: A columnar binary format used in big data ecosystems (Apache Spark, Apache Arrow). Much more efficient than CSV for large datasets but not human-readable.
- Avro: A row-based binary format with schema evolution support. Common in Apache Kafka and Hadoop pipelines.
- XML: Verbose but self-describing. Still used in enterprise integrations, SOAP APIs, and configuration files.
- YAML: Human-readable configuration format. Not typically used for tabular data but common for configuration files and infrastructure-as-code.
Frequently Asked Questions
Is my data safe?
Yes. This converter runs entirely in your browser. No data is transmitted to any server. You can verify this by opening your browser's Network tab — you will see zero requests during conversion.
What JSON format does this tool accept?
The JSON input must be an array of objects, e.g., [{"name":"Alice","age":30}, ...]. Each object becomes one row in the CSV output. Nested objects are flattened using dot notation. Arrays within objects are converted to strings.
How does it handle commas inside field values?
Fields containing the delimiter character, double quotes, or newlines are automatically wrapped in double quotes per RFC 4180. Double quotes within fields are escaped by doubling them.
Can I use a tab or semicolon as the delimiter?
Yes. Select your preferred delimiter (comma, tab, semicolon, or pipe) from the dropdown before converting. The same delimiter is used for both JSON-to-CSV and CSV-to-JSON operations.
What happens with deeply nested JSON objects?
Nested objects are recursively flattened using dot notation. For example, {"a":{"b":{"c":1}}} becomes a column named a.b.c with value 1. There is no depth limit.
What is the maximum input size?
Since processing happens in your browser, the limit depends on available memory. In practice, inputs up to several megabytes convert without issues. For very large datasets (hundreds of megabytes), consider using a scripted solution with Python or Node.js.
Does CSV-to-JSON preserve data types?
No. CSV does not encode type information, so all values in the JSON output are strings. If you need typed values (numbers, booleans), you will need to post-process the JSON output or use a typed format like Parquet.