What Is a Diff?
A diff (short for “difference”) is a comparison between two pieces of text that highlights what has changed between them. The concept originated in the Unix world with the diff command, first released in 1974 as part of AT&T’s Unix. Since then, diff has become a foundational concept in software development, powering version control systems, code review platforms, and document collaboration tools.
At its core, a diff algorithm takes two inputs — often called the “original” and “modified” versions — and produces a set of operations (insertions, deletions, and unchanged sections) that describe the minimal transformation from one to the other. This output is often called a patch or changeset.
How Diff Algorithms Work
Several algorithms exist for computing text differences, each with different trade-offs between speed, memory usage, and output quality.
Longest Common Subsequence (LCS)
The most classical approach to computing diffs is based on the Longest Common Subsequence problem. The LCS of two sequences is the longest sequence of elements that appear in both inputs in the same order (but not necessarily contiguously). By finding the LCS, you can determine which lines are shared between the two texts; everything else is either an addition or a deletion.
The standard dynamic programming solution to LCS runs in O(n × m) time and space, where n and m are the lengths of the two inputs. While this is sufficient for most practical use cases, it can become slow for very large files with tens of thousands of lines.
Myers’ Diff Algorithm
In 1986, Eugene Myers published “An O(ND) Difference Algorithm and Its Variations,” which became the foundation for most modern diff tools including Git’s diff engine. Myers’ algorithm finds the shortest edit script (the minimum number of insertions and deletions) between two sequences. Its time complexity is O(N × D), where N is the total length and D is the number of differences. This means it is extremely fast when the two inputs are mostly similar — exactly the common case in version control.
Patience Diff
Patience diff is an alternative strategy that first identifies lines that are unique to both inputs (appearing exactly once in each), uses those as anchors, and then recursively diffs the sections between the anchors. This often produces more human-readable output, especially when dealing with code that has many similar or blank lines. Git supports patience diff via the --patience flag.
Diff in Git and Version Control
Git uses diff extensively: every git diff, git log -p, and git show command produces diff output. Pull request reviews on platforms like GitHub, GitLab, and Bitbucket display diffs to help reviewers understand what changed. Understanding how to read a diff is an essential skill for any developer working with version control.
Git’s diff output uses the unified diff format, which shows context lines (unchanged lines surrounding a change) along with the additions and deletions. Each chunk of changes is prefixed with a header like @@ -10,7 +10,8 @@, indicating the starting line numbers and lengths in both the original and modified files. Lines beginning with + are additions, lines beginning with - are deletions, and lines with a space prefix are unchanged context.
Unified vs. Side-by-Side Format
There are two primary ways to display a diff:
- Unified format: Interleaves additions and deletions within a single stream, with
+and-prefixes. This is the default output ofgit diffand is compact and easy to include in emails or patches. - Side-by-side format: Displays the original text on the left and the modified text on the right, with changes highlighted in their respective positions. This format is more visual and is preferred in GUI tools and code review platforms.
This tool uses a hybrid approach: the diff is displayed inline with line numbers from both the original and modified texts shown side by side, making it easy to trace changes back to their positions in each version.
Common Use Cases for Diff
Code Review
Code review is the most common use case for diff. When a developer submits a pull request, reviewers examine the diff to understand what changed, check for bugs, ensure coding standards are followed, and provide feedback. A clear, readable diff is essential for effective code review and helps teams maintain code quality.
Document Comparison
Beyond code, diff tools are valuable for comparing any text-based documents: configuration files, legal contracts, technical specifications, and more. Writers and editors use diff to track changes between document revisions, ensuring nothing is accidentally lost or altered.
Merge Conflict Resolution
When two developers modify the same file in different ways, version control systems flag a merge conflict. Resolving the conflict requires examining the diff between both versions and the common ancestor, then manually deciding which changes to keep. Diff tools with three-way comparison (the ancestor plus both modifications) make this process much easier.
Debugging and Troubleshooting
Comparing the output of a program before and after a change can quickly reveal what broke. Diff is also useful for comparing configuration files across environments (staging vs. production) to catch discrepancies that might cause issues.
How This Tool Works
This diff checker runs entirely in your browser. Your text never leaves your device. The algorithm uses a dynamic programming approach based on the Longest Common Subsequence to identify matching, added, and removed lines. Results are displayed with color-coded highlighting: green for additions and red for deletions. Line numbers from both the original and modified texts are shown for easy reference.
You can swap the original and modified texts with one click, clear both inputs, or copy the diff output in a unified-style format for sharing.
Frequently Asked Questions
Is this tool safe for comparing sensitive data?
Yes. All comparison is performed locally in your browser using JavaScript. No text is transmitted to any server. You can verify this by monitoring the Network tab in your browser’s developer tools while using the tool.
Can I compare binary files?
This tool is designed for plain text comparison. Binary files (images, compiled executables, etc.) are not supported. For binary diff, use specialized tools like hexdump combined with diff, or use Git’s built-in binary diff support.
What is the maximum text size supported?
Since the diff runs in your browser, performance depends on your device. The LCS-based algorithm handles texts with several thousand lines comfortably. For extremely large files (50,000+ lines), consider using a desktop diff tool like VS Code’s built-in diff or a command-line tool like diff or delta.
What is the difference between diff and patch?
A diff is the output showing what changed between two texts. A patch is a file containing diff output that can be applied to the original text to reproduce the modified version. The patch command-line tool reads patch files and applies the described changes. This workflow was the standard way of distributing code changes before modern version control systems.
How do I resolve a merge conflict?
When you encounter a merge conflict in Git, the conflicting file will contain markers like <<<<<<<, =======, and >>>>>>>. You need to manually edit the file to keep the correct changes, remove the conflict markers, then stage and commit the result. Using a diff tool to compare both versions side by side makes this process significantly easier.