DOCUDIFF

Document comparison tool

Docudiff

Docudiff

Docudiff

Comparing version documents is a hard problem. Thoug seemingly a simple operation, this task poses multifarious complexities, due in part to the organic and mutable nature of text. Current tools, although numerous and varied, are often just higlighting differences and, thus, they can be thrown by minor changes.

This inadequacy becomes evident when alterations such as the repositioning of document sections or the insertion of blank pages occur. In these instances, the software erroneously perceives the entire subsequent document as entirely altered – either as 'added' or 'deleted'. This, of course, does not provide a reliable representation of the changes made.

Artificial Intelligence (AI) has emerged as a potential solution to this problem. However, using AI independently has its share of shortcomings. AI, with its present capabilities, may overlook critical clauses or, even worse, perceive changes or additions that never happened (what we call hallucinations). These false positives render AI unreliable, particularly in tasks requiring high levels of accuracy, like legal documents or business contracts.

However, a robust solution lies in the union of existing document comparison libraries and AI. At Ikkfoo consulting, by first using document comparison tools to detect changes and then using AI to interpret those changes, we effectively leverage the strengths of both technologies.

This integrated approach offers a rigorous and reliable analysis of modifications while also providing a digestible, natural language interpretation. This means, any complex modifications or subtle changes in the document can be efficiently identified and then explained in a comprehensive, easy-to-understand manner.


Comparing version documents is a hard problem. Thoug seemingly a simple operation, this task poses multifarious complexities, due in part to the organic and mutable nature of text. Current tools, although numerous and varied, are often just higlighting differences and, thus, they can be thrown by minor changes.

This inadequacy becomes evident when alterations such as the repositioning of document sections or the insertion of blank pages occur. In these instances, the software erroneously perceives the entire subsequent document as entirely altered – either as 'added' or 'deleted'. This, of course, does not provide a reliable representation of the changes made.

Artificial Intelligence (AI) has emerged as a potential solution to this problem. However, using AI independently has its share of shortcomings. AI, with its present capabilities, may overlook critical clauses or, even worse, perceive changes or additions that never happened (what we call hallucinations). These false positives render AI unreliable, particularly in tasks requiring high levels of accuracy, like legal documents or business contracts.

However, a robust solution lies in the union of existing document comparison libraries and AI. At Ikkfoo consulting, by first using document comparison tools to detect changes and then using AI to interpret those changes, we effectively leverage the strengths of both technologies.

This integrated approach offers a rigorous and reliable analysis of modifications while also providing a digestible, natural language interpretation. This means, any complex modifications or subtle changes in the document can be efficiently identified and then explained in a comprehensive, easy-to-understand manner.


Consider the seminal paper "Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents"

The authors investigate the feasibility of utilizing large language models (LLMs) for the task of grounding abstract, natural language instructions (such as "make breakfast") into concrete, actionable sequences (like "open fridge"), bypassing the need for explicit, step-by-step task learning.


Change control

They have two versions of this paper, one published in January 2022 and the second one published in March 2022. By applying our integrated approach, we can provide a robust analysis of the differences, as well as a comprehensive interpretation. On the left we see the “Limitations and Conclusions” of version 1. On the right, “Conclusions, Limitations & Future work” of version 2:

Basic Diff tool

There are some key difference at first glance. Let's consider a raw comparison of both versions of the paper, performed using a basic diff tool:

AI Interpretation

While the raw comparison provides an overview of the changes, it lacks interpretability. This is where GPT-4, our chosen AI model, steps in. When asking GPT-4 directly, we get nowhere:

Traditional libraries + AI integrated solution

Feeding in the entirety of both texts is inefficient and expensive. By feeding the output from 'difflib' into GPT-4, we are able to get a coherent, clear, natural language description of the changes.


By combining the precision of document comparison tools and the interpretability of AI, we offer an effective solution to the long-standing challenge of document version comparison, thus paving the way for seamless document editing and management.

© 2023 Ikkfoo Consulting. All Rights Reserved.