Anti-unification (computer science)

Anti-unification is the process of constructing a generalization common to two given symbolic expressions. As in unification, several frameworks are distinguished depending on which expressions (also called terms) are allowed, and which expressions are considered equal. If variables representing functions are allowed in an expression, the process is called “higher-order anti-unification”, otherwise “first-order anti-unification”. If the generalization is required to have an instance literally equal to each input expression, the process is called “syntactical anti-unification”, otherwise “E-anti-unification”, or “anti-unification modulo theory”.

An anti-unification algorithm should compute for given expressions a complete, and minimal generalization set, that is, a set covering all generalizations, and containing no redundant members, respectively. Depending on the framework, a complete and minimal generalization set may have one, finitely many, or possibly infinitely many members, or may not exist at all;[note 1] it cannot be empty, since a trivial generalization exists in any case. For first-order syntactical anti-unification, Gordon Plotkin[1][2] gave an algorithm that computes a complete and minimal singleton generalization set containing the so-called “least general generalization” (lgg).

Anti-unification should not be confused with dis-unification. The latter means the process of solving systems of inequations, that is of finding values for the variables such that all given inequations are satisfied.[note 2] This task is quite different from finding generalizations.

. . . Anti-unification (computer science) . . .

Formally, an anti-unification approach presupposes

• An infinite set V of variables. For higher-order anti-unification, it is convenient to choose V disjoint from the set of lambda-term bound variables.
• A set T of terms such that VT. For first-order and higher-order anti-unification, T is usually the set of first-order terms (terms built from variable and function symbols) and lambda terms (terms containing some higher-order variables), respectively.
• An equivalence relation
${displaystyle equiv }$

on

${displaystyle T}$

, indicating which terms are considered equal. For higher-order anti-unification, usually

${displaystyle tequiv u}$

if

${displaystyle t}$

and

${displaystyle u}$

are alpha equivalent. For first-order E-anti-unification,

${displaystyle equiv }$

reflects the background knowledge about certain function symbols; for example, if

${displaystyle oplus }$

is considered commutative,

${displaystyle tequiv u}$

if

${displaystyle u}$

results from

${displaystyle t}$

by swapping the arguments of

${displaystyle oplus }$

at some (possibly all) occurrences.[note 3] If there is no background knowledge at all, then only literally, or syntactically, identical terms are considered equal.