TitleSimilarity Measures over Refinement Graphs
Publication TypeJournal Article
Year of Publication2012
AuthorsOntañón S, Plaza E
JournalMachine Learning
KeywordsCBR, feature terms, Machine Learning, Similarity

Similarity assessment plays a key role in lazy learning methods such as k-nearest neighbor or case-based reasoning. In this paper we will show how refinement graphs, that were originally introduced for inductive learning, can be employed to assess and reason about similarity. We will define and analyze two similarity measures, $S_{λ}$ and $S_{π}$, based on refinement graphs. The \emph{anti-unification-based similarity}, $S_{λ}$, assesses similarity by finding the anti-unification of two instances, which is a description capturing all the information common to these two instances. The \emph{property-based similarity}, $S_{π}$, is based on a process of disintegrating the instances into a set of {\em properties}, and then analyzing these property sets. Moreover these similarity measures are applicable to any representation language for which a refinement graph that satisfies the requirements we identify can be defined. Specifically, we present a refinement graph for feature terms, in which several languages of increasing expressiveness can be defined. The similarity measures are empirically evaluated on relational data sets belonging to languages of different expressiveness.