Entity resolution is a long-standing challenge in many areas of computer science. State-of-the-art approaches to entity resolution favor similarity-based methods. The project has the following tasks: (1). To conduct a literature review on similarity-based and pattern- and rules-based entity resolution techniques. (2). To become familiar with Datalog rules for specifying patterns of entity resolution. (3). To develop an approach for incorporating similarity-based entity resolution techniques into a framework of patterns based on Datalog. (4). To analyze the efficiency of the developed approach, and potentially experimentally evaluate the approach using real or synthetic data sets.
The goal of this project is to investigate how similarity-based entity resolution methods can be enriched with pattern rules that have the ability to specify the semantics of data.
On the completion of the project, the following learning objectives are expected to achieve: â€¢ Have a good understanding for the literature of entity resolution techniques. â€¢ Develop an entity resolution method based on similarity and patterns, and evaluate its effectiveness.