Bibliographic databases such as Google Scholar, DLBP, the ACM Digital Library, or Scopus collect a wealth of information about scientific and research publications. They therefore allow analysis about the publications by individual researchers, groups, organisations, and disciplines. The outcomes of such investigations can allow organisations and governments to formulate better research anddevelopment policies and direct resources that will enhance and optimise research quality and productivity.
The aim of this one-semesters project is to investigate, develop, implement and evaluate techniques that allow the analysis of temporal aspects of research output from large bibliographic databases. We specifically aim to develop techniques that allow for individual researchers to calculate (and potentially visualise) a measure of their outputs over time (such as their publications per year, the impact of these publications â€“ calculated by number of citations after a certain period of time). Different ways of aggregating these individual measures should also be investigated.
This project is available as a one-semester Computer Science project for both undergraduate or Mcomp students, or as a one-year CS or MComp honours project (with an extended scope). Students interested in undertaking this project should have good programming (ideally be familiar with Python) and database (SQL) skills, and knowledge in areas such as algorithms and data structures, string processing, etc. It is of advantage if a student has successfully attended courses on databases, data mining, machine learning, or document computing.
The recently published book 'Data Matching" (see URL below) by Peter Christen provides an ideal broad introduction to most topics related to this project.
This is an exciting and challenging project that will involve the analysis of real world data, cutting edge technologies, advanced scientific techniques, and cross sectorial collaboration between academia (Research School of Computer Science) and university administration (Central Research Office).