My final semester at The University of York has officially begun. Over the next 5 months I will be conducting 100 credits worth of (hopefully) novel research in the area of Git Source Control Querying and Analytics through the use of Model-Driven Engineering under the supervision of Dr. Dimitris Kolovos.
The original project proposal is shown below:
Advanced Querying of Git Repositories
Git is a distributed version control system that is widely used both in academia and industry. Git provides a command-line API through which basic queries can be evaluated against local repositories (e.g. git log) but lacks facilities for expressing complex queries in a concise manner. The aim of this project is to support such complex high-level queries on Git repositories. In the context of this project, the student will need to
1) identify the metadata stored in a Git repository and extract it to an object-oriented representation (e.g. using JGit)
2) develop a driver that will allow languages of the Epsilon platform to query extracted metadata at a high level of abstraction. For example, the following query would select all files larger than 200 lines and which were last modified by firstname.lastname@example.org on a Wednesday:File.all.select(f|f.lines > 200 and (f.lastModifiedBy = "email@example.com" or f.lastModifiedDay = "Wednesday"))
Such an advanced query facility would enable the development of advanced Git repository analytics and visualisation services (e.g. using Epsilon’s EGL as a server-side scripting language).
I’m currently in the very early stages of literature review and finding out what other git analytics programs are available so there isn’t too much to talk about. However, I will as ever keep the blog up to date with my progress over the next few months.