A little while ago I was curious about how software that uses Transaction Logs works, so I built a toy to-do command line application to get some hands on experience.
Transaction Logs are used when you want to have metadata detailing each action that has taken place on a system but then actually use those logs as the data for the system. Therefore to build up the state of the system at any time you can simply “replay” each action that took place before that time. This enables you to time travel as well as know exactly what happened for user 54 to go into his overdraft.
Above you can see every feature of
todo-log (I did say it was a toy application!) being used from the macOS terminal.
add command you can store a todo item with both a title and some additional body text. This todo item is then assigned a short code — shown in red above — which can be used as the unique identifier to manipulate it in other commands. The transaction detailing the addition of the todo item is saved to a JSON Lines (
ls command, without parameters, displays a list of all the todo items which are outstanding at the current time. Some basic formatting is applied including showing a friendly date (e.g. “two hours ago”) for the creation time of a todo-item.
Deleting a todo item is simple, just pass the shortcode of the item to the
rm command. The command doesn’t then delete the add transaction from the
jsonl file, instead it adds its own
remove transaction detailing the time of deletion and the todo item involved.
log command just prints out the raw
.jsonl to the terminal for inspection. In the above screenshot you can see the format of both the
Finally, passing a positive integer parameter to the
ls command goes back in time by that number of transactions. In this example you can see what the state of the todo list was before the previous
rm command was issued. Everytime
ls is called it simply builds up the current state of the system by rerunning each of the actions detailed in the
.jsonl file — to go back in time we just ignore the last
n lines, where
n is the number provided to the command.
You can have a look at this fun little project over on my Github.
Though I only spent a couple of hours on it, and it doesn’t do much, I feel like working on this gave me a greater appreciation of what transaction logs can be used for and the challenegs around them. Systems like RDBMS transaction logs and Apache Kafka are much more interesting to me as a result.
Sometimes the easiest way to learn about something and become more interested is just to jump in and build something
The Joel Test is a very quick way of measuring the quality of a software engineering team by asking them 12 questions, which can only yield a yes or no answer. A score of 12 is great, 11 is tolerable and 10 or less is a fail. It’s harsh, but fair — a combination of two or more fails could result in larger problems.
When trying to size up a development team I usually ask them “Have you heard of The Joel Test?”. If they’ve heard of it I ask them if they know their score, we then usually have a discussion about each question. If not, I introduce them to it and do the same thing. Anecdotally, I’ve found that being aware of the Joel test increases the likelihood of a tolerable or passing score.
When Joel Spolsky, a former program manager on Microsoft Excel and CEO at Stack Exchange, first published the blog post outlining this test in 2000 questions such as “Do you make daily builds?” were probably a lot more relevant than they are now in a world where a lot of development is for the web. A lot of modern web development doesn’t require a build stage, and even when it does it’s usually quick enough to take place on every commit rather than once a day. Whilst big software packages such as Microsoft Excel probably do still require daily builds the vast majority of software teams I talk to aren’t making things like that.
Because the teams I usually speak to are doing different things I end up altering some of the questions and I thought it might be worth noting down my changes; even if it’s only myself that ever refers back to them.
- Do you use source control and follow a workflow?
- Are your test and local environments highly similar to your production environment?
- Do you use a continuous integration server to build and test every change (be that at the granularity of per branch or per commit?)
- Do you have a bug database?
- Do you fix bugs and make time for refactoring before writing new code?
- Are developers involved early on in design and product decisions?
- Do you have a roadmap?
- Do programmers have quiet working conditions?
- Do you use the best tools regardless of cost or license?
- Do you make testing everyones concern?
- Do you use Code Reviews?
- Do new candidates write code in their interviews?
- Do developers have access to stats and metrics for the live product?
As you can see, I’ve got 13 questions rather than 12 — therefore a perfect score is 13; 12 is a tolerable score and anything less means you should be looking to make improvements.
1. Do you use source control and follow a workflow?
In the Joel Test, the question just asks if source control is used. I’ve not come across a commercial team — so far at least — that doesn’t utilise source control. However, merely using git doesn’t mean you’re using it in the most optimal way. Following a known workflow such as GitFlow or the GitHub Flow, makes it easier to maintain multiple versions of a product and work on multiple different new features at the same time.
2. Are your test and local environments highly similar to your production environment?
There have been a number of times where I’ve had a bug that only appears in production environments, serving live traffic; this sucks because you can end up testing fixes in production and affecting your real users. Whilst having an identical system locally can be difficult — by design most large scale web systems are distributed so couldn’t be fully emulated on a single machine — it should be possible to have an environment that is highly similar at a component level. Tools like Docker make this easier.
3. Do you use a continuous integration server to build and test every change?
Joel speaks about utilising a daily build to catch mistakes programmers routinely make, such as not checking in a new file resulting in broken builds. Shortening the feedback look from once every 24 hours to a few minutes after every change by following the principles of continuous integration is a more modern approach. TravisCI, Jenkins and CodeShip are popular tools for achieving this.
4. Do you have a bug database?
This question remains as relevant now as when Joel asked it back in 2000.
5. Do you fix bugs and make time for refactoring before writing new code?
Joels test focused just on fixing bugs before writing new code. However, I think making time for refactoring and the reduction of technical debt in small amounts over a period of time is also crucial to the continued effectiveness of a team.
6. Are developers involved early on in design and product decisions?
I’ve experienced development teams where web developers were given a final pixel-perfect design by a graphic designer and asked to reproduce it — this rarely works well because a single PSD rarely shows the complex interaction a user can have with, for example, a web page. Having developers collaborate with a graphic designer and other product stakeholders from early on the process can result in better, more complete, specifications and a better understanding of the business and user needs by the people developing the software for those people — that can be no bad thing.
7. Do you have a roadmap?
Joels question asked if the software team had access to an up-to-date schedule. Most teams I’ve worked with take a more agile approach to development and therefore don’t have a schedule set in stone, or often at all. However, it is important to know the direction the ship is sailing in still. What projects are on your roadmap?
8. Do programmers have quiet working conditions?
This one is more important than people give it credit for. I like the approach to working conditions taken by Stack Exchange.
9. Do you use the best tools regardless of cost or license?
Joels question originally asked if a company uses the best tools money can buy, however in many cases now the best tools don’t need to be bought — they’re free as in beer or open source. So I’ve simply clarified that with my version of the question.
10. Do you make testing everyones concern?
Some developers like having a team whose sole purpose is to test other peoples code; and Joel’s original question marks you down if you don’t have dedicated testers. However, I am of the opinion that testing should be every individuals concern. Given that code that is easy to test exhibits different attributes to code that is difficult to test, removing the responsibility of testing from developers increases the chance that they may produce code that the testers subsequently struggle with.
However, only having develops test their own code would result in poorly tested code due to code being tested with the same set of assumptions as it was developed with.
Therefore, everyone needs to be concerned with testing and quality in general. Developers in the first instance, a second pair of eyes — be it a dedicated test engineer or another developer — and a product stakeholder should be the minimum number of people involved in the quality assurance of any change.
11. Do you use Code Reviews?
Code Reviews serve a few purposes;
1) They improve the quality of code by allowing for input and discussion with developers not initially involved in its development.
2) They help mistakes to be spotted before they cause any problems.
3) Code reviews are, in my opinion, the single best way to disseminate knowledge within a team — you can’t have a part of the system only one person understands if other people have been involved in regular code reviews
I personally like to have a code review as part of accepting any pull request in the Github flow, but this isn’t necessary to pass the question. They should, however, be regular.
12. Do new candidates write code in their interviews?
This question is unchanged from The Joel Test.
13. Do developers have access to stats and metrics for the live product?
Its important that developers can see how their work is managing in the real world; whether this be performance metrics (RAM usage, number of concurrent connections, etc.) or business statistics such as step conversion. Allowing the people who are building the product to be able to see the results of their work means that they can see if they need to take a new approach or if their work is paying dividends; therefore this aids both motivation as well as early detection of possible problems. Note: Having a business analyst or similar between the developer and the metrics doesn’t count, the developers should be able to access them directly and in real time if possible.
This test has the same caveats as The Joel Test; you can get 0/10 and by some divine intervention have a team that is constantly delivering, conversely you can ace the test and still be working in a dysfunctional way and obviously you shouldn’t be using this as a checklist to see if your team is capable of working on nuclear power plant control software.
However, what this test should do is let you know how much a team has thought about quality and developer experience, and open up a dialogue which allows you to investigate further their ideals around development.
When you design software you usually have a few use cases in mind, in the case of EpsilonGit the use case I keep coming back to is a project lead who wants an overview of how his team is working and how they are using their version control software.
A short while later I made a few small adaptations to package the orrery as a Windows Store (now called Universal Windows) Application. I thought a few people might enjoy watching the planets go around the screen, but didn’t really expect too many people to download it. To be completely honest, I mainly packaged it as an app to get points for the App Builder Rewards competition.
I haven’t touched the orrery, packaged as Solar System Simulation on Windows, for years. However, I wrote a little while back about someone who used it to teach their daughter about space, an unexpected use but a nice one.
Today I got an email from a student in Brazil who wondered if the software had a function to see planet locations at specific dates, as he liked the simple 2D graphics and wanted to use them to make a tattoo of the layout of the solar system on his birthday. Strange, but cool.
Unfortunately the Solar System Simulation (which is a gratuitous name — its in no way even close to a ‘simulation’) doesn’t support this function — but its a cool idea, and one I wouldn’t have thought of.
It might be fun to add it in one day, and see how popular some of the ideas I have would be compared to those that a user has had and wanted to be implemented enough to go to the effort to email me about it. I suspect the user submitted ideas might be more popular, because no one knows how well a customer users your product as well as a customer. But I might be wrong, it could be an interesting bit of research.
So, expect the unexpected uses of your software and services — both in positive ways, such as odd-but-exciting use cases, and negative, such as malformed input — but also be excited by the prospect.
P.S ‘Solar System Simulation’ is still available and works on Windows 8, 8.1 and 10.
Last week I was lucky enough to be put in contact with Guilaume and the rest of the team at Ripple. Ripple is a start-up of three students who have recently won the UK leg of the Microsoft Imagine Cup with their idea for a location based messaging app which allows you to contact people you may not know within a certain distance of your location – great for something such as freshers week where you want to meet new people.
Though Ripple have won the UK leg of the competition with their idea they are yet to have actually built the product. The three people currently involved with ripple had a great idea, but lacked the programming skills required to bring it to fruition. In order to find a student able to help them develop the application they contacted the Microsoft Student Partner program. Through the MSP program I got in touch with Ripple.
After a few Skype video conferences and telephone calls the team welcomed me on board, which means that through an odd bit of luck I am now an Imagine Cup 2014 finalist.
The Imagine Cup is a global student-only competition run by Microsoft in 190 countries which seeks to get students involved in solving social, economic and environmental problems through the use of technology. Each year winners from each country go to the world final in a different city to compete against each other – this year’s final is being hosted in Microsoft’s own back-yard, Seattle.
Winners can walk away with up to $50,000 prize money and a once-in-a-lifetime opportunity to meet former Microsoft CEO Bill Gates and current CEO Satya Nadella, so wish us luck! Regardless of how well we do I can’t wait to visit Seattle and do some of the awesome activities Microsoft has lined up for us, including going up the famous space needle.
I will, of course, keep the blog updated throughout the course of development and the competition itself.
Last night my housemate Hayley and I were talking about the Data Mining and Decision Systems module we took last semester, during that discussion the concept of genetic algorithms came up.
In the computer science field of artificial intelligence, a genetic algorithm (GA) is a search heuristic that mimics the process of natural selection, except that GAs use a goal-oriented targeted search and natural selection isn’t a search at all
Hayley explained the concept to me using the idea of an animal that is more likely to survive in its environment by blending in with the colour of its surroundings. So tonight, as a nice change from revision and work on my Final Year Project, I had a go at developing such an algorithm — with a somewhat humorous undertone. The result of this undertaking is a little tech demo called “The Generation Game – A Simple Genetic Algorithm”.
In the demo it is advantageous for a sheep to be green, in order to fit in with its field. However at the beginning of the game the ten sheep in the initial flock are of completely random colours.
Turns are taken, in order by
- A Flock of Sheep, which breeds, producing 1 new sheep for every two sheep in the flock (if there is a left over sheep it doesn’t breed). Sheep produced by the mating process are the median colour of its two parents.
- A Wolfpack which depending on its hunting ability can eat 10% – 60% of the heard in any given turn
The closer a sheep is to the colour green, the less likely it is the be a casualty of a wolf attack (however there are other contributing factors, and a green sheep can still be killed).
As you can see from the gif image above if the initial flock has a green sheep amongst its ranks then evolution takes place (survival of the fittest, like how Darwin described it) and within just 20 generations the flock consists of only green sheep.
Another interesting situation is when there is no green sheep in the initial flock.
As you can see from the example above which starts with no green sheep the flock gradually becomes a light brown colour, this is the closest colour the flock could get to green with the genes avaliable in its gene pool.
This is just a simple tech demo (written in C# with XNA) to prove to myself I understood the concept, but it worked out quite well and I think its cool. I’ll be cleaning up the code and adding some comments, so be sure to check out the repository containing the program on GitHub.
Thats all for now,
A few weeks ago, when I was at Campus Party: Europe, I attended a fascinating lecture by Michael Meeks about LibreOffice. Whilst his talk was mainly about getting people interested in using LibreOffice, he also suggested that people get involved in the development of open source projects. As I was interested in this particular aspect of his talk the most I put my hand up, and asked the question
How easy is it to get involved in Open Source, particularly LibreOffice?
Michael, who works at Collabora – a company which supports LibreOffice, gave a great response and asked me to speak with him after the talk. I did so, and it was at this point he mentioned the “easy hacks” list that LibreOffice developers maintain, the purpose of which is to take the fear out of getting involved in an open source project. Items on the easy hacks list should only take a few hours to complete, be simple in their nature and allow to learn about the coding standards and systems used by the project.
For those who aren’t quite as interested in productivity software as myself, LibreOffice is a Desktop Office suite which has similar functionality to that of the famous Microsoft Office package, it consists of the following programs:
- Writer (similar to Microsoft Word)
- Calc (similar to Microsoft Excel)
- Impress (similar to Microsoft PowerPoint)
- Base (similar to Microsoft Access)
Fast forward a few weeks to the 12th of September and I stumble across an item on the easy hacks list that interests me, this item is called “Bug #67158 – FORMATTING: Add shortcut Ctrl+K (or cmd + K on OS X) for inserting hyperlinks”. It interested me because it seemed like it would be relatively simple to add, it’s just an event handler after all, and it was marked as “Medium importance, enhancement”, this meant it was both considered more ‘important’ than some of the other easy hacks, and it was more likely to impact on users as it was an enhancement people could actively see as opposed to code formatting changes (which, of course, are equally important, just less visible to users). Another interesting aspect of this bug was that it affected Writer, Calc and Impress, and my changes would be useful to everyone who used these programs, whether they used Windows, GNU/Linux or Mac OSX as the event would work on all 3 major operating systems.
I clicked through, from the list, to the Bugzilla entry for the task and assigned the task to myself and left a comment showing my interest in taking it on.
The next thing I had to do was boot into my Fedora GNU/Linux operating system and set to work on making the development environment in which I would make my changes, This is an area in which GNU/Linux, particularly its package system, really comes into its own. To install every single resource I needed to work on the project, from libraries to images, I just had to to type
$ yum-builddep libreoffice
Following this I just had to download the latest version of LibreOffice from their Git repository and run a shell script which dealt with everything else. It was really easy to do following this guide.
Now all I had to do was add in the event handler, and ‘fix the bug’ so to speak. Simple right? Well, sort of. LibreOffice has 7,075,071 lines of code (according to ohloh.net) so finding exactly where my fixes should go, and what exactly I should be writing to invoke the Hyperlink Editing and Insertion Dialog was a daunting task. Fortunately because my bug was assigned to the easy hack list some regular contributors to the project had left some pointers on what to do. Petr Mladek from Suse pointed me in the direction of the file which I needed to edit in order to add an event handler.
I played around with the file, finding out what the various variables did and meant and decided to swap around what Ctrl + C and the delete keydown events did, in order to see how easy it was to edit a pre-existing event handler — It was at this point I ran into my first issue.
When I attempted to build my changes it worked, however, when I ran LibreOffice Ctrl+C still copied things, and delete still deleted things. I was flummoxed. Thankfully when I reached out to Petr for help he was happy to provide it, it turned out I was providing the compiler with flags that meant it would ignore the changes I had made, with this knowledge I managed to get delete to copy things and ctrl + c to delete things, useless and in fact harmful changes, but a great test.
Having reverted the aforementioned events to do what was expected, I added in an event for Ctrl+K, using the same code style and formatting as the other event methods. To check the event was firing I made Ctrl+K Select all text in a document. Brilliantly, it worked first time.
I then set out the find what I would need to call in order to invoke the hyperlink editing dialog. This was made a lot easier by a really awesome utility LibreOffice has implemented called OpenGROK. OpenGROK allows you to search through all of LibreOffice’s code, instantly, using a web interface. I quickly found the method I needed to invoke to call the HyperLink Editing Dialog and added it to my code and compiled it. It worked! Awesome. My first patch to LibreOffice was ready to be submitted for review.
A review, in the context of open source, is when one of the project maintainers — kind of a manager of the project — looks over your code, ensures it does what it says it does and that it doesn’t break anything else, and then implements it into the main branch of code. At this point your code is part of the project and will then be made part of the next update which will be made available to the 150 million people who use LibreOffice! (thats over 100,000 downloads every day).
Today my code was accepted into the main branch. It’s really exciting to think that soon over 150 million people, and possibly many more in the future, will be using some of my code. I can’t wait to contribute some more, not only to this project but to others too!
You’ll be able to download my fix as part of LibreOffice 4.2, the release dates for which you can see here.
Last week I was fortunate enough to be with some of my fellow Microsoft Student Partners, some Windows Ambassadors, some Microsoft Interns and some Microsoft Employees at Campus Party Europe, an event which was described by the BBC as ‘Glastonbury for geeks’.
I would say this was fairly accurate, except there was less mud! Like Glastonbury there were several stages, a whole host of interesting people to meet, and tents!
Working on the Microsoft Stand
Tuesday through to Friday I worked for 6 hours a day on the Microsoft Stand. It was really good fun! Our job was to talk to people about Windows 8.1, Windows Phone 8, Microsoft Surface and the Xbox One and endeavour to answer any questions they had about either the software or hardware. As well as that we tried to get as many people as possible to take our surveys, in return each participant got a surprisingly stylish pair of Windows 8 Branded Sunglasses and a glow stick!
I was also fortunate enough to have Academic Audience Lead Phil Cross, point a few developers who had questions about Visual Studio and developing for Windows platforms my way.
Throughout Wednesday and Thursday I spent much of my shifts writing a Windows 8 app for the project management website TeamworkPM. It was especially interesting to do this because my display was being projected on two 42inch monitors above my head, this meant everyone could see what I was doing and I attracted quite a few developers to come and talk about developing for the platform.
In the evenings when the stand got a bit quiet we would try to entice people to come and see our wares in a variety of ways, one of which was through the medium of dance :P. My highlight was the Macarana, or the Microsoft Macarena as I called it. Below you can see us all dancing and waving our glowsticks to the ever-entertaining Harlem Shake.
The main thing that first attracted me to the offer of working for Microsoft at Campus Party Europe was the fact that we could spend our down time watching some of the many speakers that came to talk about their respective fields.
I was fortunate enough to catch 2 or 3 lectures a day, from people as well respected and diverse as Jon “Maddog” Hall — chairman of Linux International — and Ian Livingstone — President of Eidos and founder of Games Workshop.
The O2 arena hosted 8 stages, of all of which had talks from 10am – 10pm each night, so there was certainly a lot to take in — too much to write about here.
My favourite talks were actually that about free and open source software (sorry, Microsoft), and the relatively new phenomenon of open data.
At the end of the week my fellow MSP’s and I were super happy with being able to have witnessed one of the coolest, and largest tech conferences in the world, but even on top of that Microsoft were generous enough to allow us to keep the devices we had been using throughout the week to showcase both Windows 8.1 and Windows Phone 8 to customers, this meant a Nokia Lumia 920 and a Microsoft Surface RT each!
I was over the moon with the Surface RT because I had been looking to get an RT device for a while to test the performance of a few of my apps on the lower powered ARM CPU’s — but I was especially happy with the Nokia Lumia 920. My phone contract ends in a few days, and because now I have an awesome new phone I’m gonna go on a SIM only plan and save myself some money 🙂
I would like to say a massive thank-you to everyone involved at the O2, the people behind Campus Party, and of course Microsoft for making everything work like clock work and giving me a fantastic opportunity to learn from some of the best minds in our industry, a lot of laughs, some great knowledge and some cool electronics! I hope to see you all again soon!