Unfortunately it took 2 months to make that 90p!
Since Rob and I rebuilt CS Blogs in 2016 neither of us have really done a lot with it. Whilst I am accountable for some of the 3,000 monthly page views the most I have done with it operationally for some time has been to ensure that the servers & domain keep being paid for.
I think that at some point because of this I started seeing CS Blogs as an expense which I should try to break even on despite the fact that it started as a labour of love.
To try and recoup to small amount of money I throw at it each year I thought I could simply include the Google AdSense Auto Ad script and let that take care of monetisation.
Auto Ad itself is quite nifty and takes all of the effort out of adding adverts to your site whilst still giving you enough flexibility to tailor them to your content and audience. However, just plugging in adverts was somewhat low effort and so unsurprisingly didn’t yield good results. I would go so far as to say that adding adverts was a negative as it didn’t cover the CS Blogs bills but did have a negative impact on the user experience.
To put the matter right I have now removed all ads from the platform and will be taking another look at ways to improve both the software and the reach of the platform (perhaps by marketing directly to Computer Science departments around the UK).
If you’d be interested in helping with these efforts please get in touch.
- how your intended developer audience should affect your technical decisions
- how a good application doesn’t always become a good open source project
- how to structure an application or system to make it easy to contribute to
- how the CS Blogs workflow fits together
After having received some feedback from Charlotte, who watched the whole thing, I intend to make some improvements to the structure of the presentation and submit it to some other conferences.
I’m not sure how much sense the slides will make without me talking around them, but they’re available here.
Where are we now?
As I mentioned in my previous post the current CS Blogs system grew out of a prototype. This meant that the requirements of the system were discovered in parallel with designing and implementing the system, resulting in the slightly weird architecture shown below.
I say its weird because the `web-app` component isn’t really just a web application — it’s also an API server for the android application (and in theory any other app) and includes all the business logic of the system.
The `feed-aggregator` is a small node.js application ran as an Azure WebJob. It was hacked together in a few days and really only supports certain RSS and ATOM feeds. For example it works great for ATOM feeds using <description> tags, but not ones which use <content> tags. These oversights were made due to the software not being developed on much real data, essentially only my own feed, and the homogeneous nature of our users blogs — they’re mainly all Blogger or WordPress.com.
Despite the obvious and numerous flaws of the system it has worked well for the past year or so. However, when I wanted to add the concepts of organisation to the system — a way of seeing blogs only written by people at a certain company or university — I found the system to be a hodge-podge of technical debt, to the point where adding new features was going to take longer than developing a good, modular, expandable system. It was time to pay down the technical debt.
The first thing to do was to determine what parts of the old system were good — and try to ensure that these poistive things didn’t regress in the new system –, which things were in need of improvement and what new features we should add in at the same time.
Fortunately CS Blogs does do a number of things well:
- Short lead time — New posts appear in the system within 15 minutes
- Good Web App — The front end works well both on desktop and on mobile and is very performant due to its lack of scripts. The work Rob did on the styling makes it a joy to use
- Good Authentication — Users enjoy being able to use Github, Stack Exchange or WordPress to sign in and I enjoy not having to look after their passwords
A few things it could improve on are:
- Support for a larger range of RSS and ATOM feeds — ATOM support in particular isnt great in the current system
- A lot of functionality only works in the web app — Any method which requires authentication, such as signing up to the system, isn’t avaliable through the API
- Feed aggregation downloads every authors feed every 15 minutes, this is a lot of data and wouldn’t be economic to scale to 100s of users
- Code maintainability is poor due to a complete lack of automated testing and linting
The additional user-facing features I want to implement are:
- Notifications of new blog posts for CS Blogs applications on Android/iOS
- Support for the aforementioned organisations feature
Designing a Distributed System
The system you can see in the diagram below was designed with the intention of fulfilling out of the requirements which I outlined above. You’ll notice the use of Amazon Web Services icons, as I have recently switched hosting from Azure to AWS. There are a enough reasons for this decision to warrent its own blog post, so I wont go into detail here.
In the new system all applications are treated as first class citizens, meaning there is nothing that the web application can do that any other application can’t. This is achieved by having all of the business logic, authentication and database interaction handled by the `api-server` — which is accessable by anthing that can make HTTPS:// requests and handle JSON responses.
This means that the mobile applications will be able to perform actions such as registering a user and editing their details, which they cannot under the current system. Another benefit to the mobile applications that isn’t shown on this diagram is that the `feed-downloader` calls Amazon SNS with information about how many new blog posts it has found every time it runs, this in turn is relayed to the mobile applications in the form of notifications.
Whereas in the old system we used MongoDB, I’ve opted to use PostgreSQL — via the Sequelize Node.js ORM — this time around. Some of the features I want to implement in the future, such as organisations, make more sense as relations rather than as document in my mind and the ecosystem of applicatons for interacting with SQL databases, and in partciular PostgreSQL, is much more mature than MongoDB.
The `feed-downloader` is portable, but contains an entry point so that it can be used as a infrastructureless AWS Lambda function (and I suppose this entry point would also work for the newly released Azure Function system). It’s a bit more clever than the old `feed-aggregator` in that it uses If-Modified-Since HTTP requests to only download and parse RSS or ATOM feeds that purport to have changed since the last time an aggregation was ran.
The implementation of the `feed-downloader`, `api-server` and `web-app` components follows my guide to writing better quality Node.js applications. Node.js was chosen due to its abundance of good quality libraries, ease of interaction with JSON objects and the authors familarity with it in production scenarios.
In order to meet the requirement of good maintainability the `feed downloader` was built using the test driven development methodology and currently has 99% test coverage. These tests use real data, feeds from actual CS Blogs authors, including feeds from Blogger, WordPress.com, WordPress.org, Ghost and Jekyll.
Theres still a lot to be done before before the new CS Blogs can be released, so why not hit up the contribution guide and get involved?
Rob and I have both been doing a lot of work on CS Blogs since the last time I blogged about it. Its now in a usable state, and the public is now welcome to sign up and use the service, as long as they are aware there may be some bugs and changes to public interfaces at any time.
The service has been split up into 4 main areas, which will be discussed below:
csblogs.com – The CS Blogs Web App
CSBlogs.com provides the HTML5 website interface to Computer Science Blogs. The website itself is HTML5 and CSS 3 compliant, supports all screen sizes through responsive web design and supports high and low DPI devices through its use of scalable vector graphics for iconography.
Through the web app a user can read all blog posts on the homepage, select a blogger from a list and view their profile — including links to their social media, github and cv — or sign up for the service themselves.
One of the major flaws with the hullcompsciblogs system was that to sign up a user had to email the administrator and be added to a database manually. Updating a profile happened in the same way. CSBlogs.com times to entirely remove that pain point by providing a secure, easy way to get involved. Users are prompted to sign in with a service — either GitHub, WordPress or StackExchange — and then register. This use of OAuth services means that we never know a users password (meaning we can’t lose it) and that we can auto-fill some of their information upon sign in, such as email address and name, saving them precious time.
As with every part of the site a user can sign up, register manage and update their profile entirely from a mobile device.
api.csblogs.com – The CS Blogs Application Programming Interface
Everything that can be viewed and edited on the web application can be viewed and edited from any application which can interact with a RESTful JSON API. The web application itself is actually built onto of the same API functions.
We think making our data and functions available for use outside of our system will allow people to come up with some interesting applications for a multitude of platforms that we couldn’t support on our own. Alex Pringle has already started writing an Android App.
docs.csblogs.com – The CS Blogs Documentation Website
docs.csblogs.com is the source of information for all users, from application developers consuming the API to potential web app and feed aggregator developers. Alongside pages of documentation on functions and developer workflows there are live API docs and support forums.
In the screenshot below you can see a screenshot of a docs.csblogs.com page which shows a developer the expected outcome of an API call and actually allows them to test it, in a similar way to the Facebook graph explorer, live on the documentation page.
Thanks to readme.io for providing our documentation website for free due to us being an open source project they are interested in!
The CS Blogs Feed Aggregator
The feed aggregator is a node.js application which, every five minutes, requests the RSS/ATOM feed of each blogger and adds any new blogs to the CSBlogs database.
The job is triggered using a Microsoft Azure WebJob, however it is written so that it could also be triggered by a standard UNIX chronjob.
Whilst much of the actual RSS/ATOM parsing is provided by libraries it has been interesting to see inconsistencies between different platforms handling of syndication feeds. Some give you links to images used in blog posts, some don’t, some give you “Read more here” links, some don’t. A reasonable amount of code was written to ensure that all blog posts appear the same to end-users, no matter their original source.
I welcome anyone who wants to to try to service now at http://csblogs.com. We would also love any help, whether that be submitting bugs via GitHub issues or writing code over at our public repository.
Between studying hard for my masters degree — and applying for jobs for when it ends — I have managed to find some time to set up a new website called CSBlogs.com
People who have been reading this blog for a few years will have seen HullCompSciBlogs.com mentioned a few times, for those that haven’t it was a service which aggregated all of the blog feeds of computer science students at the University of Hull.
John Van Rij did a great job of keeping that service online, but unfortunately doesn’t have time to maintain it anymore. Since the service went down I have grown to miss it — I guess I didn’t realise how much enjoyment I get from seeing how well everyone is doing from back in Hull — current students, alumni and even lecturers.
In order to resolve this problem I set up CSBlogs.com with the aim of getting all of the Hull Computer Science bloggers and others from around the country onboard.
The project is entirely open source, under the MIT license, and can be forked, modified and improved by the community on Github.
The website itself is hosted on Microsoft Azure and utilises CloudFlare to provide security, analytics and a global content delivery network. Node.js is used as the backend programming language and the MongoDB NoSQL database is used for persistent data storage. Nodes packages are used extensively, including Express.js for routing, Handlebars for data-binding to the front end and LESS-Middleware to improve CSS development.
Complicated acronyms aside I have worked hard to make setting up a local development environment and contributing source as easy as possible for beginners via the instructions I have written on the homepage of the Github repository. I would really recommend any 1st or 2nd year students give it a go — open source development looks great on your CV! And if you need any help contact me as per the instructions.
We are currently in the process of setting up all of the required frameworks and technologies and writing guides for how to get involved (this has actually been one of the more challenging and interesting parts of the project so far) and hope to have a working minimum viable product in the next week.
At this point I would like to thank Charlotte Godley, Alex Pringle & Rob Crocombe for their extensive help in getting the project to where it is now. Charlotte has taken on a role of project management, Alex has developed a rudimentary database controller and Rob has been working on implementing less.js support and developing a theme for the site.
I will keep the blog updated with progress on the project.