Tag Archive | Result

Languages and Their Compilers ACW Result – 80%

You may have noticed that my blog hasn’t been updated nearly as much this semester as it has been in those last year and the year before. One of the reasons for this has been the Languages and Compilers assessed coursework, which has taken up quite a bit of time.

The aim of the coursework was to build a compiler, a bit of software which converts from a “high level” human readable language to a “low level” language a computer can execute, for a completely made up language called SPL – which stands for Simple Programming Language.

Formalizing the Grammar of the Language

SPL If Then Else Syntax Diagram

SPL If Then Else Syntax Diagram

The first thing we had to do was formally specify the grammar of the language — the grammar of a language specifies what is, and isn’t, valid in said language — we did this by reading some syntax diagrams provided to us by our lecturer, such as the one above, and then writing out Backus-Naur form notation to describe it. The image above in Backus-Naur form is notated as:

<if_statement> ::= IF <conditional> THEN <statement_list> ENDIF |
                   IF <conditional> THEN <statement_list> ELSE <statement_list> ENDIF

Items in <brackets> are other rules, so there would be another rule which describes the syntax of “conditional” and “statement list”. Using multiple rules together allows us to build up a complete syntax for the language.

The Tokenizer

Once we had developed a complete BNF grammar for the language we could move on to building the compiler itself, this was especially interesting for me because my final year project uses a lot of the same concepts.

The first stage of building the compiler was building a tokenizer. A tokenizer, as I’ve spoken about before, is a system which recognizes certain keywords, or patterns, and gives them a label which can then be used later on in the compilation. For example the C# code:

public static void main()
{
     int a = 10 / 2;
}

Consists of the following tokens:

  • Public Keyword
  • Static Keyword
  • Void Keywork
  • Identifier Pattern main
  • Open Bracket (sometimes referred to as BRA)
  • Close Bracket (sometimes referred to as KET)
  • Open Curly Bracket
  • int keyword
  • Identifier Pattern a
  • Symbol =
  • Integer Pattern
  • Symbol /
  • Integer Pattern
  • Semi Colon
  • Close Curly Backet

We used an open source version of the Unix program lex, called flex, to build our tokenizer (in my FYP I’m building a tokenizer from scratch). In our flex file we recognized tokens using regular expressions. Matching something like a keyword is rather simple, you just look for that literal string, however looking for an integer is a tad more interesting. Below you can see a few examples:

/* Primitives */
digit                  [0-9]
number                 {digit}+
/* End: Primitives*/

//Public keyword
public                 TOKEN(public);

//Integers (e.g. whole numbers)
{number}               TOKEN(integer);

//Reals (e.g. Numbers with decimal point)
{number}.{number}      TOKEN(float);

On the left hand side of the token recognition is the pattern that has to be matched. For the word public it is the literal string public, for integers it is a {number} and for a real it is a {number} a full stop and then another {number}. {number} has been declared previously to be 1 or more digits, and a digit has been declared before that to be any number between 0 and 9.

The tokenizer takes a .spl file, which contains spl code written by a programmer, and spits out an array of tokens which we can then use in the next stage.

The Syntax Parser

Once we’ve finished building our tokenizer we could then go on to building a syntax parser. The job of the syntax parser is to ensure that the array of tokens provided by the tokenizer are in an order which makes sense. To produce the syntax parser we used a free program called Bison, which is based on the Unix program YACC.

This is where we go back to the Backus-Naur Form we made previously. Using that, with some alteration, we produced a .y file which, once passed through Bison, could identify correct grammars in the list of tokens passed to it — and tell the programmer there was an issue with their code if anything was detected as incorrect.

Correct grammars were added to a parse tree, in hierarchical order. As we saw earlier, an IF statement contains a conditional, so that became one of the branch nodes of the tree which represented an IF statement.

At this point we could print out the parse tree, my software could print both to the console and to an interactive animated viewer. Below you can see both:

a.spl Parse Tree as shown in the Command Prompt

a.spl Parse Tree as shown in the Command Prompt

a.spl Parse Tree as Interactive Animated Viewer (Click to enlarge)

a.spl Parse Tree as Interactive Animated Viewer (Click to enlarge)

Code Generation

Once we have determined that the input file from the programer is syntactically correct we need to generate executable code. For this coursework instead of generating an executable directly we generated ANSI C code, and passed that to the GNU C Compiler. We did this for a few reasons:

  • Generating machine code directly wouldn’t be as portable
  • Generating machine code would take a lot longer!
  • Learning ANSI C was really useful, as its the number 1 most used programming language in the world

Generating code was a simple case of walking the parse tree and generating code for each node… for example, if you reach an IF THEN ELSE node you could simply output the following:

if(/*Generate code for the CONDITIONAL here*/)
{
    /*Generate code for the IF statement list here*/
}
else
{
    /*Generate code for the ELSE statement list here*/
}

Ensuring you generate code for the conditional and statement lists in the correct sequence. Certain types of grammar throw up a few problems, such as FOR LOOPS (which way should the iteration go?) but the concept of code generation from a tree in itself is simple.

Optimization

The final part of the coursework was to add some optimization to the compiler. Some optimizations you could implement included:

  • Don’t compile the code if it doesn’t contain and input or output (if a program has no input or output, it looks the same as having no program at all!)
  • If a programmer increments a variable by 1 generate the ++ operator rather than +1
  • Constant Folding: If a variable is assigned to the outcome of a constant set of values. e.g.  a = 4/2, then generate a = 2 because this will always be the case

I chose to do both the incrementation optimization and the no input or output optimizations. I was going to add constant folding too, but didn’t think I had enough time.

Conclusion

Today I got my grade for the coursework and achieved 80%, which is a strong first class. I’m really happy with this. 🙂

I would recommend that any second year thinking about what modules he or she should take next year seriously consider Languages and Compilers, it is by far the best module I have taken. It combines knowledge from all your previous modules and really cements that information in your head.

I’d like to thank both Dr. Martin Walker and Eur Ing Brian Tompsett for setting such an interesting coursework, and providing me with the help that they did along the way.

If I’m allowed I might upload my compiler and some documentation for SPL for people to have a play with it.

Danny

Advertisements

Systems Analysis, Design and Process Exam and Result

Image from Rob Miles' Introduction to the 08220 Module

This Image from Rob Miles’ Module Introduction Lecture Slide is what kept me sane throughout revision 😉

On Tuesday I had an Exam for the Systems Analysis, Design and Process Module. You may remember I posted about the other half of the assessment, the Mr. Bump Management System Group work before, here.

The exam mainly covered the comprehension of specifications, diagrams used in System design (State Transition, Activity, Use Case, Class, Sequence, PERT, etc) and BCS Ethics. All of which were quite interesting. 🙂

I achieved a grade of 72% — which is a first class — so hopefully, along with my Group work I will have a first class grade overall for the module.

Danny

Advanced Programming Exam and Result

C++ vs C#

Last Wednesday I had an exam for the Advanced Programming module. The exam was about the syntax and use of the C++ Language and the ability to read Assembly Language. I was fairly confident, even though I have personally developed very few C++ applications. As more of an application developer than a real time system or game developer I prefer to ease, reliability and stability of C# or other managed languages such as Java over the pure speed of C++.

Having said that, C++ and Assembly are interesting languages and by studying them I feel I have learnt a lot about programming in general, particularly optimization and how things actually work on the hardware.

The exam wen’t well, and I received a mark of 75% — a first class — which I’m very happy with 🙂

In the upcoming semester there will be a coursework and one more exam for the module. Hopefully I do as well in them as I have in this exam 🙂

Danny

Programming 2 Test Result

Today we took a test for our Programming 2 Module, and because it was an online assessment we got immediate feedback. I got 100% on questions which varied from “What is a static member?” to “What is an overloaded method?”. Can’t ask for better than that, I’m very please 🙂

Danny.

IT and Professional Skills Test Result

Today I took and got my result for the Professional Writing and Communication Skills test for the IT and Professional Skills Module. Due to the test being online, on eBridge, I got instant feedback and found I got 48/60, which is 80%, a pretty high first class grade so I’m really happy 🙂

Danny.