Troydm's Blog

A personal blog about software development

Rewriting Micro Compiler in OCaml Using Ocamllex and Ocamlyacc

In my previous post I’ve talked about writing micro compiler in OCaml under 300 lines of source code. There are number of ways to make our work easier and number of source code lines significantly smaller.

Potato Loli

Let’s rewrite our micro compiler using tools called lexer and parser generators. We’ll be using tools called ocamllex and ocamlyacc which are distributed with OCaml compiler and are modeled after famous lex and yacc tools for Unix operating systems. Those tools actually have better modern analogues called flex and bison which are described in detail in Flex & Bison: Text Processing Tools book. Nowadays however if you are writing a professional compiler in OCaml I strongly suggest you consider using sedlex and menhir instead of ocamllex and ocamlyacc as both tools are quite outdated and lack some significant features that their modern analogues have such as unicode support for lexing, parameterized parser generation and built-in grammar interpreter. So what are lexer and parser generators? To put it simply ocamllex and ocamlyacc take special .mll and .mly definition files of lexer and parser semantics mixed with OCaml source code and generate an .ml source code files that do the actual token generation and parsing for you. Pretty neat indeed, and it’s actually easier to use than it sounds so let’s rewrite our micro compiler using those tools. We’ll be using original source code of micro compiler as reference only as entire code base needs to be changed. You can see the end result of our rewrite in micro git repository branch called simple. For the actual description of the micro language see my previous post. So let’s get started!

Writing Micro Compiler in OCaml

At one point or another every single software developer in the world comes to a realization in his career when the time is ripe and it’s time to write your own super cool programming language.

Lemon Loli

However the subject of creating your own programming language with an compiler is quite a complex one and can’t be tackled without some pre-research. That’s how I’ve started reading Crafting Compiler in C, an aged but really comprehensive book about developing your own compiler for an Ada-like programming language. Second chapter describes writing a really simple micro language targeting pseudo assembly-like output in order to explain the core concepts of developing your own compiler and writing an LL(1) parser.

Let’s try rewriting this micro compiler in OCaml, a language better suited for writing compilers that is becoming quite popular due to it’s clean syntax and strict evaluation semantics combined with functional and object-oriented programming styles. If you are not familiar with OCaml try reading Real World OCaml first. Instead of outputting pseudo assembly our micro compiler will output a real nasm source code which will be automatically compiled into a binary executable file.

Making 30 Years Old Pascal Code Run Again

Recently I’ve been interested in Logic Programming, notably in learning Prolog so I’m in a process of reading two great books, Programming for Artificial Intelligence and The Art of Prolog. If you want to get a quick feel of Prolog I recommend you take a look at Bernardo Pires’s Gentle Introduction to Prolog and Prologomenon blog. To put it simply Prolog is all about logic, deduction and backtracking

Sherlock Loli

While browsing /r/prolog I’ve stumbled upon Prolog for Programmers originally published in 1985, an old book indeed and honestly sometimes hard to follow. I can’t recommend it as a starter book about Prolog but it’s still quite interesting to read. However it has a whole two chapters describing implementation of Prolog interpreter which is quite a complex task and sparkled my interest in continuing reading this book. Authors provide source code of two version of Prolog interpreter, the one they originally wrote in Pascal back in 1983 and it’s port to C in 2013 which, as stated on their website, was done because they couldn’t compile old Pascal code with Free Pascal Compiler and because… well Pascal is quite out of fashion nowadays. Couldn’t compile?! Well, challenge accepted!

Processing & Broadcasting Financial Data in Scheme

Any software developer who worked in financial industry will tell you that there are few key requirements to programming applications for real time market. Applications should be as fast as possible and they should be as easily modifiable as possible. First requirement is essential since getting and processing information takes time and sending processed information takes even additional precious time, and in financial world time equals money. Second requirement is determined by constantly changing business rules imposed on data processing.

(eq? 'money 'power)

Writing IRC Bot Using Perl 5 and POCO::IRC

Some people use IRC to chat, some don’t. It was invented a really long time ago and isn’t going away anytime soon despite some new generation alternatives popping up like Jabber.

Personally I always have my IRC client running (I’m using weechat + tmux) and chat with lots of interesting people who inspire me to try new technologies and learn something different every day. One person, who’s nickname I won’t name, was always telling me about how awesome Perl as a programming language is and how great it’s potential is thanks to CPAN that has almost 124k modules for any life situation. I always thought he was exaggerating and literally acting like a Perl fanboy. Perl was the first programming language I’ve learned back in the late 90’s and remembering how frustrating my experience with it was and how cryptic it really was for me do something with it when I was unexperienced and lacked lots of qualities that make up a any decent software engineer I was skeptic about using it again. Well, time passed, time always passes, and I haven’t written anything more than quick 50 line server scripts in Perl for almost 13 years. I’ve almost forgotten everything about Perl. Since lately I was having this crazy idea about writing IRC bot that could store and execute shell scripts on server so I could automate my servers through IRC, I thought why not write it in Perl. I’ve remembered that person who was always bragging about Perl’s greatness wrote an IRC bot in Perl using POE::Component::IRC so I’ve decided to try and use the same framework for my bot. It’s based on really popular POE event loop framework which is very easy to learn and use. Matt Cashner wrote a really good introduction article called Application Design with POE

Hosting Your Own Remote Private Torrent Tracker

Ever wanted to share a really big file (more than 4 GB) with someone without a hassle of uploading it to some file upload server?

BitTorrent to rescue, also there are alternatives like hosting your own ftp/sftp file server but I won’t consider them here! So you probably already have a dedicated home file server running on Linux/BSD/Solaris that also has a torrent client installed on it that you access through web interface?

Oh you don’t? Snap it’s it’s so useful that nowadays almost everyone has some kind of NAS that he/she is using for file storage and torrents. So if you don’t have one then you are behind of times

So what do we need to share some file over torrent? Yes indeed we need a torrent tracker