Progress Report (January 2020)

Dmitriy Kubyshkin
2 months, 3 weeks ago
I have started developing the mass compiler in April 2020 with the majority of early work captured in a series of YouTube videos. Due to the limitation of doing the work on video the progress in the first half-year or so was quite slow. Still, by September 2020 the language got sufficiently powerful to run FizzBuzz both in JIT mode and by compiling it to a Windows executable.

Four months later FizzBuzz remains the most complex program in the test suit, however, that does not mean there was no progress. My main focus throughout this time has been mainly on two things: robustness and meta-programming capabilities. There is not much to be said about robustness work - it is very important but rather uninteresting. Meta-programming on the other hand is very important and core to this language.

There are two main parts for the meta-programming: macros and compile-time execution. It may seem that one does not need macros when an ability to run arbitrary code at compile is there. This does not seem to be the case in practice for a couple of reasons. Firstly, macros very often provide a succinct way to express complex transformations that would be awkward to represent in straight code. Secondly, all compiled languages suffer tremendous penalties in project compilation times when doing compile-time execution. Solving this is no easy feat.

The core issue with making compile-time execution fast is something that interpreted JIT languages, such as LuaJIT or JavaScript suffer from. If a certain piece of code is only ever executed once, it is the fastest to do a straight-up interpretation. On the second call, you probably would have been better off with a crappy JIT version. By the 10th time you have spent a couple of orders of magnitude more time than you should have due to the inherent overhead of the interpretation.

Out of mainstream compiled languages, only C++ and Rust provide facilities for compile-time evaluation. From my understanding C++ constexpr stuff is interpreted, not compiled. Rust has compile-time procedural macros, but because Rust compilation is dead slow in general using them has a significant penalty to a degree where some projects stop using the feature in certain cases in favor of offline code generation.

Out of newly developed languages, Zig and Jai both use a form of byte-code interpretation. Although the evaluation approach differs a bit, if used extensively, compile-time execution will destroy your compile times on either of them. Mass language aims to use compile execution to implement most of its features. This leaves no choice but to do JIT right away. As usual, this is way trickier than it sounds. Here are some of the things that need to be solved:

  1. Incremental JIT. Adding new code or data should not require any copying, recompilation, or interaction with the OS. I have a good plan of action here and already making some progress.
  2. JIT Compilation Speed. Mass has no real AST or IR, so the speed isn't too bad, but there is currently an assembly step. It should be removed at some point in the future.
  3. Compile-time / Runtime Boundary. Nailing the semantics for which values and how can be shared is tough. I expect this to be in flux to the very late stages of the project.
  4. Cross compilation. A different processor architecture or even a calling convention requires a separate version of the compiled code. Big-endian vs little-endian is also something to think about.

The topics above are what I plan on working on in the coming months. Of course, there are smaller tasks that need to be done.

You can follow the project progress and support my work by subscribing to my YouTube channel, starring the project on GitHub or catching an occasional live stream on Twitch.

Log in to comment