External STG Interpreter

In my previous blog post I mentioned that I'd like to create an interpreter for the external STG IR. That was exactly what I've worked on in the past several months. I call it the external STG interpreter. It can execute programs that were compiled with GHC-WPC.
I.e it can run GHC itself (DEMO: GHC 9 running in the External STG Interpreter) or a simple interactive OpenGL application (DEMO: External STG interpreter & breakpoints)

The external STG interpreter supports most of GHC primops and FFI, plus implements the necessary RTS features like thread scheduler and the I/O manager. Regarding the primops, only the compact regions, STM and GHCi bytecode primops are not implemented. In the future I plan to add support for compact regions and STM but not GHCi primops. The GHCi bytecode primops will not be needed because the ext-stg-interpreter essentially subsumes the GHC RTS bytecode interpreter implementation.
I reused the GHC testsuite to test the external stg interpreter. Additionally I wrote quickcheck based unit tests for the simplest primops.

One might ask how fast the interpreter is. Well, it is pretty slow, no question. But its performance is not the point. Of course performance is important for me as well, but the interpreter is a learning tool and a research vehicle to develop novel and performant backends for Haskell. So the main purpose of the external STG interpreter is to have a high level executable specification for the GHC STG & RTS semantics. Without a simple spec it is impossible to develop new and conceptually different compiler backends for GHC.

The other important property that the interpreter delivers is program observability. Program observability is crucial to validate the effect of optimisation transformations and to find and understand runtime issues, like compiler/RTS bugs or memory leaks.

The most important design principle for the interpreter was to keep it simple so that an average Haskell programmer could understand the Haskell runtime semantics. Without this simplicity I could not understand the compiler backend and RTS as a whole. I specifically developed the ext stg interpreter to answer my questions if a cool compilation technique (e.g. GRIN/LoCal/SPMD-on-SIMD) from a paper is applicable to Haskell or not.
I'm interested in Haskell in general, not specifically GHC STG machine Haskell with its fine tuned implementation details. If I restrict the feasibility question to the GHC STG machine the answer would be no in most cases.

The external STG interpreter is part of the GHC-WPC tooling. GHC-WPC exports the STG IR of the compiled Haskell modules to .modpak files.
The .modpak files are zip archives that can contain module related data like the ext-stg IR or pretty printed .cmm .stg and .core files.
Also at link time GHC-WPC writes out a single .ghc_stgapp yaml like file for the linked application that contains references for the used Haskell and C library dependencies.
This .ghc_stgapp file is the main file for the Haskell application. It can be passed to other GHC-WPC cli tools, like gen-exe to generate binary executable or to the external STG interpreter to execute it. The mkfullpak is a convenience tool that collects the whole program dependencies into a single .fullpak (zip) file.

Beside using the interpreter to learn the semantics of STG, primops and RTS, it is also an excellent tool to debug programs. So not only compiler backends but their debug and profiler tooling could be researched and prototyped using the interpreter.
It was shockingly easy to add a debugger REPL for the interpreter. It was also simple to implement a GC and quickly fix the related bugs. I never had a segmentation fault error, instead I get an easy to understand pattern match error with a specific source location and stack trace info.

I believe this interpreter based approach is the way to go when developing the compiler backend and the RTS. Conceptually this is a cross platform development setup, where the development is done in a powerful computer that can compensate the interpreter's overhead. And when the design and implementation is completed a native RTS and codegen backend is written or generated for the less performant target computer platform. It does not matter if the computers have the same ISA.
My experience with this approach to system development is just as easy and fun as the regular Haskell app developer experience.

For next I'd like to use the external stg interpreter and its debugger to implement a heap memory analysis and visualisation tool to observe and understand the runtime memory behaviour of Haskell programs. This tool could help to find memory leaks.
At first I really just would like to implement a low-level visualisation tool that could aid manual leak debugging. I already read lots of papers about memory visualisation and leak detection of GC-d languages. (Papers for heap memory analysis and leak detection)

I hope if I learn more about the problem domain and gain experience in fixing memory leaks then I'll be able to create a high-level tool with simple UX. I'll write the analyses in souffle datalog, because it is the most efficient and convenient way to do it. To support this approach I've extended the interpreter to save the STG state to .tsv files that the souffle/datalog can parse out of the box.

I'm really excited to work on the analysis and visualisation tool because I never had access to this kind of run-time data with these details.