OCaml Compiler Hacking: how to add a primitive

I have been hacking on the OCaml compiler recently; in particular, I added some support for coloring warning/error messages. At some point during the discussion over this pull request, it became clear that colored output should only be enabled if stderr was an interactive terminal (in opposition to a regular file handle or whatnot). The compiler does not link with the Unix library, so I finally decided to add a primitive caml_sys_is_interactive. My purpose here is to explain how it works (from what I gathered) and how to do it.

Simple Refinement Types for OCaml

For more than one year, vulnerabilies in software (especially pervasive C software) have been disclosed at an alarmingly high rate. I love OCaml, which is definitely safer, but still has gaps left open. I believe formal verification, albeit a very powerful tool, is not mature enough for most programmers (too difficult to use, requires too much efforts), so I'm thinking about alternative solutions that would be more lightweight to define and use, but would still increase the confidence in source code.

So here is a draft of a relatively simple extension to OCaml, that allows to specify more invariants in types without requiring a full-fledged proof system. It's purely hypothetical and I did not implement it (nor don't I plan to do it, not before the end of my PhD). The purpose is to propose a conservative extension of OCaml with a type system that can represent more complex invariants, without putting too much of a burden on the user, in a way so adding annotations to existing programs incrementally is feasible and even easy.

Introduction to Automated Theorem Proving with Logtk

My PhD work is centered around automated theorem proving in first-order logic. This is obviously a very cool topic (otherwise I wouldn't have focused on it), so this post is a crash course (but the program won't crash because I use OCaml) on one of the most classic method to prove (some) theorems automatically. I named... resolution!

The goal is to prove some (not too complicated) theorems automatically. In other words, we want a program that reads a bunch of axioms and a formula that we conjectured is a theorem following from the axioms, and then tries to produce a proof of the theorem. In practice it's almost always done by refutation: to prove that Γ ⊢ F (formula F is a theorem under axioms Γ), we try to deduce ⊥ (false) from Γ, ¬F). The applications for this kind of technology are multiple, but afaik the prominent one is software verification — aims at formally proving that a program satisfies a specification ("not crashing" is a good start). There has been a lot of research in this area for decades, but the problem is extremely hard (only semi-decidable: you will find a solution eventually if the problem is a theorem, but might run forever otherwise).

Batch Operations on Collections

Some very common (and useful) operations, including the classic map, filter, and flat_map, traverse their whole argument and return another collection. When several such operations are composed, intermediate collections will be created and become useless immediately after. Languages like Haskell sometimes perform optimizations that merge together the operations so as to minimize the number of intermediate collections :

map f . map g == map (f . g)

Sadly, the OCaml compiler performs very few optimizations (and in this case, the optimization would not be correct anyway, because it would change the order of evaluation and therefore side-effects).

Representing Lazy Values

A quick survey of Lazy in OCaml

I remember having heard that Jane Street had its own implementation of Lazy values, and that it was faster than the standard one. So I played a bit with several ways of managing lazy values in OCaml.