Sunday, June 04, 2017

Correctness by design vs. formal verification

One common debate in the programming language community is the notion of correctness by design vs. verifying the correctness of your program. After all, if you can guarantee that all the programs that you write to be correct by construction, then there is no need to separately prove its correctness (using program analysis, formal verification, you name it), which is arguably non-trivial in general. How might you ensure correctness by construction? One easy way is to design a programming language that *rules out* certain bugs. For example, back in the 2001 when I started learning programming in C and then C++, I had to learn how to debug segmentation faults. A switch to higher-level programming languages like Java and Python ensures that I don't have to deal with segmentation faults anymore (assuming the correctness of JVM and Python interpreters). Can you do the same with other kinds of errors?

An interesting CACM article that I recently read (https://cacm.acm.org/magazines/2017/5/216322-ending-null-pointer-crashes/fulltext) discusses how the programming language Eiffel eliminates the problem of null pointer crashes (which could cause serious security vulnerabilities). A null pointer dereferencing happens when you invoke x.f in an object-oriented programming language, where x is a variable pointing to null and f is a field. It turns out that eliminating null pointer dereferencing by language design in a way that does not severely limit the expressiveness and convenience of the language is not straightforward, as discussed in the article. Eiffel instead relies on sophisticated use of type checking and static analysis to ensure that a program that passes an Eiffel compiler cannot exhibit null pointer crashes. So, the case study of Eiffel seems to suggest that correctness should be achieved not only by only language design or software verification alone, but by both of them simultaneously.

What about other kinds of programming errors, e.g., cross-site scripting in HTML5, data race condition in concurrent programming?