View on GitHub

CCjs

Reimagining C as JavaScript

Welcome to CCjs

Let’s imagine what C transformed into JavaScript might look like.

We’ll start with a very simple C program, primes.c, that prints the first N prime numbers:

#include <math.h>
#include <stdio.h>
#include <stdlib.h>

#define ERROR   -1
#define SUCCESS  0
#define ZERO     0
#define DIVIDES(a, b)   ((a) % (b) == ZERO)

int defaultCount = 100;

int main(int argc, char *argv[])
{
    int count = argc > 1? atoi(argv[1]) : 0;
    if (count <= 0) count = defaultCount;
    printf("the first %d primes are: ", count);
    int prime = 2;
    while (count > 0) {
        int divisor = 2;
        int maxDivisor = sqrt(prime);
        while (divisor <= maxDivisor) {
            if (DIVIDES(prime, divisor)) {
                break;
            }
            divisor++;
        }
        if (divisor > maxDivisor) {
            printf("%d ", prime);
            count--;
        }
        prime++;
    }
    printf("\n");
    return SUCCESS;
}

Here’s how you would compile and run it from a command prompt:

cc primes.c -o primes
primes 10
the first 10 primes are: 2 3 5 7 11 13 17 19 23 29 

Here’s what an equivalent JavaScript version, primes.js, might look like:

importScripts("../../../lib/ccjs.js");
importScripts("../../../lib/math.js");
importScripts("../../../lib/stdio.js");
importScripts("../../../lib/stdlib.js");

class primes
{
    static main(argc, argv)
    {
        let count = argc > 1? stdlib.atoi(argv[1]) : 0;
        if (count <= 0) count = primes.defaultCount;
        stdio.printf("the first %d primes are: ", count);
        let prime = 2;
        while (count > 0) {
            let divisor = 2;
            let maxDivisor = math.sqrt(prime)|0;
            while (divisor <= maxDivisor) {
                if (prime % divisor == 0) {
                    break;
                }
                divisor++;
            }
            if (divisor > maxDivisor) {
                stdio.printf("%d ", prime);
                count--;
            }
            prime++;
        }
        stdio.printf("\n");
        return 0;
    }
}

primes.defaultCount = 100;

CCjs.run(primes.main);

The JavaScript version works just like the original. Below is a window containing the live output of the command “c/tests/primes/primes.js 100”:

NOTE: The windows created for these demos are intended to operate like small “terminal” windows, managed by a Terminal class that allows you to run CCjs apps as if they were built-in commands (eg, c/tests/primes/primes.js). The Terminal class also supports ArrowUp and ArrowDown to recall previous commands and Ctrl-C to terminate a running app. A CCjs app runs in a separate JavaScript worker thread, and all other commands are passed to a dedicated “background” worker thread that calls eval() and returns the result.

The above demo works only because primes.js was written by hand. The challenge of CCjs is converting C to JavaScript automatically.

The first step is the CCjs Parser. It has a long way to go, but it is far enough long that it can parse an entire C application into tokens. It can be run from the command-line (using Node) or on a web page (like this one).

Here’s primes.c converted into tokens, by running “src/parse.js c/tests/primes/primes.c”:

The CCjs Parser includes a complete C preprocessor and should be capable of dealing with any valid ANSI C syntax, along with a few C99 extensions (eg, long long support). In fact, I can run all the C code from the SimH PDP10 Emulator through the CCjs Parser, compile the resulting token stream, and produce an identical binary.

The next steps are building an AST (Abstract Syntax Tree) from the token stream and then rewriting the AST as JavaScript. And if that weren’t enough work, other major obstacles include properly supporting all C data types:

Even our simple demo poses some challenges. For example, char *argv[] refers to an array of pointers to characters. So, when calling main(), it would be natural to create the argv array like this:

primes.main(2, ["primes", "100"]);

but that will only work if we know that nothing is done with the argv pointers other than passing them to functions that expect char * parameters. More generally, we might have to do something like this:

primes.main(2, [new pointer("primes"), new pointer("100")]);

And even more generally, if argv is defined as char **, then we might be required to do this:

primes.main(2, new pointer(new pointer("primes"), new pointer("100")));

which means that any references to, say, argv[1] must be transformed into something like argv.getElement(1).

Pointers (and arrays) will have to be implemented as JavaScript objects with references to a buffer (up to a specified size) and position (initially 0). If no buffer is provided, then it’s effectively a “null pointer.” If a string initializer is provided, a buffer must be initialized with the contents of the string, including a terminating null.

Later, when one of those pointers is passed to a CCjs library function, like stdlib.atoi(), if the library function is expecting a char * pointer, it will generally call the pointer’s getString() method to obtain a JavaScript string, so that it can implement the function using JavaScript operations.

Pointer objects will also have methods that enable the following operations:

“Run-time” pointer assignments will likely only verify that the buffer type of the source pointer (1, 2, 4, or 8-byte elements) matches that of the target pointer; most verification checks will be performed at “compile-time” (i.e., verifying that the data type of the source and target pointers match).

Motivation

To be able to run the SimH PDP10 Emulator in a browser.

Resources

Since I want to start simple, and I have a natural fondness for old architectures, I’m targeting the ANSI C language specification, finalized in the mid-1980’s and memorialized in The C Programming Language, Second Edition.

I’ll worry about newer features and data types (like long long) later, and as the code bases that I’m working with require them.

License

Copyright © 2017 Jeff Parsons and the PCjs Project.

CCjs, a software translation project, is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

You are required to include the above copyright notice in every modified copy of this work and to display that notice whenever the software is started.

See LICENSE for details.