#P3202. A Very Simple Compiler

A Very Simple Compiler

Description

A typical assignment for computer science students is to build a compiler that compiles a simple language for a RISC architecture, which, when it goes to extremes, could become: evaluating an expression on an ultimate RISC architecture – a One Instruction Set Computer (OISC).

Target Architecture

The target architecture has a memory consisting of 65,536 16-bit words and a 16-bit instruction pointer IP. As suggested by its name, it has only one instruction: subtract-and-branch-if-non-negative. In each instruction cycle, the processor reads three consecutive words a, b and c at the address stored in IP, then subtracts a from the word at address b. If the result is non-negative, IP is set to c, otherwise it is increased by three. (Here by “subtract” and “non-negative” we mean 16-bit signed integer arithmetic.) When IP is greater than or equal to 65,534, the processor halts.

A program is executed as follows:

  1. The program is loaded into memory starting from address 0.
  2. A word-sized input parameter is placed at address 0, overwriting the program’s first word.
  3. IP is set to 0.
  4. The processor starts execution until it halts.
  5. The word at address 0 is considered as the output.

Source Language

The source language is an arithmetic expression consisting of only ‘(’, ‘)’, ‘+’, ‘*’, ‘x’ and floating-point constants. ‘x’ represents the input. The expression is supposed to be evaluated in exactly the same manner as its literally specified semantics, i.e., subexpressions in parentheses take highest precedence, followed by multiplication then addition. All constants, intermediate results and the input ‘x’ are rounded to 16-bit half-precision floating point numbers by the round-to-even method.

Given an expression, your task is to compile it into an OISC program.


A quick reference for half-precision floating point

Memory format

Half precision memory format has a sign bit, 5 bits of exponent and 10 bits of mantissa:

bit1514~109~0
signexponentmantissa

The mathematical value of a half-precision floating point number is (−1)sign × 1.mantissa × 2exponent − 15. Below are several examples.

EncodingValue
0 01111 0000000001
0 01111 1000000001.5
1 01110 100000000−0.75

Among all possible exponents, 0 and 31 are reserved for representation of special values and not involved in this problem.

Rounding convention

Rounding is done by the round-to-even method, i.e., an unrepresentable value is rounded to the nearest number unless two possible outcomes are equally near, in which case the one with an even least significant bit is chosen. Below are several examples.

DecimalBinaryRounded in half precisionRounded in binary
0.4997558593750.01111111111110 01110 00000000000.1
2.000976562510.00000000010 10000 000000000010
0.60.10011001…0 01110 00110011010.10011001101

Below are two routines written in C for simplified conversions between half-precision and single-precision floating point numbers.

typedef unsigned short half;

float half_to_float(half h)
{
    unsigned int f;
    if ((h & 0x7ffff) == 0) return 0;
    f = ((h & 0x8000) << 16) /* sign                  */
      + ((h & 0x7fff) << 13) /* exponent and mantissa */
      + ((127 - 15) << 23);  /* adjust exponent       */
    return *(float*)&f;
}

half float_to_half(float f)
{
    unsigned int u = *(unsigned int*)&f;
    int e = ((u >> 23) & 0xff) - 127 + 15;                /* exponent                 */
    int m = ((u >> 13) & 0x3ff) + ((u >> 12) & 1);        /* mantissa rounded half-up */
    if ((u & 0x1fff) == 0x1000) m &= 0x7fe;               /* round to even            */
    if (m == 0x400) { m = 0; e++; }                       /* carry             */
    if (e <= 0) { return (u >> 16) & 0x8000; }            /* underflow or zero        */
    if (e > 30) { return 0x7bff + ((u >> 16) & 0x8000); } /* overflow                 */
    return ((u >> 16) & 0x8000) + (e << 10) + m;
}

Input

The input is presented as an expression on a single line. The expression can contain up to 8 arithmetic operators.

Output

The output should be a line of at most 65,536 space-separated integers, each representing a word in the OISC program starting from address 0. Memory unoccupied when the program is loaded will be filled with 0.

0.499755859375+0.125*(2.0009765625+2)
0 0 3 -15360 0 65535

Hint

Underflow, overflow and special values (zeroes, denormals, infinities and NaNs) will not occur in this problem.

Rounding to single precision before rounding to half precision is always safe in this problem.

Source

POJ Monthly--2007.03.04, Hou, Qiming