# Project structure

```
src/
  - index.js --> the "compiler"
  - tokenizer/ --> reads strings returns tokens
  - parser/  --> reads tokens, returns ast
  - semantics/ --> reads ast, returns a typed ast
  - validation/ --> validations an ast for correctness
  - generator/ --> generates IR for emitter from a typed AST
  - emitter/ --> reads ast, returns Web Assembly binary encoding
    - section/ --> contains mini-emitters for wasm binary sections
  - utils/
    - stream.js --> basic string stream object used in token generation
    - token-stream.js --> object used to hold tokens generated by tokenizer
    - output-stream.js --> object used to hold binary and dissasembler data
dist/ --> build
docs/ --> .io site with Explorer/Playground
```

# Compiler Phases

To be able to edit the compiler it is important to understand where a change should be made. The compiler could be visualized as a chain of individual immutable operations. Each one taking a specific data type (this usually being a Node) and returning a new data structure. The one exception is the _Validation_ step which is an identify function which may throw an Exception.

From left-to-right:

`Source -> Parse -> Semantics -> Validate(identity) -> Generate -> Emit -> Binary`


## Phase 1 - Parsing

### 1.A - Tokenizing

Before an AST is generated the source is divided into atomic _Tokens_. The Tokens are then converted into an Abstract Syntax Tree. The tokenizing process is stable and there are no known bugs.

### 1.B - Base AST Generation

The initial pass over the Tokens which becomes the base Abstract Syntax Tree for the compiler. This Tree contains no type information. This tree structure is usually much _lighter_ and more _high-level_ representing abstract ideas or the _intent of the source program_.

Only the basic syntactical checks are pefromed in this phase. Syntax errors may be thrown here.

## Phase 2 - Semantics

The tree is mapped in this phase, producing a new tree structure which may be used to generate a valid binary. In this phase the relationships between functions, types and variables are assigned. High-level representations are reduced to lower-level operations represented as AST nodes. This is a much more detailed and larger tree than the one in Phase 1. This tree structure can be thought of as representing the WebAssembly equivalent of the walt source code.

## Phase 3 - Validation

Validation is the final step before the AST is used to generate a binary. The connections between AST Nodes assigned in Phase 2 are sanity checked here for errors. This area could use the most help.

## Phase 4 - Generator

In this phase we flatten and map the _Typed AST_ to generate the _Intermediate Representation (IR)_ in a form of a _Program_ object. The _Program_ represents the data, imports, exports and instructions generated from our source. Later to be consumed by the emitter. 

## Phase 5 - Emitter

Pure function which takes the _Program_ generated by the parsing process and converts it to binary. Uses
the `output-stream`. This is the land of Web Assembly spec. Not many chnages need to happen here as the semantics and generator do the heavy lifting of converting the source code into an emit-able _Program_.

# Developing

## Requirements

* `node 8`

Node 8+ has native Web Assembly support and this project takes full advantage of that fact.

## Commands

### Tests

Every piece of the compiler is unit tested. AVA is used as the test runner.

* `npm run tdd`
* `npm run tdd -- --watch`
* `npm run tdd -- --watch <spec_file_path>`

Do not confuse with `npm test`. The `npm test` command is used for CI integration and generates
full code coverage reports. You may still use it if you are interested in coverall report.

To **debug** a spec

* `npm run debug -- <spec_file_path>`

Helpful APIs and notes:

* `prettyPrint(ast)` may be used to debug AST issues. Pretty-print works on both first pass tree and the full semantic AST with type information.
* `debug(output-buffer)` may be used to examine wasm opcode output
* `String(ast-node)` most AST nodes may be coerced to a string to retrieve the original source represented. _Most_ because there are some nodes which are hand-crafted and may not contain any source-code equivalent.
* `parser/fragment.js` module contains fragment methods to generate AST fragments or Nodes. Fragments do not require a full program to generate, they may be created from a _snippet_ of code. Fragments may only be generated from an expression or a statement. Fragments are valid AST Nodes and may be debugged with methods mentioned above.

## Pull Requests

* `100%` statement coverage must be maintained. A PR with a lowered coverage will not be accepted.
* UI/Explorer changes may require a screenshot and or running demo.

## Build

If you'd like to see your changes reflected in the explorer page, run the `npm run build` command.


