# Symlink Behavior In Broccoli Plugins

**Summary:** We are changing the contract between Broccoli plugins to mandate
that plugins follow symbolic links inside their input trees. This includes
recursing into symlinked directories.

## Background

Broccoli plugins often need to pass files through from their input trees to
their output trees. For instance, the CoffeeScript plugin (based on
broccoli-filter) will copy any files that do not end in `.coffee` verbatim
from its input tree to its output tree; and the merge-trees plugin will
successively copy the files in all its input trees to its output tree.

In the beginning, we used hardlinks to "copy" all the files. Hardlinking a
file takes a very small constant amount of time, regardless of file size. We
created a lot of hardlinks on each rebuild, but because hardlinks are fast,
the performance was adequate on most project sizes.

However, we later discovered that hardlinks can [cause data loss on OS
X](https://github.com/broccolijs/broccoli/blob/master/docs/hardlink-issue.md),
and as a stop-gap immediately switched all plugins to copying files
byte-by-byte rather than hardlinking.

Unfortunately, copying files turns out to be too slow. On a typical project,
it adds seconds or even tens of seconds to the rebuild time.

## Upcoming Change

To resolve this performance issue, we will be switching to symlinking files
instead of copying them. At the end of the build process, Broccoli will
automatically dereference all symlinks in the final tree, so that the output
generated by `broccoli build` only contains regular files and directories.

To make this possible, we will soon start requiring plugins to follow
(dereference) symlinks inside their input trees - that is, to treat symlinked
files the same as regular files, and recurse into symlinked directories. Until
now, we had left undefined how plugins deal with symlinks. In practice, most
plugin currently do not follow symlinks, but rather copy symlinks verbatim or
similar.

This is a change in the expected behavior - the "contract" between plugins, if
you will - rather than in the programmatic API.

The change is happening in two parts.

### Part 1: Transparently Follow Symlinks

The first set of changes is making plugins follow symlinks consistently.

This change should be mostly non-breaking. Breakage can occur when there are
broken symlinks in source trees, which will now result in build failures; and
also when an application relies on plugins ignoring symlinked files or on
plugins not recursing into symlinked directories.

To implement this change, fortunately, not much code is necessary. Most file
system functions (like `readFile` and `readdir`) transparently follow
symlinks, both on Node and in external libraries. For example, calling
`readdir` on a symlink to a directory behaves the same as calling `readdir` on
the directory directly.

Here is what plugins need to do accept symlinks:

#### Use `stat` Instead Of `lstat`

The `stat` and `lstat` functions ([syscall
documentation](http://linux.die.net/man/2/stat), [Node
documentation](http://nodejs.org/api/fs.html#fs_fs_stat_path_callback)) behave
differently with regard to symlinks: `stat` follows symlinks, whereas `lstat`
returns information on the symlink itself. For instance:

```js
// Treat symlinks differently (old behavior):
var lstats = fs.lstatSync(somePath);
if (lstats.isFile()) {
  ...
} else if (lstats.isDirectory()) {
  ...
} else if (lstats.isSymlink()) {
  // Here be special symlink handling code.
  ...
} else {
  throw new Error('Unexpected file type'); // socket, device, or similar
}

// Transparently follow symlinks (new behavior):
var stats = fs.statSync(somePath);
if (stats.isFile()) {
  // Could be file, or symlink pointing to file.
  ...
} else if (stats.isDirectory()) {
  // Could be directory, or symlink pointing to directory.
  ...
} else {
  throw new Error('Unexpected file type'); // socket, device, or similar
}
```

Plugins should be sure to use `fs.statSync`, rather than `fs.lstatSync`.

Note that stat'ing a broken symlink (`fs.statSync('does-not-exist')`) throws
"Error: ENOENT, no such file or directory 'does-not-exist'". We will typically
let these errors propagate and not try to handle them. It seems acceptable to
fail when we encounter broken symlinks.

#### Use Up-To-Date Helper Packages

If you are using node-walk-sync or broccoli-kitchen-sink-helpers, be sure to
use the latest versions, as they have been updated to follow symlinks:

```js
"dependencies": {
  "walk-sync": "^0.1.3",
  "broccoli-kitchen-sink-helpers": "^0.2.5"
}
```

#### Do Not Crash On Broken Symlinks (Emacs Lockfiles)

Emacs in its default configuration creates lockfiles of the form `.#foo.js`,
which are broken symlinks. Trying to stat or open a broken symlink throws an
`ENOENT` exception. It would seem to be wise to not crash when Emacs lockfiles
appear in input trees.

When plugins iterate over all files in their input trees, they should
generally expect to encounter Emacs lockfiles and ignore them [like
so](https://github.com/joliss/node-walk-sync/blob/b2a3b178ea7bc681d4ab0221686e945f9453645e/index.js#L34-L38)

This applies to directory traversal only. It is OK to crash when a file
explicitly specified by the user is a broken symlink.

#### Auto-Dereference Symlinks After Build

Once we start emitting symlinks, the final output tree generated by Broccoli
may contain symlinks into temporary directories.

As of version 0.13.0, Broccoli automatically dereferences symlinks (that is,
it replaces them with the files or directories they point to) when you call
`broccoli build`.

If you are maintaining code that uses Broccoli programmatically, use
[node-copy-dereference](https://github.com/broccolijs/node-copy-dereference)
at the end of each build to dereference symlinks, [like so](https://github.com/broccolijs/broccoli/blob/48e9b5f450f4dd59e424713c7a9c901b15bc6746/lib/cli.js#L33).

### Part 2: Emit Symlinks As An Optimization

...

## Performance Gains

...
