# Ninja File Canonicalizer

Suppose we have a tool that generates a Ninja file from some other description (think Kati and makefiles), and during
the testing we discovered a regression. Furthermore, suppose that the generated Ninja file is large (think millions of
lines). And, the new Ninja file has build statements and rules in a slightly different order. As the tool generates the
rule names, the real differences in the output of the `diff` command are drowned in noise. Enter Canoninja.

Canoninja renames each Ninja rule to the hash of its contents. After that, we can just sort the build statements, and a
simple `comm` command immediately reveal the essential difference between the files.

## Example

Consider the following makefile

```makefile
second :=
first: foo
foo:
	@echo foo
second: bar
bar:
	@echo bar
```

Depending on Kati version converting it to Ninja file will yield either:

```
$ cat /tmp/1.ninja
# Generated by kati 06f2569b2d16628608c000a76e3d495a5a5528cb

pool local_pool
 depth = 72

build _kati_always_build_: phony

build first: phony foo
rule rule0
 description = build $out
 command = /bin/sh -c "echo foo"
build foo: rule0
build second: phony bar
rule rule1
 description = build $out
 command = /bin/sh -c "echo bar"
build bar: rule1

default first
```

or

```
$ cat 2.ninja
# Generated by kati 371194da71b3e191fea6f2ccceb7b061bd0de310

pool local_pool
 depth = 72

build _kati_always_build_: phony

build second: phony bar
rule rule0
 description = build $out
 command = /bin/sh -c "echo bar"
build bar: rule0
build first: phony foo
rule rule1
 description = build $out
 command = /bin/sh -c "echo foo"
build foo: rule1

default first
```

This is a quirk in Kati, see https://github.com/google/kati/issues/238

Trying to find out the difference between the targets even after sorting them isn't too helpful:

```
diff <(grep '^build' /tmp/1.ninja|sort) <(grep '^build' /tmp/2.ninja | sort)
1c1
< build bar: rule1
---
> build bar: rule0
3c3
< build foo: rule0
---
> build foo: rule1
```

However, running these files through `canoninja` yields

```
$ canoninja /tmp/1.ninja
# Generated by kati 06f2569b2d16628608c000a76e3d495a5a5528cb

pool local_pool
 depth = 72

build _kati_always_build_: phony

build first: phony foo
rule R2f9981d3c152fc255370dc67028244f7bed72a03
 description = build $out
 command = /bin/sh -c "echo foo"
build foo: R2f9981d3c152fc255370dc67028244f7bed72a03
build second: phony bar
rule R62640f3f9095cf2da5b9d9e2a82f746cc710c94c
 description = build $out
 command = /bin/sh -c "echo bar"
build bar: R62640f3f9095cf2da5b9d9e2a82f746cc710c94c

default first
```

and

```
~/go/bin/canoninja /tmp/2.ninja
# Generated by kati 371194da71b3e191fea6f2ccceb7b061bd0de310

pool local_pool
 depth = 72

build _kati_always_build_: phony

build second: phony bar
rule R62640f3f9095cf2da5b9d9e2a82f746cc710c94c
 description = build $out
 command = /bin/sh -c "echo bar"
build bar: R62640f3f9095cf2da5b9d9e2a82f746cc710c94c
build first: phony foo
rule R2f9981d3c152fc255370dc67028244f7bed72a03
 description = build $out
 command = /bin/sh -c "echo foo"
build foo: R2f9981d3c152fc255370dc67028244f7bed72a03

default first
```

and when we extract only build statements and sort them, we see that both Ninja files define the same graph:

```shell
$ diff <(~/go/bin/canoninja /tmp/1.ninja | grep '^build' | sort) \
       <(~/go/bin/canoninja /tmp/2.ninja | grep '^build' | sort)
```

# Todo

* Optionally output only the build statements, optionally sorted
* Handle continuation lines correctly