# Ninja File Canonicalizer Suppose we have a tool that generates a Ninja file from some other description (think Kati and makefiles), and during the testing we discovered a regression. Furthermore, suppose that the generated Ninja file is large (think millions of lines). And, the new Ninja file has build statements and rules in a slightly different order. As the tool generates the rule names, the real differences in the output of the `diff` command are drowned in noise. Enter Canoninja. Canoninja renames each Ninja rule to the hash of its contents. After that, we can just sort the build statements, and a simple `comm` command immediately reveal the essential difference between the files. ## Example Consider the following makefile ```makefile second := first: foo foo: @echo foo second: bar bar: @echo bar ``` Depending on Kati version converting it to Ninja file will yield either: ``` $ cat /tmp/1.ninja # Generated by kati 06f2569b2d16628608c000a76e3d495a5a5528cb pool local_pool depth = 72 build _kati_always_build_: phony build first: phony foo rule rule0 description = build $out command = /bin/sh -c "echo foo" build foo: rule0 build second: phony bar rule rule1 description = build $out command = /bin/sh -c "echo bar" build bar: rule1 default first ``` or ``` $ cat 2.ninja # Generated by kati 371194da71b3e191fea6f2ccceb7b061bd0de310 pool local_pool depth = 72 build _kati_always_build_: phony build second: phony bar rule rule0 description = build $out command = /bin/sh -c "echo bar" build bar: rule0 build first: phony foo rule rule1 description = build $out command = /bin/sh -c "echo foo" build foo: rule1 default first ``` This is a quirk in Kati, see https://github.com/google/kati/issues/238 Trying to find out the difference between the targets even after sorting them isn't too helpful: ``` diff <(grep '^build' /tmp/1.ninja|sort) <(grep '^build' /tmp/2.ninja | sort) 1c1 < build bar: rule1 --- > build bar: rule0 3c3 < build foo: rule0 --- > build foo: rule1 ``` However, running these files through `canoninja` yields ``` $ canoninja /tmp/1.ninja # Generated by kati 06f2569b2d16628608c000a76e3d495a5a5528cb pool local_pool depth = 72 build _kati_always_build_: phony build first: phony foo rule R2f9981d3c152fc255370dc67028244f7bed72a03 description = build $out command = /bin/sh -c "echo foo" build foo: R2f9981d3c152fc255370dc67028244f7bed72a03 build second: phony bar rule R62640f3f9095cf2da5b9d9e2a82f746cc710c94c description = build $out command = /bin/sh -c "echo bar" build bar: R62640f3f9095cf2da5b9d9e2a82f746cc710c94c default first ``` and ``` ~/go/bin/canoninja /tmp/2.ninja # Generated by kati 371194da71b3e191fea6f2ccceb7b061bd0de310 pool local_pool depth = 72 build _kati_always_build_: phony build second: phony bar rule R62640f3f9095cf2da5b9d9e2a82f746cc710c94c description = build $out command = /bin/sh -c "echo bar" build bar: R62640f3f9095cf2da5b9d9e2a82f746cc710c94c build first: phony foo rule R2f9981d3c152fc255370dc67028244f7bed72a03 description = build $out command = /bin/sh -c "echo foo" build foo: R2f9981d3c152fc255370dc67028244f7bed72a03 default first ``` and when we extract only build statements and sort them, we see that both Ninja files define the same graph: ```shell $ diff <(~/go/bin/canoninja /tmp/1.ninja | grep '^build' | sort) \ <(~/go/bin/canoninja /tmp/2.ninja | grep '^build' | sort) ``` # Todo * Optionally output only the build statements, optionally sorted * Handle continuation lines correctly