I had trouble reproducing 9 differences. I tried these samples in a couple different ways (on the same line or broken onto different lines) in different browsers and always came up with 6 differences instead of 9 differences.
The weird thing is, that website highlights 6 changes for that example (first and last three lines) but says at the top: Number of differences: 9 differences from 9 lines of code.
And the little folding button does indeed fold all nine lines.
When I instead try [a,b,c,1,2,3,x,y,z] against [a,b,c,1,2,3,a,b,c] it only shows "Number of differences: 3 differences from 3 lines of code." and folds the first 6 and the last 3 lines.
I still cannot reproduce the problem of 9 differences. I have tried in Edge, Chrome, and Firefox on Windows 10. I will try on OSX in just a second.
The reason why it changes output after using square braces is because the tool is language aware. Before it could not guess at a programming language and so the input is text parsed into lines. After the language detection guesses this is JSON format and formally parses the input, beautifies it, and then compares the beautified output.
I already realize that if I am going to the trouble of parsing languages that it would be far more efficient to simply compare the parsed token lists instead of beautified text, but I haven't gotten there yet.