Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> huge files with huge numbers of duplicates

At least on the stock MacOS awk, you can get up to 2^53 before arithmetic breaks (doesn't wrap, just doesn't go up any more which means the one-liner still works.)

    > echo '2^53-1' | bc
    9007199254740991
    > seq 1 10 | awk 'BEGIN{a[123]=9007199254740991;b=a[123]}{a[123]++}END{print a[123],b,a[123]-b}'
    9007199254740992 9007199254740991 1
Even with one character per line, you'd need an 18PB file before you got to this limit, afaict.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: