Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> These is no god damn reason why a filename should be able to contain, say, LF, DEL, or BEL. None whatsoever.

OK you want ASCII 0x07 to be disallowed. Should a filename be allowed to contain "㜇"? (U+3707)



That's not a problem because the UTF-8 encoding of U+3707 will absolutely not contain any USASCII control characters, or any special shell or filesystem characters. It will all be bytes in the range 0x80-0xFF.


There are other encodings than UTF-8 though. Which is kind of my point. If you have your file system set to UTF-16 (doesn't NTFS do this?) then 0x07 will be present.


I also believe that filesystems should require that all filenames be fully normalized UTF-8. I don't think the benefits (slight, IMHO) of allowing filenames to be arbitrary byte strings outweigh the costs of code complexity and security problems.


That's not how UTF-8 works.


It is how UTF-16 (NTFS) does though.


That doesn't count. Windows doesn't allow the 16-bit word 0x0007 to appear in filenames.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: