That's not actually lxml's fault, it depends on the libxml2 installed in your environment. Just like Jake232, I haven't found a site that lxml can't handle lately, using libxml 2.9.1. You can find your libxml2 version in Python with:
>>> from lxml import etree
>>> etree.LIBXML_VERSION
If you find a page in the wild that it can't handle, I'd love to know the URL.