- python (<< 2.8)
- python (>= 2.7~)
- python-chardet
- python-lxml
- python:any (<< 2.8)
- python:any (>= 2.7~)
- libc6 (>= 2.17)
- libxml2 (>= 2.7.4)
A fast implementation of the HTML 5 parsing spec for Python. Parsing is
done in C using a variant of the gumbo parser. The gumbo parse tree is
then transformed into an lxml tree, also in C, yielding parse times that
can be a thirtieth of the html5lib parse times. That is a speedup of 30x.
This differs, for instance, from the gumbo python bindings, where the
initial parsing is done in C but the transformation into the final
tree is done in python.
Installed Size: 463.9 kB
Architectures: arm64 amd64