- python3 (<< 3.14)
- python3 (>= 3.13~)
- python3-chardet
- python3-lxml
- python3:any
- libc6 (>= 2.17)
- libxml2 (>= 2.7.4)
A fast implementation of the HTML 5 parsing spec for Python. Parsing is
done in C using a variant of the gumbo parser. The gumbo parse tree is
then transformed into an lxml tree, also in C, yielding parse times that
can be a thirtieth of the html5lib parse times. That is a speedup of 30x.
This differs, for instance, from the gumbo Python bindings, where the
initial parsing is done in C but the transformation into the final
tree is done in Python.
Installed Size: 529.4 kB
Architectures: arm64 amd64