python-w3lib

Collection of web-related functions for Python (Python 2)


Python module with simple, reusable functions to work with URLs, HTML, forms, and HTTP, that aren’t found in the Python standard library.

This module is used to, for example:

  • remove comments, or tags from HTML snippets
  • extract base url from HTML snippets
  • translate entites on HTML strings
  • encoding mulitpart/form-data
  • convert raw HTTP headers to dicts and vice-versa
  • construct HTTP auth header
  • RFC-compliant url joining
  • sanitize urls (like browsers do)
  • extract arguments from urls

The code of w3lib was originally part of the Scrapy framework but was later stripped out of Scrapy, with the aim of make it more reusable and to provide a useful library of web functions without depending on Scrapy.

This is the Python 2 version of the package.

Related packages: python3-w3lib


Maintainer information

This software package is maintained for (Neuro)Debian by the follow individuals and/or groups:

Maintainer avatar
NeuroDebian Maintainers
Maintainer avatar
Ignace Mouzannar

In order to get support, or to get in touch with a maintainer, please click the ‘Help’ button at the top of the page.

Advanced user information

Version control system available: Browse sources

Package availability chart
Distribution Base version Our version Architectures
Debian GNU/Linux 7.0 (wheezy) 1.0-1 1.11.0-1~nd70+1 i386, amd64, sparc
Debian GNU/Linux 8.0 (jessie) 1.5-1 1.11.0-1~nd80+1 i386, amd64, sparc
Debian testing (stretch) 1.11.0-1 1.11.0-1~nd90+1 i386, amd64, sparc
Debian unstable (sid) 1.11.0-1 1.11.0-1~nd+1 i386, amd64, sparc
Ubuntu 14.04 “Trusty Tahr” (trusty) 1.5-1 1.11.0-1~nd14.04+1 i386, amd64, sparc
Ubuntu 14.10 “Utopic Unicorn” (utopic) 1.5-1 1.11.0-1~nd14.10+1 i386, amd64, sparc
Ubuntu 15.04 “Vivid Vervet” (vivid) 1.5-1 1.11.0-1~nd15.04+1 i386, amd64, sparc
The source code for this portal is licensed under the GPL-3 and is available on GitHub.