Merge pull request #25 from sigmavirus24/release/1.0
Create 1.0 Release
This commit is contained in:
commit
482573d3f2
@ -7,8 +7,6 @@ script: tox
|
||||
|
||||
matrix:
|
||||
include:
|
||||
- python: 2.6
|
||||
env: TOXENV=py26
|
||||
- python: 2.7
|
||||
env: TOXENV=py27
|
||||
- python: 3.3
|
||||
@ -19,10 +17,8 @@ matrix:
|
||||
env: TOXENV=py35
|
||||
- python: pypy
|
||||
env: TOXENV=pypy
|
||||
- python: 2.7
|
||||
env: TOXENV=py27-flake8
|
||||
- python: 3.4
|
||||
env: TOXENV=py34-flake8
|
||||
- python: 3.5
|
||||
env: TOXENV=flake8
|
||||
#- env: TOXENV=docs
|
||||
|
||||
notifications:
|
||||
|
63
HISTORY.rst
63
HISTORY.rst
@ -1,63 +0,0 @@
|
||||
0.4.2 -- 2016-08-22
|
||||
-------------------
|
||||
|
||||
- Avoid parsing an string with just an IPv6 address as having a scheme of
|
||||
``[``.
|
||||
|
||||
0.4.1 -- 2016-08-22
|
||||
-------------------
|
||||
|
||||
- Normalize URIs constructed using ``ParseResult.from_parts`` and
|
||||
``ParseResultBytes.from_parts``
|
||||
|
||||
0.4.0 -- 2016-08-20
|
||||
-------------------
|
||||
|
||||
- Add ``ParseResult.from_parts`` and ``ParseResultBytes.from_parts`` class
|
||||
methods to easily create a ParseResult
|
||||
|
||||
- When using regular expressions, use ``[0-9]`` instead of ``\d`` to avoid
|
||||
finding ports with "numerals" that are not valid in a port
|
||||
|
||||
0.3.1 -- 2015-12-15
|
||||
-------------------
|
||||
|
||||
- Preserve empty query strings during normalization
|
||||
|
||||
0.3.0 -- 2015-10-20
|
||||
-------------------
|
||||
|
||||
- Read README and HISTORY files using the appropriate codec so rfc3986 can be
|
||||
installed on systems with locale's other than utf-8 (specifically C)
|
||||
|
||||
- Replace the standard library's urlparse behaviour
|
||||
|
||||
0.2.2 -- 2015-05-27
|
||||
-------------------
|
||||
|
||||
- Update the regular name regular expression to accept all of the characters
|
||||
allowed in the RFC. Closes bug #11 (Thanks Viktor Haag). Previously URIs
|
||||
similar to "http://http-bin.org" would be considered invalid.
|
||||
|
||||
0.2.1 -- 2015-03-20
|
||||
-------------------
|
||||
|
||||
- Check that the bytes of an IPv4 Host Address are within the valid range.
|
||||
Otherwise, URIs like "http://256.255.255.0/v1/resource" are considered
|
||||
valid.
|
||||
|
||||
- Add 6 to the list of unreserved characters. It was previously missing.
|
||||
Closes bug #9
|
||||
|
||||
0.2.0 -- 2014-06-30
|
||||
-------------------
|
||||
|
||||
- Add support for requiring components during validation. This includes adding
|
||||
parameters ``require_scheme``, ``require_authority``, ``require_path``,
|
||||
``require_path``, ``require_query``, and ``require_fragment`` to
|
||||
``rfc3986.is_valid_uri`` and ``URIReference#is_valid``.
|
||||
|
||||
0.1.0 -- 2014-06-27
|
||||
-------------------
|
||||
|
||||
- Initial Release includes validation and normalization of URIs
|
@ -1,9 +1,8 @@
|
||||
include README.rst
|
||||
include LICENSE
|
||||
include HISTORY.rst
|
||||
include AUTHORS.rst
|
||||
include setup.cfg
|
||||
prune *.pyc
|
||||
#recursive-include docs *.rst *.py Makefile
|
||||
recursive-include docs *.rst *.py
|
||||
recursive-include tests *.py
|
||||
prune docs/_build
|
||||
|
@ -2,12 +2,12 @@ rfc3986
|
||||
=======
|
||||
|
||||
A Python implementation of `RFC 3986`_ including validation and authority
|
||||
parsing. Coming soon: `Reference Resolution <http://tools.ietf.org/html/rfc3986#section-5>`_.
|
||||
parsing.
|
||||
|
||||
Installation
|
||||
------------
|
||||
|
||||
Simply use pip to install ``rfc3986`` like so::
|
||||
Use pip to install ``rfc3986`` like so::
|
||||
|
||||
pip install rfc3986
|
||||
|
||||
|
9
docs/source/api-ref/api.rst
Normal file
9
docs/source/api-ref/api.rst
Normal file
@ -0,0 +1,9 @@
|
||||
===============
|
||||
API Submodule
|
||||
===============
|
||||
|
||||
.. autofunction:: rfc3986.api.urlparse
|
||||
|
||||
.. autofunction:: rfc3986.api.uri_reference
|
||||
|
||||
.. autofunction:: rfc3986.api.normalize_uri
|
23
docs/source/api-ref/builder.rst
Normal file
23
docs/source/api-ref/builder.rst
Normal file
@ -0,0 +1,23 @@
|
||||
====================
|
||||
URI Builder Module
|
||||
====================
|
||||
|
||||
.. autoclass:: rfc3986.builder.URIBuilder
|
||||
|
||||
.. automethod:: rfc3986.builder.URIBuilder.add_scheme
|
||||
|
||||
.. automethod:: rfc3986.builder.URIBuilder.add_credentials
|
||||
|
||||
.. automethod:: rfc3986.builder.URIBuilder.add_host
|
||||
|
||||
.. automethod:: rfc3986.builder.URIBuilder.add_port
|
||||
|
||||
.. automethod:: rfc3986.builder.URIBuilder.add_path
|
||||
|
||||
.. automethod:: rfc3986.builder.URIBuilder.add_query_from
|
||||
|
||||
.. automethod:: rfc3986.builder.URIBuilder.add_query
|
||||
|
||||
.. automethod:: rfc3986.builder.URIBuilder.add_fragment
|
||||
|
||||
.. automethod:: rfc3986.builder.URIBuilder.finalize
|
16
docs/source/api-ref/index.rst
Normal file
16
docs/source/api-ref/index.rst
Normal file
@ -0,0 +1,16 @@
|
||||
===============
|
||||
API Reference
|
||||
===============
|
||||
|
||||
This section contains API documentation generated from the source code of
|
||||
|rfc3986|. If you're looking for an introduction to the module and how it
|
||||
can be utilized, please see :ref:`narrative` instead.
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
|
||||
api
|
||||
builder
|
||||
uri
|
||||
validators
|
||||
miscellaneous
|
231
docs/source/api-ref/miscellaneous.rst
Normal file
231
docs/source/api-ref/miscellaneous.rst
Normal file
@ -0,0 +1,231 @@
|
||||
==========================
|
||||
Miscellaneous Submodules
|
||||
==========================
|
||||
|
||||
There are several submodules in |rfc3986| that are not meant to be exposed to
|
||||
users directly but which are valuable to document, regardless.
|
||||
|
||||
.. data:: rfc3986.misc.UseExisting
|
||||
|
||||
A sentinel object to make certain APIs simpler for users.
|
||||
|
||||
.. module:: rfc3986.abnf_regexp
|
||||
|
||||
The :mod:`rfc3986.abnf_regexp` module contains the regular expressions written
|
||||
from the RFC's ABNF. The :mod:`rfc3986.misc` module contains compiled regular
|
||||
expressions from :mod:`rfc3986.abnf_regexp` and previously contained those
|
||||
regular expressions.
|
||||
|
||||
.. data:: rfc3986.abnf_regexp.GEN_DELIMS
|
||||
.. data:: rfc3986.abnf_regexp.GENERIC_DELIMITERS
|
||||
|
||||
The string containing all of the generic delimiters as defined on
|
||||
`page 13 <https://tools.ietf.org/html/rfc3986#page-13>`__.
|
||||
|
||||
.. data:: rfc3986.abnf_regexp.GENERIC_DELIMITERS_SET
|
||||
|
||||
:data:`rfc3986.abnf_regexp.GEN_DELIMS` represented as a :class:`set`.
|
||||
|
||||
.. data:: rfc3986.abnf_regexp.SUB_DELIMS
|
||||
.. data:: rfc3986.abnf_regexp.SUB_DELIMITERS
|
||||
|
||||
The string containing all of the 'sub' delimiters as defined on
|
||||
`page 13 <https://tools.ietf.org/html/rfc3986#page-13>`__.
|
||||
|
||||
.. data:: rfc3986.abnf_regexp.SUB_DELIMITERS_SET
|
||||
|
||||
:data:`rfc3986.abnf_regexp.SUB_DELIMS` represented as a :class:`set`.
|
||||
|
||||
.. data:: rfc3986.abnf_regexp.SUB_DELIMITERS_RE
|
||||
|
||||
:data:`rfc3986.abnf_regexp.SUB_DELIMS` with the ``*`` escaped for use in
|
||||
regular expressions.
|
||||
|
||||
.. data:: rfc3986.abnf_regexp.RESERVED_CHARS_SET
|
||||
|
||||
A :class:`set` constructed of :data:`GEN_DELIMS` and :data:`SUB_DELIMS`.
|
||||
This union is defined on `page 13
|
||||
<https://tools.ietf.org/html/rfc3986#page-13>`__.
|
||||
|
||||
.. data:: rfc3986.abnf_regexp.ALPHA
|
||||
|
||||
The string of upper- and lower-case letters in USASCII.
|
||||
|
||||
.. data:: rfc3986.abnf_regexp.DIGIT
|
||||
|
||||
The string of digits 0 through 9.
|
||||
|
||||
.. data:: rfc3986.abnf_regexp.UNRESERVED
|
||||
.. data:: rfc3986.abnf_regexp.UNRESERVED_CHARS
|
||||
|
||||
The string of unreserved characters defined in :rfc:`3986#section-2.3`.
|
||||
|
||||
.. data:: rfc3986.abnf_regexp.UNRESERVED_CHARS_SET
|
||||
|
||||
:data:`rfc3986.abnf_regexp.UNRESERVED_CHARS` represented as a
|
||||
:class:`set`.
|
||||
|
||||
.. data:: rfc3986.abnf_regexp.NON_PCT_ENCODED_SET
|
||||
|
||||
The non-percent encoded characters represented as a :class:`set`.
|
||||
|
||||
.. data:: rfc3986.abnf_regexp.UNRESERVED_RE
|
||||
|
||||
Optimized regular expression for unreserved characters.
|
||||
|
||||
.. data:: rfc3986.abnf_regexp.SCHEME_RE
|
||||
|
||||
Stricter regular expression to match and validate the scheme part
|
||||
of a URI.
|
||||
|
||||
.. data:: rfc3986.abnf_regexp.COMPONENT_PATTERN_DICT
|
||||
|
||||
Dictionary with regular expressions to match various components in
|
||||
a URI. Except for :data:`rfc3986.abnf_regexp.SCHEME_RE`, all patterns
|
||||
are from :rfc:`3986#appendix-B`.
|
||||
|
||||
.. data:: rfc3986.abnf_regexp.URL_PARSING_RE
|
||||
|
||||
Regular expression compposed from the components in
|
||||
:data:`rfc3986.abnf_regexp.COMPONENT_PATTERN_DICT`.
|
||||
|
||||
.. data:: rfc3986.abnf_regexp.HEXDIG_RE
|
||||
|
||||
Hexadecimal characters used in each piece of an IPv6 address.
|
||||
See :rfc:`3986#section-3.2.2`.
|
||||
|
||||
.. data:: rfc3986.abnf_regexp.LS32_RE
|
||||
|
||||
Lease significant 32 bits of an IPv6 address.
|
||||
See :rfc:`3986#section-3.2.2`.
|
||||
|
||||
.. data:: rfc3986.abnf_regexp.REG_NAME
|
||||
.. data:: rfc3986.abnf_regexp.REGULAR_NAME_RE
|
||||
|
||||
The pattern for a regular name, e.g., ``www.google.com``,
|
||||
``api.github.com``.
|
||||
See :rfc:`3986#section-3.2.2`.
|
||||
|
||||
.. data:: rfc3986.abnf_regexp.IPv4_RE
|
||||
|
||||
The pattern for an IPv4 address, e.g., ``192.168.255.255``.
|
||||
See :rfc:`3986#section-3.2.2`.
|
||||
|
||||
.. data:: rfc3986.abnf_regexp.IPv6_RE
|
||||
|
||||
The pattern for an IPv6 address, e.g., ``::1``.
|
||||
See :rfc:`3986#section-3.2.2`.
|
||||
|
||||
.. data:: rfc3986.abnf_regexp.IPv_FUTURE_RE
|
||||
|
||||
A regular expression to parse out IPv Futures.
|
||||
See :rfc:`3986#section-3.2.2`.
|
||||
|
||||
.. data:: rfc3986.abnf_regexp.IP_LITERAL_RE
|
||||
|
||||
Pattern to match IPv6 addresses and IPv Future addresses.
|
||||
See :rfc:`3986#section-3.2.2`.
|
||||
|
||||
.. data:: rfc3986.abnf_regexp.HOST_RE
|
||||
.. data:: rfc3986.abnf_regexp.HOST_PATTERN
|
||||
|
||||
Pattern to match and validate the host piece of an authority.
|
||||
This is composed of
|
||||
|
||||
- :data:`rfc3986.abnf_regexp.REG_NAME`
|
||||
- :data:`rfc3986.abnf_regexp.IPv4_RE`
|
||||
- :data:`rfc3986.abnf_regexp.IP_LITERAL_RE`
|
||||
|
||||
See :rfc:`3986#section-3.2.2`.
|
||||
|
||||
.. data:: rfc3986.abnf_regexp.USERINFO_RE
|
||||
|
||||
Pattern to match and validate the user information portion of
|
||||
an authority component.
|
||||
|
||||
See :rfc:`3986#section-3.2.2`.
|
||||
|
||||
.. data:: rfc3986.abnf_regexp.PORT_RE
|
||||
|
||||
Pattern to match and validate the port portion of an authority
|
||||
component.
|
||||
|
||||
See :rfc:`3986#section-3.2.2`.
|
||||
|
||||
.. data:: rfc3986.abnf_regexp.PCT_ENCODED
|
||||
.. data:: rfc3986.abnf_regexp.PERCENT_ENCODED
|
||||
|
||||
Regular expression to match percent encoded character values.
|
||||
|
||||
.. data:: rfc3986.abnf_regexp.PCHAR
|
||||
|
||||
Regular expression to match printable characters.
|
||||
|
||||
.. data:: rfc3986.abnf_regexp.PATH_RE
|
||||
|
||||
Regular expression to match and validate the path component of a URI.
|
||||
|
||||
See :rfc:`3986#section-3.3`.
|
||||
|
||||
.. data:: rfc3986.abnf_regexp.PATH_EMPTY
|
||||
.. data:: rfc3986.abnf_regexp.PATH_ROOTLESS
|
||||
.. data:: rfc3986.abnf_regexp.PATH_NOSCHEME
|
||||
.. data:: rfc3986.abnf_regexp.PATH_ABSOLUTE
|
||||
.. data:: rfc3986.abnf_regexp.PATH_ABEMPTY
|
||||
|
||||
Components of the :data:`rfc3986.abnf_regexp.PATH_RE`.
|
||||
|
||||
See :rfc:`3986#section-3.3`.
|
||||
|
||||
.. data:: rfc3986.abnf_regexp.QUERY_RE
|
||||
|
||||
Regular expression to parse and validate the query component of a URI.
|
||||
|
||||
.. data:: rfc3986.abnf_regexp.FRAGMENT_RE
|
||||
|
||||
Regular expression to parse and validate the fragment component of a URI.
|
||||
|
||||
.. data:: rfc3986.abnf_regexp.RELATIVE_PART_RE
|
||||
|
||||
Regular expression to parse the relative URI when resolving URIs.
|
||||
|
||||
.. data:: rfc3986.abnf_regexp.HIER_PART_RE
|
||||
|
||||
The hierarchical part of a URI. This regular expression is used when
|
||||
resolving relative URIs.
|
||||
|
||||
See :rfc:`3986#section-3`.
|
||||
|
||||
.. module:: rfc3986.misc
|
||||
|
||||
.. data:: rfc3986.misc.URI_MATCHER
|
||||
|
||||
Compiled version of :data:`rfc3986.abnf_regexp.URL_PARSING_RE`.
|
||||
|
||||
.. data:: rfc3986.misc.SUBAUTHORITY_MATCHER
|
||||
|
||||
Compiled compilation of :data:`rfc3986.abnf_regexp.USERINFO_RE`,
|
||||
:data:`rfc3986.abnf_regexp.HOST_PATTERN`,
|
||||
:data:`rfc3986.abnf_regexp.PORT_RE`.
|
||||
|
||||
.. data:: rfc3986.misc.SCHEME_MATCHER
|
||||
|
||||
Compiled version of :data:`rfc3986.abnf_regexp.SCHEME_RE`.
|
||||
|
||||
.. data:: rfc3986.misc.IPv4_MATCHER
|
||||
|
||||
Compiled version of :data:`rfc3986.abnf_regexp.IPv4_RE`.
|
||||
|
||||
.. data:: rfc3986.misc.PATH_MATCHER
|
||||
|
||||
Compiled version of :data:`rfc3986.abnf_regexp.PATH_RE`.
|
||||
|
||||
.. data:: rfc3986.misc.QUERY_MATCHER
|
||||
|
||||
Compiled version of :data:`rfc3986.abnf_regexp.QUERY_RE`.
|
||||
|
||||
.. data:: rfc3986.misc.RELATIVE_REF_MATCHER
|
||||
|
||||
Compiled compilation of :data:`rfc3986.abnf_regexp.SCHEME_RE`,
|
||||
:data:`rfc3986.abnf_regexp.HIER_PART_RE`,
|
||||
:data:`rfc3986.abnf_regexp.QUERY_RE`.
|
19
docs/source/api-ref/uri.rst
Normal file
19
docs/source/api-ref/uri.rst
Normal file
@ -0,0 +1,19 @@
|
||||
===============
|
||||
URI Submodule
|
||||
===============
|
||||
|
||||
.. autoclass:: rfc3986.uri.URIReference
|
||||
|
||||
.. automethod:: rfc3986.uri.URIReference.from_string
|
||||
|
||||
.. automethod:: rfc3986.uri.URIReference.unsplit
|
||||
|
||||
.. automethod:: rfc3986.uri.URIReference.resolve_with
|
||||
|
||||
.. automethod:: rfc3986.uri.URIReference.copy_with
|
||||
|
||||
.. automethod:: rfc3986.uri.URIReference.normalize
|
||||
|
||||
.. automethod:: rfc3986.uri.URIReference.is_absolute
|
||||
|
||||
.. automethod:: rfc3986.uri.URIReference.authority_info
|
19
docs/source/api-ref/validators.rst
Normal file
19
docs/source/api-ref/validators.rst
Normal file
@ -0,0 +1,19 @@
|
||||
======================
|
||||
Validators Submodule
|
||||
======================
|
||||
|
||||
.. autoclass:: rfc3986.validators.Validator
|
||||
|
||||
.. automethod:: rfc3986.validators.Validator.allow_schemes
|
||||
|
||||
.. automethod:: rfc3986.validators.Validator.allow_hosts
|
||||
|
||||
.. automethod:: rfc3986.validators.Validator.allow_ports
|
||||
|
||||
.. automethod:: rfc3986.validators.Validator.allow_use_of_password
|
||||
|
||||
.. automethod:: rfc3986.validators.Validator.forbid_use_of_password
|
||||
|
||||
.. automethod:: rfc3986.validators.Validator.require_presence_of
|
||||
|
||||
.. automethod:: rfc3986.validators.Validator.validate
|
166
docs/source/conf.py
Normal file
166
docs/source/conf.py
Normal file
@ -0,0 +1,166 @@
|
||||
# -*- coding: utf-8 -*-
|
||||
#
|
||||
# rfc3986 documentation build configuration file, created by
|
||||
# sphinx-quickstart on Tue Mar 14 07:06:46 2017.
|
||||
#
|
||||
# This file is execfile()d with the current directory set to its
|
||||
# containing dir.
|
||||
#
|
||||
# Note that not all possible configuration values are present in this
|
||||
# autogenerated file.
|
||||
#
|
||||
# All configuration values have a default; values that are commented out
|
||||
# serve to show the default.
|
||||
|
||||
# If extensions (or modules to document with autodoc) are in another directory,
|
||||
# add these directories to sys.path here. If the directory is relative to the
|
||||
# documentation root, use os.path.abspath to make it absolute, like shown here.
|
||||
#
|
||||
# import os
|
||||
# import sys
|
||||
# sys.path.insert(0, os.path.abspath('.'))
|
||||
|
||||
|
||||
# -- General configuration ------------------------------------------------
|
||||
|
||||
# If your documentation needs a minimal Sphinx version, state it here.
|
||||
#
|
||||
# needs_sphinx = '1.0'
|
||||
rst_epilog = """
|
||||
.. |rfc3986| replace:: :mod:`rfc3986`
|
||||
"""
|
||||
|
||||
# Add any Sphinx extension module names here, as strings. They can be
|
||||
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
|
||||
# ones.
|
||||
extensions = [
|
||||
'sphinx.ext.autodoc',
|
||||
'sphinx.ext.doctest',
|
||||
'sphinx.ext.intersphinx',
|
||||
'sphinx.ext.coverage',
|
||||
'sphinx-prompt',
|
||||
]
|
||||
|
||||
# Add any paths that contain templates here, relative to this directory.
|
||||
templates_path = ['_templates']
|
||||
|
||||
# The suffix(es) of source filenames.
|
||||
# You can specify multiple suffix as a list of string:
|
||||
#
|
||||
# source_suffix = ['.rst', '.md']
|
||||
source_suffix = '.rst'
|
||||
|
||||
# The master toctree document.
|
||||
master_doc = 'index'
|
||||
|
||||
# General information about the project.
|
||||
project = u'rfc3986'
|
||||
copyright = u'2017, Ian Cordasco'
|
||||
author = u'Ian Cordasco'
|
||||
|
||||
# The version info for the project you're documenting, acts as replacement for
|
||||
# |version| and |release|, also used in various other places throughout the
|
||||
# built documents.
|
||||
#
|
||||
# The short X.Y version.
|
||||
version = u'1.0.0'
|
||||
# The full version, including alpha/beta/rc tags.
|
||||
release = u'1.0.0'
|
||||
|
||||
# The language for content autogenerated by Sphinx. Refer to documentation
|
||||
# for a list of supported languages.
|
||||
#
|
||||
# This is also used if you do content translation via gettext catalogs.
|
||||
# Usually you set "language" from the command line for these cases.
|
||||
language = None
|
||||
|
||||
# List of patterns, relative to source directory, that match files and
|
||||
# directories to ignore when looking for source files.
|
||||
# This patterns also effect to html_static_path and html_extra_path
|
||||
exclude_patterns = []
|
||||
|
||||
# The name of the Pygments (syntax highlighting) style to use.
|
||||
pygments_style = 'sphinx'
|
||||
|
||||
# If true, `todo` and `todoList` produce output, else they produce nothing.
|
||||
todo_include_todos = False
|
||||
|
||||
|
||||
# -- Options for HTML output ----------------------------------------------
|
||||
|
||||
# The theme to use for HTML and HTML Help pages. See the documentation for
|
||||
# a list of builtin themes.
|
||||
#
|
||||
html_theme = 'alabaster'
|
||||
|
||||
# Theme options are theme-specific and customize the look and feel of a theme
|
||||
# further. For a list of options available for each theme, see the
|
||||
# documentation.
|
||||
#
|
||||
# html_theme_options = {}
|
||||
|
||||
# Add any paths that contain custom static files (such as style sheets) here,
|
||||
# relative to this directory. They are copied after the builtin static files,
|
||||
# so a file named "default.css" will overwrite the builtin "default.css".
|
||||
html_static_path = ['_static']
|
||||
|
||||
|
||||
# -- Options for HTMLHelp output ------------------------------------------
|
||||
|
||||
# Output file base name for HTML help builder.
|
||||
htmlhelp_basename = 'rfc3986doc'
|
||||
|
||||
|
||||
# -- Options for LaTeX output ---------------------------------------------
|
||||
|
||||
latex_elements = {
|
||||
# The paper size ('letterpaper' or 'a4paper').
|
||||
#
|
||||
# 'papersize': 'letterpaper',
|
||||
|
||||
# The font size ('10pt', '11pt' or '12pt').
|
||||
#
|
||||
# 'pointsize': '10pt',
|
||||
|
||||
# Additional stuff for the LaTeX preamble.
|
||||
#
|
||||
# 'preamble': '',
|
||||
|
||||
# Latex figure (float) alignment
|
||||
#
|
||||
# 'figure_align': 'htbp',
|
||||
}
|
||||
|
||||
# Grouping the document tree into LaTeX files. List of tuples
|
||||
# (source start file, target name, title,
|
||||
# author, documentclass [howto, manual, or own class]).
|
||||
latex_documents = [
|
||||
(master_doc, 'rfc3986.tex', u'rfc3986 Documentation',
|
||||
u'Ian Cordasco', 'manual'),
|
||||
]
|
||||
|
||||
|
||||
# -- Options for manual page output ---------------------------------------
|
||||
|
||||
# One entry per manual page. List of tuples
|
||||
# (source start file, name, description, authors, manual section).
|
||||
man_pages = [
|
||||
(master_doc, 'rfc3986', u'rfc3986 Documentation',
|
||||
[author], 1)
|
||||
]
|
||||
|
||||
|
||||
# -- Options for Texinfo output -------------------------------------------
|
||||
|
||||
# Grouping the document tree into Texinfo files. List of tuples
|
||||
# (source start file, target name, title, author,
|
||||
# dir menu entry, description, category)
|
||||
texinfo_documents = [
|
||||
(master_doc, 'rfc3986', u'rfc3986 Documentation',
|
||||
author, 'rfc3986', 'One line description of project.',
|
||||
'Miscellaneous'),
|
||||
]
|
||||
|
||||
|
||||
# Example configuration for intersphinx: refer to the Python standard library.
|
||||
intersphinx_mapping = {'https://docs.python.org/': None}
|
29
docs/source/index.rst
Normal file
29
docs/source/index.rst
Normal file
@ -0,0 +1,29 @@
|
||||
=========
|
||||
rfc3986
|
||||
=========
|
||||
|
||||
|rfc3986| is a Python implementation of :rfc:`3986` including validation and
|
||||
authority parsing. This module also supports :rfc:`6874` which adds support
|
||||
for zone identifiers to IPv6 Addresses.
|
||||
|
||||
The maintainers strongly suggest using `pip`_ to install |rfc3986|. For
|
||||
example,
|
||||
|
||||
.. prompt:: bash
|
||||
|
||||
pip install rfc3986
|
||||
python -m pip install rfc3986
|
||||
python3.6 -m pip install rfc3986
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
:caption: Contents:
|
||||
|
||||
narrative
|
||||
api-ref/index
|
||||
release-notes/index
|
||||
|
||||
|
||||
.. links
|
||||
.. _pip:
|
||||
https://pypi.python.org/pypi/pip/
|
29
docs/source/narrative.rst
Normal file
29
docs/source/narrative.rst
Normal file
@ -0,0 +1,29 @@
|
||||
.. _narrative:
|
||||
|
||||
====================
|
||||
User Documentation
|
||||
====================
|
||||
|
||||
|rfc3986| has several API features and convenience methods. The core of
|
||||
|rfc3986|'s API revolves around parsing, validating, and building URIs.
|
||||
|
||||
There is an API to provide compatibility with :mod:`urllib.parse`, there is an
|
||||
API to parse a URI as a URI Reference, there's an API to provide validation of
|
||||
URIs, and finally there's an API to build URIs.
|
||||
|
||||
.. note::
|
||||
|
||||
There's presently no support for IRIs as defined in :rfc:`3987`.
|
||||
|
||||
|rfc3986| parses URIs much differently from :mod:`urllib.parse` so users may
|
||||
see some subtle differences with very specific URLs that contain rough
|
||||
edgecases. Regardless, we do our best to implement the same API so you should
|
||||
be able to seemlessly swap |rfc3986| for ``urlparse``.
|
||||
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
user/parsing
|
||||
user/validating
|
||||
user/building
|
4
docs/source/release-notes/0.1.0.rst
Normal file
4
docs/source/release-notes/0.1.0.rst
Normal file
@ -0,0 +1,4 @@
|
||||
0.1.0 -- 2014-06-27
|
||||
-------------------
|
||||
|
||||
- Initial Release includes validation and normalization of URIs
|
7
docs/source/release-notes/0.2.0.rst
Normal file
7
docs/source/release-notes/0.2.0.rst
Normal file
@ -0,0 +1,7 @@
|
||||
0.2.0 -- 2014-06-30
|
||||
-------------------
|
||||
|
||||
- Add support for requiring components during validation. This includes adding
|
||||
parameters ``require_scheme``, ``require_authority``, ``require_path``,
|
||||
``require_path``, ``require_query``, and ``require_fragment`` to
|
||||
``rfc3986.is_valid_uri`` and ``URIReference#is_valid``.
|
9
docs/source/release-notes/0.2.1.rst
Normal file
9
docs/source/release-notes/0.2.1.rst
Normal file
@ -0,0 +1,9 @@
|
||||
0.2.1 -- 2015-03-20
|
||||
-------------------
|
||||
|
||||
- Check that the bytes of an IPv4 Host Address are within the valid range.
|
||||
Otherwise, URIs like "http://256.255.255.0/v1/resource" are considered
|
||||
valid.
|
||||
|
||||
- Add 6 to the list of unreserved characters. It was previously missing.
|
||||
Closes bug #9
|
7
docs/source/release-notes/0.2.2.rst
Normal file
7
docs/source/release-notes/0.2.2.rst
Normal file
@ -0,0 +1,7 @@
|
||||
0.2.2 -- 2015-05-27
|
||||
-------------------
|
||||
|
||||
- Update the regular name regular expression to accept all of the characters
|
||||
allowed in the RFC. Closes bug #11 (Thanks Viktor Haag). Previously URIs
|
||||
similar to "http://http-bin.org" would be considered invalid.
|
||||
|
7
docs/source/release-notes/0.3.0.rst
Normal file
7
docs/source/release-notes/0.3.0.rst
Normal file
@ -0,0 +1,7 @@
|
||||
0.3.0 -- 2015-10-20
|
||||
-------------------
|
||||
|
||||
- Read README and HISTORY files using the appropriate codec so rfc3986 can be
|
||||
installed on systems with locale's other than utf-8 (specifically C)
|
||||
|
||||
- Replace the standard library's urlparse behaviour
|
4
docs/source/release-notes/0.3.1.rst
Normal file
4
docs/source/release-notes/0.3.1.rst
Normal file
@ -0,0 +1,4 @@
|
||||
0.3.1 -- 2015-12-15
|
||||
-------------------
|
||||
|
||||
- Preserve empty query strings during normalization
|
8
docs/source/release-notes/0.4.0.rst
Normal file
8
docs/source/release-notes/0.4.0.rst
Normal file
@ -0,0 +1,8 @@
|
||||
0.4.0 -- 2016-08-20
|
||||
-------------------
|
||||
|
||||
- Add ``ParseResult.from_parts`` and ``ParseResultBytes.from_parts`` class
|
||||
methods to easily create a ParseResult
|
||||
|
||||
- When using regular expressions, use ``[0-9]`` instead of ``\d`` to avoid
|
||||
finding ports with "numerals" that are not valid in a port
|
5
docs/source/release-notes/0.4.1.rst
Normal file
5
docs/source/release-notes/0.4.1.rst
Normal file
@ -0,0 +1,5 @@
|
||||
0.4.1 -- 2016-08-22
|
||||
-------------------
|
||||
|
||||
- Normalize URIs constructed using ``ParseResult.from_parts`` and
|
||||
``ParseResultBytes.from_parts``
|
5
docs/source/release-notes/0.4.2.rst
Normal file
5
docs/source/release-notes/0.4.2.rst
Normal file
@ -0,0 +1,5 @@
|
||||
0.4.2 -- 2016-08-22
|
||||
-------------------
|
||||
|
||||
- Avoid parsing an string with just an IPv6 address as having a scheme of
|
||||
``[``.
|
26
docs/source/release-notes/1.0.0.rst
Normal file
26
docs/source/release-notes/1.0.0.rst
Normal file
@ -0,0 +1,26 @@
|
||||
1.0.0 -- 2017-05-10
|
||||
-------------------
|
||||
|
||||
- Add support for :rfc:`6874` - Zone Identifiers in IPv6 Addresses
|
||||
|
||||
See also `issue #2`_
|
||||
|
||||
- Add a more flexible and usable validation framework. See our documentation
|
||||
for more information.
|
||||
|
||||
- Add an object to aid in building new URIs from scratch. See our
|
||||
documentation for more information.
|
||||
|
||||
- Add real documentation for the entire module.
|
||||
|
||||
- Add separate submodule with documented regular expression strings for the
|
||||
collected ABNF.
|
||||
|
||||
- Allow ``None`` to be used to eliminate components via ``copy_with`` for URIs
|
||||
and ParseResults.
|
||||
|
||||
- Move release history into our documentation.
|
||||
|
||||
.. links
|
||||
.. _issue #2:
|
||||
https://github.com/sigmavirus24/rfc3986/issues/2
|
28
docs/source/release-notes/index.rst
Normal file
28
docs/source/release-notes/index.rst
Normal file
@ -0,0 +1,28 @@
|
||||
===========================
|
||||
Release Notes and History
|
||||
===========================
|
||||
|
||||
All of the release notes that have been recorded for |rfc3986| are organized
|
||||
here with the newest releases first.
|
||||
|
||||
1.x Release Series
|
||||
==================
|
||||
|
||||
.. toctree::
|
||||
|
||||
1.0.0
|
||||
|
||||
0.x Release Series
|
||||
==================
|
||||
|
||||
.. toctree::
|
||||
|
||||
0.4.2
|
||||
0.4.1
|
||||
0.4.0
|
||||
0.3.1
|
||||
0.3.0
|
||||
0.2.2
|
||||
0.2.1
|
||||
0.2.0
|
||||
0.1.0
|
116
docs/source/user/building.rst
Normal file
116
docs/source/user/building.rst
Normal file
@ -0,0 +1,116 @@
|
||||
===============
|
||||
Building URIs
|
||||
===============
|
||||
|
||||
Constructing URLs often seems simple. There are some problems with
|
||||
concatenating strings to build a URL:
|
||||
|
||||
- Certain parts of the URL disallow certain characters
|
||||
|
||||
- Formatting some parts of the URL is tricky and doing it manually isn't fun
|
||||
|
||||
To make the experience better |rfc3986| provides the
|
||||
:class:`~rfc3986.builder.URIBuilder` class to generate valid
|
||||
:class:`~rfc3986.uri.URIReference` instances. The
|
||||
:class:`~rfc3986.builder.URIBuilder` class will handle ensuring that each
|
||||
component is normalized and safe for real world use.
|
||||
|
||||
|
||||
Example Usage
|
||||
=============
|
||||
|
||||
.. note::
|
||||
|
||||
All of the methods on a :class:`~rfc3986.builder.URIBuilder` are
|
||||
chainable (except :meth:`~rfc3986.builder.URIBuilder.finalize`).
|
||||
|
||||
Let's build a basic URL with just a scheme and host. First we create an
|
||||
instance of :class:`~rfc3986.builder.URIBuilder`. Then we call
|
||||
:meth:`~rfc3986.builder.URIBuilder.add_scheme` and
|
||||
:meth:`~rfc3986.builder.URIBuilder.add_host` with the scheme and host
|
||||
we want to include in the URL. Then we convert our builder object into
|
||||
a :class:`~rfc3986.uri.URIReference` and call
|
||||
:meth:`~rfc3986.uri.URIReference.unsplit`.
|
||||
|
||||
.. doctest::
|
||||
|
||||
>>> from rfc3986 import builder
|
||||
>>> print(builder.URIBuilder().add_scheme(
|
||||
... 'https'
|
||||
... ).add_host(
|
||||
... 'github.com'
|
||||
... ).finalize().unsplit())
|
||||
https://github.com
|
||||
|
||||
Each time you invoke a method, you get a new instance of a
|
||||
:class:`~rfc3986.builder.URIBuilder` class so you can build several different
|
||||
URLs from one base instance.
|
||||
|
||||
.. doctest::
|
||||
|
||||
>>> from rfc3986 import builder
|
||||
>>> github_builder = builder.URIBuilder().add_scheme(
|
||||
... 'https'
|
||||
... ).add_host(
|
||||
... 'api.github.com'
|
||||
... )
|
||||
>>> print(github_builder.add_path(
|
||||
... '/users/sigmavirus24'
|
||||
... ).finalize().unsplit())
|
||||
https://api.github.com/users/sigmavirus24
|
||||
>>> print(github_builder.add_path(
|
||||
... '/repos/sigmavirus24/rfc3986'
|
||||
... ).finalize().unsplit())
|
||||
https://api.github.com/repos/sigmavirus24/rfc3986
|
||||
|
||||
|rfc3986| makes adding authentication credentials convenient. It takes care of
|
||||
making the credentials URL safe. There are some characters someone might want
|
||||
to include in a URL that are not safe for the authority component of a URL.
|
||||
|
||||
.. doctest::
|
||||
|
||||
>>> from rfc3986 import builder
|
||||
>>> print(builder.URIBuilder().add_scheme(
|
||||
... 'https'
|
||||
... ).add_host(
|
||||
... 'api.github.com'
|
||||
... ).add_credentials(
|
||||
... username='us3r',
|
||||
... password='p@ssw0rd',
|
||||
... ).finalize().unsplit())
|
||||
https://us3r:p%40ssw0rd@api.github.com
|
||||
|
||||
Further, |rfc3986| attempts to simplify the process of adding query parameters
|
||||
to a URL. For example, if we were using Elasticsearch, we might do something
|
||||
like:
|
||||
|
||||
.. doctest::
|
||||
|
||||
>>> from rfc3986 import builder
|
||||
>>> print(builder.URIBuilder().add_scheme(
|
||||
... 'https'
|
||||
... ).add_host(
|
||||
... 'search.example.com'
|
||||
... ).add_path(
|
||||
... '_search'
|
||||
... ).add_query_from(
|
||||
... [('q', 'repo:sigmavirus24/rfc3986'), ('sort', 'created_at:asc')]
|
||||
... ).finalize().unsplit())
|
||||
https://search.example.com/_search?q=repo%3Asigmavirus24%2Frfc3986&sort=created_at%3Aasc
|
||||
|
||||
Finally, we provide a way to add a fragment to a URL. Let's build up a URL to
|
||||
view the section of the RFC that refers to fragments:
|
||||
|
||||
.. doctest::
|
||||
|
||||
>>> from rfc3986 import builder
|
||||
>>> print(builder.URIBuilder().add_scheme(
|
||||
... 'https'
|
||||
... ).add_host(
|
||||
... 'tools.ietf.org'
|
||||
... ).add_path(
|
||||
... '/html/rfc3986'
|
||||
... ).add_fragment(
|
||||
... 'section-3.5'
|
||||
... ).finalize().unsplit())
|
||||
https://tools.ietf.org/html/rfc3986#section-3.5
|
147
docs/source/user/parsing.rst
Normal file
147
docs/source/user/parsing.rst
Normal file
@ -0,0 +1,147 @@
|
||||
===============
|
||||
Parsing a URI
|
||||
===============
|
||||
|
||||
There are two ways to parse a URI with |rfc3986|
|
||||
|
||||
#. :meth:`rfc3986.api.uri_reference`
|
||||
|
||||
This is best when you're **not** replacing existing usage of
|
||||
:mod:`urllib.parse`. This also provides convenience methods around safely
|
||||
normalizing URIs passed into it.
|
||||
|
||||
#. :meth:`rfc3986.api.urlparse`
|
||||
|
||||
This is best suited to completely replace :func:`urllib.parse.urlparse`.
|
||||
It returns a class that should be indistinguishable from
|
||||
:class:`urllib.parse.ParseResult`
|
||||
|
||||
Let's look at some code samples.
|
||||
|
||||
|
||||
Some Examples
|
||||
=============
|
||||
|
||||
First we'll parse the URL that points to the repository for this project.
|
||||
|
||||
.. testsetup:: *
|
||||
|
||||
import rfc3986
|
||||
url = rfc3986.urlparse('https://github.com/sigmavirus24/rfc3986')
|
||||
uri = rfc3986.uri_reference('https://github.com/sigmavirus24/rfc3986')
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
url = rfc3986.urlparse('https://github.com/sigmavirus24/rfc3986')
|
||||
|
||||
|
||||
Then we'll replace parts of that URL with new values:
|
||||
|
||||
.. testcode:: ex0
|
||||
|
||||
print(url.copy_with(
|
||||
userinfo='username:password',
|
||||
port='443',
|
||||
).unsplit())
|
||||
|
||||
.. testoutput:: ex0
|
||||
|
||||
https://username:password@github.com:443/sigmavirus24/rfc3986
|
||||
|
||||
This, however, does not change the current ``url`` instance of
|
||||
:class:`~rfc3986.parseresult.ParseResult`. As the method name might suggest,
|
||||
we're copying that instance and then overriding certain attributes.
|
||||
In fact, we can make as many copies as we like and nothing will change.
|
||||
|
||||
.. testcode:: ex1
|
||||
|
||||
print(url.copy_with(
|
||||
scheme='ssh',
|
||||
userinfo='git',
|
||||
).unsplit())
|
||||
|
||||
.. testoutput:: ex1
|
||||
|
||||
ssh://git@github.com/sigmavirus24/rfc3986
|
||||
|
||||
.. testcode:: ex1
|
||||
|
||||
print(url.scheme)
|
||||
|
||||
.. testoutput:: ex1
|
||||
|
||||
https
|
||||
|
||||
We can do similar things with URI References as well.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
uri = rfc3986.uri_reference('https://github.com/sigmavirus24/rfc3986')
|
||||
|
||||
.. testcode:: ex2
|
||||
|
||||
print(uri.copy_with(
|
||||
authority='username:password@github.com:443',
|
||||
path='/sigmavirus24/github3.py',
|
||||
).unsplit())
|
||||
|
||||
.. testoutput:: ex2
|
||||
|
||||
https://username:password@github.com:443/sigmavirus24/github3.py
|
||||
|
||||
However, URI References may have some unexpected behaviour based strictly on
|
||||
the RFC.
|
||||
|
||||
Finally, if you want to remove a component from a URI, you may pass ``None``
|
||||
to remove it, for example:
|
||||
|
||||
.. testcode:: ex3
|
||||
|
||||
print(uri.copy_with(path=None).unsplit())
|
||||
|
||||
.. testoutput:: ex3
|
||||
|
||||
https://github.com
|
||||
|
||||
This will work on both URI References and Parse Results.
|
||||
|
||||
|
||||
And Now For Something Slightly Unusual
|
||||
======================================
|
||||
|
||||
If you are familiar with GitHub, GitLab, or a similar service, you may have
|
||||
interacted with the "SSH URL" for some projects. For this project,
|
||||
the SSH URL is:
|
||||
|
||||
.. code::
|
||||
|
||||
git@github.com:sigmavirus24/rfc3986
|
||||
|
||||
|
||||
Let's see what happens when we parse this.
|
||||
|
||||
.. code-block:: pycon
|
||||
|
||||
>>> rfc3986.uri_reference('git@github.com:sigmavirus24/rfc3986')
|
||||
URIReference(scheme=None, authority=None,
|
||||
path=u'git@github.com:sigmavirus24/rfc3986', query=None, fragment=None)
|
||||
|
||||
There's no scheme present, but it is apparent to our (human) eyes that
|
||||
``git@github.com`` should not be part of the path. This is one of the areas
|
||||
where :mod:`rfc3986` suffers slightly due to its strict conformance to
|
||||
:rfc:`3986`. In the RFC, an authority must be preceded by ``//``. Let's see
|
||||
what happens when we add that to our URI
|
||||
|
||||
.. code-block:: pycon
|
||||
|
||||
>>> rfc3986.uri_reference('//git@github.com:sigmavirus24/rfc3986')
|
||||
URIReference(scheme=None, authority=u'git@github.com:sigmavirus24',
|
||||
path=u'/rfc3986', query=None, fragment=None)
|
||||
|
||||
Somewhat better, but not much.
|
||||
|
||||
.. note::
|
||||
|
||||
The maintainers of :mod:`rfc3986` are working to discern better ways to
|
||||
parse these less common URIs in a reasonable and sensible way without
|
||||
losing conformance to the RFC.
|
140
docs/source/user/validating.rst
Normal file
140
docs/source/user/validating.rst
Normal file
@ -0,0 +1,140 @@
|
||||
=================
|
||||
Validating URIs
|
||||
=================
|
||||
|
||||
While not as difficult as `validating an email address`_, validating URIs is
|
||||
tricky. Different parts of the URI allow different characters. Those sets
|
||||
sometimes overlap and othertimes they don't and it's not very convenient.
|
||||
Luckily, |rfc3986| makes validating URIs far simpler.
|
||||
|
||||
Example Usage
|
||||
=============
|
||||
|
||||
First we need to create an instance of a
|
||||
:class:`~rfc3986.validators.Validator` which takes no parameters. After that
|
||||
we can call methods on the instance to indicate what we want to validate.
|
||||
|
||||
Let's assume that we're building something that takes user input for a URl and
|
||||
we want to ensure that URL is only ever using a specific domain with https. In
|
||||
that case, our code would look like this:
|
||||
|
||||
.. doctest::
|
||||
|
||||
>>> from rfc3986 import validators, uri_reference
|
||||
>>> user_url = 'https://github.com/sigmavirus24/rfc3986'
|
||||
>>> validator = validators.Validator().allow_schemes(
|
||||
... 'https',
|
||||
... ).allow_hosts(
|
||||
... 'github.com',
|
||||
... )
|
||||
>>> validator.validate(uri_reference(
|
||||
... 'https://github.com/sigmavirus24/rfc3986'
|
||||
... ))
|
||||
>>> validator.validate(uri_reference(
|
||||
... 'https://github.com/'
|
||||
... ))
|
||||
>>> validator.validate(uri_reference(
|
||||
... 'http://example.com'
|
||||
... ))
|
||||
Traceback (most recent call last):
|
||||
...
|
||||
rfc3986.exceptions.UnpermittedComponentError
|
||||
|
||||
First notice that we can easily reuse our validator object for each URL.
|
||||
This allows users to not have to constantly reconstruct Validators for each
|
||||
bit of user input. Next, we have three different URLs that we validate:
|
||||
|
||||
#. ``https://github.com/sigmavirus24/rfc3986``
|
||||
#. ``https://github.com/``
|
||||
#. ``http://example.com``
|
||||
|
||||
As it stands, our validator will allow the first two URLs to pass but will
|
||||
fail the third. This is specifically because we only allow URLs using
|
||||
``https`` as a scheme and ``github.com`` as the domain name.
|
||||
|
||||
Next, let's imagine that we want to prevent leaking user credentials. In that
|
||||
case, we want to ensure that there is no password in the user information
|
||||
portion of the authority. In that case, our new validator would look like this:
|
||||
|
||||
.. doctest::
|
||||
|
||||
>>> from rfc3986 import validators, uri_reference
|
||||
>>> user_url = 'https://github.com/sigmavirus24/rfc3986'
|
||||
>>> validator = validators.Validator().allow_schemes(
|
||||
... 'https',
|
||||
... ).allow_hosts(
|
||||
... 'github.com',
|
||||
... ).forbid_use_of_password()
|
||||
>>> validator.validate(uri_reference(
|
||||
... 'https://github.com/sigmavirus24/rfc3986'
|
||||
... ))
|
||||
>>> validator.validate(uri_reference(
|
||||
... 'https://github.com/'
|
||||
... ))
|
||||
>>> validator.validate(uri_reference(
|
||||
... 'http://example.com'
|
||||
... ))
|
||||
Traceback (most recent call last):
|
||||
...
|
||||
rfc3986.exceptions.UnpermittedComponentError
|
||||
>>> validator.validate(uri_reference(
|
||||
... 'https://sigmavirus24@github.com'
|
||||
... ))
|
||||
>>> validator.validate(uri_reference(
|
||||
... 'https://sigmavirus24:not-my-real-password@github.com'
|
||||
... ))
|
||||
Traceback (most recent call last):
|
||||
...
|
||||
rfc3986.exceptions.PasswordForbidden
|
||||
|
||||
Up until now, we have assumed that we will get a URL that has the appropriate
|
||||
components for validation. For example, we assume that we will have a URL that
|
||||
has a scheme and hostname. However, our current validation doesn't require
|
||||
those items exist.
|
||||
|
||||
.. doctest::
|
||||
|
||||
>>> from rfc3986 import validators, uri_reference
|
||||
>>> user_url = 'https://github.com/sigmavirus24/rfc3986'
|
||||
>>> validator = validators.Validator().allow_schemes(
|
||||
... 'https',
|
||||
... ).allow_hosts(
|
||||
... 'github.com',
|
||||
... ).forbid_use_of_password()
|
||||
>>> validator.validate(uri_reference('//github.com'))
|
||||
>>> validator.validate(uri_reference('https:/'))
|
||||
|
||||
In the first case, we have a host name but no scheme and in the second we have
|
||||
a scheme and a path but no host. If we want to ensure that those components
|
||||
are there and that they are *always* what we allow, then we must add one last
|
||||
item to our validator:
|
||||
|
||||
.. doctest::
|
||||
|
||||
>>> from rfc3986 import validators, uri_reference
|
||||
>>> user_url = 'https://github.com/sigmavirus24/rfc3986'
|
||||
>>> validator = validators.Validator().allow_schemes(
|
||||
... 'https',
|
||||
... ).allow_hosts(
|
||||
... 'github.com',
|
||||
... ).forbid_use_of_password(
|
||||
... ).require_presence_of(
|
||||
... 'scheme', 'host',
|
||||
... )
|
||||
>>> validator.validate(uri_reference('//github.com'))
|
||||
Traceback (most recent call last):
|
||||
...
|
||||
rfc3986.exceptions.MissingComponentError
|
||||
>>> validator.validate(uri_reference('https:/'))
|
||||
Traceback (most recent call last):
|
||||
...
|
||||
rfc3986.exceptions.MissingComponentError
|
||||
>>> validator.validate(uri_reference('https://github.com'))
|
||||
>>> validator.validate(uri_reference(
|
||||
... 'https://github.com/sigmavirus24/rfc3986'
|
||||
... ))
|
||||
|
||||
.. links
|
||||
|
||||
.. _validating an email address:
|
||||
http://haacked.com/archive/2007/08/21/i-knew-how-to-validate-an-email-address-until-i.aspx/
|
@ -1,21 +0,0 @@
|
||||
# -*- coding: utf-8 -*-
|
||||
class RFC3986Exception(Exception):
|
||||
pass
|
||||
|
||||
|
||||
class InvalidAuthority(RFC3986Exception):
|
||||
def __init__(self, authority):
|
||||
super(InvalidAuthority, self).__init__(
|
||||
"The authority ({0}) is not valid.".format(authority))
|
||||
|
||||
|
||||
class InvalidPort(RFC3986Exception):
|
||||
def __init__(self, port):
|
||||
super(InvalidPort, self).__init__(
|
||||
'The port ("{0}") is not valid.'.format(port))
|
||||
|
||||
|
||||
class ResolutionError(RFC3986Exception):
|
||||
def __init__(self, uri):
|
||||
super(ResolutionError, self).__init__(
|
||||
"{0} is not an absolute URI.".format(uri.unsplit()))
|
219
rfc3986/misc.py
219
rfc3986/misc.py
@ -1,219 +0,0 @@
|
||||
# -*- coding: utf-8 -*-
|
||||
# Copyright (c) 2014 Rackspace
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
|
||||
# implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
"""
|
||||
rfc3986.misc
|
||||
~~~~~~~~~~~~
|
||||
|
||||
This module contains important constants, patterns, and compiled regular
|
||||
expressions for parsing and validating URIs and their components.
|
||||
"""
|
||||
|
||||
import re
|
||||
|
||||
# These are enumerated for the named tuple used as a superclass of
|
||||
# URIReference
|
||||
URI_COMPONENTS = ['scheme', 'authority', 'path', 'query', 'fragment']
|
||||
|
||||
important_characters = {
|
||||
'generic_delimiters': ":/?#[]@",
|
||||
'sub_delimiters': "!$&'()*+,;=",
|
||||
# We need to escape the '*' in this case
|
||||
're_sub_delimiters': "!$&'()\*+,;=",
|
||||
'unreserved_chars': ('ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'
|
||||
'0123456789._~-'),
|
||||
# We need to escape the '-' in this case:
|
||||
're_unreserved': 'A-Za-z0-9._~\-',
|
||||
}
|
||||
# For details about delimiters and reserved characters, see:
|
||||
# http://tools.ietf.org/html/rfc3986#section-2.2
|
||||
GENERIC_DELIMITERS = set(important_characters['generic_delimiters'])
|
||||
SUB_DELIMITERS = set(important_characters['sub_delimiters'])
|
||||
RESERVED_CHARS = GENERIC_DELIMITERS.union(SUB_DELIMITERS)
|
||||
# For details about unreserved characters, see:
|
||||
# http://tools.ietf.org/html/rfc3986#section-2.3
|
||||
UNRESERVED_CHARS = set(important_characters['unreserved_chars'])
|
||||
NON_PCT_ENCODED = RESERVED_CHARS.union(UNRESERVED_CHARS).union('%')
|
||||
|
||||
# Extracted from http://tools.ietf.org/html/rfc3986#appendix-B
|
||||
component_pattern_dict = {
|
||||
# NOTE(sigmavirus24): Our list of things we want to not match includes one
|
||||
# item more than the RFC. We want to not parse the leading '[' from an
|
||||
# IPv6 address into the scheme when provided something akin to:
|
||||
# >>> rfc3986.uri_reference('[::1]')
|
||||
# We would rather that appear to be a path than not.
|
||||
'scheme': '[^:/?#[]+',
|
||||
'authority': '[^/?#]*',
|
||||
'path': '[^?#]*',
|
||||
'query': '[^#]*',
|
||||
'fragment': '.*',
|
||||
}
|
||||
|
||||
# See http://tools.ietf.org/html/rfc3986#appendix-B
|
||||
# In this case, we name each of the important matches so we can use
|
||||
# SRE_Match#groupdict to parse the values out if we so choose. This is also
|
||||
# modified to ignore other matches that are not important to the parsing of
|
||||
# the reference so we can also simply use SRE_Match#groups.
|
||||
expression = ('(?:(?P<scheme>{scheme}):)?(?://(?P<authority>{authority}))?'
|
||||
'(?P<path>{path})(?:\?(?P<query>{query}))?'
|
||||
'(?:#(?P<fragment>{fragment}))?'
|
||||
).format(**component_pattern_dict)
|
||||
|
||||
URI_MATCHER = re.compile(expression)
|
||||
|
||||
# #########################
|
||||
# Authority Matcher Section
|
||||
# #########################
|
||||
|
||||
# Host patterns, see: http://tools.ietf.org/html/rfc3986#section-3.2.2
|
||||
# The pattern for a regular name, e.g., www.google.com, api.github.com
|
||||
reg_name = '(({0})*|[{1}]*)'.format(
|
||||
'%[0-9A-Fa-f]{2}',
|
||||
important_characters['re_sub_delimiters'] +
|
||||
important_characters['re_unreserved']
|
||||
)
|
||||
# The pattern for an IPv4 address, e.g., 192.168.255.255, 127.0.0.1,
|
||||
ipv4 = '([0-9]{1,3}.){3}[0-9]{1,3}'
|
||||
# Hexadecimal characters used in each piece of an IPv6 address
|
||||
hexdig = '[0-9A-Fa-f]{1,4}'
|
||||
# Least-significant 32 bits of an IPv6 address
|
||||
ls32 = '({hex}:{hex}|{ipv4})'.format(hex=hexdig, ipv4=ipv4)
|
||||
# Substitutions into the following patterns for IPv6 patterns defined
|
||||
# http://tools.ietf.org/html/rfc3986#page-20
|
||||
subs = {'hex': hexdig, 'ls32': ls32}
|
||||
|
||||
# Below: h16 = hexdig, see: https://tools.ietf.org/html/rfc5234 for details
|
||||
# about ABNF (Augmented Backus-Naur Form) use in the comments
|
||||
variations = [
|
||||
# 6( h16 ":" ) ls32
|
||||
'(%(hex)s:){6}%(ls32)s' % subs,
|
||||
# "::" 5( h16 ":" ) ls32
|
||||
'::(%(hex)s:){5}%(ls32)s' % subs,
|
||||
# [ h16 ] "::" 4( h16 ":" ) ls32
|
||||
'(%(hex)s)?::(%(hex)s:){4}%(ls32)s' % subs,
|
||||
# [ *1( h16 ":" ) h16 ] "::" 3( h16 ":" ) ls32
|
||||
'((%(hex)s:)?%(hex)s)?::(%(hex)s:){3}%(ls32)s' % subs,
|
||||
# [ *2( h16 ":" ) h16 ] "::" 2( h16 ":" ) ls32
|
||||
'((%(hex)s:){0,2}%(hex)s)?::(%(hex)s:){2}%(ls32)s' % subs,
|
||||
# [ *3( h16 ":" ) h16 ] "::" h16 ":" ls32
|
||||
'((%(hex)s:){0,3}%(hex)s)?::%(hex)s:%(ls32)s' % subs,
|
||||
# [ *4( h16 ":" ) h16 ] "::" ls32
|
||||
'((%(hex)s:){0,4}%(hex)s)?::%(ls32)s' % subs,
|
||||
# [ *5( h16 ":" ) h16 ] "::" h16
|
||||
'((%(hex)s:){0,5}%(hex)s)?::%(hex)s' % subs,
|
||||
# [ *6( h16 ":" ) h16 ] "::"
|
||||
'((%(hex)s:){0,6}%(hex)s)?::' % subs,
|
||||
]
|
||||
|
||||
ipv6 = '(({0})|({1})|({2})|({3})|({4})|({5})|({6})|({7}))'.format(*variations)
|
||||
|
||||
ipv_future = 'v[0-9A-Fa-f]+.[%s]+' % (
|
||||
important_characters['re_unreserved'] +
|
||||
important_characters['re_sub_delimiters'] +
|
||||
':')
|
||||
|
||||
ip_literal = '\[({0}|{1})\]'.format(ipv6, ipv_future)
|
||||
|
||||
# Pattern for matching the host piece of the authority
|
||||
HOST_PATTERN = '({0}|{1}|{2})'.format(reg_name, ipv4, ip_literal)
|
||||
|
||||
SUBAUTHORITY_MATCHER = re.compile((
|
||||
'^(?:(?P<userinfo>[A-Za-z0-9_.~\-%:]+)@)?' # userinfo
|
||||
'(?P<host>{0}?)' # host
|
||||
':?(?P<port>[0-9]+)?$' # port
|
||||
).format(HOST_PATTERN))
|
||||
|
||||
IPv4_MATCHER = re.compile('^' + ipv4 + '$')
|
||||
|
||||
|
||||
# ####################
|
||||
# Path Matcher Section
|
||||
# ####################
|
||||
|
||||
# See http://tools.ietf.org/html/rfc3986#section-3.3 for more information
|
||||
# about the path patterns defined below.
|
||||
|
||||
# Percent encoded character values
|
||||
pct_encoded = '%[A-Fa-f0-9]{2}'
|
||||
pchar = ('([' + important_characters['re_unreserved']
|
||||
+ important_characters['re_sub_delimiters']
|
||||
+ ':@]|%s)' % pct_encoded)
|
||||
segments = {
|
||||
'segment': pchar + '*',
|
||||
# Non-zero length segment
|
||||
'segment-nz': pchar + '+',
|
||||
# Non-zero length segment without ":"
|
||||
'segment-nz-nc': pchar.replace(':', '') + '+'
|
||||
}
|
||||
|
||||
# Path types taken from Section 3.3 (linked above)
|
||||
path_empty = '^$'
|
||||
path_rootless = '%(segment-nz)s(/%(segment)s)*' % segments
|
||||
path_noscheme = '%(segment-nz-nc)s(/%(segment)s)*' % segments
|
||||
path_absolute = '/(%s)?' % path_rootless
|
||||
path_abempty = '(/%(segment)s)*' % segments
|
||||
|
||||
# Matcher used to validate path components
|
||||
PATH_MATCHER = re.compile('^(%s|%s|%s|%s|%s)$' % (
|
||||
path_abempty, path_absolute, path_noscheme, path_rootless, path_empty
|
||||
))
|
||||
|
||||
|
||||
# ##################################
|
||||
# Query and Fragment Matcher Section
|
||||
# ##################################
|
||||
|
||||
QUERY_MATCHER = re.compile(
|
||||
'^([/?:@' + important_characters['re_unreserved']
|
||||
+ important_characters['re_sub_delimiters']
|
||||
+ ']|%s)*$' % pct_encoded)
|
||||
|
||||
FRAGMENT_MATCHER = QUERY_MATCHER
|
||||
|
||||
# Scheme validation, see: http://tools.ietf.org/html/rfc3986#section-3.1
|
||||
SCHEME_MATCHER = re.compile('^[A-Za-z][A-Za-z0-9+.\-]*$')
|
||||
|
||||
# Relative reference matcher
|
||||
|
||||
# See http://tools.ietf.org/html/rfc3986#section-4.2 for details
|
||||
relative_part = '(//%s%s|%s|%s|%s)' % (
|
||||
component_pattern_dict['authority'], path_abempty, path_absolute,
|
||||
path_noscheme, path_empty
|
||||
)
|
||||
|
||||
RELATIVE_REF_MATCHER = re.compile('^%s(\?%s)?(#%s)?$' % (
|
||||
relative_part, QUERY_MATCHER.pattern, FRAGMENT_MATCHER.pattern
|
||||
))
|
||||
|
||||
# See http://tools.ietf.org/html/rfc3986#section-3 for definition
|
||||
hier_part = '(//%s%s|%s|%s|%s)' % (
|
||||
component_pattern_dict['authority'], path_abempty, path_absolute,
|
||||
path_rootless, path_empty
|
||||
)
|
||||
|
||||
# See http://tools.ietf.org/html/rfc3986#section-4.3
|
||||
ABSOLUTE_URI_MATCHER = re.compile('^%s:%s(\?%s)?$' % (
|
||||
component_pattern_dict['scheme'], hier_part, QUERY_MATCHER.pattern[1:-1]
|
||||
))
|
||||
|
||||
|
||||
# Path merger as defined in http://tools.ietf.org/html/rfc3986#section-5.2.3
|
||||
def merge_paths(base_uri, relative_path):
|
||||
"""Merge a base URI's path with a relative URI's path."""
|
||||
if base_uri.path is None and base_uri.authority is not None:
|
||||
return '/' + relative_path
|
||||
else:
|
||||
path = base_uri.path or ''
|
||||
index = path.rfind('/')
|
||||
return path[:index] + '/' + relative_path
|
31
setup.py
31
setup.py
@ -1,20 +1,14 @@
|
||||
#!/usr/bin/env python
|
||||
|
||||
"""Packaging logic for the rfc3986 library."""
|
||||
import io
|
||||
import os
|
||||
import sys
|
||||
|
||||
import setuptools
|
||||
|
||||
sys.path.insert(0, os.path.join(os.path.dirname(__file__), 'src')) # noqa
|
||||
|
||||
import rfc3986
|
||||
|
||||
try:
|
||||
from setuptools import setup
|
||||
except ImportError:
|
||||
from distutils.core import setup
|
||||
|
||||
if sys.argv[-1] == 'publish':
|
||||
os.system('python setup.py bdist_wheel sdist upload')
|
||||
sys.exit()
|
||||
|
||||
packages = [
|
||||
'rfc3986',
|
||||
]
|
||||
@ -22,18 +16,16 @@ packages = [
|
||||
with io.open('README.rst', encoding='utf-8') as f:
|
||||
readme = f.read()
|
||||
|
||||
with io.open('HISTORY.rst', encoding='utf-8') as f:
|
||||
history = f.read()
|
||||
|
||||
setup(
|
||||
setuptools.setup(
|
||||
name='rfc3986',
|
||||
version=rfc3986.__version__,
|
||||
description='Validating URI References per RFC 3986',
|
||||
long_description=readme + '\n\n' + history,
|
||||
long_description=readme,
|
||||
author='Ian Cordasco',
|
||||
author_email='ian.cordasco@rackspace.com',
|
||||
url='https://rfc3986.readthedocs.org',
|
||||
author_email='graffatcolmingov@gmail.com',
|
||||
url='http://rfc3986.readthedocs.io',
|
||||
packages=packages,
|
||||
package_dir={'': 'src/'},
|
||||
package_data={'': ['LICENSE']},
|
||||
include_package_data=True,
|
||||
license='Apache 2.0',
|
||||
@ -43,10 +35,11 @@ setup(
|
||||
'Natural Language :: English',
|
||||
'License :: OSI Approved :: Apache Software License',
|
||||
'Programming Language :: Python',
|
||||
'Programming Language :: Python :: 2.6',
|
||||
'Programming Language :: Python :: 2.7',
|
||||
'Programming Language :: Python :: 3',
|
||||
'Programming Language :: Python :: 3.3',
|
||||
'Programming Language :: Python :: 3.4',
|
||||
'Programming Language :: Python :: 3.5',
|
||||
'Programming Language :: Python :: 3.6',
|
||||
),
|
||||
)
|
||||
|
@ -14,18 +14,19 @@
|
||||
# limitations under the License.
|
||||
|
||||
"""
|
||||
rfc3986
|
||||
=======
|
||||
An implementation of semantics and validations described in RFC 3986.
|
||||
|
||||
An implementation of semantics and validations described in RFC 3986. See
|
||||
http://rfc3986.rtfd.org/ for documentation.
|
||||
See http://rfc3986.readthedocs.io/ for detailed documentation.
|
||||
|
||||
:copyright: (c) 2014 Rackspace
|
||||
:license: Apache v2.0, see LICENSE for details
|
||||
"""
|
||||
|
||||
from .api import (URIReference, uri_reference, is_valid_uri, normalize_uri,
|
||||
urlparse)
|
||||
from .api import is_valid_uri
|
||||
from .api import normalize_uri
|
||||
from .api import uri_reference
|
||||
from .api import URIReference
|
||||
from .api import urlparse
|
||||
from .parseresult import ParseResult
|
||||
|
||||
__title__ = 'rfc3986'
|
||||
@ -33,7 +34,7 @@ __author__ = 'Ian Cordasco'
|
||||
__author_email__ = 'graffatcolmingov@gmail.com'
|
||||
__license__ = 'Apache v2.0'
|
||||
__copyright__ = 'Copyright 2014 Rackspace'
|
||||
__version__ = '0.4.1'
|
||||
__version__ = '1.0.0'
|
||||
|
||||
__all__ = (
|
||||
'ParseResult',
|
186
src/rfc3986/abnf_regexp.py
Normal file
186
src/rfc3986/abnf_regexp.py
Normal file
@ -0,0 +1,186 @@
|
||||
# -*- coding: utf-8 -*-
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
|
||||
# implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
"""Module for the regular expressions crafted from ABNF."""
|
||||
|
||||
# https://tools.ietf.org/html/rfc3986#page-13
|
||||
GEN_DELIMS = GENERIC_DELIMITERS = ":/?#[]@"
|
||||
GENERIC_DELIMITERS_SET = set(GENERIC_DELIMITERS)
|
||||
# https://tools.ietf.org/html/rfc3986#page-13
|
||||
SUB_DELIMS = SUB_DELIMITERS = "!$&'()*+,;="
|
||||
SUB_DELIMITERS_SET = set(SUB_DELIMITERS)
|
||||
# Escape the '*' for use in regular expressions
|
||||
SUB_DELIMITERS_RE = "!$&'()\*+,;="
|
||||
RESERVED_CHARS_SET = GENERIC_DELIMITERS_SET.union(SUB_DELIMITERS_SET)
|
||||
ALPHA = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'
|
||||
DIGIT = '0123456789'
|
||||
# https://tools.ietf.org/html/rfc3986#section-2.3
|
||||
UNRESERVED = UNRESERVED_CHARS = ALPHA + DIGIT + '._!-'
|
||||
UNRESERVED_CHARS_SET = set(UNRESERVED_CHARS)
|
||||
NON_PCT_ENCODED_SET = RESERVED_CHARS_SET.union(UNRESERVED_CHARS_SET).union('%')
|
||||
# We need to escape the '-' in this case:
|
||||
UNRESERVED_RE = 'A-Za-z0-9._~\-'
|
||||
|
||||
# Percent encoded character values
|
||||
PERCENT_ENCODED = PCT_ENCODED = '%[A-Fa-f0-9]{2}'
|
||||
PCHAR = '([' + UNRESERVED_RE + SUB_DELIMITERS_RE + ':@]|%s)' % PCT_ENCODED
|
||||
|
||||
# NOTE(sigmavirus24): We're going to use more strict regular expressions
|
||||
# than appear in Appendix B for scheme. This will prevent over-eager
|
||||
# consuming of items that aren't schemes.
|
||||
SCHEME_RE = '[a-zA-Z][a-zA-Z0-9+.-]*'
|
||||
_AUTHORITY_RE = '[^/?#]*'
|
||||
_PATH_RE = '[^?#]*'
|
||||
_QUERY_RE = '[^#]*'
|
||||
_FRAGMENT_RE = '.*'
|
||||
|
||||
# Extracted from http://tools.ietf.org/html/rfc3986#appendix-B
|
||||
COMPONENT_PATTERN_DICT = {
|
||||
'scheme': SCHEME_RE,
|
||||
'authority': _AUTHORITY_RE,
|
||||
'path': _PATH_RE,
|
||||
'query': _QUERY_RE,
|
||||
'fragment': _FRAGMENT_RE,
|
||||
}
|
||||
|
||||
# See http://tools.ietf.org/html/rfc3986#appendix-B
|
||||
# In this case, we name each of the important matches so we can use
|
||||
# SRE_Match#groupdict to parse the values out if we so choose. This is also
|
||||
# modified to ignore other matches that are not important to the parsing of
|
||||
# the reference so we can also simply use SRE_Match#groups.
|
||||
URL_PARSING_RE = (
|
||||
'(?:(?P<scheme>{scheme}):)?(?://(?P<authority>{authority}))?'
|
||||
'(?P<path>{path})(?:\?(?P<query>{query}))?'
|
||||
'(?:#(?P<fragment>{fragment}))?'
|
||||
).format(**COMPONENT_PATTERN_DICT)
|
||||
|
||||
|
||||
# #########################
|
||||
# Authority Matcher Section
|
||||
# #########################
|
||||
|
||||
# Host patterns, see: http://tools.ietf.org/html/rfc3986#section-3.2.2
|
||||
# The pattern for a regular name, e.g., www.google.com, api.github.com
|
||||
REGULAR_NAME_RE = REG_NAME = '(({0})*|[{1}]*)'.format(
|
||||
'%[0-9A-Fa-f]{2}', SUB_DELIMITERS_RE + UNRESERVED_RE
|
||||
)
|
||||
# The pattern for an IPv4 address, e.g., 192.168.255.255, 127.0.0.1,
|
||||
IPv4_RE = '([0-9]{1,3}.){3}[0-9]{1,3}'
|
||||
# Hexadecimal characters used in each piece of an IPv6 address
|
||||
HEXDIG_RE = '[0-9A-Fa-f]{1,4}'
|
||||
# Least-significant 32 bits of an IPv6 address
|
||||
LS32_RE = '({hex}:{hex}|{ipv4})'.format(hex=HEXDIG_RE, ipv4=IPv4_RE)
|
||||
# Substitutions into the following patterns for IPv6 patterns defined
|
||||
# http://tools.ietf.org/html/rfc3986#page-20
|
||||
_subs = {'hex': HEXDIG_RE, 'ls32': LS32_RE}
|
||||
|
||||
# Below: h16 = hexdig, see: https://tools.ietf.org/html/rfc5234 for details
|
||||
# about ABNF (Augmented Backus-Naur Form) use in the comments
|
||||
variations = [
|
||||
# 6( h16 ":" ) ls32
|
||||
'(%(hex)s:){6}%(ls32)s' % _subs,
|
||||
# "::" 5( h16 ":" ) ls32
|
||||
'::(%(hex)s:){5}%(ls32)s' % _subs,
|
||||
# [ h16 ] "::" 4( h16 ":" ) ls32
|
||||
'(%(hex)s)?::(%(hex)s:){4}%(ls32)s' % _subs,
|
||||
# [ *1( h16 ":" ) h16 ] "::" 3( h16 ":" ) ls32
|
||||
'((%(hex)s:)?%(hex)s)?::(%(hex)s:){3}%(ls32)s' % _subs,
|
||||
# [ *2( h16 ":" ) h16 ] "::" 2( h16 ":" ) ls32
|
||||
'((%(hex)s:){0,2}%(hex)s)?::(%(hex)s:){2}%(ls32)s' % _subs,
|
||||
# [ *3( h16 ":" ) h16 ] "::" h16 ":" ls32
|
||||
'((%(hex)s:){0,3}%(hex)s)?::%(hex)s:%(ls32)s' % _subs,
|
||||
# [ *4( h16 ":" ) h16 ] "::" ls32
|
||||
'((%(hex)s:){0,4}%(hex)s)?::%(ls32)s' % _subs,
|
||||
# [ *5( h16 ":" ) h16 ] "::" h16
|
||||
'((%(hex)s:){0,5}%(hex)s)?::%(hex)s' % _subs,
|
||||
# [ *6( h16 ":" ) h16 ] "::"
|
||||
'((%(hex)s:){0,6}%(hex)s)?::' % _subs,
|
||||
]
|
||||
|
||||
IPv6_RE = '(({0})|({1})|({2})|({3})|({4})|({5})|({6})|({7}))'.format(
|
||||
*variations
|
||||
)
|
||||
|
||||
IPv_FUTURE_RE = 'v[0-9A-Fa-f]+.[%s]+' % (
|
||||
UNRESERVED_RE + SUB_DELIMITERS_RE + ':'
|
||||
)
|
||||
|
||||
|
||||
# RFC 6874 Zone ID ABNF
|
||||
ZONE_ID = '(?:[' + UNRESERVED_RE + ']|' + PCT_ENCODED + ')+'
|
||||
IPv6_ADDRZ_RE = IPv6_RE + '%25' + ZONE_ID
|
||||
|
||||
IP_LITERAL_RE = '\[({0}|(?:{1})|{2})\]'.format(
|
||||
IPv6_RE,
|
||||
IPv6_ADDRZ_RE,
|
||||
IPv_FUTURE_RE,
|
||||
)
|
||||
|
||||
# Pattern for matching the host piece of the authority
|
||||
HOST_RE = HOST_PATTERN = '({0}|{1}|{2})'.format(
|
||||
REG_NAME,
|
||||
IPv4_RE,
|
||||
IP_LITERAL_RE,
|
||||
)
|
||||
USERINFO_RE = '^[A-Za-z0-9_.~\-%:]+'
|
||||
PORT_RE = '[0-9]{1,5}'
|
||||
|
||||
# ####################
|
||||
# Path Matcher Section
|
||||
# ####################
|
||||
|
||||
# See http://tools.ietf.org/html/rfc3986#section-3.3 for more information
|
||||
# about the path patterns defined below.
|
||||
segments = {
|
||||
'segment': PCHAR + '*',
|
||||
# Non-zero length segment
|
||||
'segment-nz': PCHAR + '+',
|
||||
# Non-zero length segment without ":"
|
||||
'segment-nz-nc': PCHAR.replace(':', '') + '+'
|
||||
}
|
||||
|
||||
# Path types taken from Section 3.3 (linked above)
|
||||
PATH_EMPTY = '^$'
|
||||
PATH_ROOTLESS = '%(segment-nz)s(/%(segment)s)*' % segments
|
||||
PATH_NOSCHEME = '%(segment-nz-nc)s(/%(segment)s)*' % segments
|
||||
PATH_ABSOLUTE = '/(%s)?' % PATH_ROOTLESS
|
||||
PATH_ABEMPTY = '(/%(segment)s)*' % segments
|
||||
PATH_RE = '^(%s|%s|%s|%s|%s)$' % (
|
||||
PATH_ABEMPTY, PATH_ABSOLUTE, PATH_NOSCHEME, PATH_ROOTLESS, PATH_EMPTY
|
||||
)
|
||||
|
||||
FRAGMENT_RE = QUERY_RE = (
|
||||
'^([/?:@' + UNRESERVED_RE + SUB_DELIMITERS_RE + ']|%s)*$' % PCT_ENCODED
|
||||
)
|
||||
|
||||
# ##########################
|
||||
# Relative reference matcher
|
||||
# ##########################
|
||||
|
||||
# See http://tools.ietf.org/html/rfc3986#section-4.2 for details
|
||||
RELATIVE_PART_RE = '(//%s%s|%s|%s|%s)' % (
|
||||
COMPONENT_PATTERN_DICT['authority'],
|
||||
PATH_ABEMPTY,
|
||||
PATH_ABSOLUTE,
|
||||
PATH_NOSCHEME,
|
||||
PATH_EMPTY,
|
||||
)
|
||||
|
||||
# See http://tools.ietf.org/html/rfc3986#section-3 for definition
|
||||
HIER_PART_RE = '(//%s%s|%s|%s|%s)' % (
|
||||
COMPONENT_PATTERN_DICT['authority'],
|
||||
PATH_ABEMPTY,
|
||||
PATH_ABSOLUTE,
|
||||
PATH_ROOTLESS,
|
||||
PATH_EMPTY,
|
||||
)
|
@ -13,15 +13,14 @@
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
"""
|
||||
rfc3986.api
|
||||
~~~~~~~~~~~
|
||||
Module containing the simple and functional API for rfc3986.
|
||||
|
||||
This defines the simple API to rfc3986. This module defines 3 functions and
|
||||
provides access to the class ``URIReference``.
|
||||
This module defines functions and provides access to the public attributes
|
||||
and classes of rfc3986.
|
||||
"""
|
||||
|
||||
from .uri import URIReference
|
||||
from .parseresult import ParseResult
|
||||
from .uri import URIReference
|
||||
|
||||
|
||||
def uri_reference(uri, encoding='utf-8'):
|
298
src/rfc3986/builder.py
Normal file
298
src/rfc3986/builder.py
Normal file
@ -0,0 +1,298 @@
|
||||
# -*- coding: utf-8 -*-
|
||||
# Copyright (c) 2017 Ian Cordasco
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
|
||||
# implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
"""Module containing the logic for the URIBuilder object."""
|
||||
from . import compat
|
||||
from . import normalizers
|
||||
from . import uri
|
||||
|
||||
|
||||
class URIBuilder(object):
|
||||
"""Object to aid in building up a URI Reference from parts.
|
||||
|
||||
.. note::
|
||||
|
||||
This object should be instantiated by the user, but it's recommended
|
||||
that it is not provided with arguments. Instead, use the available
|
||||
method to populate the fields.
|
||||
|
||||
"""
|
||||
|
||||
def __init__(self, scheme=None, userinfo=None, host=None, port=None,
|
||||
path=None, query=None, fragment=None):
|
||||
"""Initialize our URI builder.
|
||||
|
||||
:param str scheme:
|
||||
(optional)
|
||||
:param str userinfo:
|
||||
(optional)
|
||||
:param str host:
|
||||
(optional)
|
||||
:param int port:
|
||||
(optional)
|
||||
:param str path:
|
||||
(optional)
|
||||
:param str query:
|
||||
(optional)
|
||||
:param str fragment:
|
||||
(optional)
|
||||
"""
|
||||
self.scheme = scheme
|
||||
self.userinfo = userinfo
|
||||
self.host = host
|
||||
self.port = port
|
||||
self.path = path
|
||||
self.query = query
|
||||
self.fragment = fragment
|
||||
|
||||
def __repr__(self):
|
||||
"""Provide a convenient view of our builder object."""
|
||||
formatstr = ('URIBuilder(scheme={b.scheme}, userinfo={b.userinfo}, '
|
||||
'host={b.host}, port={b.port}, path={b.path}, '
|
||||
'query={b.query}, fragment={b.fragment})')
|
||||
return formatstr.format(b=self)
|
||||
|
||||
def add_scheme(self, scheme):
|
||||
"""Add a scheme to our builder object.
|
||||
|
||||
After normalizing, this will generate a new URIBuilder instance with
|
||||
the specified scheme and all other attributes the same.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
>>> URIBuilder().add_scheme('HTTPS')
|
||||
URIBuilder(scheme='https', userinfo=None, host=None, port=None,
|
||||
path=None, query=None, fragment=None)
|
||||
|
||||
"""
|
||||
scheme = normalizers.normalize_scheme(scheme)
|
||||
return URIBuilder(
|
||||
scheme=scheme,
|
||||
userinfo=self.userinfo,
|
||||
host=self.host,
|
||||
port=self.port,
|
||||
path=self.path,
|
||||
query=self.query,
|
||||
fragment=self.fragment,
|
||||
)
|
||||
|
||||
def add_credentials(self, username, password):
|
||||
"""Add credentials as the userinfo portion of the URI.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
>>> URIBuilder().add_credentials('root', 's3crete')
|
||||
URIBuilder(scheme=None, userinfo='root:s3crete', host=None,
|
||||
port=None, path=None, query=None, fragment=None)
|
||||
|
||||
>>> URIBuilder().add_credentials('root', None)
|
||||
URIBuilder(scheme=None, userinfo='root', host=None,
|
||||
port=None, path=None, query=None, fragment=None)
|
||||
"""
|
||||
if username is None:
|
||||
raise ValueError('Username cannot be None')
|
||||
userinfo = normalizers.normalize_username(username)
|
||||
|
||||
if password is not None:
|
||||
userinfo = '{}:{}'.format(
|
||||
userinfo,
|
||||
normalizers.normalize_password(password),
|
||||
)
|
||||
|
||||
return URIBuilder(
|
||||
scheme=self.scheme,
|
||||
userinfo=userinfo,
|
||||
host=self.host,
|
||||
port=self.port,
|
||||
path=self.path,
|
||||
query=self.query,
|
||||
fragment=self.fragment,
|
||||
)
|
||||
|
||||
def add_host(self, host):
|
||||
"""Add hostname to the URI.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
>>> URIBuilder().add_host('google.com')
|
||||
URIBuilder(scheme=None, userinfo=None, host='google.com',
|
||||
port=None, path=None, query=None, fragment=None)
|
||||
|
||||
"""
|
||||
return URIBuilder(
|
||||
scheme=self.scheme,
|
||||
userinfo=self.userinfo,
|
||||
host=normalizers.normalize_host(host),
|
||||
port=self.port,
|
||||
path=self.path,
|
||||
query=self.query,
|
||||
fragment=self.fragment,
|
||||
)
|
||||
|
||||
def add_port(self, port):
|
||||
"""Add port to the URI.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
>>> URIBuilder().add_port(80)
|
||||
URIBuilder(scheme=None, userinfo=None, host=None, port='80',
|
||||
path=None, query=None, fragment=None)
|
||||
|
||||
>>> URIBuilder().add_port(443)
|
||||
URIBuilder(scheme=None, userinfo=None, host=None, port='443',
|
||||
path=None, query=None, fragment=None)
|
||||
|
||||
"""
|
||||
port_int = int(port)
|
||||
if port_int < 0:
|
||||
raise ValueError(
|
||||
'ports are not allowed to be negative. You provided {}'.format(
|
||||
port_int,
|
||||
)
|
||||
)
|
||||
if port_int > 65535:
|
||||
raise ValueError(
|
||||
'ports are not allowed to be larger than 65535. '
|
||||
'You provided {}'.format(
|
||||
port_int,
|
||||
)
|
||||
)
|
||||
|
||||
return URIBuilder(
|
||||
scheme=self.scheme,
|
||||
userinfo=self.userinfo,
|
||||
host=self.host,
|
||||
port='{}'.format(port_int),
|
||||
path=self.path,
|
||||
query=self.query,
|
||||
fragment=self.fragment,
|
||||
)
|
||||
|
||||
def add_path(self, path):
|
||||
"""Add a path to the URI.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
>>> URIBuilder().add_path('sigmavirus24/rfc3985')
|
||||
URIBuilder(scheme=None, userinfo=None, host=None, port=None,
|
||||
path='/sigmavirus24/rfc3986', query=None, fragment=None)
|
||||
|
||||
>>> URIBuilder().add_path('/checkout.php')
|
||||
URIBuilder(scheme=None, userinfo=None, host=None, port=None,
|
||||
path='/checkout.php', query=None, fragment=None)
|
||||
|
||||
"""
|
||||
if not path.startswith('/'):
|
||||
path = '/{}'.format(path)
|
||||
|
||||
return URIBuilder(
|
||||
scheme=self.scheme,
|
||||
userinfo=self.userinfo,
|
||||
host=self.host,
|
||||
port=self.port,
|
||||
path=normalizers.normalize_path(path),
|
||||
query=self.query,
|
||||
fragment=self.fragment,
|
||||
)
|
||||
|
||||
def add_query_from(self, query_items):
|
||||
"""Generate and add a query a dictionary or list of tuples.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
>>> URIBuilder().add_query_from({'a': 'b c'})
|
||||
URIBuilder(scheme=None, userinfo=None, host=None, port=None,
|
||||
path=None, query='a=b+c', fragment=None)
|
||||
|
||||
>>> URIBuilder().add_query_from([('a', 'b c')])
|
||||
URIBuilder(scheme=None, userinfo=None, host=None, port=None,
|
||||
path=None, query='a=b+c', fragment=None)
|
||||
|
||||
"""
|
||||
query = normalizers.normalize_query(compat.urlencode(query_items))
|
||||
|
||||
return URIBuilder(
|
||||
scheme=self.scheme,
|
||||
userinfo=self.userinfo,
|
||||
host=self.host,
|
||||
port=self.port,
|
||||
path=self.path,
|
||||
query=query,
|
||||
fragment=self.fragment,
|
||||
)
|
||||
|
||||
def add_query(self, query):
|
||||
"""Add a pre-formated query string to the URI.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
>>> URIBuilder().add_query('a=b&c=d')
|
||||
URIBuilder(scheme=None, userinfo=None, host=None, port=None,
|
||||
path=None, query='a=b&c=d', fragment=None)
|
||||
|
||||
"""
|
||||
return URIBuilder(
|
||||
scheme=self.scheme,
|
||||
userinfo=self.userinfo,
|
||||
host=self.host,
|
||||
port=self.port,
|
||||
path=self.path,
|
||||
query=normalizers.normalize_query(query),
|
||||
fragment=self.fragment,
|
||||
)
|
||||
|
||||
def add_fragment(self, fragment):
|
||||
"""Add a fragment to the URI.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
>>> URIBuilder().add_fragment('section-2.6.1')
|
||||
URIBuilder(scheme=None, userinfo=None, host=None, port=None,
|
||||
path=None, query=None, fragment='section-2.6.1')
|
||||
|
||||
"""
|
||||
return URIBuilder(
|
||||
scheme=self.scheme,
|
||||
userinfo=self.userinfo,
|
||||
host=self.host,
|
||||
port=self.port,
|
||||
path=self.path,
|
||||
query=self.query,
|
||||
fragment=normalizers.normalize_fragment(fragment),
|
||||
)
|
||||
|
||||
def finalize(self):
|
||||
"""Create a URIReference from our builder.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
>>> URIBuilder().add_scheme('https').add_host('github.com'
|
||||
... ).add_path('sigmavirus24/rfc3986').finalize().unsplit()
|
||||
'https://github.com/sigmavirus24/rfc3986'
|
||||
|
||||
>>> URIBuilder().add_scheme('https').add_host('github.com'
|
||||
... ).add_path('sigmavirus24/rfc3986').add_credentials(
|
||||
... 'sigmavirus24', 'not-re@l').finalize().unsplit()
|
||||
'https://sigmavirus24:not-re%40l@github.com/sigmavirus24/rfc3986'
|
||||
|
||||
"""
|
||||
return uri.URIReference(
|
||||
self.scheme,
|
||||
normalizers.normalize_authority(
|
||||
(self.userinfo, self.host, self.port)
|
||||
),
|
||||
self.path,
|
||||
self.query,
|
||||
self.fragment,
|
||||
)
|
@ -12,20 +12,43 @@
|
||||
# implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
"""Compatibility module for Python 2 and 3 support."""
|
||||
import sys
|
||||
|
||||
try:
|
||||
from urllib.parse import quote as urlquote
|
||||
except ImportError: # Python 2.x
|
||||
from urllib import quote as urlquote
|
||||
|
||||
if sys.version_info >= (3, 0):
|
||||
try:
|
||||
from urllib.parse import urlencode
|
||||
except ImportError: # Python 2.x
|
||||
from urllib import urlencode
|
||||
|
||||
__all__ = (
|
||||
'to_bytes',
|
||||
'to_str',
|
||||
'urlquote',
|
||||
'urlencode',
|
||||
)
|
||||
|
||||
PY3 = (3, 0) <= sys.version_info < (4, 0)
|
||||
PY2 = (2, 6) <= sys.version_info < (2, 8)
|
||||
|
||||
|
||||
if PY3:
|
||||
unicode = str # Python 3.x
|
||||
|
||||
|
||||
def to_str(b, encoding):
|
||||
"""Ensure that b is text in the specified encoding."""
|
||||
if hasattr(b, 'decode') and not isinstance(b, unicode):
|
||||
b = b.decode('utf-8')
|
||||
return b
|
||||
|
||||
|
||||
def to_bytes(s, encoding):
|
||||
"""Ensure that s is converted to bytes from the encoding."""
|
||||
if hasattr(s, 'encode') and not isinstance(s, bytes):
|
||||
s = s.encode('utf-8')
|
||||
return s
|
92
src/rfc3986/exceptions.py
Normal file
92
src/rfc3986/exceptions.py
Normal file
@ -0,0 +1,92 @@
|
||||
# -*- coding: utf-8 -*-
|
||||
"""Exceptions module for rfc3986."""
|
||||
|
||||
|
||||
class RFC3986Exception(Exception):
|
||||
"""Base class for all rfc3986 exception classes."""
|
||||
|
||||
pass
|
||||
|
||||
|
||||
class InvalidAuthority(RFC3986Exception):
|
||||
"""Exception when the authority string is invalid."""
|
||||
|
||||
def __init__(self, authority):
|
||||
"""Initialize the exception with the invalid authority."""
|
||||
super(InvalidAuthority, self).__init__(
|
||||
"The authority ({0}) is not valid.".format(authority))
|
||||
|
||||
|
||||
class InvalidPort(RFC3986Exception):
|
||||
"""Exception when the port is invalid."""
|
||||
|
||||
def __init__(self, port):
|
||||
"""Initialize the exception with the invalid port."""
|
||||
super(InvalidPort, self).__init__(
|
||||
'The port ("{0}") is not valid.'.format(port))
|
||||
|
||||
|
||||
class ResolutionError(RFC3986Exception):
|
||||
"""Exception to indicate a failure to resolve a URI."""
|
||||
|
||||
def __init__(self, uri):
|
||||
"""Initialize the error with the failed URI."""
|
||||
super(ResolutionError, self).__init__(
|
||||
"{0} is not an absolute URI.".format(uri.unsplit()))
|
||||
|
||||
|
||||
class ValidationError(RFC3986Exception):
|
||||
"""Exception raised during Validation of a URI."""
|
||||
|
||||
pass
|
||||
|
||||
|
||||
class MissingComponentError(ValidationError):
|
||||
"""Exception raised when a required component is missing."""
|
||||
|
||||
def __init__(self, uri, *component_names):
|
||||
"""Initialize the error with the missing component name."""
|
||||
verb = 'was'
|
||||
if len(component_names) > 1:
|
||||
verb = 'were'
|
||||
|
||||
self.uri = uri
|
||||
self.components = sorted(component_names)
|
||||
components = ', '.join(self.components)
|
||||
super(MissingComponentError, self).__init__(
|
||||
"{} {} required but missing".format(components, verb),
|
||||
uri,
|
||||
self.components,
|
||||
)
|
||||
|
||||
|
||||
class UnpermittedComponentError(ValidationError):
|
||||
"""Exception raised when a component has an unpermitted value."""
|
||||
|
||||
def __init__(self, component_name, component_value, allowed_values):
|
||||
"""Initialize the error with the unpermitted component."""
|
||||
super(UnpermittedComponentError, self).__init__(
|
||||
"{} was required to be one of {!r} but was {!r}".format(
|
||||
component_name, list(sorted(allowed_values)), component_value,
|
||||
),
|
||||
component_name,
|
||||
component_value,
|
||||
allowed_values,
|
||||
)
|
||||
self.component_name = component_name
|
||||
self.component_value = component_value
|
||||
self.allowed_values = allowed_values
|
||||
|
||||
|
||||
class PasswordForbidden(ValidationError):
|
||||
"""Exception raised when a URL has a password in the userinfo section."""
|
||||
|
||||
def __init__(self, uri):
|
||||
"""Initialize the error with the URI that failed validation."""
|
||||
unsplit = getattr(uri, 'unsplit', lambda: uri)
|
||||
super(PasswordForbidden, self).__init__(
|
||||
'"{}" contained a password when validation forbade it'.format(
|
||||
unsplit()
|
||||
)
|
||||
)
|
||||
self.uri = uri
|
102
src/rfc3986/misc.py
Normal file
102
src/rfc3986/misc.py
Normal file
@ -0,0 +1,102 @@
|
||||
# -*- coding: utf-8 -*-
|
||||
# Copyright (c) 2014 Rackspace
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
|
||||
# implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
"""
|
||||
Module containing compiled regular expressions and constants.
|
||||
|
||||
This module contains important constants, patterns, and compiled regular
|
||||
expressions for parsing and validating URIs and their components.
|
||||
"""
|
||||
|
||||
import re
|
||||
|
||||
from . import abnf_regexp
|
||||
|
||||
# These are enumerated for the named tuple used as a superclass of
|
||||
# URIReference
|
||||
URI_COMPONENTS = ['scheme', 'authority', 'path', 'query', 'fragment']
|
||||
|
||||
important_characters = {
|
||||
'generic_delimiters': abnf_regexp.GENERIC_DELIMITERS,
|
||||
'sub_delimiters': abnf_regexp.SUB_DELIMITERS,
|
||||
# We need to escape the '*' in this case
|
||||
're_sub_delimiters': abnf_regexp.SUB_DELIMITERS_RE,
|
||||
'unreserved_chars': abnf_regexp.UNRESERVED_CHARS,
|
||||
# We need to escape the '-' in this case:
|
||||
're_unreserved': abnf_regexp.UNRESERVED_RE,
|
||||
}
|
||||
|
||||
# For details about delimiters and reserved characters, see:
|
||||
# http://tools.ietf.org/html/rfc3986#section-2.2
|
||||
GENERIC_DELIMITERS = abnf_regexp.GENERIC_DELIMITERS_SET
|
||||
SUB_DELIMITERS = abnf_regexp.SUB_DELIMITERS_SET
|
||||
RESERVED_CHARS = abnf_regexp.RESERVED_CHARS_SET
|
||||
# For details about unreserved characters, see:
|
||||
# http://tools.ietf.org/html/rfc3986#section-2.3
|
||||
UNRESERVED_CHARS = abnf_regexp.UNRESERVED_CHARS_SET
|
||||
NON_PCT_ENCODED = abnf_regexp.NON_PCT_ENCODED_SET
|
||||
|
||||
URI_MATCHER = re.compile(abnf_regexp.URL_PARSING_RE)
|
||||
|
||||
SUBAUTHORITY_MATCHER = re.compile((
|
||||
'^(?:(?P<userinfo>{0})@)?' # userinfo
|
||||
'(?P<host>{1})' # host
|
||||
':?(?P<port>{2})?$' # port
|
||||
).format(abnf_regexp.USERINFO_RE,
|
||||
abnf_regexp.HOST_PATTERN,
|
||||
abnf_regexp.PORT_RE))
|
||||
|
||||
|
||||
IPv4_MATCHER = re.compile('^' + abnf_regexp.IPv4_RE + '$')
|
||||
|
||||
# Matcher used to validate path components
|
||||
PATH_MATCHER = re.compile(abnf_regexp.PATH_RE)
|
||||
|
||||
|
||||
# ##################################
|
||||
# Query and Fragment Matcher Section
|
||||
# ##################################
|
||||
|
||||
QUERY_MATCHER = re.compile(abnf_regexp.QUERY_RE)
|
||||
|
||||
FRAGMENT_MATCHER = QUERY_MATCHER
|
||||
|
||||
# Scheme validation, see: http://tools.ietf.org/html/rfc3986#section-3.1
|
||||
SCHEME_MATCHER = re.compile('^{0}$'.format(abnf_regexp.SCHEME_RE))
|
||||
|
||||
RELATIVE_REF_MATCHER = re.compile('^%s(\?%s)?(#%s)?$' % (
|
||||
abnf_regexp.RELATIVE_PART_RE, abnf_regexp.QUERY_RE,
|
||||
abnf_regexp.FRAGMENT_RE,
|
||||
))
|
||||
|
||||
# See http://tools.ietf.org/html/rfc3986#section-4.3
|
||||
ABSOLUTE_URI_MATCHER = re.compile('^%s:%s(\?%s)?$' % (
|
||||
abnf_regexp.COMPONENT_PATTERN_DICT['scheme'],
|
||||
abnf_regexp.HIER_PART_RE,
|
||||
abnf_regexp.QUERY_RE[1:-1],
|
||||
))
|
||||
|
||||
|
||||
# Path merger as defined in http://tools.ietf.org/html/rfc3986#section-5.2.3
|
||||
def merge_paths(base_uri, relative_path):
|
||||
"""Merge a base URI's path with a relative URI's path."""
|
||||
if base_uri.path is None and base_uri.authority is not None:
|
||||
return '/' + relative_path
|
||||
else:
|
||||
path = base_uri.path or ''
|
||||
index = path.rfind('/')
|
||||
return path[:index] + '/' + relative_path
|
||||
|
||||
|
||||
UseExisting = object()
|
@ -12,29 +12,48 @@
|
||||
# implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
"""Module with functions to normalize components."""
|
||||
import re
|
||||
|
||||
from .compat import to_bytes
|
||||
from .misc import NON_PCT_ENCODED
|
||||
from . import compat
|
||||
from . import misc
|
||||
|
||||
|
||||
def normalize_scheme(scheme):
|
||||
"""Normalize the scheme component."""
|
||||
return scheme.lower()
|
||||
|
||||
|
||||
def normalize_authority(authority):
|
||||
"""Normalize an authority tuple to a string."""
|
||||
userinfo, host, port = authority
|
||||
result = ''
|
||||
if userinfo:
|
||||
result += normalize_percent_characters(userinfo) + '@'
|
||||
if host:
|
||||
result += host.lower()
|
||||
result += normalize_host(host)
|
||||
if port:
|
||||
result += ':' + port
|
||||
return result
|
||||
|
||||
|
||||
def normalize_username(username):
|
||||
"""Normalize a username to make it safe to include in userinfo."""
|
||||
return compat.urlquote(username)
|
||||
|
||||
|
||||
def normalize_password(password):
|
||||
"""Normalize a password to make safe for userinfo."""
|
||||
return compat.urlquote(password)
|
||||
|
||||
|
||||
def normalize_host(host):
|
||||
"""Normalize a host string."""
|
||||
return host.lower()
|
||||
|
||||
|
||||
def normalize_path(path):
|
||||
"""Normalize the path string."""
|
||||
if not path:
|
||||
return path
|
||||
|
||||
@ -43,12 +62,14 @@ def normalize_path(path):
|
||||
|
||||
|
||||
def normalize_query(query):
|
||||
"""Normalize the query string."""
|
||||
if not query:
|
||||
return query
|
||||
return normalize_percent_characters(query)
|
||||
|
||||
|
||||
def normalize_fragment(fragment):
|
||||
"""Normalize the fragment string."""
|
||||
if not fragment:
|
||||
return fragment
|
||||
return normalize_percent_characters(fragment)
|
||||
@ -70,6 +91,10 @@ def normalize_percent_characters(s):
|
||||
|
||||
|
||||
def remove_dot_segments(s):
|
||||
"""Remove dot segments from the string.
|
||||
|
||||
See also Section 5.2.4 of :rfc:`3986`.
|
||||
"""
|
||||
# See http://tools.ietf.org/html/rfc3986#section-5.2.4 for pseudo-code
|
||||
segments = s.split('/') # Turn the path into a list of segments
|
||||
output = [] # Initialize the variable to use to store output
|
||||
@ -100,10 +125,11 @@ def remove_dot_segments(s):
|
||||
|
||||
|
||||
def encode_component(uri_component, encoding):
|
||||
"""Encode the specific component in the provided encoding."""
|
||||
if uri_component is None:
|
||||
return uri_component
|
||||
|
||||
uri_bytes = to_bytes(uri_component, encoding)
|
||||
uri_bytes = compat.to_bytes(uri_component, encoding)
|
||||
|
||||
encoded_uri = bytearray()
|
||||
|
||||
@ -111,7 +137,7 @@ def encode_component(uri_component, encoding):
|
||||
# Will return a single character bytestring on both Python 2 & 3
|
||||
byte = uri_bytes[i:i+1]
|
||||
byte_ord = ord(byte)
|
||||
if byte_ord < 128 and byte.decode() in NON_PCT_ENCODED:
|
||||
if byte_ord < 128 and byte.decode() in misc.NON_PCT_ENCODED:
|
||||
encoded_uri.extend(byte)
|
||||
continue
|
||||
encoded_uri.extend('%{0:02x}'.format(byte_ord).encode())
|
@ -12,10 +12,12 @@
|
||||
# implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
"""Module containing the urlparse compatibility logic."""
|
||||
from collections import namedtuple
|
||||
|
||||
from . import compat
|
||||
from . import exceptions
|
||||
from . import misc
|
||||
from . import normalizers
|
||||
from . import uri
|
||||
|
||||
@ -44,31 +46,38 @@ class ParseResultMixin(object):
|
||||
return self.authority
|
||||
|
||||
def geturl(self):
|
||||
"""Standard library shim to the unsplit method."""
|
||||
"""Shim to match the standard library method."""
|
||||
return self.unsplit()
|
||||
|
||||
@property
|
||||
def hostname(self):
|
||||
"""Standard library shim for the host portion of the URI."""
|
||||
"""Shim to match the standard library."""
|
||||
return self.host
|
||||
|
||||
@property
|
||||
def netloc(self):
|
||||
"""Standard library shim for the authority portion of the URI."""
|
||||
"""Shim to match the standard library."""
|
||||
return self.authority
|
||||
|
||||
@property
|
||||
def params(self):
|
||||
"""Standard library shim for the query portion of the URI."""
|
||||
"""Shim to match the standard library."""
|
||||
return self.query
|
||||
|
||||
|
||||
class ParseResult(namedtuple('ParseResult', PARSED_COMPONENTS),
|
||||
ParseResultMixin):
|
||||
"""Implementation of urlparse compatibility class.
|
||||
|
||||
This uses the URIReference logic to handle compatibility with the
|
||||
urlparse.ParseResult class.
|
||||
"""
|
||||
|
||||
slots = ()
|
||||
|
||||
def __new__(cls, scheme, userinfo, host, port, path, query, fragment,
|
||||
uri_ref, encoding='utf-8'):
|
||||
"""Create a new ParseResult."""
|
||||
parse_result = super(ParseResult, cls).__new__(
|
||||
cls,
|
||||
scheme or None,
|
||||
@ -139,16 +148,19 @@ class ParseResult(namedtuple('ParseResult', PARSED_COMPONENTS),
|
||||
|
||||
@property
|
||||
def authority(self):
|
||||
"""Normalized authority generated from the subauthority parts."""
|
||||
"""Return the normalized authority."""
|
||||
return self.reference.authority
|
||||
|
||||
def copy_with(self, scheme=None, userinfo=None, host=None, port=None,
|
||||
path=None, query=None, fragment=None):
|
||||
def copy_with(self, scheme=misc.UseExisting, userinfo=misc.UseExisting,
|
||||
host=misc.UseExisting, port=misc.UseExisting,
|
||||
path=misc.UseExisting, query=misc.UseExisting,
|
||||
fragment=misc.UseExisting):
|
||||
"""Create a copy of this instance replacing with specified parts."""
|
||||
attributes = zip(PARSED_COMPONENTS,
|
||||
(scheme, userinfo, host, port, path, query, fragment))
|
||||
attrs_dict = {}
|
||||
for name, value in attributes:
|
||||
if value is None:
|
||||
if value is misc.UseExisting:
|
||||
value = getattr(self, name)
|
||||
attrs_dict[name] = value
|
||||
authority = self._generate_authority(attrs_dict)
|
||||
@ -160,6 +172,7 @@ class ParseResult(namedtuple('ParseResult', PARSED_COMPONENTS),
|
||||
return ParseResult(uri_ref=ref, encoding=self.encoding, **attrs_dict)
|
||||
|
||||
def encode(self, encoding=None):
|
||||
"""Convert to an instance of ParseResultBytes."""
|
||||
encoding = encoding or self.encoding
|
||||
attrs = dict(
|
||||
zip(PARSED_COMPONENTS,
|
||||
@ -187,8 +200,11 @@ class ParseResult(namedtuple('ParseResult', PARSED_COMPONENTS),
|
||||
|
||||
class ParseResultBytes(namedtuple('ParseResultBytes', PARSED_COMPONENTS),
|
||||
ParseResultMixin):
|
||||
"""Compatibility shim for the urlparse.ParseResultBytes object."""
|
||||
|
||||
def __new__(cls, scheme, userinfo, host, port, path, query, fragment,
|
||||
uri_ref, encoding='utf-8', lazy_normalize=True):
|
||||
"""Create a new ParseResultBytes instance."""
|
||||
parse_result = super(ParseResultBytes, cls).__new__(
|
||||
cls,
|
||||
scheme or None,
|
||||
@ -267,16 +283,19 @@ class ParseResultBytes(namedtuple('ParseResultBytes', PARSED_COMPONENTS),
|
||||
|
||||
@property
|
||||
def authority(self):
|
||||
"""Normalized authority generated from the subauthority parts."""
|
||||
"""Return the normalized authority."""
|
||||
return self.reference.authority.encode(self.encoding)
|
||||
|
||||
def copy_with(self, scheme=None, userinfo=None, host=None, port=None,
|
||||
path=None, query=None, fragment=None, lazy_normalize=True):
|
||||
def copy_with(self, scheme=misc.UseExisting, userinfo=misc.UseExisting,
|
||||
host=misc.UseExisting, port=misc.UseExisting,
|
||||
path=misc.UseExisting, query=misc.UseExisting,
|
||||
fragment=misc.UseExisting, lazy_normalize=True):
|
||||
"""Create a copy of this instance replacing with specified parts."""
|
||||
attributes = zip(PARSED_COMPONENTS,
|
||||
(scheme, userinfo, host, port, path, query, fragment))
|
||||
attrs_dict = {}
|
||||
for name, value in attributes:
|
||||
if value is None:
|
||||
if value is misc.UseExisting:
|
||||
value = getattr(self, name)
|
||||
if not isinstance(value, bytes) and hasattr(value, 'encode'):
|
||||
value = value.encode(self.encoding)
|
@ -1,3 +1,4 @@
|
||||
"""Module containing the implementation of the URIReference class."""
|
||||
# -*- coding: utf-8 -*-
|
||||
# Copyright (c) 2014 Rackspace
|
||||
# Copyright (c) 2015 Ian Cordasco
|
||||
@ -15,24 +16,76 @@
|
||||
# limitations under the License.
|
||||
from collections import namedtuple
|
||||
|
||||
from .compat import to_str
|
||||
from .exceptions import InvalidAuthority, ResolutionError
|
||||
from .misc import (
|
||||
ABSOLUTE_URI_MATCHER, FRAGMENT_MATCHER, IPv4_MATCHER, PATH_MATCHER,
|
||||
QUERY_MATCHER, SCHEME_MATCHER, SUBAUTHORITY_MATCHER, URI_MATCHER,
|
||||
URI_COMPONENTS, merge_paths
|
||||
)
|
||||
from .normalizers import (
|
||||
encode_component, normalize_scheme, normalize_authority, normalize_path,
|
||||
normalize_query, normalize_fragment
|
||||
)
|
||||
from . import compat
|
||||
from . import exceptions as exc
|
||||
from . import misc
|
||||
from . import normalizers
|
||||
from . import validators
|
||||
|
||||
|
||||
class URIReference(namedtuple('URIReference', URI_COMPONENTS)):
|
||||
class URIReference(namedtuple('URIReference', misc.URI_COMPONENTS)):
|
||||
"""Immutable object representing a parsed URI Reference.
|
||||
|
||||
.. note::
|
||||
|
||||
This class is not intended to be directly instantiated by the user.
|
||||
|
||||
This object exposes attributes for the following components of a
|
||||
URI:
|
||||
|
||||
- scheme
|
||||
- authority
|
||||
- path
|
||||
- query
|
||||
- fragment
|
||||
|
||||
.. attribute:: scheme
|
||||
|
||||
The scheme that was parsed for the URI Reference. For example,
|
||||
``http``, ``https``, ``smtp``, ``imap``, etc.
|
||||
|
||||
.. attribute:: authority
|
||||
|
||||
Component of the URI that contains the user information, host,
|
||||
and port sub-components. For example,
|
||||
``google.com``, ``127.0.0.1:5000``, ``username@[::1]``,
|
||||
``username:password@example.com:443``, etc.
|
||||
|
||||
.. attribute:: path
|
||||
|
||||
The path that was parsed for the given URI Reference. For example,
|
||||
``/``, ``/index.php``, etc.
|
||||
|
||||
.. attribute:: query
|
||||
|
||||
The query component for a given URI Reference. For example, ``a=b``,
|
||||
``a=b%20c``, ``a=b+c``, ``a=b,c=d,e=%20f``, etc.
|
||||
|
||||
.. attribute:: fragment
|
||||
|
||||
The fragment component of a URI. For example, ``section-3.1``.
|
||||
|
||||
This class also provides extra attributes for easier access to information
|
||||
like the subcomponents of the authority component.
|
||||
|
||||
.. attribute:: userinfo
|
||||
|
||||
The user information parsed from the authority.
|
||||
|
||||
.. attribute:: host
|
||||
|
||||
The hostname, IPv4, or IPv6 adddres parsed from the authority.
|
||||
|
||||
.. attribute:: port
|
||||
|
||||
The port parsed from the authority.
|
||||
"""
|
||||
|
||||
slots = ()
|
||||
|
||||
def __new__(cls, scheme, authority, path, query, fragment,
|
||||
encoding='utf-8'):
|
||||
"""Create a new URIReference."""
|
||||
ref = super(URIReference, cls).__new__(
|
||||
cls,
|
||||
scheme or None,
|
||||
@ -44,6 +97,7 @@ class URIReference(namedtuple('URIReference', URI_COMPONENTS)):
|
||||
return ref
|
||||
|
||||
def __eq__(self, other):
|
||||
"""Compare this reference to another."""
|
||||
other_ref = other
|
||||
if isinstance(other, tuple):
|
||||
other_ref = URIReference(*other)
|
||||
@ -67,49 +121,52 @@ class URIReference(namedtuple('URIReference', URI_COMPONENTS)):
|
||||
:param str encoding: The encoding of the string provided
|
||||
:returns: :class:`URIReference` or subclass thereof
|
||||
"""
|
||||
uri_string = to_str(uri_string, encoding)
|
||||
uri_string = compat.to_str(uri_string, encoding)
|
||||
|
||||
split_uri = URI_MATCHER.match(uri_string).groupdict()
|
||||
return cls(split_uri['scheme'], split_uri['authority'],
|
||||
encode_component(split_uri['path'], encoding),
|
||||
encode_component(split_uri['query'], encoding),
|
||||
encode_component(split_uri['fragment'], encoding), encoding)
|
||||
split_uri = misc.URI_MATCHER.match(uri_string).groupdict()
|
||||
return cls(
|
||||
split_uri['scheme'], split_uri['authority'],
|
||||
normalizers.encode_component(split_uri['path'], encoding),
|
||||
normalizers.encode_component(split_uri['query'], encoding),
|
||||
normalizers.encode_component(split_uri['fragment'], encoding),
|
||||
encoding,
|
||||
)
|
||||
|
||||
def authority_info(self):
|
||||
"""Returns a dictionary with the ``userinfo``, ``host``, and ``port``.
|
||||
"""Return a dictionary with the ``userinfo``, ``host``, and ``port``.
|
||||
|
||||
If the authority is not valid, it will raise a ``InvalidAuthority``
|
||||
Exception.
|
||||
If the authority is not valid, it will raise a
|
||||
:class:`~rfc3986.exceptions.InvalidAuthority` Exception.
|
||||
|
||||
:returns:
|
||||
``{'userinfo': 'username:password', 'host': 'www.example.com',
|
||||
'port': '80'}``
|
||||
:rtype: dict
|
||||
:raises InvalidAuthority: If the authority is not ``None`` and can not
|
||||
be parsed.
|
||||
:raises rfc3986.exceptions.InvalidAuthority:
|
||||
If the authority is not ``None`` and can not be parsed.
|
||||
"""
|
||||
if not self.authority:
|
||||
return {'userinfo': None, 'host': None, 'port': None}
|
||||
|
||||
match = SUBAUTHORITY_MATCHER.match(self.authority)
|
||||
match = misc.SUBAUTHORITY_MATCHER.match(self.authority)
|
||||
|
||||
if match is None:
|
||||
# In this case, we have an authority that was parsed from the URI
|
||||
# Reference, but it cannot be further parsed by our
|
||||
# SUBAUTHORITY_MATCHER. In this case it must not be a valid
|
||||
# misc.SUBAUTHORITY_MATCHER. In this case it must not be a valid
|
||||
# authority.
|
||||
raise InvalidAuthority(self.authority.encode(self.encoding))
|
||||
raise exc.InvalidAuthority(self.authority.encode(self.encoding))
|
||||
|
||||
# We had a match, now let's ensure that it is actually a valid host
|
||||
# address if it is IPv4
|
||||
matches = match.groupdict()
|
||||
host = matches.get('host')
|
||||
|
||||
if (host and IPv4_MATCHER.match(host) and not
|
||||
valid_ipv4_host_address(host)):
|
||||
if (host and misc.IPv4_MATCHER.match(host) and not
|
||||
validators.valid_ipv4_host_address(host)):
|
||||
# If we have a host, it appears to be IPv4 and it does not have
|
||||
# valid bytes, it is an InvalidAuthority.
|
||||
raise InvalidAuthority(self.authority.encode(self.encoding))
|
||||
raise exc.InvalidAuthority(self.authority.encode(self.encoding))
|
||||
|
||||
return matches
|
||||
|
||||
@ -118,16 +175,16 @@ class URIReference(namedtuple('URIReference', URI_COMPONENTS)):
|
||||
"""If present, a string representing the host."""
|
||||
try:
|
||||
authority = self.authority_info()
|
||||
except InvalidAuthority:
|
||||
except exc.InvalidAuthority:
|
||||
return None
|
||||
return authority['host']
|
||||
|
||||
@property
|
||||
def port(self):
|
||||
"""If present, the port (as a string) extracted from the authority."""
|
||||
"""If present, the port extracted from the authority."""
|
||||
try:
|
||||
authority = self.authority_info()
|
||||
except InvalidAuthority:
|
||||
except exc.InvalidAuthority:
|
||||
return None
|
||||
return authority['port']
|
||||
|
||||
@ -136,7 +193,7 @@ class URIReference(namedtuple('URIReference', URI_COMPONENTS)):
|
||||
"""If present, the userinfo extracted from the authority."""
|
||||
try:
|
||||
authority = self.authority_info()
|
||||
except InvalidAuthority:
|
||||
except exc.InvalidAuthority:
|
||||
return None
|
||||
return authority['userinfo']
|
||||
|
||||
@ -148,10 +205,10 @@ class URIReference(namedtuple('URIReference', URI_COMPONENTS)):
|
||||
:returns: ``True`` if it is an absolute URI, ``False`` otherwise.
|
||||
:rtype: bool
|
||||
"""
|
||||
return bool(ABSOLUTE_URI_MATCHER.match(self.unsplit()))
|
||||
return bool(misc.ABSOLUTE_URI_MATCHER.match(self.unsplit()))
|
||||
|
||||
def is_valid(self, **kwargs):
|
||||
"""Determines if the URI is valid.
|
||||
"""Determine if the URI is valid.
|
||||
|
||||
:param bool require_scheme: Set to ``True`` if you wish to require the
|
||||
presence of the scheme component.
|
||||
@ -175,82 +232,69 @@ class URIReference(namedtuple('URIReference', URI_COMPONENTS)):
|
||||
]
|
||||
return all(v(r) for v, r in validators)
|
||||
|
||||
def _is_valid(self, value, matcher, require):
|
||||
if require:
|
||||
return (value is not None
|
||||
and matcher.match(value))
|
||||
|
||||
# require is False and value is not None
|
||||
return value is None or matcher.match(value)
|
||||
|
||||
def authority_is_valid(self, require=False):
|
||||
"""Determines if the authority component is valid.
|
||||
"""Determine if the authority component is valid.
|
||||
|
||||
:param str require: Set to ``True`` to require the presence of this
|
||||
component.
|
||||
:returns: ``True`` if the authority is valid. ``False`` otherwise.
|
||||
:rtype: bool
|
||||
:param bool require:
|
||||
Set to ``True`` to require the presence of this component.
|
||||
:returns:
|
||||
``True`` if the authority is valid. ``False`` otherwise.
|
||||
:rtype:
|
||||
bool
|
||||
"""
|
||||
try:
|
||||
self.authority_info()
|
||||
except InvalidAuthority:
|
||||
except exc.InvalidAuthority:
|
||||
return False
|
||||
|
||||
is_valid = self._is_valid(self.authority,
|
||||
SUBAUTHORITY_MATCHER,
|
||||
require)
|
||||
|
||||
# Ensure that IPv4 addresses have valid bytes
|
||||
if is_valid and self.host and IPv4_MATCHER.match(self.host):
|
||||
return valid_ipv4_host_address(self.host)
|
||||
|
||||
# Perhaps the host didn't exist or if it did, it wasn't an IPv4-like
|
||||
# address. In either case, we want to rely on the `_is_valid` check,
|
||||
# so let's return that.
|
||||
return is_valid
|
||||
return validators.authority_is_valid(
|
||||
self.authority,
|
||||
host=self.host,
|
||||
require=require,
|
||||
)
|
||||
|
||||
def scheme_is_valid(self, require=False):
|
||||
"""Determines if the scheme component is valid.
|
||||
"""Determine if the scheme component is valid.
|
||||
|
||||
:param str require: Set to ``True`` to require the presence of this
|
||||
component.
|
||||
:returns: ``True`` if the scheme is valid. ``False`` otherwise.
|
||||
:rtype: bool
|
||||
"""
|
||||
return self._is_valid(self.scheme, SCHEME_MATCHER, require)
|
||||
return validators.scheme_is_valid(self.scheme, require)
|
||||
|
||||
def path_is_valid(self, require=False):
|
||||
"""Determines if the path component is valid.
|
||||
"""Determine if the path component is valid.
|
||||
|
||||
:param str require: Set to ``True`` to require the presence of this
|
||||
component.
|
||||
:returns: ``True`` if the path is valid. ``False`` otherwise.
|
||||
:rtype: bool
|
||||
"""
|
||||
return self._is_valid(self.path, PATH_MATCHER, require)
|
||||
return validators.path_is_valid(self.path, require)
|
||||
|
||||
def query_is_valid(self, require=False):
|
||||
"""Determines if the query component is valid.
|
||||
"""Determine if the query component is valid.
|
||||
|
||||
:param str require: Set to ``True`` to require the presence of this
|
||||
component.
|
||||
:returns: ``True`` if the query is valid. ``False`` otherwise.
|
||||
:rtype: bool
|
||||
"""
|
||||
return self._is_valid(self.query, QUERY_MATCHER, require)
|
||||
return validators.query_is_valid(self.query, require)
|
||||
|
||||
def fragment_is_valid(self, require=False):
|
||||
"""Determines if the fragment component is valid.
|
||||
"""Determine if the fragment component is valid.
|
||||
|
||||
:param str require: Set to ``True`` to require the presence of this
|
||||
component.
|
||||
:returns: ``True`` if the fragment is valid. ``False`` otherwise.
|
||||
:rtype: bool
|
||||
"""
|
||||
return self._is_valid(self.fragment, FRAGMENT_MATCHER, require)
|
||||
return validators.fragment_is_valid(self.fragment, require)
|
||||
|
||||
def normalize(self):
|
||||
"""Normalize this reference as described in Section 6.2.2
|
||||
"""Normalize this reference as described in Section 6.2.2.
|
||||
|
||||
This is not an in-place normalization. Instead this creates a new
|
||||
URIReference.
|
||||
@ -260,12 +304,12 @@ class URIReference(namedtuple('URIReference', URI_COMPONENTS)):
|
||||
"""
|
||||
# See http://tools.ietf.org/html/rfc3986#section-6.2.2 for logic in
|
||||
# this method.
|
||||
return URIReference(normalize_scheme(self.scheme or ''),
|
||||
normalize_authority(
|
||||
return URIReference(normalizers.normalize_scheme(self.scheme or ''),
|
||||
normalizers.normalize_authority(
|
||||
(self.userinfo, self.host, self.port)),
|
||||
normalize_path(self.path or ''),
|
||||
normalize_query(self.query),
|
||||
normalize_fragment(self.fragment),
|
||||
normalizers.normalize_path(self.path or ''),
|
||||
normalizers.normalize_query(self.query),
|
||||
normalizers.normalize_fragment(self.fragment),
|
||||
self.encoding)
|
||||
|
||||
def normalized_equality(self, other_ref):
|
||||
@ -291,13 +335,14 @@ class URIReference(namedtuple('URIReference', URI_COMPONENTS)):
|
||||
:returns: A new URIReference which is the result of resolving this
|
||||
reference using ``base_uri``.
|
||||
:rtype: :class:`URIReference`
|
||||
:raises ResolutionError: If the ``base_uri`` is not an absolute URI.
|
||||
:raises rfc3986.exceptions.ResolutionError:
|
||||
If the ``base_uri`` is not an absolute URI.
|
||||
"""
|
||||
if not isinstance(base_uri, URIReference):
|
||||
base_uri = URIReference.from_string(base_uri)
|
||||
|
||||
if not base_uri.is_absolute():
|
||||
raise ResolutionError(base_uri)
|
||||
raise exc.ResolutionError(base_uri)
|
||||
|
||||
# This is optional per
|
||||
# http://tools.ietf.org/html/rfc3986#section-5.2.1
|
||||
@ -311,12 +356,14 @@ class URIReference(namedtuple('URIReference', URI_COMPONENTS)):
|
||||
|
||||
# http://tools.ietf.org/html/rfc3986#page-32
|
||||
if resolving.scheme is not None:
|
||||
target = resolving.copy_with(path=normalize_path(resolving.path))
|
||||
target = resolving.copy_with(
|
||||
path=normalizers.normalize_path(resolving.path)
|
||||
)
|
||||
else:
|
||||
if resolving.authority is not None:
|
||||
target = resolving.copy_with(
|
||||
scheme=base_uri.scheme,
|
||||
path=normalize_path(resolving.path)
|
||||
path=normalizers.normalize_path(resolving.path)
|
||||
)
|
||||
else:
|
||||
if resolving.path is None:
|
||||
@ -332,10 +379,10 @@ class URIReference(namedtuple('URIReference', URI_COMPONENTS)):
|
||||
)
|
||||
else:
|
||||
if resolving.path.startswith('/'):
|
||||
path = normalize_path(resolving.path)
|
||||
path = normalizers.normalize_path(resolving.path)
|
||||
else:
|
||||
path = normalize_path(
|
||||
merge_paths(base_uri, resolving.path)
|
||||
path = normalizers.normalize_path(
|
||||
misc.merge_paths(base_uri, resolving.path)
|
||||
)
|
||||
target = resolving.copy_with(
|
||||
scheme=base_uri.scheme,
|
||||
@ -365,8 +412,26 @@ class URIReference(namedtuple('URIReference', URI_COMPONENTS)):
|
||||
result_list.extend(['#', self.fragment])
|
||||
return ''.join(result_list)
|
||||
|
||||
def copy_with(self, scheme=None, authority=None, path=None, query=None,
|
||||
fragment=None):
|
||||
def copy_with(self, scheme=misc.UseExisting, authority=misc.UseExisting,
|
||||
path=misc.UseExisting, query=misc.UseExisting,
|
||||
fragment=misc.UseExisting):
|
||||
"""Create a copy of this reference with the new components.
|
||||
|
||||
:param str scheme:
|
||||
(optional) The scheme to use for the new reference.
|
||||
:param str authority:
|
||||
(optional) The authority to use for the new reference.
|
||||
:param str path:
|
||||
(optional) The path to use for the new reference.
|
||||
:param str query:
|
||||
(optional) The query to use for the new reference.
|
||||
:param str fragment:
|
||||
(optional) The fragment to use for the new reference.
|
||||
:returns:
|
||||
New URIReference with provided components.
|
||||
:rtype:
|
||||
URIReference
|
||||
"""
|
||||
attributes = {
|
||||
'scheme': scheme,
|
||||
'authority': authority,
|
||||
@ -375,14 +440,8 @@ class URIReference(namedtuple('URIReference', URI_COMPONENTS)):
|
||||
'fragment': fragment,
|
||||
}
|
||||
for key, value in list(attributes.items()):
|
||||
if value is None:
|
||||
if value is misc.UseExisting:
|
||||
del attributes[key]
|
||||
uri = self._replace(**attributes)
|
||||
uri.encoding = self.encoding
|
||||
return uri
|
||||
|
||||
|
||||
def valid_ipv4_host_address(host):
|
||||
# If the host exists, and it might be IPv4, check each byte in the
|
||||
# address.
|
||||
return all([0 <= int(byte, base=10) <= 255 for byte in host.split('.')])
|
311
src/rfc3986/validators.py
Normal file
311
src/rfc3986/validators.py
Normal file
@ -0,0 +1,311 @@
|
||||
# -*- coding: utf-8 -*-
|
||||
# Copyright (c) 2017 Ian Cordasco
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
|
||||
# implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
"""Module containing the validation logic for rfc3986."""
|
||||
from . import exceptions
|
||||
from . import misc
|
||||
from . import normalizers
|
||||
|
||||
|
||||
class Validator(object):
|
||||
"""Object used to configure validation of all objects in rfc3986.
|
||||
|
||||
Example usage::
|
||||
|
||||
>>> from rfc3986 import api, validators
|
||||
>>> uri = api.uri_reference('https://github.com/')
|
||||
>>> validator = validators.Validator().require_presence_of(
|
||||
... 'scheme', 'host', 'path',
|
||||
... ).allow_schemes(
|
||||
... 'http', 'https',
|
||||
... ).allow_hosts(
|
||||
... '127.0.0.1', 'github.com',
|
||||
... )
|
||||
>>> validator.validate(uri)
|
||||
>>> invalid_uri = rfc3986.uri_reference('imap://mail.google.com')
|
||||
>>> validator.validate(invalid_uri)
|
||||
Traceback (most recent call last):
|
||||
...
|
||||
rfc3986.exceptions.MissingComponentError: ('path was required but
|
||||
missing', URIReference(scheme=u'imap', authority=u'mail.google.com',
|
||||
path=None, query=None, fragment=None), ['path'])
|
||||
|
||||
"""
|
||||
|
||||
COMPONENT_NAMES = frozenset([
|
||||
'scheme',
|
||||
'userinfo',
|
||||
'host',
|
||||
'port',
|
||||
'path',
|
||||
'query',
|
||||
'fragment',
|
||||
])
|
||||
|
||||
def __init__(self):
|
||||
"""Initialize our default validations."""
|
||||
self.allowed_schemes = set()
|
||||
self.allowed_hosts = set()
|
||||
self.allowed_ports = set()
|
||||
self.allow_password = True
|
||||
self.required_components = {
|
||||
'scheme': False,
|
||||
'userinfo': False,
|
||||
'host': False,
|
||||
'port': False,
|
||||
'path': False,
|
||||
'query': False,
|
||||
'fragment': False,
|
||||
}
|
||||
|
||||
def allow_schemes(self, *schemes):
|
||||
"""Require the scheme to be one of the provided schemes.
|
||||
|
||||
:param schemes:
|
||||
Schemes, without ``://`` that are allowed.
|
||||
:returns:
|
||||
The validator instance.
|
||||
:rtype:
|
||||
Validator
|
||||
"""
|
||||
for scheme in schemes:
|
||||
self.allowed_schemes.add(normalizers.normalize_scheme(scheme))
|
||||
return self
|
||||
|
||||
def allow_hosts(self, *hosts):
|
||||
"""Require the host to be one of the provided hosts.
|
||||
|
||||
:param hosts:
|
||||
Hosts that are allowed.
|
||||
:returns:
|
||||
The validator instance.
|
||||
:rtype:
|
||||
Validator
|
||||
"""
|
||||
for host in hosts:
|
||||
self.allowed_hosts.add(normalizers.normalize_host(host))
|
||||
return self
|
||||
|
||||
def allow_ports(self, *ports):
|
||||
"""Require the port to be one of the provided ports.
|
||||
|
||||
:param ports:
|
||||
Ports that are allowed.
|
||||
:returns:
|
||||
The validator instance.
|
||||
:rtype:
|
||||
Validator
|
||||
"""
|
||||
for port in ports:
|
||||
port_int = int(port, base=10)
|
||||
if 0 <= port_int <= 65535:
|
||||
self.allowed_ports.add(port)
|
||||
return self
|
||||
|
||||
def allow_use_of_password(self):
|
||||
"""Allow passwords to be present in the URI."""
|
||||
self.allow_password = True
|
||||
return self
|
||||
|
||||
def forbid_use_of_password(self):
|
||||
"""Prevent passwords from being included in the URI."""
|
||||
self.allow_password = False
|
||||
return self
|
||||
|
||||
def require_presence_of(self, *components):
|
||||
"""Require the components provided.
|
||||
|
||||
:param components:
|
||||
Names of components from :attr:`Validator.COMPONENT_NAMES`.
|
||||
:returns:
|
||||
The validator instance.
|
||||
:rtype:
|
||||
Validator
|
||||
"""
|
||||
components = [c.lower() for c in components]
|
||||
for component in components:
|
||||
if component not in self.COMPONENT_NAMES:
|
||||
raise ValueError(
|
||||
'"{}" is not a valid component'.format(component)
|
||||
)
|
||||
self.required_components.update({
|
||||
component: True for component in components
|
||||
})
|
||||
return self
|
||||
|
||||
def validate(self, uri):
|
||||
"""Check a URI for conditions specified on this validator.
|
||||
|
||||
:param uri:
|
||||
Parsed URI to validate.
|
||||
:type uri:
|
||||
rfc3986.uri.URIReference
|
||||
:raises MissingComponentError:
|
||||
When a required component is missing.
|
||||
:raises UnpermittedComponentError:
|
||||
When a component is not one of those allowed.
|
||||
:raises PasswordForbidden:
|
||||
When a password is present in the userinfo component but is
|
||||
not permitted by configuration.
|
||||
"""
|
||||
if not self.allow_password:
|
||||
check_password(uri)
|
||||
|
||||
required_components = [
|
||||
component
|
||||
for component, required in self.required_components.items()
|
||||
if required
|
||||
]
|
||||
if required_components:
|
||||
ensure_required_components_exist(uri, required_components)
|
||||
|
||||
ensure_one_of(self.allowed_schemes, uri, 'scheme')
|
||||
ensure_one_of(self.allowed_hosts, uri, 'host')
|
||||
ensure_one_of(self.allowed_ports, uri, 'port')
|
||||
|
||||
|
||||
def check_password(uri):
|
||||
"""Assert that there is no password present in the uri."""
|
||||
userinfo = uri.userinfo
|
||||
if not userinfo:
|
||||
return
|
||||
credentials = userinfo.split(':', 1)
|
||||
if len(credentials) <= 1:
|
||||
return
|
||||
raise exceptions.PasswordForbidden(uri)
|
||||
|
||||
|
||||
def ensure_one_of(allowed_values, uri, attribute):
|
||||
"""Assert that the uri's attribute is one of the allowed values."""
|
||||
value = getattr(uri, attribute)
|
||||
if value is not None and allowed_values and value not in allowed_values:
|
||||
raise exceptions.UnpermittedComponentError(
|
||||
attribute, value, allowed_values,
|
||||
)
|
||||
|
||||
|
||||
def ensure_required_components_exist(uri, required_components):
|
||||
"""Assert that all required components are present in the URI."""
|
||||
missing_components = sorted([
|
||||
component
|
||||
for component in required_components
|
||||
if getattr(uri, component) is None
|
||||
])
|
||||
if missing_components:
|
||||
raise exceptions.MissingComponentError(uri, *missing_components)
|
||||
|
||||
|
||||
def is_valid(value, matcher, require):
|
||||
"""Determine if a value is valid based on the provided matcher.
|
||||
|
||||
:param str value:
|
||||
Value to validate.
|
||||
:param matcher:
|
||||
Compiled regular expression to use to validate the value.
|
||||
:param require:
|
||||
Whether or not the value is required.
|
||||
"""
|
||||
if require:
|
||||
return (value is not None
|
||||
and matcher.match(value))
|
||||
|
||||
# require is False and value is not None
|
||||
return value is None or matcher.match(value)
|
||||
|
||||
|
||||
def authority_is_valid(authority, host=None, require=False):
|
||||
"""Determine if the authority string is valid.
|
||||
|
||||
:param str authority:
|
||||
The authority to validate.
|
||||
:param str host:
|
||||
(optional) The host portion of the authority to validate.
|
||||
:param bool require:
|
||||
(optional) Specify if authority must not be None.
|
||||
:returns:
|
||||
``True`` if valid, ``False`` otherwise
|
||||
:rtype:
|
||||
bool
|
||||
"""
|
||||
validated = is_valid(authority, misc.SUBAUTHORITY_MATCHER, require)
|
||||
if validated and host is not None and misc.IPv4_MATCHER.match(host):
|
||||
return valid_ipv4_host_address(host)
|
||||
return validated
|
||||
|
||||
|
||||
def scheme_is_valid(scheme, require=False):
|
||||
"""Determine if the scheme is valid.
|
||||
|
||||
:param str scheme:
|
||||
The scheme string to validate.
|
||||
:param bool require:
|
||||
(optional) Set to ``True`` to require the presence of a scheme.
|
||||
:returns:
|
||||
``True`` if the scheme is valid. ``False`` otherwise.
|
||||
:rtype:
|
||||
bool
|
||||
"""
|
||||
return is_valid(scheme, misc.SCHEME_MATCHER, require)
|
||||
|
||||
|
||||
def path_is_valid(path, require=False):
|
||||
"""Determine if the path component is valid.
|
||||
|
||||
:param str path:
|
||||
The path string to validate.
|
||||
:param bool require:
|
||||
(optional) Set to ``True`` to require the presence of a path.
|
||||
:returns:
|
||||
``True`` if the path is valid. ``False`` otherwise.
|
||||
:rtype:
|
||||
bool
|
||||
"""
|
||||
return is_valid(path, misc.PATH_MATCHER, require)
|
||||
|
||||
|
||||
def query_is_valid(query, require=False):
|
||||
"""Determine if the query component is valid.
|
||||
|
||||
:param str query:
|
||||
The query string to validate.
|
||||
:param bool require:
|
||||
(optional) Set to ``True`` to require the presence of a query.
|
||||
:returns:
|
||||
``True`` if the query is valid. ``False`` otherwise.
|
||||
:rtype:
|
||||
bool
|
||||
"""
|
||||
return is_valid(query, misc.QUERY_MATCHER, require)
|
||||
|
||||
|
||||
def fragment_is_valid(fragment, require=False):
|
||||
"""Determine if the fragment component is valid.
|
||||
|
||||
:param str fragment:
|
||||
The fragment string to validate.
|
||||
:param bool require:
|
||||
(optional) Set to ``True`` to require the presence of a fragment.
|
||||
:returns:
|
||||
``True`` if the fragment is valid. ``False`` otherwise.
|
||||
:rtype:
|
||||
bool
|
||||
"""
|
||||
return is_valid(fragment, misc.FRAGMENT_MATCHER, require)
|
||||
|
||||
|
||||
def valid_ipv4_host_address(host):
|
||||
"""Determine if the given host is a valid IPv4 address."""
|
||||
# If the host exists, and it might be IPv4, check each byte in the
|
||||
# address.
|
||||
return all([0 <= int(byte, base=10) <= 255 for byte in host.split('.')])
|
@ -7,16 +7,30 @@ import pytest
|
||||
SNOWMAN = b'\xe2\x98\x83'
|
||||
|
||||
valid_hosts = [
|
||||
'[21DA:00D3:0000:2F3B:02AA:00FF:FE28:9C5A]', '[::1]',
|
||||
'[21DA:D3:0:2F3B:2AA:FF:FE28:9C5A]', '[FE80::2AA:FF:FE9A:4CA2]',
|
||||
'[FF02::2]', '[FF02:3::5]', '[FF02:0:0:0:0:0:0:2]',
|
||||
'[FF02:30:0:0:0:0:0:5]', '127.0.0.1', 'www.example.com', 'localhost',
|
||||
'[21DA:00D3:0000:2F3B:02AA:00FF:FE28:9C5A]',
|
||||
'[::1]',
|
||||
'[::1%25lo]', # With ZoneID
|
||||
'[FF02:0:0:0:0:0:0:2%25en01]', # With ZoneID
|
||||
'[FF02:30:0:0:0:0:0:5%25en1]', # With ZoneID
|
||||
'[21DA:D3:0:2F3B:2AA:FF:FE28:9C5A]',
|
||||
'[FE80::2AA:FF:FE9A:4CA2]',
|
||||
'[FF02::2]',
|
||||
'[FF02:3::5]',
|
||||
'[FF02:0:0:0:0:0:0:2]',
|
||||
'[FF02:30:0:0:0:0:0:5]',
|
||||
'127.0.0.1',
|
||||
'www.example.com',
|
||||
'localhost',
|
||||
'http-bin.org',
|
||||
]
|
||||
|
||||
invalid_hosts = [
|
||||
'[FF02::3::5]', # IPv6 can only have one ::
|
||||
'[FADF:01]', # Not properly compacted (missing a :)
|
||||
'[FADF:01%en0]', # Not properly compacted (missing a :), Invalid ZoneID
|
||||
'[FADF::01%en0]', # Invalid ZoneID separator
|
||||
'[FADF::01%]', # Invalid ZoneID separator and no ZoneID
|
||||
'[FADF::01%25]', # Missing ZoneID
|
||||
'localhost:80:80:80', # Too many ports
|
||||
'256.256.256.256', # Invalid IPv4 Address
|
||||
SNOWMAN.decode('utf-8')
|
||||
|
165
tests/test_builder.py
Normal file
165
tests/test_builder.py
Normal file
@ -0,0 +1,165 @@
|
||||
# -*- coding: utf-8 -*-
|
||||
# Copyright (c) 2017 Ian Cordasco
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
|
||||
# implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
"""Module containing the tests for the URIBuilder object."""
|
||||
import pytest
|
||||
|
||||
from rfc3986 import builder
|
||||
|
||||
|
||||
def test_builder_default():
|
||||
"""Verify the default values."""
|
||||
uribuilder = builder.URIBuilder()
|
||||
assert uribuilder.scheme is None
|
||||
assert uribuilder.userinfo is None
|
||||
assert uribuilder.host is None
|
||||
assert uribuilder.port is None
|
||||
assert uribuilder.path is None
|
||||
assert uribuilder.query is None
|
||||
assert uribuilder.fragment is None
|
||||
|
||||
|
||||
def test_repr():
|
||||
"""Verify our repr looks like our class."""
|
||||
uribuilder = builder.URIBuilder()
|
||||
assert repr(uribuilder).startswith('URIBuilder(scheme=None')
|
||||
|
||||
|
||||
@pytest.mark.parametrize('scheme', [
|
||||
'https',
|
||||
'hTTps',
|
||||
'Https',
|
||||
'HtTpS',
|
||||
'HTTPS',
|
||||
])
|
||||
def test_add_scheme(scheme):
|
||||
"""Verify schemes are normalized when added."""
|
||||
uribuilder = builder.URIBuilder().add_scheme(scheme)
|
||||
assert uribuilder.scheme == 'https'
|
||||
|
||||
|
||||
@pytest.mark.parametrize('username, password, userinfo', [
|
||||
('user', 'pass', 'user:pass'),
|
||||
('user', None, 'user'),
|
||||
('user@domain.com', 'password', 'user%40domain.com:password'),
|
||||
('user', 'pass:word', 'user:pass%3Aword'),
|
||||
])
|
||||
def test_add_credentials(username, password, userinfo):
|
||||
"""Verify we normalize usernames and passwords."""
|
||||
uribuilder = builder.URIBuilder().add_credentials(username, password)
|
||||
assert uribuilder.userinfo == userinfo
|
||||
|
||||
|
||||
def test_add_credentials_requires_username():
|
||||
"""Verify one needs a username to add credentials."""
|
||||
with pytest.raises(ValueError):
|
||||
builder.URIBuilder().add_credentials(None, None)
|
||||
|
||||
|
||||
@pytest.mark.parametrize('hostname', [
|
||||
'google.com',
|
||||
'GOOGLE.COM',
|
||||
'gOOgLe.COM',
|
||||
'goOgLE.com',
|
||||
])
|
||||
def test_add_host(hostname):
|
||||
"""Verify we normalize hostnames in add_host."""
|
||||
uribuilder = builder.URIBuilder().add_host(hostname)
|
||||
assert uribuilder.host == 'google.com'
|
||||
|
||||
|
||||
@pytest.mark.parametrize('port', [
|
||||
-100,
|
||||
'-100',
|
||||
-1,
|
||||
'-1',
|
||||
65536,
|
||||
'65536',
|
||||
1000000,
|
||||
'1000000',
|
||||
'',
|
||||
'abc',
|
||||
'0b10',
|
||||
])
|
||||
def test_add_invalid_port(port):
|
||||
"""Verify we raise a ValueError for invalid ports."""
|
||||
with pytest.raises(ValueError):
|
||||
builder.URIBuilder().add_port(port)
|
||||
|
||||
|
||||
@pytest.mark.parametrize('port, expected', [
|
||||
(0, '0'),
|
||||
('0', '0'),
|
||||
(1, '1'),
|
||||
('1', '1'),
|
||||
(22, '22'),
|
||||
('22', '22'),
|
||||
(80, '80'),
|
||||
('80', '80'),
|
||||
(443, '443'),
|
||||
('443', '443'),
|
||||
(65535, '65535'),
|
||||
('65535', '65535'),
|
||||
])
|
||||
def test_add_port(port, expected):
|
||||
"""Verify we normalize our port."""
|
||||
uribuilder = builder.URIBuilder().add_port(port)
|
||||
assert uribuilder.port == expected
|
||||
|
||||
|
||||
@pytest.mark.parametrize('path', [
|
||||
'sigmavirus24/rfc3986',
|
||||
'/sigmavirus24/rfc3986',
|
||||
])
|
||||
def test_add_path(path):
|
||||
"""Verify we normalize our path value."""
|
||||
uribuilder = builder.URIBuilder().add_path(path)
|
||||
assert uribuilder.path == '/sigmavirus24/rfc3986'
|
||||
|
||||
|
||||
@pytest.mark.parametrize('query_items, expected', [
|
||||
({'a': 'b c'}, 'a=b+c'),
|
||||
({'a': 'b+c'}, 'a=b%2Bc'),
|
||||
([('a', 'b c')], 'a=b+c'),
|
||||
([('a', 'b+c')], 'a=b%2Bc'),
|
||||
([('a', 'b'), ('c', 'd')], 'a=b&c=d'),
|
||||
([('a', 'b'), ('username', '@d')], 'a=b&username=%40d'),
|
||||
])
|
||||
def test_add_query_from(query_items, expected):
|
||||
"""Verify the behaviour of add_query_from."""
|
||||
uribuilder = builder.URIBuilder().add_query_from(query_items)
|
||||
assert uribuilder.query == expected
|
||||
|
||||
|
||||
def test_add_query():
|
||||
"""Verify we do not modify the provided query string."""
|
||||
uribuilder = builder.URIBuilder().add_query('username=@foo')
|
||||
assert uribuilder.query == 'username=@foo'
|
||||
|
||||
|
||||
def test_add_fragment():
|
||||
"""Verify our handling of fragments."""
|
||||
uribuilder = builder.URIBuilder().add_fragment('section-2.5.1')
|
||||
assert uribuilder.fragment == 'section-2.5.1'
|
||||
|
||||
|
||||
def test_finalize():
|
||||
"""Verify the whole thing."""
|
||||
uri = builder.URIBuilder().add_scheme('https').add_credentials(
|
||||
'sigmavirus24', 'not-my-re@l-password'
|
||||
).add_host('github.com').add_path('sigmavirus24/rfc3986').finalize(
|
||||
).unsplit()
|
||||
expected = ('https://sigmavirus24:not-my-re%40l-password@github.com/'
|
||||
'sigmavirus24/rfc3986')
|
||||
assert expected == uri
|
196
tests/test_validators.py
Normal file
196
tests/test_validators.py
Normal file
@ -0,0 +1,196 @@
|
||||
# -*- coding: utf-8 -*-
|
||||
"""Tests for the validators module."""
|
||||
import rfc3986
|
||||
from rfc3986 import exceptions
|
||||
from rfc3986 import validators
|
||||
|
||||
import pytest
|
||||
|
||||
|
||||
def test_defaults():
|
||||
"""Verify the default Validator settings."""
|
||||
validator = validators.Validator()
|
||||
|
||||
assert validator.required_components == {
|
||||
c: False for c in validator.COMPONENT_NAMES
|
||||
}
|
||||
assert validator.allow_password is True
|
||||
assert validator.allowed_schemes == set()
|
||||
assert validator.allowed_hosts == set()
|
||||
assert validator.allowed_ports == set()
|
||||
|
||||
|
||||
def test_allowing_schemes():
|
||||
"""Verify the ability to select schemes to be allowed."""
|
||||
validator = validators.Validator().allow_schemes('http', 'https')
|
||||
|
||||
assert 'http' in validator.allowed_schemes
|
||||
assert 'https' in validator.allowed_schemes
|
||||
|
||||
|
||||
def test_allowing_hosts():
|
||||
"""Verify the ability to select hosts to be allowed."""
|
||||
validator = validators.Validator().allow_hosts(
|
||||
'pypi.python.org', 'pypi.org',
|
||||
)
|
||||
|
||||
assert 'pypi.python.org' in validator.allowed_hosts
|
||||
assert 'pypi.org' in validator.allowed_hosts
|
||||
|
||||
|
||||
def test_allowing_ports():
|
||||
"""Verify the ability select ports to be allowed."""
|
||||
validator = validators.Validator().allow_ports('80', '100')
|
||||
|
||||
assert '80' in validator.allowed_ports
|
||||
assert '100' in validator.allowed_ports
|
||||
|
||||
|
||||
def test_requiring_invalid_component():
|
||||
"""Verify that we validate required component names."""
|
||||
with pytest.raises(ValueError):
|
||||
validators.Validator().require_presence_of('frob')
|
||||
|
||||
|
||||
def test_use_of_password():
|
||||
"""Verify the behaviour of {forbid,allow}_use_of_password."""
|
||||
validator = validators.Validator()
|
||||
assert validator.allow_password is True
|
||||
|
||||
validator.forbid_use_of_password()
|
||||
assert validator.allow_password is False
|
||||
|
||||
validator.allow_use_of_password()
|
||||
assert validator.allow_password is True
|
||||
|
||||
|
||||
@pytest.mark.parametrize('uri', [
|
||||
rfc3986.uri_reference('https://user:password@github.com'),
|
||||
rfc3986.uri_reference('https://user:password@github.com/path'),
|
||||
rfc3986.uri_reference('https://user:password@github.com/path?query'),
|
||||
rfc3986.uri_reference('https://user:password@github.com/path?query#frag'),
|
||||
rfc3986.uri_reference('//user:password@github.com'),
|
||||
])
|
||||
def test_forbidden_passwords(uri):
|
||||
"""Verify that passwords are disallowed."""
|
||||
validator = validators.Validator().forbid_use_of_password()
|
||||
with pytest.raises(exceptions.PasswordForbidden):
|
||||
validator.validate(uri)
|
||||
|
||||
|
||||
@pytest.mark.parametrize('uri', [
|
||||
rfc3986.uri_reference('https://user@github.com'),
|
||||
rfc3986.uri_reference('https://user@github.com/path'),
|
||||
rfc3986.uri_reference('https://user@github.com/path?query'),
|
||||
rfc3986.uri_reference('https://user@github.com/path?query#frag'),
|
||||
rfc3986.uri_reference('//user@github.com'),
|
||||
rfc3986.uri_reference('//github.com'),
|
||||
rfc3986.uri_reference('https://github.com'),
|
||||
])
|
||||
def test_passwordless_uris_pass_validation(uri):
|
||||
"""Verify password-less URLs validate properly."""
|
||||
validator = validators.Validator().forbid_use_of_password()
|
||||
validator.validate(uri)
|
||||
|
||||
|
||||
@pytest.mark.parametrize('uri', [
|
||||
rfc3986.uri_reference('https://'),
|
||||
rfc3986.uri_reference('/path/to/resource'),
|
||||
])
|
||||
def test_missing_host_component(uri):
|
||||
"""Verify that missing host components cause errors."""
|
||||
validator = validators.Validator().require_presence_of('host')
|
||||
with pytest.raises(exceptions.MissingComponentError):
|
||||
validator.validate(uri)
|
||||
|
||||
|
||||
@pytest.mark.parametrize('uri', [
|
||||
rfc3986.uri_reference('https://'),
|
||||
rfc3986.uri_reference('//google.com'),
|
||||
rfc3986.uri_reference('//google.com?query=value'),
|
||||
rfc3986.uri_reference('//google.com#fragment'),
|
||||
rfc3986.uri_reference('https://google.com'),
|
||||
rfc3986.uri_reference('https://google.com#fragment'),
|
||||
rfc3986.uri_reference('https://google.com?query=value'),
|
||||
])
|
||||
def test_missing_path_component(uri):
|
||||
"""Verify that missing path components cause errors."""
|
||||
validator = validators.Validator().require_presence_of('path')
|
||||
with pytest.raises(exceptions.MissingComponentError):
|
||||
validator.validate(uri)
|
||||
|
||||
|
||||
@pytest.mark.parametrize('uri', [
|
||||
rfc3986.uri_reference('//google.com'),
|
||||
rfc3986.uri_reference('//google.com?query=value'),
|
||||
rfc3986.uri_reference('//google.com#fragment'),
|
||||
])
|
||||
def test_multiple_missing_components(uri):
|
||||
"""Verify that multiple missing components are caught."""
|
||||
validator = validators.Validator().require_presence_of('scheme', 'path')
|
||||
with pytest.raises(exceptions.MissingComponentError) as captured_exc:
|
||||
validator.validate(uri)
|
||||
exception = captured_exc.value
|
||||
assert 2 == len(exception.args[-1])
|
||||
|
||||
|
||||
@pytest.mark.parametrize('uri', [
|
||||
rfc3986.uri_reference('smtp://'),
|
||||
rfc3986.uri_reference('telnet://'),
|
||||
])
|
||||
def test_ensure_uri_has_a_scheme(uri):
|
||||
"""Verify validation with allowed schemes."""
|
||||
validator = validators.Validator().allow_schemes('https', 'http')
|
||||
with pytest.raises(exceptions.UnpermittedComponentError):
|
||||
validator.validate(uri)
|
||||
|
||||
|
||||
@pytest.mark.parametrize('uri, failed_component', [
|
||||
(rfc3986.uri_reference('git://github.com'), 'scheme'),
|
||||
(rfc3986.uri_reference('http://github.com'), 'scheme'),
|
||||
(rfc3986.uri_reference('ssh://gitlab.com'), 'host'),
|
||||
(rfc3986.uri_reference('https://gitlab.com'), 'host'),
|
||||
])
|
||||
def test_allowed_hosts_and_schemes(uri, failed_component):
|
||||
"""Verify each of these fails."""
|
||||
validator = validators.Validator().allow_schemes(
|
||||
'https', 'ssh',
|
||||
).allow_hosts(
|
||||
'github.com', 'git.openstack.org',
|
||||
)
|
||||
with pytest.raises(exceptions.UnpermittedComponentError) as caught_exc:
|
||||
validator.validate(uri)
|
||||
|
||||
exc = caught_exc.value
|
||||
assert exc.component_name == failed_component
|
||||
|
||||
|
||||
@pytest.mark.parametrize('uri', [
|
||||
rfc3986.uri_reference('https://github.com/sigmavirus24'),
|
||||
rfc3986.uri_reference('ssh://github.com/sigmavirus24'),
|
||||
rfc3986.uri_reference('ssh://ssh@github.com:22/sigmavirus24'),
|
||||
rfc3986.uri_reference('https://github.com:443/sigmavirus24'),
|
||||
rfc3986.uri_reference('https://gitlab.com/sigmavirus24'),
|
||||
rfc3986.uri_reference('ssh://gitlab.com/sigmavirus24'),
|
||||
rfc3986.uri_reference('ssh://ssh@gitlab.com:22/sigmavirus24'),
|
||||
rfc3986.uri_reference('https://gitlab.com:443/sigmavirus24'),
|
||||
rfc3986.uri_reference('https://bitbucket.org/sigmavirus24'),
|
||||
rfc3986.uri_reference('ssh://bitbucket.org/sigmavirus24'),
|
||||
rfc3986.uri_reference('ssh://ssh@bitbucket.org:22/sigmavirus24'),
|
||||
rfc3986.uri_reference('https://bitbucket.org:443/sigmavirus24'),
|
||||
rfc3986.uri_reference('https://git.openstack.org/sigmavirus24'),
|
||||
rfc3986.uri_reference('ssh://git.openstack.org/sigmavirus24'),
|
||||
rfc3986.uri_reference('ssh://ssh@git.openstack.org:22/sigmavirus24'),
|
||||
rfc3986.uri_reference('https://git.openstack.org:443/sigmavirus24'),
|
||||
])
|
||||
def test_successful_complex_validation(uri):
|
||||
"""Verify we do not raise ValidationErrors for good URIs."""
|
||||
validators.Validator().allow_schemes(
|
||||
'https', 'ssh',
|
||||
).allow_hosts(
|
||||
'github.com', 'bitbucket.org', 'gitlab.com', 'git.openstack.org',
|
||||
).allow_ports(
|
||||
'22', '443',
|
||||
).require_presence_of(
|
||||
'scheme', 'host', 'path',
|
||||
).validate(uri)
|
41
tox.ini
41
tox.ini
@ -1,5 +1,5 @@
|
||||
[tox]
|
||||
envlist = py26,py27,py32,py33,py34,pypy,{py27,py34}-flake8
|
||||
envlist = py27,py33,py34,py35,py36,pypy,flake8
|
||||
|
||||
[testenv]
|
||||
pip_pre = False
|
||||
@ -12,31 +12,38 @@ commands =
|
||||
deps = {[testenv]deps}
|
||||
commands = py.test {posargs}
|
||||
|
||||
[testenv:py27-flake8]
|
||||
basepython = python2.7
|
||||
[testenv:flake8]
|
||||
basepython = python3
|
||||
deps =
|
||||
flake8
|
||||
commands = flake8 {posargs} rfc3986
|
||||
flake8-docstrings
|
||||
flake8-import-order
|
||||
commands = flake8 {posargs} src/rfc3986
|
||||
|
||||
[testenv:py34-flake8]
|
||||
basepython = python3.4
|
||||
[testenv:venv]
|
||||
commands = {posargs}
|
||||
|
||||
[testenv:build]
|
||||
deps =
|
||||
flake8
|
||||
commands = flake8 {posargs} rfc3986
|
||||
wheel
|
||||
commands =
|
||||
python setup.py sdist bdist_wheel
|
||||
|
||||
[testenv:release]
|
||||
deps =
|
||||
wheel
|
||||
{[testenv:build]deps}
|
||||
twine>=1.4.0
|
||||
commands =
|
||||
python setup.py sdist bdist_wheel
|
||||
{[testenv:build]commands}
|
||||
twine upload {posargs:--skip-existing dist/*}
|
||||
|
||||
[testenv:docs]
|
||||
deps =
|
||||
sphinx>=1.3.0
|
||||
sphinx-prompt
|
||||
commands =
|
||||
sphinx-build -E -c docs -b html docs/ docs/_build/html
|
||||
sphinx-build -WE -c docs/source/ -b html docs/source/ docs/build/html
|
||||
sphinx-build -WE -c docs/source/ -b doctest docs/source/ docs/build/html
|
||||
|
||||
[testenv:readme]
|
||||
deps =
|
||||
@ -47,3 +54,15 @@ commands =
|
||||
[pytest]
|
||||
addopts = -q
|
||||
norecursedirs = *.egg .git .* _*
|
||||
|
||||
[flake8]
|
||||
exclude =
|
||||
.tox,
|
||||
.git,
|
||||
__pycache__,
|
||||
*.pyc,
|
||||
*.egg-info,
|
||||
.cache,
|
||||
.eggs
|
||||
max-complexity = 10
|
||||
import-order-style = google
|
||||
|
Loading…
Reference in New Issue
Block a user