Building URIs
Constructing URLs often seems simple. There are some problems with concatenating strings to build a URL:
Certain parts of the URL disallow certain characters
Formatting some parts of the URL is tricky and doing it manually isn’t fun
To make the experience better rfc3986 provides the
URIBuilder class to generate valid
URIReference instances. The
URIBuilder class will handle ensuring that each
component is normalized and safe for real world use.
Example Usage
Note
All of the methods on a URIBuilder are
chainable (except finalize() and
geturl() as neither returns a
URIBuilder).
Building From Scratch
Let’s build a basic URL with just a scheme and host. First we create an
instance of URIBuilder. Then we call
add_scheme() and
add_host() with the scheme and host
we want to include in the URL. Then we convert our builder object into
a URIReference and call
unsplit().
>>> from rfc3986 import builder
>>> print(builder.URIBuilder().add_scheme(
... 'https'
... ).add_host(
... 'github.com'
... ).finalize().unsplit())
https://github.com
Replacing Components of a URI
It is possible to update an existing URI by constructing a builder from an
instance of URIReference or a textual representation:
>>> from rfc3986 import builder
>>> print(builder.URIBuilder.from_uri("http://github.com").add_scheme(
... 'https'
... ).finalize().unsplit())
https://github.com
The Builder is Immutable
Each time you invoke a method, you get a new instance of a
URIBuilder class so you can build several different
URLs from one base instance.
>>> from rfc3986 import builder
>>> github_builder = builder.URIBuilder().add_scheme(
... 'https'
... ).add_host(
... 'api.github.com'
... )
>>> print(github_builder.add_path(
... '/users/sigmavirus24'
... ).finalize().unsplit())
https://api.github.com/users/sigmavirus24
>>> print(github_builder.add_path(
... '/repos/sigmavirus24/rfc3986'
... ).finalize().unsplit())
https://api.github.com/repos/sigmavirus24/rfc3986
Convenient Path Management
Because our builder is immutable, one could use the
URIBuilder class to build a class to make HTTP
Requests that used the provided path to extend the original one.
>>> from rfc3986 import builder
>>> github_builder = builder.URIBuilder().add_scheme(
... 'https'
... ).add_host(
... 'api.github.com'
... ).add_path(
... '/users'
... )
>>> print(github_builder.extend_path("sigmavirus24").geturl())
https://api.github.com/users/sigmavirus24
>>> print(github_builder.extend_path("lukasa").geturl())
https://api.github.com/users/lukasa
Convenient Credential Handling
rfc3986 makes adding authentication credentials convenient. It takes care of
making the credentials URL safe. There are some characters someone might want
to include in a URL that are not safe for the authority component of a URL.
>>> from rfc3986 import builder
>>> print(builder.URIBuilder().add_scheme(
... 'https'
... ).add_host(
... 'api.github.com'
... ).add_credentials(
... username='us3r',
... password='p@ssw0rd',
... ).finalize().unsplit())
https://us3r:p%40ssw0rd@api.github.com
Managing Query String Parameters
Further, rfc3986 attempts to simplify the process of adding query parameters
to a URL. For example, if we were using Elasticsearch, we might do something
like:
>>> from rfc3986 import builder
>>> print(builder.URIBuilder().add_scheme(
... 'https'
... ).add_host(
... 'search.example.com'
... ).add_path(
... '_search'
... ).add_query_from(
... [('q', 'repo:sigmavirus24/rfc3986'), ('sort', 'created_at:asc')]
... ).finalize().unsplit())
https://search.example.com/_search?q=repo%3Asigmavirus24%2Frfc3986&sort=created_at%3Aasc
If one also had an existing URL with query string that we merely wanted to
append to, we can also do that with rfc3986.
>>> from rfc3986 import builder
>>> print(builder.URIBuilder().from_uri(
... 'https://search.example.com/_search?q=repo%3Asigmavirus24%2Frfc3986'
... ).extend_query_with(
... [('sort', 'created_at:asc')]
... ).finalize().unsplit())
https://search.example.com/_search?q=repo%3Asigmavirus24%2Frfc3986&sort=created_at%3Aasc
Adding Fragments
Finally, we provide a way to add a fragment to a URL. Let’s build up a URL to view the section of the RFC that refers to fragments:
>>> from rfc3986 import builder
>>> print(builder.URIBuilder().add_scheme(
... 'https'
... ).add_host(
... 'tools.ietf.org'
... ).add_path(
... '/html/rfc3986'
... ).add_fragment(
... 'section-3.5'
... ).finalize().unsplit())
https://tools.ietf.org/html/rfc3986#section-3.5