Internet functionality

Working with URLs

urllib is a Python standard library that provides a set of functions for working with URLs.

Example of parsing URL using urllib.parse:

import urllib.parse
sample_url = "https://www.google.com/search?q=python"
parsed_url = urllib.parse.urlparse(sample_url)
print(parsed_url)
print(parsed_url.scheme)
print(parsed_url.hostname)
print(parsed_url.path)

The result is like this:

ParseResult(scheme='https', netloc='www.google.com', path='/search', params='', query='q=python', fragment='')
https
www.google.com
/search

quote() replaces special characters in a string with their URL-encoded equivalents. For example,

sample_string = "Hello El Niño!"
print(urllib.parse.quote(sample_string))
print(urllib.parse.quote_plus(sample_string))

The result is like this:

urlencode() converts a dictionary to a URL query string. For example,

The result is like this:

Retrieving internet data

urllib.request provides a number of functions for retrieving data from the web. For example, we can use it to open a URL, read the contents of a URL, and write to a URL.

Example of creating a request to retrieve data using urllib.request.urlopen():

Printed result is like this:

Parsing HTML

We can work with HTML data via the HTMLParser module. For example,

Using JSON

JSON is acronym for JavaScript Object Notation. It is a lightweight data-interchange format. json is a Python standard library that provides a number of functions for working with JSON data.

Example:

Last updated