API reference

See modules for a short description of each modules. For a full listing of the contents of all modules, see the module contents overview.

Modules

algorithms
click
configuration
data_frame
debug
dict
difflib Module difflib – helpers for computing deltas between objects.
exceptions Exception classes and utilities.
hashlib hashlib extensions.
iterable
logging Logging utilities.
multi_dict Multi-dict class, a dict which maps keys to one or more values.
numpy numpy extensions.
observable Observable collections.
parse File parsers
path pathlib extensions.
pkg_resources pkg_resources extensions.
series
set Set utilities.
sqlalchemy
test
various Various utilities.
write File writers

Module content overview

algorithms

multi_way_partitioning
toset_from_tosets

click

option
password_option

configuration

ConfigurationLoader

data_frame

assert_equals
equals
replace_na_with_none
split_array_like

debug

pretty_memory_info

dict

DefaultDict
pretty_print_head
invert

difflib

line_diff

exceptions

InvalidOperationError Operation is illegal/invalid in the current state.
UserException Exceptional user input/action.
exc_info Get exc_info tuple from exception.

hashlib

base85_digest Get base 85 encoded digest of hash.

iterable

is_sorted

logging

configure Configure root logger to log INFO to stderr and DEBUG to log file.
set_level Temporarily change log level of logger.

multi_dict

MultiDict A multi-dict, a dict which maps keys to one or more values.

numpy

ArrayLike Type representing any value that can be passed to numpy.array to create an array.

observable

Set Observable set

parse

csv Parse CSV file.
tsv

path

TemporaryDirectory An extension to tempfile.TemporaryDirectory.
chmod Change file mode bits.
hash Hash file or directory.
is_descendant Get whether path is descendant of other path.
is_descendant_or_self Get whether path is descendant of other path or is equivalent to it.
remove Remove file or directory (recursively), if it exists.
sorted_lines Lines of file, sorted.

pkg_resources

resource_copy Copy file/dir resource to destination.
resource_path Like resource_filename but return a Path instead.

series

assert_equals
equals
invert
split

set

merge_by_overlap Of a list of sets, merge those that overlap, in place.

sqlalchemy

log_sql
pretty_sql

test

assert_dir_equals
assert_dir_unchanged
assert_file_equals
assert_file_mode
assert_lines_equal
assert_matches
assert_search_matches
assert_text_contains
assert_text_equals
assert_xlsx_equals
assert_xml_equals
reset_loggers
temp_dir_cwd

various

join_multiline Join multiline text into a single line.

write

csv Write CSV file.
tsv

pytil.algorithms

pytil.click

pytil.configuration

pytil.data_frame

pytil.debug

pytil.dict

pytil.difflib

pytil.exceptions

Exception classes and utilities.

exception pytil.exceptions.InvalidOperationError[source]

Bases: Exception

Operation is illegal/invalid in the current state.

If an invalid argument was given, use ValueError instead. An operation can be a method/function call or getting/setting an attribute.

exception pytil.exceptions.UserException[source]

Bases: Exception

Exceptional user input/action.

pytil.exceptions.exc_info(exception)[source]

Get exc_info tuple from exception.

See also

sys.exc_info()

pytil.hashlib

hashlib extensions.

pytil.hashlib.base85_digest(hash_)[source]

Get base 85 encoded digest of hash.

Parameters:hash (hash object) – E.g. as returned by hashlib.sha512().
Returns:Base 85 encoded digest.
Return type:str

pytil.iterable

pytil.logging

Logging utilities.

pytil.logging.configure(log_file)[source]

Configure root logger to log INFO to stderr and DEBUG to log file.

The log file is appended to. Stderr uses a terse format, while the log file uses a verbose unambiguous format.

Root level is set to INFO.

Parameters:log_file (Path) – File to log to.
Returns:Stderr and file handler respectively.
Return type:Tuple[StreamHandler, FileHandler]
pytil.logging.set_level(logger, level)[source]

Temporarily change log level of logger.

Parameters:
  • logger (str or Logger) – Logger name or logger whose log level to change.
  • level (int) – Log level to set.

Examples

>>> with set_level('sqlalchemy.engine', logging.INFO):
...     pass  # sqlalchemy log level is set to INFO in this block

pytil.multi_dict

Multi-dict class, a dict which maps keys to one or more values.

class pytil.multi_dict.MultiDict(dict_)[source]

Bases: object

A multi-dict, a dict which maps keys to one or more values.

Warning

This is very much a work in progress.

Parameters:dict (Dict[Hashable, Set[Hashable]]) – Dict to create a multi-dict view of. No copy is made. Editing the multi-dict, edits the underlying dict. Changes to the underlying dict, affect the multi-dict.

Notes

A multi-dict (or multi map) is a dict that maps each key to one or more values.

Multi-dicts provided by other libraries tend to be more feature rich, while this interface is far more conservative. Instead of wrapping, they provide an interface that mixes regular and multi-dict access. Additionally, other multi-dicts map keys to lists of values, allowing a key to map to the same value multiple times.

dict

Get the underlying dict.

Returns:The underlying dict.
Return type:Dict[Hashable, Set[Hashable]]
invert()[source]

Invert by swapping each value with its key.

Returns:Inverted multi-dict.
Return type:MultiDict

Examples

>>> MultiDict({1: {1}, 2: {1,2,3}}, 4: {}).invert()
MultiDict({1: {1,2}, 2: {2}, 3: {2}})

pytil.numpy

numpy extensions.

class pytil.numpy.ArrayLike[source]

Bases: typing.Generic

Type representing any value that can be passed to numpy.array to create an array.

ArrayLike[T] is an array-like with data type T.

pytil.observable

Observable collections.

class pytil.observable.Set(*args, **kwargs)[source]

Bases: set

Observable set

change_listeners

Get change listeners.

Each change listener is called immediately after a mutating operation that actually changed the set. E.g. redundant additions are ignored.

Returns:List of change listeners. Each change listener takes 2 arguments: the items that were added, and the items that were removed. Note: Items can be added and removed from a set in a single operation. When a listener raises, the change is rolled back without further notification.
Return type:List[Callable[[FrozenSet, FrozenSet], None]]

pytil.parse

File parsers

class pytil.parse.csv(file, *args, **kwargs)[source]

Bases: object

Parse CSV file.

Parameters:
  • file (Path) –
  • *args – csv.DictReader args (except the f arg)
  • **kwargs – csv.DictReader args

Examples

for row in parse.csv(file):
print(row[‘column’])
with parse.csv(file) as reader:
for row in reader:
pass

pytil.path

pathlib extensions.

pytil.path.TemporaryDirectory(suffix=None, prefix=None, dir=None, on_error='ignore')[source]

An extension to tempfile.TemporaryDirectory.

Unlike with tempfile, a Path is yielded on __enter__, not a str.

Parameters:
pytil.path.chmod(path, mode, operator='=', recursive=False)[source]

Change file mode bits.

When recursively chmodding a directory, executable bits in mode are ignored when applying to a regular file. E.g. chmod(path, mode=0o777, recursive=True) would apply mode=0o666 to regular files.

Symlinks are ignored.

Parameters:
  • path (Path) – Path to chmod.
  • mode (int) – Mode bits to apply, e.g. 0o777.
  • operator (str) –

    How to apply the mode bits to the file, one of:

    ’=’
    Replace mode with given mode.
    ’+’
    Add to current mode.
    ’-‘
    Subtract from current mode.
  • recursive (bool) – Whether to chmod recursively.
pytil.path.hash(path, hash_function=<built-in function openssl_sha512>)[source]

Hash file or directory.

Parameters:
  • path (Path) – File or directory to hash.
  • hash_function (Callable[[], hash object]) – Function which creates a hashlib hash object when called. Defaults to hashlib.sha512.
Returns:

hashlib hash object of file/directory contents. File/directory stat data is ignored. The directory digest covers file/directory contents and their location relative to the directory being digested. The directory name itself is ignored.

Return type:

hash object

pytil.path.is_descendant(descendant, ancestor)[source]

Get whether path is descendant of other path.

Uses the absolute path, so symlinks, … do not affect this.

Parameters:
  • descendant (Path) – Supposed descendant.
  • ancestor (Path) – Supposed ancestor.
Returns:

Whether descendant is indeed a descendant of ancestor.

Return type:

bool

See also

is_descendant_or_self()
Get whether path is descendant of other path or is equivalent to it

Examples

>>> is_descendant(Path('a'), Path('a'))
False
>>> is_descendant(Path('a/b'), Path('a'))
True
>>> is_descendant(Path('a'), Path('a/b'))
False
>>> is_descendant(Path('a'), Path('a/..'))
False
pytil.path.is_descendant_or_self(descendant, ancestor)[source]

Get whether path is descendant of other path or is equivalent to it.

Uses the absolute path, so symlinks, … do not affect this.

Parameters:
  • descendant (Path) – Supposed descendant.
  • ancestor (Path) – Supposed ancestor.
Returns:

Whether descendant is indeed a descendant of ancestor or they are equivalent (equal after path normalisation).

Return type:

bool

See also

is_descendant()
Get whether path is descendant of other path

Examples

>>> is_descendant_or_self(Path('a'), Path('a'))
True
>>> is_descendant_or_self(Path('a/b'), Path('a'))
True
>>> is_descendant_or_self(Path('a'), Path('a/b'))
False
>>> is_descendant_or_self(Path('a'), Path('a/..'))
False
pytil.path.remove(path, force=False)[source]

Remove file or directory (recursively), if it exists.

On NFS file systems, if a directory contains .nfs* temporary files (sometimes created when deleting a file), it waits for them to go away.

Parameters:
  • path (Path) – Path to remove.
  • force (bool) – If True, will remove files and directories even if they are read-only (as if first doing chmod -R +w).
pytil.path.sorted_lines(file)[source]

Lines of file, sorted.

Parameters:file (Path) – Path to file whose lines to read.
Returns:Sorted lines of file.
Return type:List[str]

pytil.pkg_resources

pkg_resources extensions.

pytil.pkg_resources.resource_copy(package_or_requirement, resource_name, destination)[source]

Copy file/dir resource to destination.

Parameters:
  • package_or_requirement (str) –
  • resource_name (str) –
  • destination (Path) – Path to copy to, it must not exist.
pytil.pkg_resources.resource_path(package_or_requirement, resource_name)[source]

Like resource_filename but return a Path instead.

Parameters:
  • package_or_requirement (str) –
  • resource_name (str) –
Returns:

Path to resource.

Return type:

Path

pytil.series

pytil.set

Set utilities.

pytil.set.merge_by_overlap(sets)[source]

Of a list of sets, merge those that overlap, in place.

The result isn’t necessarily a subsequence of the original sets.

Parameters:sets (Sequence[Set[Any]]) – Sets of which to merge those that overlap. Empty sets are ignored.

Notes

Implementation is based on this StackOverflow answer. It outperforms all other algorithms in the thread (visited at dec 2015) on python3.4 using a wide range of inputs.

Examples

>>> merge_by_overlap([{1,2}, set(), {2,3}, {4,5,6}, {6,7}])
[{1,2,3}, {4,5,6,7}]

pytil.sqlalchemy

pytil.test

pytil.various

Various utilities.

pytil.various.join_multiline(text)[source]

Join multiline text into a single line.

pytil.write

File writers

pytil.write.csv(file, *args, **kwargs)[source]

Write CSV file.

Parameters:
  • file (Path) –
  • *args – csv.DictWriter args (except the f arg)
  • **kwargs – csv.DictWriter args

Examples

with write.csv(file) as writer:
writer.writerow((1,2,3))