bpw_pde package

Module contents

Top-level bpw_pde package.

Submodules

bpw_pde.common module

Common functions used across the index, series, and dataframe modules.

bpw_pde.common.clean_name(name: str) → str[source]

Applies lowercasing, whitespace tokenization, and rejoining tokens with underscore to name.

Parameters: name – The name to be cleaned.
Returns: The cleaned name.

bpw_pde.common.md5(text: str) → str[source]

Computes the MD5 hash of text.

Parameters: text – The text to be hashed using the MD5 hashing algorithm.
Returns: The MD5 hash of the text.

bpw_pde.common.unicode_normalize(text: str, form: str = 'NFKC') → str[source]

Applies Unicode normalization on text using the form form.

Parameters

text – The text to be normalized.
form – The Unicode normalization form to use for normalization.

Returns

The Unicode normalized text.

bpw_pde.dataframe module

Module for the pandas.DataFrame accessor class.

class bpw_pde.dataframe.DataframeAccessor(df: pandas.core.frame.DataFrame)[source]

Bases: object

Accessor class for pandas.DataFrame.

property clean_names: pandas.core.frame.DataFrame

Applies bpw_pde.common.clean_name() to all column names of the captured pandas.DataFrame.

Returns: A copy of the captured pandas.DataFrame with cleaned column names.

freq(*args, **kwargs)[source]

Alias to sidetable.SideTableAccessor.freq().

Parameters

args – args forwarded to sidetable.SideTableAccessor.freq().
kwargs – kwargs forwarded to sidetable.SideTableAccessor.freq().

Returns

The result of calling sidetable.SideTableAccessor.freq() on the captured pandas.DataFrame.

property msno

Adapted missingno interface to work as a nested accessor.

Returns: An adapter class that provides access to the functions in missingno.

samp(n: int = 5) → pandas.core.frame.DataFrame[source]

Transposed random sample of the captured pandas.DataFrame.

It is common want to take a quick look at a couple of random samples of rows from a pandas.DataFrame and have it all fit on screen without mucking with pandas display settings. This usually does a better job of that than just calling pandas.DataFrame.sample() alone.

Parameters: n – The number of rows to sample, defaults to 5.
Returns: A random sample of the captured pandas.DataFrame rows transposed.

property stb

Alias to sidetable pandas.DataFrame accessor stb.

Returns: The stb accessor on the captured pandas.DataFrame.

bpw_pde.index module

Module for the pandas.Index accessor class.

Currently, this module has no useful functionality.

class bpw_pde.index.IndexAccessor(index: pandas.core.indexes.base.Index)[source]

Bases: object

Accessor class for pandas.Index.

bpw_pde.series module

Module for the pandas.Series accessor class.

class bpw_pde.series.SeriesAccessor(series: pandas.core.series.Series)[source]

Bases: object

Accessor class for pandas.Series.

property clean_name: pandas.core.series.Series

Applies bpw_pde.common.clean_name() to the name of the captured pandas.Series.

Returns: A copy of the captured pandas.Series with a cleaned name.

property hashify: pandas.core.series.Series

Applies bpw_pde.common.md5() to the captured pandas.Series.

Returns: A copy of the captured pandas.Series MD5 hashes.

normalize(form: str = 'NFKC') → pandas.core.series.Series[source]

Applies bpw_pde.common.normalize() to the captured pandas.Series.

Parameters: form – Unicode normalization form, defaults to ‘NFKC’.
Returns: A copy of the captured pandas.Series Unicode normalized.