Series methods

`_obj = pandas_obj` `instance-attribute`

`init(pandas_obj)`

`assert_all_nulls(fail_message=' ㄨ Assert all nulls failed ', pass_message=' ✔️ Assert all nulls passed ', raise_exception=True, exception_to_raise=DataError, verbose=False)`

Tests whether Series has all nulls. Optionally raises an exception. Does not modify the Series itself.

Example

(
    iris
    ["sepal_length"]
    .check.assert_all_nulls()
)

# Will raise an exception, "ㄨ Assert all nulls failed"

See docs for .check.assert_data() for examples of how to customize assertions.

Parameters:

Name	Type	Description	Default
`fail_message`	`str`	Message to display if the condition fails.	`' ㄨ Assert all nulls failed '`
`pass_message`	`str`	Message to display if the condition passes.	`' ✔️ Assert all nulls passed '`
`raise_exception`	`bool`	Whether to raise an exception if the condition fails.	`True`
`exception_to_raise`	`Type[BaseException]`	The exception to raise if the condition fails and raise_exception is True.	`DataError`
`verbose`	`bool`	Whether to display the pass message if the condition passes.	`False`

Returns:

Type	Description
`Series`	The original Series, unchanged.

`assert_data(condition, fail_message=' ㄨ Assertion failed ', pass_message=' ✔️ Assertion passed ', raise_exception=True, exception_to_raise=DataError, message_shows_condition=True, verbose=False)`

Tests whether Series meets condition. Optionally raises an exception. Does not modify the Series itself.

Example

(
    iris
    ["sepal_length"]

    # Validate that a Series has at least 1 row:
    .check.assert_data(lambda s: s.shape[0]>0)

    # Or customize the message displayed when assert fails
    .check.assert_data(lambda df: s.shape[0]>0, "Assertion failed, Series has no rows!")

    # Or show a warning instead of raising an exception
    .check.assert_data(lambda df: s.shape[0]>0, "FYI Series has no rows", raise_exception=False)
)

Parameters:

Name	Type	Description	Default
`condition`	`Callable`	Assertion criteria in the form of a lambda function, such as `lambda s: s.shape[0]>10`.	required
`fail_message`	`str`	Message to display if the condition fails.	`' ㄨ Assertion failed '`
`pass_message`	`str`	Message to display if the condition passes.	`' ✔️ Assertion passed '`
`raise_exception`	`bool`	Whether to raise an exception if the condition fails.	`True`
`exception_to_raise`	`Type[BaseException]`	The exception to raise if the condition fails and raise_exception is True.	`DataError`
`message_shows_condition`	`bool`	Whether the fail/pass message should also print the assertion criteria	`True`
`verbose`	`bool`	Whether to display the pass message if the condition passes.	`False`

Returns:

Type	Description
`Series`	The original Series, unchanged.

`assert_datetime(fail_message=None, pass_message=' ✔️ Assert datetime passed ', raise_exception=True, exception_to_raise=TypeError, verbose=False)`

Tests whether Series is datetime or timestamp. Optionally raises an exception. Does not modify the Series itself.

Example

(
    df
    ["datetime_col"]
    .check.assert_datetime()
)

See docs for .check.assert_data() for examples of how to customize assertions.

Parameters:

Name	Type	Description	Default
`fail_message`	`Union[str, None]`	Message to display if the condition fails. If None, will report expected vs observed type.	`None`
`pass_message`	`str`	Message to display if the condition passes.	`' ✔️ Assert datetime passed '`
`raise_exception`	`bool`	Whether to raise an exception if the condition fails.	`True`
`exception_to_raise`	`Type[BaseException]`	The exception to raise if the condition fails and raise_exception is True.	`TypeError`
`verbose`	`bool`	Whether to display the pass message if the condition passes.	`False`

Returns:

Type	Description
`Series`	The original Series, unchanged.

`assert_float(fail_message=None, pass_message=' ✔️ Assert float passed ', raise_exception=True, exception_to_raise=TypeError, verbose=False)`

Tests whether Series is floats. Optionally raises an exception. Does not modify the Series itself.

Example

(
    df
    ["float_col"]
    .check.assert_float()
)

See docs for .check.assert_data() for examples of how to customize assertions.

Parameters:

Name	Type	Description	Default
`fail_message`	`Union[str, None]`	Message to display if the condition fails. If None, will report expected vs observed type.	`None`
`pass_message`	`str`	Message to display if the condition passes.	`' ✔️ Assert float passed '`
`raise_exception`	`bool`	Whether to raise an exception if the condition fails.	`True`
`exception_to_raise`	`Type[BaseException]`	The exception to raise if the condition fails and raise_exception is True.	`TypeError`
`verbose`	`bool`	Whether to display the pass message if the condition passes.	`False`

Returns:

Type	Description
`Series`	The original Series, unchanged.

`assert_greater_than(min, fail_message=' ㄨ Assert minimum failed ', pass_message=' ✔️ Assert minimum passed ', or_equal_to=False, raise_exception=True, exception_to_raise=DataError, verbose=False)`

Tests whether Series is > or >= a minimum threshold. Optionally raises an exception. Does not modify the Series itself.

Example

(
    iris
    ["sepal_length"]
    # Validate that the Series is always >= 0
    .check.assert_greater_than(0, or_equal_to=True)
)

See docs for .check.assert_data() for examples of how to customize assertions.

Parameters:

Name	Type	Description	Default
`min`	`Any`	the minimum value to compare Series to. Accepts any type that can be used in >, such as int, float, str, datetime	required
`fail_message`	`str`	Message to display if the condition fails.	`' ㄨ Assert minimum failed '`
`pass_message`	`str`	Message to display if the condition passes.	`' ✔️ Assert minimum passed '`
`or_equal_to`	`bool`	whether to test for >= min (True) or > min (False)	`False`
`raise_exception`	`bool`	Whether to raise an exception if the condition fails.	`True`
`exception_to_raise`	`Type[BaseException]`	The exception to raise if the condition fails and raise_exception is True.	`DataError`
`verbose`	`bool`	Whether to display the pass message if the condition passes.	`False`

Returns:

Type	Description
`Series`	The original Series, unchanged.

`assert_int(fail_message=None, pass_message=' ✔️ Assert integeer passed ', raise_exception=True, exception_to_raise=TypeError, verbose=False)`

Tests whether Series is integers. Optionally raises an exception. Does not modify the Series itself.

Example

(
    df
    ["int_col"]
    .check.assert_int()
)

See docs for .check.assert_data() for examples of how to customize assertions.

Parameters:

Name	Type	Description	Default
`fail_message`	`Union[str, None]`	Message to display if the condition fails. If None, will report expected vs observed type.	`None`
`pass_message`	`str`	Message to display if the condition passes.	`' ✔️ Assert integeer passed '`
`raise_exception`	`bool`	Whether to raise an exception if the condition fails.	`True`
`exception_to_raise`	`Type[BaseException]`	The exception to raise if the condition fails and raise_exception is True.	`TypeError`
`verbose`	`bool`	Whether to display the pass message if the condition passes.	`False`

Returns:

Type	Description
`Series`	The original Series, unchanged.

`assert_less_than(max, fail_message=' ㄨ Assert maximum failed ', pass_message=' ✔️ Assert maximum passed ', or_equal_to=False, raise_exception=True, exception_to_raise=DataError, verbose=False)`

Tests whether all values in Series are < or <= a maximum threshold. Optionally raises an exception. Does not modify the Series itself.

Example

(
    iris
    ["sepal_length"]

    # Validate that sepal_length is always < 1000
    .check.assert_less_than(1000)

    # Validate that it's always <= 1000
    .check.assert_less_than(1000, or_equal_to=True)
)

See docs for .check.assert_data() for examples of how to customize assertions.

Parameters:

Name	Type	Description	Default
`max`	`Any`	the max value to compare Series to. Accepts any type that can be used in <, such as int, float, str, datetime	required
`fail_message`	`str`	Message to display if the condition fails.	`' ㄨ Assert maximum failed '`
`pass_message`	`str`	Message to display if the condition passes.	`' ✔️ Assert maximum passed '`
`or_equal_to`	`bool`	whether to test for <= max (True) or < max (False)	`False`
`raise_exception`	`bool`	Whether to raise an exception if the condition fails.	`True`
`exception_to_raise`	`Type[BaseException]`	The exception to raise if the condition fails and raise_exception is True.	`DataError`
`verbose`	`bool`	Whether to display the pass message if the condition passes.	`False`

Returns:

Type	Description
`Series`	The original Series, unchanged.

`assert_negative(fail_message=' ㄨ Assert negative failed ', pass_message=' ✔️ Assert negative passed ', assert_no_nulls=True, raise_exception=True, exception_to_raise=DataError, verbose=False)`

Tests whether Series has all negative values. Optionally raises an exception. Does not modify the Series itself.

Example

(
    df
    ["column_name"]
    .check.assert_negative()
)

See docs for .check.assert_data() for examples of how to customize assertions.

Parameters:

Name	Type	Description	Default
`fail_message`	`str`	Message to display if the condition fails.	`' ㄨ Assert negative failed '`
`pass_message`	`str`	Message to display if the condition passes.	`' ✔️ Assert negative passed '`
`assert_no_nulls`	`bool`	Whether to also enforce that data has no nulls.	`True`
`raise_exception`	`bool`	Whether to raise an exception if the condition fails.	`True`
`exception_to_raise`	`Type[BaseException]`	The exception to raise if the condition fails and raise_exception is True.	`DataError`
`verbose`	`bool`	Whether to display the pass message if the condition passes.	`False`

Returns:

Type	Description
`Series`	The original Series, unchanged.

`assert_no_nulls(fail_message=' ㄨ Assert no nulls failed ', pass_message=' ✔️ Assert no nulls passed ', raise_exception=True, exception_to_raise=DataError, verbose=False)`

Tests whether Series has no nulls. Optionally raises an exception. Does not modify the Series itself.

Example

(
    ["sepal_length"]
    .check.assert_no_nulls()
)

See docs for .check.assert_data() for examples of how to customize assertions.

Parameters:

Name	Type	Description	Default
`fail_message`	`str`	Message to display if the condition fails.	`' ㄨ Assert no nulls failed '`
`pass_message`	`str`	Message to display if the condition passes.	`' ✔️ Assert no nulls passed '`
`raise_exception`	`bool`	Whether to raise an exception if the condition fails.	`True`
`exception_to_raise`	`Type[BaseException]`	The exception to raise if the condition fails and raise_exception is True.	`DataError`
`verbose`	`bool`	Whether to display the pass message if the condition passes.	`False`

Returns:

Type	Description
`Series`	The original Series, unchanged.

`assert_nrows(nrows, fail_message=' ㄨ Assert nrows failed ', pass_message=' ✔️ Assert nrows passed ', raise_exception=True, exception_to_raise=DataError, verbose=False)`

Tests whether Series has a given number of rows. Optionally raises an exception. Does not modify the Series itself.

Example

(
    iris
    ["species"]
    .check.assert_nrows(20)
)

See docs for .check.assert_data() for examples of how to customize assertions.

Parameters:

Name	Type	Description	Default
`nrows`	`int`	The expected number of rows	required
`fail_message`	`str`	Message to display if the condition fails.	`' ㄨ Assert nrows failed '`
`pass_message`	`str`	Message to display if the condition passes.	`' ✔️ Assert nrows passed '`
`raise_exception`	`bool`	Whether to raise an exception if the condition fails.	`True`
`exception_to_raise`	`Type[BaseException]`	The exception to raise if the condition fails and raise_exception is True.	`DataError`
`verbose`	`bool`	Whether to display the pass message if the condition passes.	`False`

Returns:

Type	Description
`DataFrame`	The original Series, unchanged.

`assert_positive(fail_message=' ㄨ Assert positive failed ', pass_message=' ✔️ Assert positive passed ', assert_no_nulls=True, raise_exception=True, exception_to_raise=DataError, verbose=False)`

Tests whether Series has all positive values. Optionally raises an exception. Does not modify the Series itself.

Example

(
    iris
    ["sepal_length"]
    .check.assert_positive()
)

See docs for .check.assert_data() for examples of how to customize assertions.

Parameters:

Name	Type	Description	Default
`fail_message`	`str`	Message to display if the condition fails.	`' ㄨ Assert positive failed '`
`pass_message`	`str`	Message to display if the condition passes.	`' ✔️ Assert positive passed '`
`assert_no_nulls`	`bool`	Whether to also enforce that data has no nulls.	`True`
`raise_exception`	`bool`	Whether to raise an exception if the condition fails.	`True`
`exception_to_raise`	`Type[BaseException]`	The exception to raise if the condition fails and raise_exception is True.	`DataError`
`verbose`	`bool`	Whether to display the pass message if the condition passes.	`False`

Returns:

Type	Description
`Series`	The original Series, unchanged.

`assert_same_nrows(other, fail_message=' ㄨ Assert same_nrows failed ', pass_message=' ✔️ Assert same_nrows passed ', raise_exception=True, exception_to_raise=DataError, verbose=False)`

Tests whether Series has the same number of rows as another DataFrame/Series has.

Optionally raises an exception. Does not modify the Series itself.

Example

(
    df1
    ["column"]
    .check.assert_same_nrows(df2)
)

See docs for .check.assert_data() for examples of how to customize assertions.

Parameters:

Name	Type	Description	Default
`other`	`Union[DataFrame, Series]`	The DataFrame or Series that we expect to have the same # of rows as	required
`fail_message`	`str`	Message to display if the condition fails.	`' ㄨ Assert same_nrows failed '`
`pass_message`	`str`	Message to display if the condition passes.	`' ✔️ Assert same_nrows passed '`
`raise_exception`	`bool`	Whether to raise an exception if the condition fails.	`True`
`exception_to_raise`	`Type[BaseException]`	The exception to raise if the condition fails and raise_exception is True.	`DataError`
`verbose`	`bool`	Whether to display the pass message if the condition passes.	`False`

Returns:

Type	Description
`DataFrame`	The original DataFrame, unchanged.

`assert_str(fail_message=None, pass_message=' ✔️ Assert string passed ', raise_exception=True, exception_to_raise=TypeError, verbose=False)`

Tests whether Series is strings. Optionally raises an exception. Does not modify the Series itself.

Example

(
    iris
    ["species"]
    .check.assert_str()
)

See docs for .check.assert_data() for examples of how to customize assertions.

Parameters:

Name	Type	Description	Default
`fail_message`	`Union[str, None]`	Message to display if the condition fails. If None, will report expected vs observed type.	`None`
`pass_message`	`str`	Message to display if the condition passes.	`' ✔️ Assert string passed '`
`raise_exception`	`bool`	Whether to raise an exception if the condition fails.	`True`
`exception_to_raise`	`Type[BaseException]`	The exception to raise if the condition fails and raise_exception is True.	`TypeError`
`verbose`	`bool`	Whether to display the pass message if the condition passes.	`False`

Returns:

Type	Description
`Series`	The original Series, unchanged.

`assert_timedelta(fail_message=None, pass_message=' ✔️ Assert timedelta passed ', raise_exception=True, exception_to_raise=TypeError, verbose=False)`

Tests whether Series is of type timedelta. Optionally raises an exception. Does not modify the Series itself.

Example

(
    df
    .check.assert_timedelta(subset=["timedelta_col"])
)

See docs for .check.assert_data() for examples of how to customize assertions.

Parameters:

Name	Type	Description	Default
`fail_message`	`Union[str, None]`	Message to display if the condition fails. If None, will report expected vs observed type.	`None`
`pass_message`	`str`	Message to display if the condition passes.	`' ✔️ Assert timedelta passed '`
`raise_exception`	`bool`	Whether to raise an exception if the condition fails.	`True`
`exception_to_raise`	`Type[BaseException]`	The exception to raise if the condition fails and raise_exception is True.	`TypeError`
`verbose`	`bool`	Whether to display the pass message if the condition passes.	`False`

Returns:

Type	Description
`Series`	The original Series, unchanged.

`assert_type(dtype, fail_message=None, pass_message=' ✔️ Assert type passed ', raise_exception=True, exception_to_raise=TypeError, verbose=False)`

Tests whether Series meets type assumption. Optionally raises an exception. Does not modify the Series itself.

Example

# Validate that a column of mixed types has overall type `object`:
(
    iris
    ["column_with_mixed_types"]
    .check.assert_type(object)
)

See docs for .check.assert_data() for examples of how to customize assertions.

Parameters:

Name	Type	Description	Default
`dtype`	`Type[Any]`	The required variable type	required
`fail_message`	`Union[str, None]`	Message to display if the condition fails. If None, will report expected vs observed type.	`None`
`pass_message`	`str`	Message to display if the condition passes.	`' ✔️ Assert type passed '`
`raise_exception`	`bool`	Whether to raise an exception if the condition fails.	`True`
`exception_to_raise`	`Type[BaseException]`	The exception to raise if the condition fails and raise_exception is True.	`TypeError`
`verbose`	`bool`	Whether to display the pass message if the condition passes.	`False`

Returns:

Type	Description
`Series`	The original Series, unchanged.

`assert_unique(fail_message=' ㄨ Assert unique failed ', pass_message=' ✔️ Assert unique passed ', raise_exception=True, exception_to_raise=DataError, verbose=False)`

Validates that a Series has no duplicate values. Optionally raises an exception. Does not modify the Series itself.

Example

(
    df
    ["id_column"]
    .check.assert_unique()
)

See docs for .check.assert_data() for examples of how to customize assertions.

Parameters:

Name	Type	Description	Default
`fail_message`	`str`	Message to display if the condition fails.	`' ㄨ Assert unique failed '`
`pass_message`	`str`	Message to display if the condition passes.	`' ✔️ Assert unique passed '`
`raise_exception`	`bool`	Whether to raise an exception if the condition fails.	`True`
`exception_to_raise`	`Type[BaseException]`	The exception to raise if the condition fails and raise_exception is True.	`DataError`
`verbose`	`bool`	Whether to display the pass message if the condition passes.	`False`

Returns:

Type	Description
`Series`	The original Series, unchanged.

`describe(fn=lambda s: s, check_name='📏 Distribution', **kwargs)`

Displays descriptive statistics about a Series, without modifying the Series itself.

See Pandas docs for describe() for additional usage information, including more configuration options you can pass to this Pandas Checks method.

Example

(
    iris
    ["sepal_length"]
    .check.describe()
)

Parameters:

Name	Type	Description	Default
`fn`	`Callable`	An optional lambda function to apply to the Series before running Pandas describe(). Example: `lambda s: s.dropna()`.	`lambda s: s`
`check_name`	`Union[str, None]`	An optional name for the check to preface the result with.	`'📏 Distribution'`
`**kwargs`	`Any`	Optional, additional arguments that are accepted by Pandas describe() method.	`{}`

Returns:

Type	Description
`Series`	The original Series, unchanged.

`disable_checks(enable_asserts=True)`

Turns off Pandas Checks globally, such as in production mode. Calls to .check functions will not be run. Does not modify the Series itself.

Example

(
    iris
    ["sepal_length"]
    .check.disable_checks()
    .check.assert_data(lambda s: s.shape[0]>10) #  This check will NOT be run
    .check.enable_checks() # Subsequent calls to .check will be run
)

Args enable_assert: Optionally, whether to also enable or disable assert statements

Returns:

Type	Description
`Series`	The original Series, unchanged.

`dtype(fn=lambda s: s, check_name='🗂️ Data type')`

Displays the data type of a Series, without modifying the Series itself.

Example

(
    iris
    ["sepal_length"]
    .check.dtype()
)

Parameters:

Name	Type	Description	Default
`fn`	`Callable`	An optional lambda function to apply to the Series before running Pandas dtype. Example: `lambda s: s.dropna()`.	`lambda s: s`
`check_name`	`Union[str, None]`	An optional name for the check to preface the result with.	`'🗂️ Data type'`

Returns:

Type	Description
`Series`	The original Series, unchanged.

`enable_checks(enable_asserts=True)`

Globally enables Pandas Checks. Subequent calls to .check methods will be run. Does not modify the Series itself.

Example

(
    iris
    ["sepal_length"]
    .check.disable_checks()
    .check.assert_data(lambda s: s.shape[0]>10) #  This check will NOT be run
    .check.enable_checks() # Subsequent calls to .check will be run
)

Parameters:

Name	Type	Description	Default
`enable_asserts`	`bool`	Optionally, whether to globally enable or disable calls to .check.assert_data().	`True`

Returns:

Type	Description
`Series`	The original Series, unchanged.

`function(fn=lambda s: s, check_name=None)`

Applies an arbitrary function on a Series and shows the result, without modifying the Series itself.

Example

(
    iris
    .check.function(fn=lambda s: s.shape[0]>10, check_name='Has at least 10 rows?')
)
# Will return "True"

Parameters:

Name	Type	Description	Default
`fn`	`Callable`	The lambda function to apply to the Series. Example: `lambda s: s.dropna()`.	`lambda s: s`
`check_name`	`Union[str, None]`	An optional name for the check to preface the result with.	`None`

Returns:

Type	Description
`Series`	The original Series, unchanged.

`get_mode(check_name='⚙️ Pandas Checks mode')`

Displays the current values of Pandas Checks global options enable_checks and enable_asserts. Does not modify the Series itself.

Example

(
    iris
    ["sepal_length"]
    .check.get_mode()
)

# The check will print: "🐼🩺 Pandas Checks mode: {'enable_checks': True, 'enable_asserts': True}"

Parameters:

Name	Type	Description	Default
`check_name`	`Union[str, None]`	An optional name for the check. Will be used as a preface the printed result.	`'⚙️ Pandas Checks mode'`

Returns:

Type	Description
`Series`	The original Series, unchanged.

`head(n=5, fn=lambda s: s, check_name=None)`

Displays the first n rows of a Series, without modifying the Series itself.

See Pandas docs for head() for additional usage information.

Example

(
    iris
    ["sepal_length"]
    .check.head(10)
)

Parameters:

Name	Type	Description	Default
`n`	`int`	The number of rows to display.	`5`
`fn`	`Callable`	An optional lambda function to apply to the Series before running Pandas head(). Example: `lambda s: s.dropna()`.	`lambda s: s`
`check_name`	`Union[str, None]`	An optional name for the check, to be printed as preface to the result.	`None`

Returns:

Type	Description
`Series`	The original Series, unchanged.

`hist(fn=lambda s: s, check_name=None, **kwargs)`

Displays a histogram for the Series's distribution, without modifying the Series itself.

See Pandas docs for hist() for additional usage information, including more configuration options you can pass to this Pandas Checks method.

Example

(
    iris
    ["sepal_length"]
    .check.hist()
)

Parameters:

Name	Type	Description	Default
`fn`	`Callable`	An optional lambda function to apply to the Series before running Pandas head(). Example: `lambda s: s.dropna()`.	`lambda s: s`
`check_name`	`Union[str, None]`	An optional name for the check, to be printed as preface to the result.	`None`
`**kwargs`	`Any`	Optional, additional arguments that are accepted by Pandas hist() method.	`{}`

Returns:

Type	Description
`Series`	The original Series, unchanged.

Note

Plots are only displayed when code is run in IPython/Jupyter, not in terminal.

`info(fn=lambda s: s, check_name='ℹ️ Series info', **kwargs)`

Displays summary information about a Series, without modifying the Series itself.

See Pandas docs for info() for additional usage information, including more configuration options you can pass to this Pandas Checks method.

Example

(
    iris
    ["sepal_length"]
    .check.info()
)

Parameters:

Name	Type	Description	Default
`fn`	`Callable`	An optional lambda function to apply to the Series before running Pandas info(). Example: `lambda s: s.dropna()`.	`lambda s: s`
`check_name`	`Union[str, None]`	An optional name for the check, to be printed as preface to the result.	`'ℹ️ Series info'`
`**kwargs`	`Any`	Optional, additional arguments that are accepted by Pandas info() method.	`{}`

Returns:

Type	Description
`Series`	The original Series, unchanged.

`memory_usage(fn=lambda s: s, check_name='💾 Memory usage', **kwargs)`

Displays the memory footprint of a Series, without modifying the Series itself.

See Pandas docs for memory_usage() for additional usage information, including more configuration options you can pass to this Pandas Checks method.

Example

(
    iris
    ["sepal_length"]
    .check.memory_usage()
)

Parameters:

Name	Type	Description	Default
`fn`	`Callable`	An optional lambda function to apply to the Series before running Pandas memory_usage(). Example: `lambda s: s.dropna()`.	`lambda s: s`
`check_name`	`Union[str, None]`	An optional name for the check, to be printed as preface to the result.	`'💾 Memory usage'`
`**kwargs`	`Any`	Optional, additional arguments that are accepted by Pandas memory_usage() method.	`{}`

Returns:

Type	Description
`Series`	The original Series, unchanged.

Note

Include argument deep=True to get further memory usage of object dtypes. See Pandas docs for memory_usage() for more info.

`ndups(fn=lambda s: s, check_name=None, **kwargs)`

Displays the number of duplicated rows in the Series, without modifying the Series itself.

See Pandas docs for duplicated() for additional usage information, including more configuration options (the keep argument) you can pass to this Pandas Checks method.

Example

(
    iris
    ["sepal_length"]
    .check.ndups()
)

Parameters:

Name	Type	Description	Default
`fn`	`Callable`	An optional lambda function to apply to the Series before counting the number of duplicates. Example: `lambda s: s.dropna()`.	`lambda s: s`
`check_name`	`Union[str, None]`	An optional name for the check, to be printed as preface to the result.	`None`
`**kwargs`	`Any`	Optional, additional arguments that are accepted by Pandas duplicated() method.	`{}`

Returns:

Type	Description
`Series`	The original Series, unchanged.

`nnulls(fn=lambda s: s, check_name='👻 Rows with NaNs')`

Displays the number of rows with null values in the Series, without modifying the Series itself.

See Pandas docs for isna() for additional usage information.

Example

(
    iris
    ["sepal_length"]
    .check.nnulls()
)

Parameters:

Name	Type	Description	Default
`fn`	`Callable`	An optional lambda function to apply to the Series before counting rows with nulls. Example: `lambda s: s.dropna()`.	`lambda s: s`
`check_name`	`Union[str, None]`	An optional name for the check, to be printed as preface to the result.	`'👻 Rows with NaNs'`

Returns:

Type	Description
`Series`	The original Series, unchanged.

`nrows(fn=lambda s: s, check_name='☰ Rows')`

Displays the number of rows in a Series, without modifying the Series itself.

Example

(
    iris
    ["sepal_width"]
    .check.nrows()
)

Parameters:

Name	Type	Description	Default
`fn`	`Callable`	An optional lambda function to apply to the Series before counting the number of rows. Example: `lambda s: s.dropna()`.	`lambda s: s`
`check_name`	`Union[str, None]`	An optional name for the check, to be printed as preface to the result.	`'☰ Rows'`

Returns:

Type	Description
`Series`	The original Series, unchanged.

`nunique(fn=lambda s: s, check_name=None, **kwargs)`

Displays the number of unique rows in a Series, without modifying the Series itself.

See Pandas docs for nunique() for additional usage information, including more configuration options (the dropna argument) you can pass to this Pandas Checks method.

Example

(
    iris
    ["sepal_width"]
    .check.nunique()
)

Parameters:

Name	Type	Description	Default
`fn`	`Callable`	An optional lambda function to apply to the Series before running Pandas nunique(). Example: `lambda s: s.dropna()`.	`lambda s: s`
`check_name`	`Union[str, None]`	An optional name for the check, to be printed as preface to the result.	`None`
`**kwargs`	`Any`	Optional, additional arguments that are accepted by Pandas nunique() method.	`{}`

Returns:

Type	Description
`Series`	The original Series, unchanged.

`plot(fn=lambda s: s, check_name='', **kwargs)`

Displays a plot of the Series, without modifying the Series itself.

See Pandas docs for plot() for additional usage information, including more configuration options you can pass to this Pandas Checks method.

Example

# Visualize the distribution of a Series with a box plot:
(
    iris
    ["sepal_width"]
    .check.plot(kind="box", title="Distribution of sepal width")
)

Parameters:

Name	Type	Description	Default
`fn`	`Callable`	An optional lambda function to apply to the Series before running Pandas plot(). Example: `lambda s: s.dropna()`.	`lambda s: s`
`check_name`	`Union[str, None]`	An optional title for the plot.	`''`
`**kwargs`	`Any`	Optional, additional arguments that are accepted by Pandas plot() method.	`{}`

Returns:

Type	Description
`Series`	The original Series, unchanged.

Note

Plots are only displayed when code is run in IPython/Jupyter, not in terminal.

If you pass a 'title' kwarg, it becomes the plot title, overriding check_name

`print(object=None, fn=lambda s: s, check_name=None, max_rows=10)`

Displays text, another object, or (by default) the current DataFrame's head. Does not modify the Series itself.

Example

(
    iris
    ["sepal_width"]

    # Print messages such as milestones
    .check.print("Starting data cleaning..."")
    ...

    # Inspect a Series, such as the interim result of data processing
    .check.print(fn=lambda s: s[s<0], check_name="Negative values of sepal_width") # Will print those values if they exist
)

Parameters:

Name	Type	Description	Default
`object`	`Any`	Object to print. Can be anything printable: str, int, list, another DataFrame, etc. If None, print the Series's head (with `max_rows` rows).	`None`
`fn`	`Callable`	An optional lambda function to apply to the Series before printing `object`. Example: `lambda s: s.dropna()`.	`lambda s: s`
`check_name`	`Union[str, None]`	An optional name for the check, to be printed as preface to the result.	`None`
`max_rows`	`int`	Maximum number of rows to print if object=None.	`10`

Returns:

Type	Description
`Series`	The original Series, unchanged.

`print_time_elapsed(start_time, lead_in='Time elapsed', units='auto')`

Displays the time elapsed since start_time.

Example

import pandas_checks as pdc

start_time = pdc.start_timer()

(
    iris
    ["species"]
    ... # Do some data processing
    .check.print_time_elapsed(start_time, "Cleaning took")

    ... # Do more
    .check.print_time_elapsed(start_time, "Processing total time", units="seconds") # Force units to stay in seconds

)

# Result: "Cleaning took: 17.298324584960938 seconds
#         "Processing total time: 71.0400543212890625 seconds

Parameters:

Name	Type	Description	Default
`start_time`	`float`	The index time when the stopwatch started, which comes from the Pandas Checks start_timer()	required
`lead_in`	`Union[str, None]`	Optional text to print before the elapsed time.	`'Time elapsed'`
`units`	`str`	The units in which to display the elapsed time. Allowed values: "auto", "milliseconds", "seconds", "minutes", "hours" or shorthands "ms", "s", "m", "h".	`'auto'`

Raises:

Type	Description
`ValueError`	If `units` is not one of allowed values.

Returns:

Type	Description
`Series`	The original Series, unchanged.

`reset_format()`

Globally restores all Pandas Checks formatting options to their default "factory" settings. Does not modify the Series itself.

Example

(
    iris
    ["sepal_width"]
    .check.set_format(precision=9, use_emojis=False)

    # Print Series summary stats with precision 9 digits and no Pandas Checks emojis
    .check.describe()

    .check.reset_format() # Go back to default precision and emojis 🥳
)

Returns:

Type	Description
`Series`	The original Series, unchanged.

`set_format(**kwargs)`

Configures selected formatting options for Pandas Checks. Does not modify the Series itself.

Run pandas_checks.describe_options() to see a list of available options.

See .check.reset_format() to restore defaults.

Example

(
    iris
    ["sepal_width"]
    .check.set_format(precision=9, use_emojis=False)

    # Print Series summary stats with precision 9 digits and no Pandas Checks emojis
    .check.describe()

    .check.reset_format() # Go back to default precision and emojis 🥳
)

Parameters:

Name	Type	Description	Default
`**kwargs`	`Any`	Pairs of setting name and its new value.	`{}`

Returns:

Type	Description
`Series`	The original Series, unchanged.

`set_mode(enable_checks, enable_asserts)`

Configures the operation mode for Pandas Checks globally. Does not modify the Series itself.

Example

(
    iris
    ["sepal_width"]

    # Disable checks except keep running assertions. Same as using `.check.disable_checks()`:
    .check.set_mode(enable_checks=False)
    .check.describe() # This check will not be run
    .check.assert_data(lambda s: s.shape[0]>10) #  This check will still be run

    # Disable checks _and_ assertions
    .check.set_mode(enable_checks=False, enable_asserts=False)
)

Parameters:

Name	Type	Description	Default
`enable_checks`	`bool`	Whether to run any Pandas Checks methods globally. Does not affect .check.assert_*() calls.	required
`enable_asserts`	`bool`	Whether to run calls to Pandas Checks .check.assert_*() globally.	required

Returns:

Type	Description
`Series`	The original Series, unchanged.

`shape(fn=lambda s: s, check_name='📐 Shape')`

Displays the Series's dimensions, without modifying the Series itself.

Example

(
    iris
    ["sepal_width"]
    .check.shape()
    .check.shape(fn=lambda s: s[s<5]), check_name="Shape of sepal_width series with values <5")
)

Parameters:

Name	Type	Description	Default
`fn`	`Callable`	An optional lambda function to apply to the Series before running Pandas `shape`. Example: `lambda s: s.dropna()`.	`lambda s: s`
`check_name`	`Union[str, None]`	An optional name for the check, to be printed as preface to the result.	`'📐 Shape'`

Returns:

Type	Description
`Series`	The original Series, unchanged.

Note

`tail(n=5, fn=lambda s: s, check_name=None)`

Displays the last n rows of the Series, without modifying the Series itself.

See Pandas docs for tail() for additional usage information.

Example

(
    iris
    .check.tail(10)
)

Parameters:

Name	Type	Description	Default
`n`	`int`	Number of rows to show.	`5`
`fn`	`Callable`	An optional lambda function to apply to the Series before running Pandas tail(). Example: `lambda s: s.dropna()`.	`lambda s: s`
`check_name`	`Union[str, None]`	An optional name for the check, to be printed as preface to the result.	`None`

Returns:

Type	Description
`Series`	The original Series, unchanged.

`unique(fn=lambda s: s, check_name=None)`

Displays the unique values in a Series, without modifying the Series itself.

See Pandas docs for unique() for additional usage information.

Example

(
    iris
    ["species"]
    .check.unique()
)
# The check will print: "🌟 Unique values of species: ['setosa', 'versicolor', 'virginica']"

Parameters:

Name	Type	Description	Default
`fn`	`Callable`	An optional lambda function to apply to the Series before running Pandas unique(). Example: `lambda s: s.dropna()`.	`lambda s: s`
`check_name`	`Union[str, None]`	An optional name for the check, to be printed as preface to the result.	`None`

Returns:

Type	Description
`Series`	The original Series, unchanged.

`value_counts(fn=lambda s: s, max_rows=10, check_name=None, **kwargs)`

Displays the value counts for a Series, without modifying the Series itself.

See Pandas docs for value_counts() for additional usage information, including more configuration options you can pass to this Pandas Checks method.

Example

(
    iris
    ["sepal_length"]
    .check.value_counts()
)

Parameters:

Name	Type	Description	Default
`max_rows`	`int`	Maximum number of rows to show in the value counts.	`10`
`fn`	`Callable`	An optional lambda function to apply to the Series before running Pandas value_counts(). Example: `lambda s: s.dropna()`.	`lambda s: s`
`check_name`	`Union[str, None]`	An optional name for the check, to be printed as preface to the result.	`None`
`**kwargs`	`Any`	Optional, additional arguments that are accepted by Pandas value_counts() method.	`{}`

Returns:

Type	Description
`Series`	The original Series, unchanged.

`write(path, format=None, fn=lambda s: s, verbose=False, **kwargs)`

Exports Series to file, without modifying the Series itself.

The file format is inferred from the extension. Supports: - .csv - .feather - .parquet - .pkl # Pickle - .tsv # Tab-separated data file - .xlsx

This functions uses the corresponding Pandas export function such as to_csv() and to_feather(). See Pandas docs for those corresponding export functions for additional usage information, including more configuration options you can pass to this Pandas Checks method.

Note

Exporting to some formats such as Excel, Feather, and Parquet may require you to install additional packages.

Example

(
    iris
    ["sepal_length"]

    # Process data
    ...

    # Export the interim data for inspection
    .check.write("sepal_length_interim.xlsx")

    # Continue processing
    ...
)

Parameters:

Name	Type	Description	Default
`path`	`str`	Path to write the file to.	required
`format`	`Union[str, None]`	Optional file format to force for the export. If None, format is inferred from the file's extension in `path`.	`None`
`fn`	`Callable`	An optional lambda function to apply to the Series before exporting. Example: `lambda s: s.dropna()`.	`lambda s: s`
`verbose`	`bool`	Whether to print a message when the file is written.	`False`
`**kwargs`	`Any`	Optional, additional keyword arguments to pass to the Pandas export function (e.g. `.to_csv()`).	`{}`

Returns:

Type	Description
`Series`	The original Series, unchanged.

Series methods

_obj = pandas_obj instance-attribute

__init__(pandas_obj)

assert_all_nulls(fail_message=' ㄨ Assert all nulls failed ', pass_message=' ✔️ Assert all nulls passed ', raise_exception=True, exception_to_raise=DataError, verbose=False)

assert_data(condition, fail_message=' ㄨ Assertion failed ', pass_message=' ✔️ Assertion passed ', raise_exception=True, exception_to_raise=DataError, message_shows_condition=True, verbose=False)

assert_datetime(fail_message=None, pass_message=' ✔️ Assert datetime passed ', raise_exception=True, exception_to_raise=TypeError, verbose=False)

assert_float(fail_message=None, pass_message=' ✔️ Assert float passed ', raise_exception=True, exception_to_raise=TypeError, verbose=False)

assert_greater_than(min, fail_message=' ㄨ Assert minimum failed ', pass_message=' ✔️ Assert minimum passed ', or_equal_to=False, raise_exception=True, exception_to_raise=DataError, verbose=False)

assert_int(fail_message=None, pass_message=' ✔️ Assert integeer passed ', raise_exception=True, exception_to_raise=TypeError, verbose=False)

assert_less_than(max, fail_message=' ㄨ Assert maximum failed ', pass_message=' ✔️ Assert maximum passed ', or_equal_to=False, raise_exception=True, exception_to_raise=DataError, verbose=False)

assert_negative(fail_message=' ㄨ Assert negative failed ', pass_message=' ✔️ Assert negative passed ', assert_no_nulls=True, raise_exception=True, exception_to_raise=DataError, verbose=False)

assert_no_nulls(fail_message=' ㄨ Assert no nulls failed ', pass_message=' ✔️ Assert no nulls passed ', raise_exception=True, exception_to_raise=DataError, verbose=False)

assert_nrows(nrows, fail_message=' ㄨ Assert nrows failed ', pass_message=' ✔️ Assert nrows passed ', raise_exception=True, exception_to_raise=DataError, verbose=False)

assert_positive(fail_message=' ㄨ Assert positive failed ', pass_message=' ✔️ Assert positive passed ', assert_no_nulls=True, raise_exception=True, exception_to_raise=DataError, verbose=False)

assert_same_nrows(other, fail_message=' ㄨ Assert same_nrows failed ', pass_message=' ✔️ Assert same_nrows passed ', raise_exception=True, exception_to_raise=DataError, verbose=False)

assert_str(fail_message=None, pass_message=' ✔️ Assert string passed ', raise_exception=True, exception_to_raise=TypeError, verbose=False)

assert_timedelta(fail_message=None, pass_message=' ✔️ Assert timedelta passed ', raise_exception=True, exception_to_raise=TypeError, verbose=False)

assert_type(dtype, fail_message=None, pass_message=' ✔️ Assert type passed ', raise_exception=True, exception_to_raise=TypeError, verbose=False)

assert_unique(fail_message=' ㄨ Assert unique failed ', pass_message=' ✔️ Assert unique passed ', raise_exception=True, exception_to_raise=DataError, verbose=False)

describe(fn=lambda s: s, check_name='📏 Distribution', **kwargs)

disable_checks(enable_asserts=True)

dtype(fn=lambda s: s, check_name='🗂️ Data type')

enable_checks(enable_asserts=True)

function(fn=lambda s: s, check_name=None)

get_mode(check_name='⚙️ Pandas Checks mode')

head(n=5, fn=lambda s: s, check_name=None)

hist(fn=lambda s: s, check_name=None, **kwargs)

info(fn=lambda s: s, check_name='ℹ️ Series info', **kwargs)

memory_usage(fn=lambda s: s, check_name='💾 Memory usage', **kwargs)

ndups(fn=lambda s: s, check_name=None, **kwargs)

nnulls(fn=lambda s: s, check_name='👻 Rows with NaNs')

nrows(fn=lambda s: s, check_name='☰ Rows')

nunique(fn=lambda s: s, check_name=None, **kwargs)

plot(fn=lambda s: s, check_name='', **kwargs)

print(object=None, fn=lambda s: s, check_name=None, max_rows=10)

print_time_elapsed(start_time, lead_in='Time elapsed', units='auto')

reset_format()

set_format(**kwargs)

set_mode(enable_checks, enable_asserts)

shape(fn=lambda s: s, check_name='📐 Shape')

tail(n=5, fn=lambda s: s, check_name=None)

unique(fn=lambda s: s, check_name=None)

value_counts(fn=lambda s: s, max_rows=10, check_name=None, **kwargs)

write(path, format=None, fn=lambda s: s, verbose=False, **kwargs)

`_obj = pandas_obj` `instance-attribute`

`init(pandas_obj)`

`assert_all_nulls(fail_message=' ㄨ Assert all nulls failed ', pass_message=' ✔️ Assert all nulls passed ', raise_exception=True, exception_to_raise=DataError, verbose=False)`

`assert_data(condition, fail_message=' ㄨ Assertion failed ', pass_message=' ✔️ Assertion passed ', raise_exception=True, exception_to_raise=DataError, message_shows_condition=True, verbose=False)`

`assert_datetime(fail_message=None, pass_message=' ✔️ Assert datetime passed ', raise_exception=True, exception_to_raise=TypeError, verbose=False)`

`assert_float(fail_message=None, pass_message=' ✔️ Assert float passed ', raise_exception=True, exception_to_raise=TypeError, verbose=False)`

`assert_greater_than(min, fail_message=' ㄨ Assert minimum failed ', pass_message=' ✔️ Assert minimum passed ', or_equal_to=False, raise_exception=True, exception_to_raise=DataError, verbose=False)`

`assert_int(fail_message=None, pass_message=' ✔️ Assert integeer passed ', raise_exception=True, exception_to_raise=TypeError, verbose=False)`

`assert_less_than(max, fail_message=' ㄨ Assert maximum failed ', pass_message=' ✔️ Assert maximum passed ', or_equal_to=False, raise_exception=True, exception_to_raise=DataError, verbose=False)`

`assert_negative(fail_message=' ㄨ Assert negative failed ', pass_message=' ✔️ Assert negative passed ', assert_no_nulls=True, raise_exception=True, exception_to_raise=DataError, verbose=False)`

`assert_no_nulls(fail_message=' ㄨ Assert no nulls failed ', pass_message=' ✔️ Assert no nulls passed ', raise_exception=True, exception_to_raise=DataError, verbose=False)`

`assert_nrows(nrows, fail_message=' ㄨ Assert nrows failed ', pass_message=' ✔️ Assert nrows passed ', raise_exception=True, exception_to_raise=DataError, verbose=False)`

`assert_positive(fail_message=' ㄨ Assert positive failed ', pass_message=' ✔️ Assert positive passed ', assert_no_nulls=True, raise_exception=True, exception_to_raise=DataError, verbose=False)`

`assert_same_nrows(other, fail_message=' ㄨ Assert same_nrows failed ', pass_message=' ✔️ Assert same_nrows passed ', raise_exception=True, exception_to_raise=DataError, verbose=False)`

`assert_str(fail_message=None, pass_message=' ✔️ Assert string passed ', raise_exception=True, exception_to_raise=TypeError, verbose=False)`

`assert_timedelta(fail_message=None, pass_message=' ✔️ Assert timedelta passed ', raise_exception=True, exception_to_raise=TypeError, verbose=False)`

`assert_type(dtype, fail_message=None, pass_message=' ✔️ Assert type passed ', raise_exception=True, exception_to_raise=TypeError, verbose=False)`

`assert_unique(fail_message=' ㄨ Assert unique failed ', pass_message=' ✔️ Assert unique passed ', raise_exception=True, exception_to_raise=DataError, verbose=False)`

`describe(fn=lambda s: s, check_name='📏 Distribution', **kwargs)`

`disable_checks(enable_asserts=True)`

`dtype(fn=lambda s: s, check_name='🗂️ Data type')`

`enable_checks(enable_asserts=True)`

`function(fn=lambda s: s, check_name=None)`

`get_mode(check_name='⚙️ Pandas Checks mode')`

`head(n=5, fn=lambda s: s, check_name=None)`

`hist(fn=lambda s: s, check_name=None, **kwargs)`

`info(fn=lambda s: s, check_name='ℹ️ Series info', **kwargs)`

`memory_usage(fn=lambda s: s, check_name='💾 Memory usage', **kwargs)`

`ndups(fn=lambda s: s, check_name=None, **kwargs)`

`nnulls(fn=lambda s: s, check_name='👻 Rows with NaNs')`

`nrows(fn=lambda s: s, check_name='☰ Rows')`

`nunique(fn=lambda s: s, check_name=None, **kwargs)`

`plot(fn=lambda s: s, check_name='', **kwargs)`

`print(object=None, fn=lambda s: s, check_name=None, max_rows=10)`

`print_time_elapsed(start_time, lead_in='Time elapsed', units='auto')`

`reset_format()`

`set_format(**kwargs)`

`set_mode(enable_checks, enable_asserts)`

`shape(fn=lambda s: s, check_name='📐 Shape')`

`tail(n=5, fn=lambda s: s, check_name=None)`

`unique(fn=lambda s: s, check_name=None)`

`value_counts(fn=lambda s: s, max_rows=10, check_name=None, **kwargs)`

`write(path, format=None, fn=lambda s: s, verbose=False, **kwargs)`