DataFrame methods
_obj = pandas_obj
instance-attribute
__init__(pandas_obj)
assert_data(condition, subset=None, pass_message=' âī¸ Assertion passed ', fail_message=' 㨠Assertion failed ', raise_exception=True, exception_to_raise=DataError, message_shows_condition=True, verbose=False)
Tests whether Dataframe meets condition. Optionally raises an exception. Does not modify the DataFrame itself.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
condition |
Callable
|
Assertion criteria in the form of a lambda function, such as |
required |
subset |
Union[str, List, None]
|
Optional, which column or columns to check the condition against. Applied after fn. Subsetting can also be done within the |
None
|
pass_message |
str
|
Message to display if the condition passes. |
' âī¸ Assertion passed '
|
fail_message |
str
|
Message to display if the condition fails. |
' 㨠Assertion failed '
|
raise_exception |
bool
|
Whether to raise an exception if the condition fails. |
True
|
exception_to_raise |
Type[BaseException]
|
The exception to raise if the condition fails and raise_exception is True. |
DataError
|
message_shows_condition |
bool
|
Whether the fail/pass message should also print the assertion criteria |
True
|
verbose |
bool
|
Whether to display the pass message if the condition passes. |
False
|
Returns:
Type | Description |
---|---|
DataFrame
|
The original DataFrame, unchanged. |
assert_datetime(subset=None, pass_message=' âī¸ Assert datetime passed ', fail_message=None, raise_exception=True, exception_to_raise=TypeError, verbose=False)
Tests whether Dataframe or subset of columns is datetime or timestamp. Optionally raises an exception. Does not modify the DataFrame itself.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
subset |
Union[str, List, None]
|
Optional, which column or columns to check the condition against. ` |
None
|
pass_message |
str
|
Message to display if the condition passes. |
' âī¸ Assert datetime passed '
|
fail_message |
Union[str, None]
|
Message to display if the condition fails. |
None
|
raise_exception |
bool
|
Whether to raise an exception if the condition fails. |
True
|
exception_to_raise |
Type[BaseException]
|
The exception to raise if the condition fails and raise_exception is True. |
TypeError
|
verbose |
bool
|
Whether to display the pass message if the condition passes. |
False
|
Returns:
Type | Description |
---|---|
DataFrame
|
The original DataFrame, unchanged. |
assert_float(subset=None, pass_message=' âī¸ Assert float passed ', fail_message=None, raise_exception=True, exception_to_raise=TypeError, verbose=False)
Tests whether Dataframe or subset of columns is floats. Optionally raises an exception. Does not modify the DataFrame itself.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
subset |
Union[str, List, None]
|
Optional, which column or columns to check the condition against. ` |
None
|
pass_message |
str
|
Message to display if the condition passes. |
' âī¸ Assert float passed '
|
fail_message |
Union[str, None]
|
Message to display if the condition fails. |
None
|
raise_exception |
bool
|
Whether to raise an exception if the condition fails. |
True
|
exception_to_raise |
Type[BaseException]
|
The exception to raise if the condition fails and raise_exception is True. |
TypeError
|
verbose |
bool
|
Whether to display the pass message if the condition passes. |
False
|
Returns:
Type | Description |
---|---|
DataFrame
|
The original DataFrame, unchanged. |
assert_greater_than(min, or_equal_to=True, subset=None, pass_message=' âī¸ Assert minimum passed ', fail_message=' 㨠Assert minimum failed ', raise_exception=True, exception_to_raise=DataError, verbose=False)
Tests whether Dataframe or subset of columns is > or >= a value. Optionally raises an exception. Does not modify the DataFrame itself.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
min |
Any
|
the minimum value to compare DataFrame to. Accepts any type that can be used in >, such as int, float, str, datetime |
required |
or_equal_to |
bool
|
whether to test for >= min (True) or > min (False) |
True
|
subset |
Union[str, List, None]
|
Optional, which column or columns to check the condition against. ` |
None
|
pass_message |
str
|
Message to display if the condition passes. |
' âī¸ Assert minimum passed '
|
fail_message |
str
|
Message to display if the condition fails. |
' 㨠Assert minimum failed '
|
raise_exception |
bool
|
Whether to raise an exception if the condition fails. |
True
|
exception_to_raise |
Type[BaseException]
|
The exception to raise if the condition fails and raise_exception is True. |
DataError
|
verbose |
bool
|
Whether to display the pass message if the condition passes. |
False
|
Returns:
Type | Description |
---|---|
DataFrame
|
The original DataFrame, unchanged. |
assert_int(subset=None, pass_message=' âī¸ Assert integeer passed ', fail_message=None, raise_exception=True, exception_to_raise=TypeError, verbose=False)
Tests whether Dataframe or subset of columns is integers. Optionally raises an exception. Does not modify the DataFrame itself.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
subset |
Union[str, List, None]
|
Optional, which column or columns to check the condition against. ` |
None
|
pass_message |
str
|
Message to display if the condition passes. |
' âī¸ Assert integeer passed '
|
fail_message |
Union[str, None]
|
Message to display if the condition fails. |
None
|
raise_exception |
bool
|
Whether to raise an exception if the condition fails. |
True
|
exception_to_raise |
Type[BaseException]
|
The exception to raise if the condition fails and raise_exception is True. |
TypeError
|
verbose |
bool
|
Whether to display the pass message if the condition passes. |
False
|
Returns:
Type | Description |
---|---|
DataFrame
|
The original DataFrame, unchanged. |
assert_less_than(max, or_equal_to=True, subset=None, pass_message=' âī¸ Assert maximum passed ', fail_message=' 㨠Assert maximum failed ', raise_exception=True, exception_to_raise=DataError, verbose=False)
Tests whether Dataframe or subset of columns is < or <= a value. Optionally raises an exception. Does not modify the DataFrame itself.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
max |
Any
|
the max value to compare DataFrame to. Accepts any type that can be used in <, such as int, float, str, datetime |
required |
or_equal_to |
bool
|
whether to test for <= min (True) or < max (False) |
True
|
subset |
Union[str, List, None]
|
Optional, which column or columns to check the condition against. ` |
None
|
pass_message |
str
|
Message to display if the condition passes. |
' âī¸ Assert maximum passed '
|
fail_message |
str
|
Message to display if the condition fails. |
' 㨠Assert maximum failed '
|
raise_exception |
bool
|
Whether to raise an exception if the condition fails. |
True
|
exception_to_raise |
Type[BaseException]
|
The exception to raise if the condition fails and raise_exception is True. |
DataError
|
verbose |
bool
|
Whether to display the pass message if the condition passes. |
False
|
Returns:
Type | Description |
---|---|
DataFrame
|
The original DataFrame, unchanged. |
assert_negative(subset=None, assert_not_null=True, pass_message=' âī¸ Assert negative passed ', fail_message=' 㨠Assert negative failed ', raise_exception=True, exception_to_raise=DataError, verbose=False)
Tests whether Dataframe or subset of columns has all negative values. Optionally raises an exception. Does not modify the DataFrame itself.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
subset |
Union[str, List, None]
|
Optional, which column or columns to check the condition against.` |
None
|
assert_not_null |
bool
|
Whether to also enforce that data has no nulls. |
True
|
pass_message |
str
|
Message to display if the condition passes. |
' âī¸ Assert negative passed '
|
fail_message |
str
|
Message to display if the condition fails. |
' 㨠Assert negative failed '
|
raise_exception |
bool
|
Whether to raise an exception if the condition fails. |
True
|
exception_to_raise |
Type[BaseException]
|
The exception to raise if the condition fails and raise_exception is True. |
DataError
|
verbose |
bool
|
Whether to display the pass message if the condition passes. |
False
|
Returns:
Type | Description |
---|---|
DataFrame
|
The original DataFrame, unchanged. |
assert_not_null(subset=None, pass_message=' âī¸ Assert no nulls passed ', fail_message=' 㨠Assert no nulls failed ', raise_exception=True, exception_to_raise=DataError, verbose=False)
Tests whether Dataframe or subset of columns has no nulls. Optionally raises an exception. Does not modify the DataFrame itself.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
subset |
Union[str, List, None]
|
Optional, which column or columns to check the condition against. ` |
None
|
pass_message |
str
|
Message to display if the condition passes. |
' âī¸ Assert no nulls passed '
|
fail_message |
str
|
Message to display if the condition fails. |
' 㨠Assert no nulls failed '
|
raise_exception |
bool
|
Whether to raise an exception if the condition fails. |
True
|
exception_to_raise |
Type[BaseException]
|
The exception to raise if the condition fails and raise_exception is True. |
DataError
|
verbose |
bool
|
Whether to display the pass message if the condition passes. |
False
|
Returns:
Type | Description |
---|---|
DataFrame
|
The original DataFrame, unchanged. |
assert_null(subset=None, pass_message=' âī¸ Assert all nulls passed ', fail_message=' 㨠Assert all nulls failed ', raise_exception=True, exception_to_raise=DataError, verbose=False)
Tests whether Dataframe or subset of columns has all nulls. Optionally raises an exception. Does not modify the DataFrame itself.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
subset |
Union[str, List, None]
|
Optional, which column or columns to check the condition against. ` |
None
|
pass_message |
str
|
Message to display if the condition passes. |
' âī¸ Assert all nulls passed '
|
fail_message |
str
|
Message to display if the condition fails. |
' 㨠Assert all nulls failed '
|
raise_exception |
bool
|
Whether to raise an exception if the condition fails. |
True
|
exception_to_raise |
Type[BaseException]
|
The exception to raise if the condition fails and raise_exception is True. |
DataError
|
verbose |
bool
|
Whether to display the pass message if the condition passes. |
False
|
Returns:
Type | Description |
---|---|
DataFrame
|
The original DataFrame, unchanged. |
assert_positive(subset=None, assert_not_null=True, pass_message=' âī¸ Assert positive passed ', fail_message=' 㨠Assert positive failed ', raise_exception=True, exception_to_raise=DataError, verbose=False)
Tests whether Dataframe or subset of columns has all positive values. Optionally raises an exception. Does not modify the DataFrame itself.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
subset |
Union[str, List, None]
|
Optional, which column or columns to check the condition against. ` |
None
|
assert_not_null |
bool
|
Whether to also enforce that data has no nulls. |
True
|
pass_message |
str
|
Message to display if the condition passes. |
' âī¸ Assert positive passed '
|
fail_message |
str
|
Message to display if the condition fails. |
' 㨠Assert positive failed '
|
raise_exception |
bool
|
Whether to raise an exception if the condition fails. |
True
|
exception_to_raise |
Type[BaseException]
|
The exception to raise if the condition fails and raise_exception is True. |
DataError
|
verbose |
bool
|
Whether to display the pass message if the condition passes. |
False
|
Returns:
Type | Description |
---|---|
DataFrame
|
The original DataFrame, unchanged. |
assert_str(subset=None, pass_message=' âī¸ Assert string passed ', fail_message=None, raise_exception=True, exception_to_raise=TypeError, verbose=False)
Tests whether Dataframe or subset of columns is strings. Optionally raises an exception. Does not modify the DataFrame itself.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
subset |
Union[str, List, None]
|
Optional, which column or columns to check the condition against. ` |
None
|
pass_message |
str
|
Message to display if the condition passes. |
' âī¸ Assert string passed '
|
fail_message |
Union[str, None]
|
Message to display if the condition fails. |
None
|
raise_exception |
bool
|
Whether to raise an exception if the condition fails. |
True
|
exception_to_raise |
Type[BaseException]
|
The exception to raise if the condition fails and raise_exception is True. |
TypeError
|
verbose |
bool
|
Whether to display the pass message if the condition passes. |
False
|
Returns:
Type | Description |
---|---|
DataFrame
|
The original DataFrame, unchanged. |
assert_timedelta(subset=None, pass_message=' âī¸ Assert timedelta passed ', fail_message=None, raise_exception=True, exception_to_raise=TypeError, verbose=False)
Tests whether Dataframe or subset of columns is of type timedelta. Optionally raises an exception. Does not modify the DataFrame itself.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
subset |
Union[str, List, None]
|
Optional, which column or columns to check the condition against. ` |
None
|
pass_message |
str
|
Message to display if the condition passes. |
' âī¸ Assert timedelta passed '
|
fail_message |
Union[str, None]
|
Message to display if the condition fails. |
None
|
raise_exception |
bool
|
Whether to raise an exception if the condition fails. |
True
|
exception_to_raise |
Type[BaseException]
|
The exception to raise if the condition fails and raise_exception is True. |
TypeError
|
verbose |
bool
|
Whether to display the pass message if the condition passes. |
False
|
Returns:
Type | Description |
---|---|
DataFrame
|
The original DataFrame, unchanged. |
assert_type(dtype, subset=None, pass_message=' âī¸ Assert type passed ', fail_message=None, raise_exception=True, exception_to_raise=TypeError, verbose=False)
Tests whether Dataframe or subset of columns meets type assumption. Optionally raises an exception. Does not modify the DataFrame itself.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dtype |
Type[Any]
|
The required variable type |
required |
subset |
Union[str, List, None]
|
Optional, which column or columns to check the condition against. ` |
None
|
pass_message |
str
|
Message to display if the condition passes. |
' âī¸ Assert type passed '
|
fail_message |
Union[str, None]
|
Message to display if the condition fails. |
None
|
raise_exception |
bool
|
Whether to raise an exception if the condition fails. |
True
|
exception_to_raise |
Type[BaseException]
|
The exception to raise if the condition fails and raise_exception is True. |
TypeError
|
verbose |
bool
|
Whether to display the pass message if the condition passes. |
False
|
Returns:
Type | Description |
---|---|
DataFrame
|
The original DataFrame, unchanged. |
assert_unique(subset=None, pass_message=' âī¸ Assert unique passed ', fail_message=' 㨠Assert unique failed ', raise_exception=True, exception_to_raise=DataError, verbose=False)
Tests whether Dataframe or subset of columns has no duplicate rows. Optionally raises an exception. Does not modify the DataFrame itself.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
subset |
Union[str, List, None]
|
Optional, which column or columns to check the condition against. ` |
None
|
pass_message |
str
|
Message to display if the condition passes. |
' âī¸ Assert unique passed '
|
fail_message |
str
|
Message to display if the condition fails. |
' 㨠Assert unique failed '
|
raise_exception |
bool
|
Whether to raise an exception if the condition fails. |
True
|
exception_to_raise |
Type[BaseException]
|
The exception to raise if the condition fails and raise_exception is True. |
DataError
|
verbose |
bool
|
Whether to display the pass message if the condition passes. |
False
|
Returns:
Type | Description |
---|---|
DataFrame
|
The original DataFrame, unchanged. |
columns(fn=lambda df: df, subset=None, check_name='đī¸ Columns')
Prints the column names of a DataFrame, without modifying the DataFrame itself.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
fn |
Callable
|
An optional lambda function to apply to the DataFrame before printing columns. Example: |
lambda df: df
|
subset |
Union[str, List, None]
|
An optional list of column names or a string to select a subset of columns before printing their names. Applied after fn. |
None
|
check_name |
Union[str, None]
|
An optional name for the check to preface the result with. |
'đī¸ Columns'
|
Returns:
Type | Description |
---|---|
DataFrame
|
The original DataFrame, unchanged. |
describe(fn=lambda df: df, subset=None, check_name='đ Distributions', **kwargs)
Displays descriptive statistics about a DataFrame without modifying the DataFrame itself.
See Pandas docs for describe() for additional usage information, including more configuration options you can pass to this Pandas Checks method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
fn |
Callable
|
An optional lambda function to apply to the DataFrame before running Pandas describe(). Example: |
lambda df: df
|
subset |
Union[str, List, None]
|
An optional list of column names or a string to select a subset of columns before running Pandas describe(). Applied after fn. |
None
|
check_name |
Union[str, None]
|
An optional name for the check to preface the result with. |
'đ Distributions'
|
**kwargs |
Any
|
Optional, additional arguments that are accepted by Pandas describe() method. |
{}
|
Returns:
Type | Description |
---|---|
DataFrame
|
The original DataFrame, unchanged. |
disable_checks(enable_asserts=True)
Turns off Pandas Checks globally, such as in production mode. Calls to .check functions will not be run. Does not modify the DataFrame itself.
Args enable_assert: Optionally, whether to also enable or disable assert statements
Returns:
Type | Description |
---|---|
DataFrame
|
The original DataFrame, unchanged. |
dtypes(fn=lambda df: df, subset=None, check_name='đī¸ Data types')
Displays the data types of a DataFrame's columns without modifying the DataFrame itself.
See Pandas docs for dtypes for additional usage information.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
fn |
Callable
|
An optional lambda function to apply to the DataFrame before running Pandas dtypes. Example: |
lambda df: df
|
subset |
Union[str, List, None]
|
An optional list of column names or a string to select a subset of columns before running Pandas .dtypes. Applied after fn. |
None
|
check_name |
Union[str, None]
|
An optional name for the check to preface the result with. |
'đī¸ Data types'
|
Returns:
Type | Description |
---|---|
DataFrame
|
The original DataFrame, unchanged. |
enable_checks(enable_asserts=True)
Globally enables Pandas Checks. Subequent calls to .check methods will be run. Does not modify the DataFrame itself.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
enable_asserts |
bool
|
Optionally, whether to globally enable or disable calls to .check.assert_data(). |
True
|
Returns:
Type | Description |
---|---|
DataFrame
|
The original DataFrame, unchanged. |
function(fn=lambda df: df, subset=None, check_name=None)
Applies an arbitrary function on a DataFrame and shows the result, without modifying the DataFrame itself.
Example
.check.function(fn=lambda df: df.shape[0]>10, check_name='Has at least 10 rows?') which will result in 'True' or 'False'
Parameters:
Name | Type | Description | Default |
---|---|---|---|
fn |
Callable
|
A lambda function to apply to the DataFrame. Example: |
lambda df: df
|
subset |
Union[str, List, None]
|
An optional list of column names or a string to select a subset of columns before running Pandas describe(). Applied after fn. |
None
|
check_name |
Union[str, None]
|
An optional name for the check to preface the result with. |
None
|
Returns:
Type | Description |
---|---|
DataFrame
|
The original DataFrame, unchanged. |
get_mode(check_name='đŧđŠē Pandas Checks mode')
Displays the current values of Pandas Checks global options enable_checks and enable_asserts. Does not modify the DataFrame itself.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
check_name |
Union[str, None]
|
An optional name for the check. Will be used as a preface the printed result. |
'đŧđŠē Pandas Checks mode'
|
Returns:
Type | Description |
---|---|
DataFrame
|
The original DataFrame, unchanged. |
head(n=5, fn=lambda df: df, subset=None, check_name=None)
Displays the first n rows of a DataFrame, without modifying the DataFrame itself.
See Pandas docs for head() for additional usage information.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
n |
int
|
The number of rows to display. |
5
|
fn |
Callable
|
An optional lambda function to apply to the DataFrame before running Pandas head(). Example: |
lambda df: df
|
subset |
Union[str, List, None]
|
An optional list of column names or a string to select a subset of columns before running Pandas head(). Applied after fn. |
None
|
check_name |
Union[str, None]
|
An optional name for the check, to be printed as preface to the result. |
None
|
Returns:
Type | Description |
---|---|
DataFrame
|
The original DataFrame, unchanged. |
hist(fn=lambda df: df, subset=[], check_name=None, **kwargs)
Displays a histogram for the DataFrame, without modifying the DataFrame itself.
See Pandas docs for hist() for additional usage information, including more configuration options you can pass to this Pandas Checks method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
fn |
Callable
|
An optional lambda function to apply to the DataFrame before running Pandas hist(). Example: |
lambda df: df
|
subset |
Union[str, List, None]
|
An optional list of column names or a string to select a subset of columns before running Pandas hist(). Applied after fn. |
[]
|
check_name |
Union[str, None]
|
An optional name for the check, to be printed as preface to the result. |
None
|
**kwargs |
Any
|
Optional, additional arguments that are accepted by Pandas hist() method. |
{}
|
Returns:
Type | Description |
---|---|
DataFrame
|
The original DataFrame, unchanged. |
Note
If more than one column is passed, displays a grid of histograms
Only renders in interactive mode (IPython/Jupyter), not in terminal
info(fn=lambda df: df, subset=None, check_name='âšī¸ Info', **kwargs)
Displays summary information about a DataFrame, without modifying the DataFrame itself.
See Pandas docs for info() for additional usage information, including more configuration options you can pass to this Pandas Checks method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
fn |
Callable
|
An optional lambda function to apply to the DataFrame before running Pandas info(). Example: |
lambda df: df
|
subset |
Union[str, List, None]
|
An optional list of column names or a string to select a subset of columns before running Pandas info(). Applied after fn. |
None
|
check_name |
Union[str, None]
|
An optional name for the check, to be printed as preface to the result. |
'âšī¸ Info'
|
**kwargs |
Any
|
Optional, additional arguments that are accepted by Pandas info() method. |
{}
|
Returns:
Type | Description |
---|---|
DataFrame
|
The original DataFrame, unchanged. |
memory_usage(fn=lambda df: df, subset=None, check_name='đž Memory usage', **kwargs)
Displays the memory footprint of a DataFrame, without modifying the DataFrame itself.
See Pandas docs for memory_usage() for additional usage information, including more configuration options you can pass to this Pandas Checks method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
fn |
Callable
|
An optional lambda function to apply to the DataFrame before running Pandas memory_usage(). Example: |
lambda df: df
|
subset |
Union[str, List, None]
|
An optional list of column names or a string to select a subset of columns before running Pandas memory_usage(). Applied after fn. |
None
|
check_name |
Union[str, None]
|
An optional name for the check, to be printed as preface to the result. |
'đž Memory usage'
|
**kwargs |
Any
|
Optional, additional arguments that are accepted by Pandas info() method. |
{}
|
Returns:
Type | Description |
---|---|
DataFrame
|
The original DataFrame, unchanged. |
Note
Include argument deep=True
to get further memory usage of object dtypes in the DataFrame. See Pandas docs for memory_usage() for more info.
ncols(fn=lambda df: df, subset=None, check_name='đī¸ Columns')
Displays the number of columns in a DataFrame, without modifying the DataFrame itself.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
fn |
Callable
|
An optional lambda function to apply to the DataFrame before counting the number of columns. Example: |
lambda df: df
|
subset |
Union[str, List, None]
|
An optional list of column names or a string to select a subset of columns before counting the number of columns. Applied after fn. |
None
|
check_name |
Union[str, None]
|
An optional name for the check, to be printed as preface to the result. |
'đī¸ Columns'
|
Returns:
Type | Description |
---|---|
DataFrame
|
The original DataFrame, unchanged. |
ndups(fn=lambda df: df, subset=None, check_name=None, **kwargs)
Displays the number of duplicated rows in a DataFrame, without modifying the DataFrame itself.
See Pandas docs for duplicated() for additional usage information, including more configuration options you can pass to this Pandas Checks method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
fn |
Callable
|
An optional lambda function to apply to the DataFrame before counting the number of duplicates. Example: |
lambda df: df
|
subset |
Union[str, List, None]
|
An optional list of column names or a string to select a subset of columns before counting duplicate rows. Applied after fn. |
None
|
check_name |
Union[str, None]
|
An optional name for the check, to be printed as preface to the result. |
None
|
**kwargs |
Any
|
Optional, additional arguments that are accepted by Pandas duplicated() method. |
{}
|
Returns:
Type | Description |
---|---|
DataFrame
|
The original DataFrame, unchanged. |
nnulls(fn=lambda df: df, subset=None, by_column=True, check_name='đģ Rows with NaNs')
Displays the number of rows with null values in a DataFrame, without modifying the DataFrame itself.
See Pandas docs for isna() for additional usage information.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
fn |
Callable
|
An optional lambda function to apply to the DataFrame before counting the number of rows with a null. Example: |
lambda df: df
|
subset |
Union[str, List, None]
|
An optional list of column names or a string to select a subset of columns before counting nulls. |
None
|
by_column |
bool
|
If True, count null values with each column separately. If False, count rows with a null value in any column. Applied after fn. |
True
|
check_name |
Union[str, None]
|
An optional name for the check, to be printed as preface to the result. |
'đģ Rows with NaNs'
|
Returns:
Type | Description |
---|---|
DataFrame
|
The original DataFrame, unchanged. |
nrows(fn=lambda df: df, subset=None, check_name='â° Rows')
Displays the number of rows in a DataFrame, without modifying the DataFrame itself.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
fn |
Callable
|
An optional lambda function to apply to the DataFrame before counting the number of rows. Example: |
lambda df: df
|
subset |
Union[str, List, None]
|
An optional list of column names or a string name of one column to limit which columns are considered when counting rows. Applied after fn. |
None
|
check_name |
Union[str, None]
|
An optional name for the check, to be printed as preface to the result. |
'â° Rows'
|
Returns:
Type | Description |
---|---|
DataFrame
|
The original DataFrame, unchanged. |
nunique(column, fn=lambda df: df, check_name=None, **kwargs)
Displays the number of unique rows in a single column, without modifying the DataFrame itself.
See Pandas docs for nunique() for additional usage information, including more configuration options you can pass to this Pandas Checks method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
column |
str
|
The name of a column to count uniques in. Applied after fn. |
required |
fn |
Callable
|
An optional lambda function to apply to the DataFrame before running Pandas nunique(). Example: |
lambda df: df
|
check_name |
Union[str, None]
|
An optional name for the check, to be printed as preface to the result. |
None
|
**kwargs |
Any
|
Optional, additional arguments that are accepted by Pandas nunique() method. |
{}
|
Returns:
Type | Description |
---|---|
DataFrame
|
The original DataFrame, unchanged. |
plot(fn=lambda df: df, subset=None, check_name='', **kwargs)
Displays a plot of the DataFrame, without modifying the DataFrame itself.
See Pandas docs for plot() for additional usage information, including more configuration options you can pass to this Pandas Checks method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
fn |
Callable
|
An optional lambda function to apply to the DataFrame before running Pandas plot(). Example: |
lambda df: df
|
subset |
Union[str, List, None]
|
An optional list of column names or a string name of one column to limit which columns are plotted. Applied after fn. |
None
|
check_name |
Union[str, None]
|
An optional title for the plot. |
''
|
**kwargs |
Any
|
Optional, additional arguments that are accepted by Pandas plot() method. |
{}
|
Returns:
Type | Description |
---|---|
DataFrame
|
The original DataFrame, unchanged. |
Note
Plots are only displayed when code is run in IPython/Jupyter, not in terminal.
If you pass a 'title' kwarg, it becomes the plot title, overriding check_name
print(object=None, fn=lambda df: df, subset=None, check_name=None, max_rows=10)
Displays text, another object, or (by default) the current DataFrame's head. Does not modify the DataFrame itself.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
object |
Any
|
Object to print. Can be anything printable: str, int, list, another DataFrame, etc. If None, print the DataFrame's head (with |
None
|
fn |
Callable
|
An optional lambda function to apply to the DataFrame before printing |
lambda df: df
|
subset |
Union[str, List, None]
|
An optional list of column names or a string name of one column to limit which columns are printed. Applied after fn. |
None
|
check_name |
Union[str, None]
|
An optional name for the check, to be printed as preface to the result. |
None
|
max_rows |
int
|
Maximum number of rows to print if object=None. |
10
|
Returns:
Type | Description |
---|---|
DataFrame
|
The original DataFrame, unchanged. |
print_time_elapsed(start_time, lead_in='Time elapsed', units='auto')
Displays the time elapsed since start_time.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
start_time |
float
|
The index time when the stopwatch started, which comes from the Pandas Checks start_timer() |
required |
lead_in |
Union[str, None]
|
Optional text to print before the elapsed time. |
'Time elapsed'
|
units |
str
|
The units in which to display the elapsed time. Allowed values: "auto", "milliseconds", "seconds", "minutes", "hours" or shorthands "ms", "s", "m", "h". |
'auto'
|
Raises:
Type | Description |
---|---|
ValueError
|
If |
Returns:
Type | Description |
---|---|
DataFrame
|
The original DataFrame, unchanged. |
reset_format()
Globally restores all Pandas Checks formatting options to their default "factory" settings. Does not modify the DataFrame itself.
Returns:
Type | Description |
---|---|
DataFrame
|
The original DataFrame, unchanged. |
set_format(**kwargs)
Configures selected formatting options for Pandas Checks. Does not modify the DataFrame itself.
Run pandas_checks.describe_options() to see a list of available options.
For example, .check.set_format(check_text_tag= "h1", use_emojis=False`) will globally change Pandas Checks to display text results as H1 headings and remove all emojis.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
**kwargs |
Any
|
Pairs of setting name and its new value. |
{}
|
Returns:
Type | Description |
---|---|
DataFrame
|
The original DataFrame, unchanged. |
set_mode(enable_checks, enable_asserts)
Configures the operation mode for Pandas Checks globally. Does not modify the DataFrame itself.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
enable_checks |
bool
|
Whether to run any Pandas Checks methods globally. Does not affect .check.assert_data(). |
required |
enable_asserts |
bool
|
Whether to run calls to Pandas Checks .check.assert_data() statements globally. |
required |
Returns:
Type | Description |
---|---|
DataFrame
|
The original DataFrame, unchanged. |
shape(fn=lambda df: df, subset=None, check_name='đ Shape')
Displays the Dataframe's dimensions, without modifying the DataFrame itself.
See Pandas docs for shape for additional usage information.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
fn |
Callable
|
An optional lambda function to apply to the DataFrame before running Pandas |
lambda df: df
|
subset |
Union[str, List, None]
|
An optional list of column names or a string name of one column to limit which columns are considered when printing the shape. Applied after fn. |
None
|
check_name |
Union[str, None]
|
An optional name for the check, to be printed as preface to the result. |
'đ Shape'
|
Returns:
Type | Description |
---|---|
DataFrame
|
The original DataFrame, unchanged. |
Note
See also .check.nrows() and .check.ncols()
tail(n=5, fn=lambda df: df, subset=None, check_name=None)
Displays the last n rows of the DataFrame, without modifying the DataFrame itself.
See Pandas docs for tail() for additional usage information, including more configuration options you can pass to this Pandas Checks method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
n |
int
|
Number of rows to show. |
5
|
fn |
Callable
|
An optional lambda function to apply to the DataFrame before running Pandas tail(). Example: |
lambda df: df
|
subset |
Union[str, List, None]
|
An optional list of column names or a string name of one column to limit which columns are displayed. Applied after fn. |
None
|
check_name |
Union[str, None]
|
An optional name for the check, to be printed as preface to the result. |
None
|
Returns:
Type | Description |
---|---|
DataFrame
|
The original DataFrame, unchanged. |
unique(column, fn=lambda df: df, check_name=None)
Displays the unique values in a column, without modifying the DataFrame itself.
See Pandas docs for unique() for additional usage information, including more configuration options you can pass to this Pandas Checks method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
column |
str
|
Column to check for unique values. |
required |
fn |
Callable
|
An optional lambda function to apply to the DataFrame before calling Pandas unique(). Example: |
lambda df: df
|
check_name |
Union[str, None]
|
An optional name for the check, to be printed as preface to the result. |
None
|
Returns:
Type | Description |
---|---|
DataFrame
|
The original DataFrame, unchanged. |
Note
fn
is applied to the dataframe before selecting column
. If you want to select the column before modifying it, set column=None
and start fn
with a column selection, i.e. fn=lambda df: df["my_column"].stuff()
value_counts(column, fn=lambda df: df, max_rows=10, check_name=None, **kwargs)
Displays the value counts for a column, without modifying the DataFrame itself.
See Pandas docs for value_counts() for additional usage information, including more configuration options you can pass to this Pandas Checks method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
column |
str
|
Column to check for value counts. |
required |
max_rows |
int
|
Maximum number of rows to show in the value counts. |
10
|
fn |
Callable
|
An optional lambda function to apply to the DataFrame before running Pandas value_counts(). Example: |
lambda df: df
|
check_name |
Union[str, None]
|
An optional name for the check, to be printed as preface to the result. |
None
|
**kwargs |
Any
|
Optional, additional arguments that are accepted by Pandas value_counts() method. |
{}
|
Returns:
Type | Description |
---|---|
DataFrame
|
The original DataFrame, unchanged. |
Note
fn
is applied to the dataframe before selecting column
. If you want to select the column before modifying it, set column=None
and start fn
with a column selection, i.e. fn=lambda df: df["my_column"].stuff()
write(path, format=None, fn=lambda df: df, subset=None, verbose=False, **kwargs)
Exports DataFrame to file, without modifying the DataFrame itself.
Format is inferred from path extension like .csv.
This functions uses the corresponding Pandas export function such as to_csv(). See Pandas docs for those functions for additional usage information, including more configuration options you can pass to this Pandas Checks method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path |
str
|
Path to write the file to. |
required |
format |
Union[str, None]
|
Optional file format to force for the export. If None, format is inferred from the file's extension in |
None
|
fn |
Callable
|
An optional lambda function to apply to the DataFrame before exporting. Example: |
lambda df: df
|
subset |
Union[str, List, None]
|
An optional list of column names or a string name of one column to limit which columns are exported. Applied after fn. |
None
|
verbose |
bool
|
Whether to print a message when the file is written. |
False
|
**kwargs |
Any
|
Optional, additional keyword arguments to pass to the Pandas export function (.to_csv). |
{}
|
Returns:
Type | Description |
---|---|
DataFrame
|
The original DataFrame, unchanged. |
Note
Exporting to some formats such as Excel, Feather, and Parquet may require you to install additional packages.