Skip to content

About

Banner image for Pandas Checks

Introduction

Pandas Checks is a Python library for data science and data engineering. It adds non-invasive health checks for Pandas method chains.

What are method chains?

Method chains are one of the coolest features of the Pandas library! They allow you to write more functional code with fewer intermediate variables and fewer side effects. If you're familiar with R, method chains are Python's version of dplyr pipes.

Why use Pandas Checks?

Pandas Checks adds the ability to inspect and validate your Pandas data at any point in the method chain, without modifying the underlying data. Think of Pandas Checks as a drone you can send up to check on your pipeline, whether it's in exploratory data analysis, prototyping, or production.

That way you don't need to chop up a method chain, or create intermediate variables, every time you need to diagnose, treat, or prevent problems with your data processing pipeline.

As Fleetwood Mac says, you would never break the chain.

Fleetwood Mac YouTube video for The Chain

Giving feedback and contributing

If you run into trouble or have questions, I'd love to know. Please open an issue.

Contributions are appreciated! Please open an issue or submit a pull request. Pandas Checks uses the wonderful libraries poetry for package and dependency management, nox for test automation, and mkdocs for docs.

License

Pandas Checks is licensed under the BSD-3 License.

🐼🩺