Alkmaar     +31659380335

Data Engineering and Analytics

Dynamically Calculating Yesterday’s Date

By Farnam Iranpour

Streamlining ETL Pipelines: Dynamically Calculating Yesterday’s Date

As a data engineer, one of the most common tasks is building ETL pipelines that process data from the previous day. While it may seem simple to manually enter a date before running the pipeline, this approach is both inefficient and error-prone. A small mistake, such as entering the wrong date format, can lead to unnecessary debugging and delays.

Fortunately, there are simple and effective ways to dynamically calculate yesterday’s date. Whether you’re working in Python or MySQL, you can automate this step to save time and reduce the risk of errors. Let’s explore how to achieve this.

Using Python

Python’s datetime module provides a straightforward way to calculate dates. Here’s how you can dynamically generate yesterday’s date:

This script:

  1. Retrieves the current date using datetime.today().

  2. Subtracts one day using timedelta(days=1).

  3. Formats the resulting date in the desired format, such as YYYY-MM-DD, using strftime.

Using MySQL

If your ETL pipeline relies on SQL queries, MySQL has a built-in function to calculate yesterday’s date:

This query:

  1. Uses CURDATE() to get the current date.

  2. Subtracts one day using the DATE_SUB function with the INTERVAL keyword.

  3. Returns the result as a column labeled yesterday.

Why Automate This Step?

  1. Efficiency: Automating the calculation of yesterday’s date eliminates the need for manual input, allowing your pipelines to run seamlessly.

  2. Accuracy: By leveraging code, you minimize the risk of human error, ensuring that your pipelines always process the correct date range.

  3. Scalability: Automated date calculations make your code more adaptable to changes and reusable across multiple projects.

Conclusion

Manually entering dates is a thing of the past. By dynamically calculating yesterday’s date using Python or MySQL, you can streamline your ETL workflows, improve reliability, and save valuable time. Whether you prefer scripting in Python or writing SQL queries, these methods are easy to implement and will enhance your data engineering practices.