The Clearly Podcast

Pre-Aggregation of Power BI Data

Summary

As we seem to mention this on almost every episode, we decided to devote an entire podcast to it - pre-aggregation.

To start off with, a matter of terminology.  The official terminology is "aggregation".  In the title we talk about pre-aggregation - this is a matter of clarity since every aggregate function (sum, average etc) carries out an aggregation.  What is officially called aggregation is a technique that aggregates the data in-memory during the loading process and link these aggregations to your measure definitions.

The documentation describes the basics of how to build aggregations, but as ever, Reza Rad has a fantastic tutorial on how to do this.

You can download Power BI Desktop from here.

If you already use Power BI, or are considering it, we strongly recommend you join your local Power BI user group here.

Transcript

Andy: Welcome to the Clearly Podcast, Episode 12. Today, we discuss Power BI pre-aggregation—what it is, when to use it, and its benefits and pitfalls. Let's dive in.

Pre-Aggregation Overview

Andy: Tom, can you explain pre-aggregation and how it differs from regular aggregation?

Tom: Sure. Pre-aggregation, as I call it, is aggregating data while loading it into a data model. This is useful for handling large data sets you don't want to import fully. For instance, summing data by customer and month for faster queries.

Shailan: Even if data is in-memory, using direct query requires aggregation at query time, which can slow things down. Pre-aggregation helps by doing this at the query level.

Benefits of Pre-Aggregation

Andy: How does pre-aggregation help system responsiveness?

Tom: It reduces load on the source server by handling common queries in Power BI. Less frequent queries can still pull from the source.

Andy: So, for large projects, especially with many transactions, this technique can be crucial?

Shailan: Yes, and other tools like Tableau and Click have similar features. The key is pre-aggregating common queries to improve performance.

Implementing Pre-Aggregation

Andy: How do you decide what to pre-aggregate?

Tom: Focus on the most common use cases. Don’t create every possible combination of aggregates; it becomes unmanageable.

Andy: Can you give examples of when to avoid pre-aggregation?

Tom: If your data fits into memory, pre-aggregation may be unnecessary and complicate maintenance. Also, avoid it if you have many infrequent queries or exploratory data analyses.

Shailan: Performance tools can help identify slow queries that might benefit from pre-aggregation.

Practical Advice and Pitfalls

Andy: What are some pitfalls?

Tom: Overly granular aggregation reduces benefits. It's rare to undo pre-aggregation, but if your model is too complex, it can become a maintenance headache.

Shailan: Also, ensure historical data is pre-aggregated, as it's less likely to change.

Top Tips for Pre-Aggregation

Andy: Final tips?

Shailan: If possible, import data instead of using direct query. Focus on common queries like year-to-date or last year's figures.

Tom: Don’t overdo it. Too many combinations can cause issues. Ensure pre-aggregations improve performance significantly.

Andy: Always verify your figures after pre-aggregation to ensure accuracy.

Conclusion

Andy: Thanks for the insights, everyone. Remember, approach pre-aggregation thoughtfully to maintain performance and manageability. Goodbye, and stay tuned for our next episode on project management of Power BI projects.