Skip to main content

Power BI

Needs Votes

Support Time Series Data Compression

Vote (5) Share
Sam I's profile image

Sam I on 30 Nov 2020 18:03:10

This can be done in one of two ways:
1) Enable Date/Time compression in the VertiPAQ Engine.
2) Support formatting UNIX times into date times natively in Power BI Visuals (already implemented in open source plotting libraries)

On the former, VertiPaq is good at compressing data in general. However, it is beyond useless when it comes to storing Date/Time columns as they tend to be entirely unique with time series data. Now, if the compression engine could natively split Date/Time into Year, Month, Date, Hour, Minute, and Seconds columns, then compression becomes viable. Even just splitting Date/Time along Date and Time allows for efficient dictionary encoding.

This technique is even recommended for storage savings in a SQLBI article:
https://www.sqlbi.com/articles/optimizing-high-cardinality-columns-in-vertipaq/
And this explains how VertiPaq compresses data:
https://www.microsoftpressstore.com/articles/article.aspx?p=2449192&seqNum=3

On the latter, VertiPaq is already good at compressing integers. There's just no way to take advantage of that with UNIX time because there isn't a visual that cares for it. Many open-source JS graphing libraries support UNIX - namely D3 and Plotly. Both accept UNIX milliseconds as a valid date-time. This can easily be done in PBI by formatting the ticks and tooltips as Date/Time with little performance reduction - the underlying data doesn't need to be converted to Date/Time in memory, it only needs to appear that way.

Either way, working with Timeseries data in Power BI would become much more practical.

Comments (2)
Sam I's profile image Profile Picture

Kenneth Barber on 22 May 2023 16:00:39

RE: Support Time Series Data Compression

Power BI should support some sort of native datetime data type where times with precision down to the second or better can be represented exactly with integers. This could be Unix time or even ticks.Currently, times are stored as a fraction of a day, which most are not a power of 2, and so they require infinite "decimal" places to express exactly. Since we don't have infinite "decimal" places, times are subject to rounding error, especially when doing any sort of math on the times. You also get less precision available for times the further in time you go since more of the limited precision will be used up to specify the date portion of a datetime.

Sam I's profile image Profile Picture

Sam I on 10 Dec 2020 14:07:51

RE: Support Time Series Data Compression

A Power BI employee has informed me VertiPaq already does what it can with date/time columns. However, I can't edit the idea to reflect this, so forgive the ignoramus tone. It's a question of how well it could handle splitting and knitting date and time columns together while maintaining cardinality.

So, it looks like it's a better bet to support UNIX integers as a valid date/time type.