Kimball has been a standard in the design and implementation of data warehouses over the years, mainly for gains in data reading performance.
Lately, with the progress in analytics, the question has arisen whether the use of the Kimball methodology is still relevant.
Why does this question arise?
The emergence of columnar storage has challenged the main advantage of the methodology: performance.
Traditional relational databases are stored in row pages, to get a column from a particular row, you need to pull the whole row.
With columnar storage, data is stored in compressed column segments. As each column is stored individually, it is possible to read only the desired columns.
Column storage brings performance gains in OLAP workloads. Therefore, having a single wide table with all the columns will performed better, since it eliminates the joins without losing performance.
So, is Kimball still relevant in a modern DW architecture?
It depends, but for most data warehouse the answer is… yes, but the reason it is not performance anymore.
Despite a wide denormalised table has improved performance; it can be difficult to maintain. Data management is very important in this kind of systems.
Updating a field in a SCD type 1 dimension may require updates on millions of records on our denormalised table. Also, evolving the system, like adding a new field to a dimension can be a huge task on a denormalised table.
If we are loading information from several different systems, something typical in a DW, a Kimball approach helps to identify and standardize common dimensions.
Also when implementing dashboards, facts with common dimensions can take advantage of these links for interaction (e.g. PowerBI), something more complicated between two or more denormalized tables.
What to do?
In conclusion, the reasons for using Kimball will be not for performance but for data management.
This do not exclude that putting everything in a single denormalised table cannot be a solution for some scenarios where performance is needed. We can adopt a hybrid architecture according to the requirements.
Kimball is not the only solution for a datawarehouse, other methodologies, such as Data Vault or Inmon, exists and should be considered. However, they suffer from the same problems discussed in this article.