Modelling Analytical Problems (Basics)
Analytical Problems
Similar to mathematical problems, analytical problems can be expressed and simplified. Problems can be broken down into multiple expressions and then later merged and rationalized into a single design. Like mathematics it also has a spectrum of complexity. On one extreme we have the statistical "please enlighten me" problems and on the other we have the simple "tell me what is going on" questions. Let's leave the statistical problems for another time.
Dimensional Modelling
Dimensional modelling is a framework that allows us to describe our understanding of the analytical requirements. The framework is a translation of the question into a dimensional model which in turn can be used to design the database schema.
Dimensional Model
The model is a construction of 2 basic building blocks – dimensions and measurements. Dimensions are concepts used to qualify a result while measurements are the results. For example: "iPhone4 Sales in January" - Sales is the measurement and it is for the product called iPhone4 in the month of January. This may be represented as SALES <PRODUCT, MONTH>.
Dimensions and Hierarchies
Generally anything used to describe a thing or a group of things are considered as dimensions. While the concept is called a dimension, the actual description is called a dimension element. And the group element if there is any would normally be referred to as the parent element organized into levels within a hierarchy. It is important to realize that there is no hard and fast rule to this and it all depends on the perspective.
Example: In the perspective of a fruit seller, an apple may be the element of fruit dimension. However, in the perspective of a grocery shop who also sells vegetables, an apple may then belong to a group called fruit in the food dimension. This is to cater for carrots that belongs to vegetables in the same food dimension. Confused yet?
To confuse the matter even further, a dimension element (like customer) may actually belongs to more than 1 group. This dimension is considered to have multiple hierarchies. Customer 01 is very loyal and therefore is treated as Gold; And Customer 01 is a still in college hence the customer is classified as Student.
To simplify matters, hierarchies can be treated like separate dimensions of the same elements.
One final note about dimensions is the elements may be derived from a calculation or formula. Example: Age is derived from the Data Of Birth and Loyalty may be calculated based on the number or value of the accumulated sales.
Measures and Calculations
Generally anything that is aggregated will be considered as measures. Examples:[Sum of Sales], [Average Cost], [Customer Count], [Sales per Customer] = [Sum of Sales]/[Customer Count]. It is actually as simple as that. However, it is worth mentioning that these measurements can be further enriched with OLAP funtions like Running Total(), Moving Average(), Percentage(), etc...
Now that you have understood the technicalities... Would you consider Average Age a Dimension or a Measure?
Dimensionality
Dimensionality describes the set of dimensions that qualifies a measure. It is useful in helping modelers understand which measures are compatible. If 2 different measures have the same set of dimensions, it is considered compatible. Example: Sales Value by product and month is compatible with Sales Quantity by product and month. Sales Value by product and month is not compatible with Cost Of Sales by region and month. Sales Value by month is compatible with Cost Of Sales by month.
It is easy to imagine the results of an operation of compatible measures. Example: For the respective Product and Month, Sales Volume can be divided with the Sales Quantity.
However, it makes little sense to have an operation of incompatible measures. Example: For the respective Month, operation between incompatible dimensionality of Product and Region will create a Cartesian product for the combination of Product-Region.
Though, when used carefully, operation of incompatible measures can be useful. Example: For the respective Month and Product, Price can be multiplied with Sales Quantity by Region. The condition is that one of the measures must have the superset dimensionality.
Also when used in this manner with the division operation on percentages, it becomes an effective allocation method.
Mastering Dimensional Modelling
The best way to master dimensional modelling is to have an appreciation of the schema it will eventually conclude to. The measure is the center of the “Universe” while the dimensions and hierarchies pans out from there. The illustration below resembles a snowflake, which gives it its name - snowflake schema.
Rule of Thumb
- Dimension modelling helps translate business questions into schemas
- Dimensionality design depends on perspective
- Measures of different dimensionality are not compatible in any operation
Complex Deal Planning & Execution | Pre-sales/Sales Operations & Management | Value Advisor | Business Architect | Enterprise Architect | Solution Specialist
9yThanks Andy Tan for your kind words. Dimension modeling I believe will be getting more attention with the Big Data wave.
I am passionate about delivering exceptional service to my clients by helping them reduce their annual software maintenance expenses and improve operational efficiency with the aid of an experienced support team.
9yI know how hard you work and how busy you can get so it's even more impressive that you can still squeeze in time to do this.
Complex Deal Planning & Execution | Pre-sales/Sales Operations & Management | Value Advisor | Business Architect | Enterprise Architect | Solution Specialist
9yHi Samit Bodas, a trip down memory lane...
Global Head of GTM Operational Excellence & Business Planning | Business Strategy & Operations
9yGood one Adrian. Reminded me of my days when I used to build star schemas :)
Complex Deal Planning & Execution | Pre-sales/Sales Operations & Management | Value Advisor | Business Architect | Enterprise Architect | Solution Specialist
9yThought I put it down before I forget this timeless info ;)