OLAP数据库由非规范化形式的数据组成。
但一种流行的OLAP数据库设计是事实-维模型。事实表将存储基于数值的条目(销售数量等)
我的问题是,在这个设计中,它看起来一点也不规范,因为所有维度表都有对事实表的外键引用。它与OLTP设计有何不同?
发布于 2013-08-26 19:46:13
反规范化是在星型模式中的维度表中进行的:例如,在产品表中,您在这个表中显式地有许多列,比如产品类别的几个级别,而不是每个级别都有一个表,并使用引用这些值的外键。
这意味着您可以对事实进行标准化,但停止对维度表进行标准化。
此外,您经常甚至不能完全规范化事实。一个典型的例子是这样的:在一个完全规范化的表中,您将只使用两个列'number of units sold‘和'price per unit',但在OLAP数据库中,冗余地拥有另一列'sales value’可能是有意义的,该列可以很容易地通过将售出的单位与单位价格相乘来计算。
发布于 2018-10-27 04:36:28
You can get the difference if you study first "highly normalized schemas".
https://www2.microstrategy.com/producthelp/10.6/ProjectDesignGuide/WebHelp/Lang_1033/Content/ProjectDesign/Highly_normalized_schema__Minimal_storage_space.htm
Will give you an example: Consider a "city" inside a "country" for a "person",
all what you need to store for a person is only his "city" because anyway that city resides in a "country".
so you don't have also to store the "country" in the "person" table.
This approach will have advantage of "minimal" storage.
But as disadvantage it will be annoying to retrieve "country" for a "person"
since you will have to do many joins to achieve that.
So regarding your question, in your design, if we stored both "city_id" and "country_code" in "person" table,
this will cause little redundancy but as advantage it will be more easier to get "person" "country" by directly joining the two tables "Countries" and "person" together.
Normalization main purpose is to remove redundancy. And to achieve data consistency.
For example, in your case OLAP , developer can make mistake by inserting correct "city_id" and wrong "country_id"
for example he can insert "Paris" as city and by mistake he can insert "Germany" as the country which is wrong.
If the schema is fully normalized, this cannot never happens since it will store only "Paris" "city id" in "party" table and will not store "country" id.
So yes, OLAP is de-normalized since it allows data redundancy and developers (application) mistakes (if any).
https://stackoverflow.com/questions/18434012
复制相似问题