Star-schema naming conventions
Is it common practice in a star schema to prefix table names as a dimension or fact table? Is it also common practice to have column names prefixed with the table name?
In my normal OLTP databases, I don't do thi开发者_开发问答s, but I'm seeing examples of this type of naming in star schemas.
Does it make sense to have a different set of naming standards for data warehouse schemas vs OLTP schemas?
Thanks Dwight
Table Names:
- I like this convention: [type][subject][name]
- where type is 'dim' or 'fact' (or 'facts' for aggregates)
- where subject is the subject area within the warehouse ('comm' for common, 'fw' for firewall, 'ids', etc)
- where name is ideally a single word name, or abbreviations of dimensions in the case of an aggregate table
- ex: dim_comm_org for the organizational dimension
- ex: fact_scan for the scan fact table
- ex: facts_scan_org_sev_daily - fact scan summary table grouped at the org, sev & day level
Column Names:
- don't prefix with the entire table name - that's way too long
- do prefix with just a meaningful part of it - this helps tremendously when writing or reading queries.
Warehouse vs OLTP Naming:
- the two are very different. Warehouse table & column names often end up in metadata, on reports, being read by both developers and users. Not so much with OLTP.
- I think table prefixes are still useful in OLTP - but there I think it's best if it's something meaningful about that subset of the model rather than a fact/dimension distinction.
The tablename_column name convention is used to ensure that all fields within a database are unique, although it is somewhat excessive it can be used for when there is a standard / requirement for unique naming (Which some client IT departments demand.)
Product.Name => Product.Product_Name
Part.Name => Part.Part_Name
It removes any ambiguity over where Name would come from.
I prefer not to name tables with a prefex at all (assuming that does not break the local standards of a company), since whilst it might be a table today, it could be re-implemented as a view or partitioned view tomorrow but expose the same schema, and I would then have to accept objects prefixed incorrectly or update everyones reference to the new name / create a synonym.
Having consistency though tends to be the winner, if every DBA / Dev implemented their own version it would be chaos, so I would tend to find the company standards and apply them.
It is common in DWs to name columns with "long names" because those columns end up as column headers in reports (query results) and are supposed to be business-user friendly. So instead of having Product.Name
and Customer.Name
which would both show up as "Name" (unless alias is used) it is common to use Product.ProductName
and Customer.CustomerName
so they show up as "ProductName" and "CustomerName" in top row of a report (query) once the star is flattened via joins. Underscores are frequently used instead of camel-case and blanks, if allowed by the DB. Prefixes dim and fact are recommended in large DWs when table's role in the schema may not be obvious; I actually like them.
with Oracle Business Intelligence Repository there is a preference towards the use of suffixes.
Ken, I like your [type] [subject] [name] convention, where type is 'dim' or 'fact' (or 'facts' for aggregates) The problem is that when creating the Star schema model in the Oracle Business Intelligence Repository, best practices suggest that we should create alias names for the dimension and fact tables with a DIM_ (or dim), and FACT (or fact_) prefixes for the dimension and fact tables.
In order to avoid having alias dimension and fact tables to read dim_dim[table name] or fact_fact_fact_[table_name), it is preferred to name the dimension tables with a _DM (or _dm) suffix, and the fact tables with a _FT (or _ft) suffix.
精彩评论