Archive: May ’10

Bill Inmon on Incremental Data Warehouse Development 27 May ’10

So I will admit I am little behind on my reading…but I just got around to looking at a blog posting on the BeyeNETWORK from Bill Inmon entitled “Incremental Data Warehouse Development – The Only Way to Fly”

I totally agree with what he says.  The SDLC approach for data warehousing doesn’t work, never has worked, and never will work.  How can you use an approach predicated on solid requirements when you know you cannot get them? 

The more I read articles like this, the more I am convinced that WhereScape is on the right track building in support for agile data warehousing practices.  This doesn’t need to mean Agile with a capital “A”, but agile in the context of small, fast pieces of work.  Watch this space - as well as working on a huge amount of new functionality in WhereScape RED to make it even faster to build data warehouses, we have a new product under development as well.  More on that later, but I can say it will be totally focused on helping developers build data warehouses faster.

And thanks to Raphael Klebanov for sending the link through.

Type 6 Dimensions 11 May ’10

WhereScape RED natively supports type 1, 2 and 3 dimensions. But what on earth is a type 6 dimension? It is a combination of a type1, type 2 and type 3! Basically there are two columns used to track a type 6 attribute – one holds the newest value, and one holds a point in time value that references a start and end datetime. This is a great way of being able to:

  • choose whether to report or filter by an employee’s current department or their department when an event / fact was recorded.
  • choose whether to report or filter by a customer’s current suburb or their suburb when an transaction / fact was recorded.
  • do a “where are they now?” type of report eg where are employees working now, that worked in this location 2 years ago.

WhereScape RED can build these type 6 dimensions without resorting to customizing the code. The standard WhereScape RED procedures will generate change detection code for time variant data (type 2 and 3 attributes) and RED will add change tracking columns to manage date ranges in those dimensions.
Just to summarise:

  • A type 1 slowly changing dimension (a Normal dimension in RED’s dimension wizard) will overwrite historical changes with the newest data – one record for each unique business key.
  • A type 2 slowly changing dimension (a Slowly Changing dimension in RED’s dimension wizard) will create new records each time an attribute that is marked as slowly changing does not match the “current” value for that business key. Unique start and end date ranges will manage which record is relevant at any point in time.
  • A type 3 dimension (a Previous Values dimension in RED’s dimension wizard) will hold the current and the previous attribute values for a tracked attribute in two attributes in the dimension.
  • The mysterious date ranged dimension in WhereScape RED is essentially a type 2 dimension that relies on the source system to maintain and pass changed data with date ranges.

So how would you build a type 6 dimension in RED using standard RED code? By copying a source attribute into two dimension attributes, one to hold the current value and one for a point in time value (with suffixes of _cur and _pit). This dimension should be built as a type 2 dimension in RED, with the _cur attributes marked as type 1, whilst the _pit attributes selected as type 2 attributes.

Thanks to Mr Kimball for writing about them here.
And thanks to Andrew Cann and the chaps from Mater for explaining why type 6’s were useful to them!