Merry Christmas WhereScape 2011 19 Jan ’12

Posted by: Steve Dickens

The WhereScape Christmas night out for staff was in Auckland in late December 2011 and kicked off with a few swift “sharpeners” at the very trendy Britomart Country club bar in the afternoon. Luckily there were some other BI bodies from Vodafone there to keep us company before we headed to “Monsoon Poon” in the Viaduct in Auckland. It’s a large Asian fusion restaurant and they didn’t have a table big enough to accommodate the 32 of us who turned up. This got a bit rowdy after three and a half hours of food, cocktails and jugs of beer so we moved on to the even more trendy Snap Dragon bar also in the viaduct. At this stage some of were a little wobbly and had to bluff our way past the ridiculous number of security staff. Apparently some of us also looked a little too scruffy to be allowed in but we all got in eventually. After a few more hours there we finished the night with no style at all at Danny Doolan’s dark, dingy bar until after 3am.

Thanks to all who made the effort to turn up and made it a fun night, especially the Hamilton crew.

In the photos you can see two things:
i) Mike holding one of the longest bills ever at Monsoon Poon. Coincidentally we had all suddenly gone outside at this point so thanks for paying.
ii) The tragic effects of having one too many as a couple of BI Guys do a bit of boy bonding in Snap Dragon.

Happy New Year to all WhereScapers, 2012 will be a great year!

A special farewell to Phillip James Considine 12 Jan ’12

Posted by: Perry Sansom

On the 10 January 2012, Phil, who was the Head of Consulting at MIP (WhereScape Master Distributor in Australia), passed away and our thoughts & prayers go out to Gail and the family at this difficult time.

Phil was passionate about his agile business intelligence work and he was instrumental in establishing WhereScape RED in the Australian market.

It was a pleasure to have known him & he will be missed.

Farewell big fella … Perry

How to Print Out the Entire RED Documentation 20 Dec ’11

Posted by: Raphael Klebanov

In order to be able to print out the complete RED documentation set, it is required to create a script that merges all the RED Documentation’s HTML files into one HTML file. The description is as follows:

  • Put the script_html_merge.bat script into RED documentation directory.
  • Double click on it to execute manually; it will create doc.html merged file. The execution of the script_html_merge.bat script can also be scheduled via RED Scheduler.
  • Download http://sourceforge.net/projects/pdfcreator/ PDF creator a free PDF printer for Windows. If desired, the other similar product can be used.
  • After the installation, it will create PDF Creator printer.
  • Open doc.html and print it to PDF Creator – it will create PDF file.
  • For PDF creation, you can use any other tool allowing printing into PDF or convert HTML into PDF; I used PDF Creator simply because it is free and it does a good job.
  • The code is two lines really, see attached code below

Note: Obviously, you can print out one doc at a time by Right Click on the doc and choose Print from dropdown menu

 

REM  *****************************************************************************

REM  Script Name    :    script_html_merge

REM  Description     :    Merges all the RED Documentation HTML files into one HTML file

REM  Generated by  :    WhereScape RED, manually

REM  Generated for :    WhereScape Customer

REM  Author              :    Raphael Klebanov

REM  *****************************************************************************

 

echo OFF

copy index.html doc.html

REM FOR %%G IN (dir /B w*.html) DO (copy doc.html+%%G d && copy d doc.html && del d)

FOR %%G IN (dir w*tech.html w*user.html ) DO (copy doc.html+%%G d  && copy d doc.html && del d)

 

REM this line will put the “tech” files first, then “user” files

 

REM  *****************************************************************************

REM  Notes

REM  *****************************************************************************

REM  1. “FOR %%G”  means for (each item)

REM  2. “/B” switch enabling the batch file to quit with a return code. “/B” option can be removed. I do not see any issues.

REM  3. DO (…&&…) means run the command following && only if the command preceding the symbol is successful.

REM  4. Small “d” is a temporary file used as a temporary copy.

REM  5. If the list of the files always contains the same files, then the remaining files can be added before or after w*, e.g. glossary.html

REM  6. To prevent cut off the images, configure the PDF Printer, e.g., use landscape instead of portrait.

REM  7. You can modify list of HTMLs that are processed within DO statement to include/exclude the doc files

 

Good luck!

Harbour Bridge Bike Ride for MS 12 Dec ’11

Posted by: Jason Laws

Sunday 11 December 2011 was the first time cyclists have been able to ride over the Auckland Harbour Bridge since it opened in 1959. They also got to ride up the Northern Busway. The event was the “Telstra Challenge”. Several thousand people took the opportunity to ride the bridge.

The tough guy option was the 110km (68 mile) race. This was a difficult ride including 1550m of ascent (that’s 5000ft) and finishing at Kumeu show grounds north west of Auckland. Lucky participants got to ride past the front door of New Zealand’s only maximum security prison at Paremoremo then through some great scenery. There was even a 3.5km gravel (lose stone) section just before half way.

The not-so-tough guy option was 15km bike ride equivalent of a “Sunday stroll”. It included a leisurely ride over the harbour bride, a gentle cruise up the busway to Constellation Drive and a gentle roll back down to Shakespeare Road.

All money raised by the event will be used to support the work of the MS Society in Auckland and for MS research. The latest estimate of funds raised is $82000.

William Hayman (fresh from completing the round Lake Taupo 160km race two weeks ago) and Jason Laws both get massive congratulations for completing the 110km race. Super fit Jacob Hendrickx also deserves a mention for getting out of bed earlier than normal on a Sunday to roll his bike over the 15km ride.

The 6am start turned into a 7:15 start for most riders due to an “unknown hold up”. Even the elite professional group (averaging 1% body fat) were left shivering in 25 knot winds waiting to start. Luckily Jason and William were better prepared for the wind than the elite group and quickly warmed up with all the hill climbing. Both agreed it was a harder ride than Taupo (although shorter).

Jason finished in 05:05 on his single speed bike. William, with 20 gears to choose from, was just behind managing an admirable 05:19. There’s a rumour 10% of the 110km starters didn’t finish. After taking the smart option and choosing the untimed “Sunday stroll” ride, Jacob was cited in a cafe in Takapuna at 9am.

Santa(s) Run for Charity – Dec 2011 9 Dec ’11

Posted by: Steve Dickens

Being keen supporters of any charity who are prepared to sponsor and organise an event and give away goodie bags, WhereScape had a strong contingent entered in this years “The Great New Zealand Santa Run.

All jokes aside, the KidsCan charity is a fantastic charity that supports disadvantaged Kiwi children.  With at least one in six New Zealand children living in poverty, this was a cause we wanted to raise money for.

KidsCan helps by providing food, school shoes, rain coats and other items to children in lower socio economic areas.  KidsCan assists children in schools where the needs are most acute and provides them with resources to overcome the physical obstacles to education.

 The event itself takes place in 6 locations across New Zealand and we entered the Auckland one, which consisted of a very short, 3km in length, completely flat course in the Viaduct Harbour.  Compared to some of the events we have done we thought this would be a walk in the park but we had forgotten the most difficult thing would be to run on a really hot summer night in a really hot and itchy Santa suit with a beard that was hard to breathe through!

 Last year there were 350 Santas taking part, but this year the numbers were up to 850 which was an amazing sight to see, especially for people dining in the Viaduct.  Despite the Santa suits all being “One size fits all” you can clearly see from the photo that some were a little tighter than others.

“What about the Chicken ?” I hear you ask.  Well, that just happens to the WhereScape CEO and Co Founder, Michael Whitehead.  We are not sure why he chose to run in an outfit that was even hotter than our “super hot” Santa suits (he even added a Santa skirt after the photo was taken), but the crowds loved and cheered the WhereScape “Chicken Man”.  They took some persuading that Mike was actually supposed to be a Christmas Turkey but a great effort.

A great laugh for a great cause and as usual we’ll be looking for even more people next year to take part to see if WhereScape can beat eleven Santas and one chicken, I mean turkey…..

A big thanks to all who took part.

 

Lake Taupo Cycle Challenge – Nov 2011 9 Dec ’11

Posted by: Steve Dickens

The Lake Taupo Cycle Challenge is New Zealand’s largest cycling event and the world’s largest cycling relay.  Being a little competitive and big talkers, we all decided to skip the relay and do the “solo” event which is the full 160km anti clockwise circuit of the lake.

Apprehension crept in the night before as the wind woke some of us in the middle of the night, and sure enough by 7.30am the gusts had got even worse.  We seemed to have a headwind for the first 90km (including up hills!) then some side winds then no wind then some more headwinds as we approached the last 20km back into Taupo.  Where was the tailwind we all wondered…..

The highlights were the scenery, the amazing sight of Thousands of cyclists doing an event, the endless hills on the western side of the lake and crossing the finish line.

 This is definitely one event for the “Tough Wall” in the office as it was the hardest event all of us had attempted, physically, mentally and towards the end…..emotionally.

None of us managed to break seven hours, which we all agreed was a very long time to be sitting on a tiny plastic seat.

As usual, congratulations to Stephen Dickens, Wayne Lanting & William Hayman who all finished this gruelling ride.  We’re all looking forward to next year and will hopefully have a larger team!

WhereScape finishers from previous Lake Taupo Cycle Challenge events:  Jason Laws, Scott McKay & Chris Wyllie.

3D Speeds Up Analysis 23 Nov ’11

Posted by: Jason Laws

The other day I set out to build a small data mart combining support data and development data from two database sources and several other sources.

I needed to quickly find out which tables in the two database sources I needed to use.

My development system has 104 tables and my support system 244 tables.

So I pointed WhereScape 3D at both sources.

Very quickly I had:
- complete ERDs of both sources
- profiling results for all tables and columns
- identified the 12 tables in the develpment system I needed
- identified the 13 tables in the support system I needed

Less than 30 minutes after starting, I was ready to build my data mart using WhereScape RED.

I’m sure it would have taken me at least a day to figure out both source systems without 3D.

Here’s the ERD I ended up with for the subset of the develpment system I needed:

And here’s the one for my subset of the support source system:

RED’s Supported Platforms and Databases after 6.5.5 23 Nov ’11

Posted by: Jason Laws

The current release of WhereScape RED, version 6.5.5, will be the last release of RED supporting the following operating systems:

· Windows 2000
· Windows XP before Service Pack 3
· Windows 2003 Release 1

It will also be the last release of WhereScape RED supporting the following databases versions for metadata repositories and target data warehouses:

· SQL Server 2000
· Oracle 8.1.x and 9.0.x
· DB2 9.1 and 9.5
· Teradata V2R5.x and V2R6.x

This will make the minimum supported Windows version for RED:

· Workstation: Windows XP SP3
· Server: Windows 2003 R2

And the minimum supported database versions for RED metadata repositories and target data warehouses:

· SQL Server 2005
· Oracle 9.2
· DB2 9.7
· Teradata 12

As is the case now, source systems using older versions of these databases and any other databases will still be accessible using RED.

26.2, because 26.3 would be crazy 21 Nov ’11

Posted by: Michael Whitehead

6.30am, a police escort through the streets of New York; this marathon thing is really going to happen…

It is actually easy for international runners to get into the New York marathon.  You can enter the ballot (around 28,000 spots this year, but there is a bias against international entries), or qualify (currently that would take a 3.10 time for me, moving to 2.58 next year – no chance on that one), but the easiest way is through an accredited travel agent or through a charity.

After missing out on the ballot, I put it out of my mind until I got an email introducing Fred’s Team (Memorial Sloan-Kettering Cancer Center in New York).  As far as choosing a charity goes, cancer research was an easy decision, and Sloan-Kettering checked out. Why not combine a run with raising money?

 
Fred’s Team turned out to be superb choice (although admittedly I didn’t look at any others).  They offered guaranteed spots if you raised $3500 or $5000.  I signed up for the $5000 level, but set a goal of $10,000.  What I didn’t realize at the time was how organized and supportive Fred’s Team would be.  As well as the police escort, their buses leave at 6.30am (official transport starts at 5.00am, and with the last wave going at 10.40am, that would be a long wait).  They have a tent at the start village, and a tent at Cherry Hill at the end of the race.  Their support team was amazingly helpful: from answering silly questions in the lead up, to the expo, the pre marathon dinner, the marathon breakfast and the support at the start and finish (not to mention the energetic support on the course). All the interactions with the Fred’s Team were professional, enthusiastic and supportive. 

The fundraising support was more than I could have hoped for.  Together we raised $10,580. My donation page reads like a who’s who of the business intelligence industry.  I can’t thank enough Scott Humphrey, Claudia Imhoff, William McKnight, Jill Dyche (poker star), Simon Arkell, Tamara Dull, Donald Farmer and all the fabulous Pacific BI Summit participants.  Thanks to my anonymous donors, to the WhereScape team, the motley group of mates, our customers and partners:  Paul Glass, Graeme Boag, Doug Barrett, Steve Dickens, Trevor Eastabrook, Nick Lambert, Chris Wylie, Jason Laws and the team at Barclays, Jeremy Rees, Lindsay Esler, Daniel Barnes, Doug Hoogervorst, Stuart Preston, Tony Millar, Gerhardt van der Westhuizen, Peter Newey, Martin Sowter, James Arbuckle, Sandra Lukey, Raphael Klebanov, John Quirk, Perry Sansom, Rob Briscoe, Peter Wogan, Wayne Richmond, Mark Budzinski, Mary Edie Meredith, Martyn Levy and David Morris…you all rock!

   
Each donation was a huge motivation, both to get out there and train and also to ask for more donations.  It is a great feeling to come back from a long run and see that while you were out another donation has come in. 

Training for marathons is hard work.   I ended up doing training runs in Auckland, Taupo, Oregon, California, Colorado, New York, Budapest and Sydney.  I got two plans from Brendon Downey of marathontraining.co.nz, a 12 week marathon plan preceded by a 7 week “get ready for the marathon plan” plan.   I also took the opportunity to eat better, and Jonny from Mission Nutrition had me eating cottage cheese and oatmeal.

It all came together on November 6th. The actual race was tougher than I expected.  I had looked at the elevation charts, and it didn’t look as steep as some of the hills in Auckland.  What I hadn’t really taken account of was where the hills come in.  In Auckland the second half is all flat, in New York there are still hills at 20+ miles, and the finish line is up hill – what’s with that?

Over 47,000 people gathered in the starting village in Staten Island.  I was in the blue start of wave 2, which means running the same course as the pros, but starting at 10.10 (all three start courses join by the 8 mile mark).

First up is the Verranzano bridge, and I can confirm (from observation not participation) that it is not anecdotal: if you are in the green start (lower level) you definitely do not want to run on the outside.  The bridges are the big hills on the course, but none are particularly tough.  You do lose time going up them, but what I hadn’t really contemplated was you don’t make the time up going downhill.  Particularly on the Verranzano Bridge, you run at the pace of the crowd.  You would normally expect to be back on goal time after going up and down a hill, but I found myself behind my goal time right from the start.

The first half marathon is basically in Brooklyn.  Each neigborhood has a distinct feel, with no bigger contrast than Williamsburg which goes from Orthodox Jewish to New York hipster in the space of a block.  Two feet past the seventh light pole on the Pulaski Bridge between Brooklyn and Queens marks the half-way point.  I was still running fine as I passed the sign, but with the time I had lost knew that sub 4 was now my goal.

It was around here I passed the only New Zealand flag I saw on the entire race, but I did chat with two New Zealand runners.  One of them I ran with up the Queensboro Bridge between Queens and Manhattan.  For some reason he was not keen on a long chat.

 
The theory was Felicity could follow me on the Marathon ipad app.  Unfortunately the app never worked so she had to guess when I would run through based on projected times and rumours of a delayed start time.  The technology on the course was a little disappointing.  Clocks are only set up for the first wave.  ASICS set up giant screens where personal messages flash up as you pass over sensors – but they don’t work if a group crosses a sensor pad together.  My plan of live tweeting also didn’t come to fruition (but this may well have been user error).  I had set up some tweets to come out as I passed sensors at key milestones but they only appeared on Facebook (to my eight friends) and not on twitter.

In the end Felicity just waited for me on First Avenue, at around Mile 16.  After a tired wave I veered over to the other side of the street to pick up the support from the cheering section at the Memorial Sloan-Kettering Hospital, and they did not disappoint with a noisy reception.  I found this section one of the hardest of the race.  The runners were spread out enough to be able to run at your own pace, but did I have enough in my legs to push at the 18 mile mark?  And it is up hill here!  It may not seem like it when you are in a taxi or walking along, but I can assure you it is uphill.

There still may be some traumatized people around 110th and 1st as I had to stop, strip off and donate my compression top to the streets of New York.  I had underestimated how warm it got, and felt much better running in only one layer.  The crowd thins a bit in Spanish Harlem, and then it is over another bridge into the Bronx.

You are only in the Bronx for a short period of time, and then it is into Harlem.  At this point it was all about survival, not walking, and getting to Central Park.  I am not sure how this works, but it is definately uphill along 5th Avenue as well.

 
The turn into Central Park is a big moment.  At this point you know you can do it.  It is up and down in Central Park.  The section at the bottom of the park when you drop out by the Plaza Hotel and run along to Columbus Circle was another of those tough sections.  You feel like you are there, yet you still have to keep going.  Turning back into Central Park it was all on.  By this time I knew I was pushing the 4 hour mark, so picked up the pace.  At least I thought I did – looking at the times later I did the same split times for the last three miles.

It is superb feeling to head up the hill to the finish line.  I had been warned not to look at your watch as you cross the line (or you are left with a great finishing photo of the top of your head), and ended up fist pumping for the last 20 yards or so.


At the finish of the race you get a heat blanket, finishers pack and of course the medal.  For most it is then a mile long walk to pick up your gear.  Fred’s Team members get picked out by volunteers and escorted to Cherry Hill where you are handed water, Gatorade and pretzels.  After 26.2 miles all were welcome.

Despite the claims of my children about not being able to move post marathons, after catching up with Felicity we walked back from 77thish to the Hilton at 54th, and I was up to going out to dinner that night.

Overall, what a superb experience.  I was thrilled to be able to meet my fundraising goal, as well as my (admittedly amended) finishing goal.  Thanks again for everyone for the donations – I owe you all.

For the runners, I finished in 14876th place, 1748th in my age group, at a pace of 9.08/mile or 5.40/km with a time of 3.59.07.  Split times were:

Keyboard Productivity – WhereScape RED shortcut keys 10 Nov ’11

Posted by: Jacob Hendrickx

Every time you take your hand off the keyboard, reach for you mouse, locate the cursor, move the cursor where you need it, click, move again, and click; you are wasting precious seconds. Precious, precious seconds, which over the course of a day will add up to minutes. And which over the course of a week, will add up to hours. Hours of time lost, per week, moving your hand back and forth.

The trick to saving time and increasing your productivity is to mitigate those seconds lost by keeping your fingers engaged with the keyboard and the task at hand. It’s a little known fact that almost everything you do with your mouse in WhereScape RED, can in fact be done instead using keyboard shortcuts.

There are many shortcuts which will speed up your development time, directly improving your projects delivery time, whilst making you look like a super star. Don’t be intimidated though; there are a lot of shortcuts out there. For best results, print off the accompanying cheat sheet and pin it to your monitor. Try to identify just a couple of functions you use often and begin incorporating them into your everyday routine. Then next week, pick up a couple more, incorporating them as well. Before you know it, almost everything you do will be via shortcuts and you may as well unplug your mouse and save your business the power usage as well.

Download the Keyboard Productivity cheatsheet from our marketplace page – http://www.wherescape.com/support/marketplace/keyboard-productivity/

WhereScape Europe – “RED & AMG Test Drive” 27 Oct ’11

Posted by: Richard Noble

October 21, 2011 - Mercedes-Benz World, Brooklands, on the site of the world’s first professional motor sport race track, was the historic setting for the WhereScape RED product test drive.

Situated in the top floor of the AMG suite, eight hand picked delegates, representing WhereScape Europes’s prospects and business partners, were set the challenge of completing a section of the WhereScape RED Enterprise Data Warehouse training course in only 90mins.

Some healthy competition surfaced as the attendees fuelled by breakfast, and the prospect of winning podium prizes, started to build a data mart from scratch. During the product test drive Senior Business Managers connected with their technical roots, and Technical Managers discovered how their teams could design and build data warehouses 10 to 100 times faster.

As the finishing line drew near, the pit crew of Terry Mooney and Paul Watson-Gover (Senior WhereScape Consultants) ensured that all drivers were prepared for their final exercise. Source tables were loaded, 3NF structures were linked and populated, all ready for a final dimensional transformation and cube generation.

At the chequered flag Richard Noble, Head of Consulting, declared a dead heat, and all attendees were treated to an AMG test drive round the Mercedes-Benz World test track.

The real result – WhereScape RED delivered with great speed, agility and ease… much like the race tuned Mercedes AMG!

RED - Speed, agility and ease of use

RED - Speed, agility and ease of use

WhereScape Europe – “Healthcare Efficiency Through Technology” 27 Oct ’11

Posted by: Paul Watson-Gover

October 4, 2011 - A week after the New Zealand Embassy event, it was time to head back into London, to Olympia this time, for The Healthcare Efficiency Through Technology (HETT) conference. This event gave healthcare professionals a chance to learn about and discuss the technological approaches that can help them to achieve greater efficiency and effectiveness in their organisations.
 
This was a very well organised event, it turned out to be one of the busiest so far! There were lots of WhereScape RED demos and business conversations with potential customers and partners in the Healthcare sector.
 
We would like to extend thanks to all the people we spoke to and look forward to the next event.

WhereScape Europe – “Breakfast Workshop” 27 Oct ’11

Posted by: Paul Watson-Gover

A busy week for the WhereScape Europe team kicked off on Thursday 29th September. WhereScape Europe hosted a breakfast workshop in central London, and the venue for this auspicious event was the Penthouse of New Zealand Embassy.  What a venue it was, with fantastic views over London, as the photos below prove.


We were just putting the finishing touches to the room as our guests began to arrive. The guests got stuck into their breakfast and engaged in a little networking before the main event. Rob Mellor, UK Country Manager Manager, got the event underway with an excellent Agile Data Warehousing presentation. This was was seamlessly followed by the always reliable Terry Mooney, demonstrating an end to end Data Mart built in  only 45 minutes. After a short coffee break our esteemed guest speakers took to the stage for their presentations. First up was Lawrence Corr with an introduction to his new book “Agile Data Warehouse Design – from Whiteboard to Star Schema” (available now in all good book shops and online book retailers!). The final presentation was by Shawn Lewis, BI Manager, at Vodafone. He gave a inspiring account of how Vodafone are utilising RED globally to enhance their data warehousing approach.
 
The event was drawn to a successful close with most attendees staying behind to network with each other and WhereScape staff. Overall the morning was a great success, capped off by the team and some guests venturing out onto the busy streets of London for a well earned lunch and a couple of drinks to celebrate.

BI Guys Strike Back 19 Oct ’11

Posted by: Steve Dickens

 

The WhereScape NZ Team at laser Quest

The WhereScape Auckland team at laser quest

The WhereScape Auckland team meeting for September 2011 was held at the Megazone Laser Tag venue just off Ponsonby Road. For our second visit, fifteen people could make it this time so we had three teams of five which was perfect for the two level maze.

You can read the previous blog “Star Wars meets Ponsonby” to get the general idea of what we got up to, but needless to say it was three sessions for good, sweaty, laser fuelled fun in the dark.  Unfortunately for most of us there, age, stamina and experience didn’t seem to be any advantage over youth and exuberance.

As usual, some bar cruising Ponsonby style was the order of the day post Laser Quest, followed by some rather oversized burgers & fries at BurgerFuel.

Thanks again to those who could make it.

Steve

Comparative Analysis 3NF vs. DV 27 Sep ’11

Posted by: Raphael Klebanov

Purpose

A number of projects I have been working on recently have benefited from the use of Data Vaults. The success has led to questions regarding Data Vaults and their application to Enterprise Data Warehousing (EDW).

This comparative analysis has been created to assist customers with making decisions regarding the design approach for an EDW Project.  It compares and contrasts two different approaches: Third Normal Form (3NF), historically a very popular option for EDW and Data Vaults, a purpose-designed for EDW modeling technique. There is no “right” answer, both options have advantages and drawbacks, and the final decision should be based on an organization’s unique circumstances. WhereScape professional services are experienced in both techniques, and are available to make recommendations where required.

 

Third Normal Form (3NF) model

Definition of 3NF:

  • A database, in which each attribute in the relationship, is a fact about a key, the whole key and nothing but the key. 3NF usually refers to a fully normalized structure where information is stored in 3rd Normal Form (E. F. Codd’s 3NF).
  • A 3NF structure is used for the EDW popularized in W. H. Inmon’s Corporate Information Factory (CIF).
  • The generally accepted goal is that a company has one centralized EDW – Data Marts and other Analytical Business Structures are fed from the EDW.
  • The data warehouse is stored using Database Normalization rules in 3NF – tables are structured into subject areas.
  • 3NF models are used for Operational Data Stores as well as Enterprise Data Warehouses.
  • 3NF structures are not recommended for queries and reports.
  • 3NF databases offer performance and stability for Online Transaction Processing Applications (OLTP).

 

Data Vault Model

 

Definition of Data Vault:

  • The Data Vault (DV) has a detail oriented, historical tracking, and uniquely linked set of normalized tables that support one or more functional areas of business. It is a hybrid approach designed to encompass the best of breed between third normal form (3NF) and star schema. The design is flexible, scalable, consistent, and adaptable to the needs of the enterprise.
  • A DV is intended to address the business challenges that DW practitioners meet. It is designed to avoid or minimize issues related to Dimensional Model and 3NF methodologies.
  • DV Modeling is an EDW design methodology that provides historical storage from multiple sources with complete tracking back to system of origin.
  • This method can be adapted to changes in the business environment.
  • The Data Vault is organized around existing business keys.

  

Comparison between 3NF and Data Vault for Enterprise Data Warehousing

 

Category 3NF DV Notes
Initial Purpose OLTP EDW The DV link structure allows flexibility in handling relationships between the Business Keys (M:M, M:1, 1:M, 1:1, etc.) without changing the model structure. In 3NF, relationships between Business Keys in the same table are “solidified” to, for example, 1:M. When past or future data requires different relationships (e.g. new set of data requires M:M instead of 1:M), the whole structure has to be changed in 3NF, while a DV supports this case in Link table.
Number of Objects Less More The number of objects in a DV vs. 3NF is approximately a 3:1 ratio at the beginning of the EDW build and approx. 2:1 for mature warehouses. Objects in DVs are generally smaller and easier to confederate.
 Code Complexity Higher Lower Although code is generated by WhereScape RED in both cases, the DV code makes it slightly less complex to follow and document, because the objects are smaller and have specific meaning.
Speed of Processing More to Same Less to Same DVs allow a high level of parallelism, therefore the total time to complete the DW loading, including Data Mart layer, is often lower than it would be for 3NF. DVs generally offer more stability and performance.
Number of Joins Less More DVs utilize more joins; however, this does not decrease throughput of the DW process due to high parallelism and effectiveness of the foreign key (FK) joins.
Number of Indexes Less to Same More to Same The DV utilizes more indexes, because there are more tables, however most additional indexes are on the PK/FK. The number of indexes may be equal for mature DWs because of the need to index non-key columns in 3NF in order to support the variety of queries against attributes in the tables.
Adaptability to Changes Medium High Adaptability is necessary in case there are changes in the business, e.g. adding a new data source of changing analytical requirements. 3NF structures need rework while with DV additional Hubs/Links/Satellites can be added without disturbing the existing structure. The changes are typically in a single place and easily adapted because the code is likely to be at only 1 or 2 places.
Agile Development Medium High Although both models work well with Agile Development principles, DVs allow the whole EDW project to be broken into well-defined stories/sprints/tasks, etc. based on Units of Work. It is easier to put the DV design into stories/sprints/tasks.
Historical Data Handling Harder Easier Although both models allow historical data to be handled, in DVs it is done in separate object types: one or many Satellites, while in 3NF history is held in a Normalized History object very much like a SCD. When built in WhereScape RED, Satellites are built the same way as 3NF history tables. This means Satellites = Normalized History tables.Another point to consider is the issue of cascading snapshot dates in master-detail-detail 3NF models. This adds to the complexity of the ETL code and makes it difficult to extract “as-of” data sets (i.e. need to use “select(max(snapshot date))” type sub queries).
Real-Time, Near-RT Loads Good Good Although both 3NF and DVs handle Real-Time and Near Real-Time loads well, DVs are preferred because of high-parallelism, hence shorter individual loads.
Very Large DW Loads Good Good Both 3NF and DVs handle very large database loads well. Nevertheless, any changes in business that require vast initial load are easier to handle in DVs.
“Forgiveness” of the Model Less Strict More Strict DVs require DW practitioners to have more discipline in following architectural and processing rules; defying the DV rules can potentially destroy the whole DW structure.
Historical Tracking Good Best DVs allow complete “geological” type tracing from the lowest grain of data to the source system’s atomic level. 3NF is mainly designed for holding operational data history.
Data Access Bad Worse DVs and 3NF are both impractical for direct querying; it is particularly difficult for DW users to combine data from different sources into meaningful information in 3NF.A Presentation Layer is required for both models and the amount of effort to build the layer is about the same.On the other hand, if these queries are performed by a super user, I believe DVs are better (not worse) because the user can create and follow a template to generate SQL Views to extract data from DVs much easier than with 3NF.
Access to Skilled Modellers Less More  There are more people trained in 3NF than there are in DV.

 

Figure 1 Predisposition of various “flavors” of Data Warehousing Models in relation to various purposes of Warehouses

Figure 1 Predisposition of various “flavors” of Data Warehousing Models in relation to various purposes of Warehouses

 

Diagrammatic Representation

The following shows the data models, built from the same source data, for a 3NF and a DV architecture. The diagrams are an example of what can be produced by WhereScape 3D at the planning stage of an EDW Project.

 

Figure 2 Northwind Tutorial Data Model built in Data Vault “flavor”.

Figure 2 Northwind Tutorial Data Model built in Data Vault “flavor”.

 

Figure 3 Northwind Tutorial Data Model built in Third Normal Form “flavor”

Figure 3 Northwind Tutorial Data Model built in Third Normal Form “flavor”

 

 

Reference:

  1. Building the Operational Data Store, 2nd Edition by William H. Inmon
  2. DW2.0, 2008 by William H. Inmon. Bill Inmon stated that the “Data Vault is the optimal approach for modeling the EDW in the DW2.0 framework.” (DW2.0)
  3. Supercharge your Data Warehouse, 2010-2011 by Dan Linstedt
  4. Data Vault and Data Modeling, 2010 Genesee Academy