Archive: December ’10

Degenerate Dimensions in Analysis Services 28 Dec ’10

Analysis Services supports the ability to define dimensions based on columns in the fact table – also known as Fact Dimensions.  These dimensions are very useful when attributes that exist in the fact table need to be exposed in our OLAP query or Drill through action, such as Invoice Number or Order Reference.

WhereScape RED does not directly support this dimension type, but it is simple to create equivalent functionality using RED and database views.  RED provides the ability to quickly and easily modify the data warehouse database using views to accommodate cube design, rather than modify the DSV within a cube database.  This approach is useful when equivalent queries are required from both the data warehouse and cubes. 

To define and add a degenerate / fact dimension to an OLAP cube already defined in RED follow these steps:

1.      First we need to define a dimension view built from the fact table, containing the degenerate attribute(s).  The dimension view is defined in RED by dragging the fact table attribute(s) into the Dimension list to create a new Dimension View.  This dimension view needs to have a single unique key column – perhaps just the degenerate attribute.  Next, define the dimension’s key column as an artificial key by checking the artificial key checkbox in the key column properties in RED. 

Note: If the degenerate dimension has a composite key, then a single unique key needs to be created in the dimension view by concatenating the composite key columns.  This key also needs to be added to the fact table or in a view over the fact table to enable a fact / dimension relationship to be defined using a single column.

2.    Once created, the degenerate dimension view can be dragged over to the list of OLAP dimensions to create a new OLAP dimension in RED.  Ensure the key column is identified correctly in the OLAP dimension properties.

3.    The OLAP dimension can be added to a cube in RED by right clicking the cube and choosing Add Dimension.  Choose the new degenerate dimension. 

 

4.    To define the relationship between fact and degenerate dimension right click the cube and choose Display Measure Group Dimensions.  The relationship between fact and degenerate dimension will be shown as “No Relationship”.  Double click this relationship and change the Relationship type to Regular, and choose the key column in the fact that matches the dimension’s key column. 

 

5.    Now you can create and process the cube.   The degenerate attribute(s) will be exposed in the new degenerate dimension.

This method has been tried and tested on client sites.  If you run into problems please let us know.

SQL Server Data Acquisition using RED and Native Support for Microsoft SQL Server Integration Services 10 Dec ’10

The other day I ran some comparative tests using different options to load data into a SQL Server data warehouse from an Oracle database using WhereScape RED.  Here’s a summary of what I did and what I found:

1. The new WhereScape RED load option Integration Services load is the fastest way to move data into SQL Server from another relational database.

2. Native ODBC loads are the second fastest way to move data into SQL Server.  Linked servers are significantly slower than other options.

3. Whether the architecture is 32-bit or 64-bit makes no difference loading data into a load table in SQL Server 2008 from another database.

Details follow, apologies if this looks like a high school science report!

Aim:

The objectives were to:

1. Show the new WhereScape RED load option:  Integration Services load.  This uses the new Native Support for Microsoft SQL Server Integration Services.

2. Determine the fastest way to move data from a relational database into SQL Server 2008 database using WhereScape RED.

3. Quantify any difference between 32-bit and 64-bit platform performance.

 

Method:

A series of tests were run on two identical machines, one 32-bit and one 64-bit.  The two machines were configured as follows:

32-bit Server:

- Windows Server 2003 R2 Enterprise Edition Service Pack 2

- AMD Athlon II X4 635 Processor 2.91 GHz

- 6GB of RAM (PAE)

64-bit Server:

- Windows Server 2003 R2 Enterprise x64 Edition Service Pack 2

- AMD Athlon II X4 635 Processor 2.91 GHz

- 6GB of RAM

Both machines were built using vm-ware 6.5 and tests were performed using vm-ware server on identical host machines.  The 64-bit virtual machine ran on a 64-bit host.  The 32-bit virtual machine ran on a 32-bit host.  Both had 2 virtual CPUs assigned to them (i.e.: 2 cores of the host machines CPU).

SQL Server 2008 R1 was used as the data warehouse database, the 32-bit Enterprise and 64-bit Enterprise versions being used respectively.

The source system was in an Oracle 11.2.0.1.0 database – Enterprise Edition.  Again 32-bit and 64-bit versions being used as appropriate.

SQL Server and Oracle were installed on the same server.

There were some additional set up complexities on the 64-bit machine.  Specifically:

- An oracle 32-bit client was also required (in addition to the 64-bit client installed with the database)

- A 64-bit oledb provider from oracle had to be installed as Microsoft does not supply a 64-bit version of MSDAORA (they do supply a 32-bit version of MSDAORA with 32-bit SQL Server).

The testing involved a large table being loaded into SQL Server from Oracle.  The source table contained approximately 13.6 million rows and used about 2Gb of space in Oracle.  It contained most common data types.

The following tests were performed:

- 32-bit machine:  Integration Services load in RED

- 32-bit machine:  Native ODBC load – 32-bit MS Driver

- 32-bit machine:  Native ODBC load – 32-bit ORA Driver

- 32-bit machine:  Native ODBC load – 32-bit ORA Driver and RAM Disk  ***1

- 32-bit machine:  Linked Server in RED – 32-bit – 4 part names

- 32-bit machine:  Linked Server in RED – 32-bit – open query

- 32-bit machine:  Linked Server in a Query Wind – 32-bit – 4 part names

- 64-bit machine:  Integration Services load in RED

- 64-bit machine:  Native ODBC load – 32-bit MS Driver

- 64-bit machine:  Native ODBC load – 32-bit ORA Driver

- 64-bit machine:  Native ODBC load – 64-bit ORA Driver  ***2

- 64-bit machine:  Native ODBC load – 32-bit ORA Driver and RAM Disk  ***1

- 64-bit machine:  Linked Server in RED – 64-bit – 4 part names

- 64-bit machine:  Linked Server in RED – 64-bit – open query

- 64-bit machine:  Linked Server in a Query Wind – 64-bit – 4 part names

Notes:

***1:  A RAM disk was used in one Native ODBC load test to check for IO constraints.

***2:  This test was run using a specially compiled 64-bit version of the RED native load extract utility.

 

Results:

Conclusions:

1. The new WhereScape RED load option Integration Services load is the fastest way to move data into SQL Server from another relational database.

2. Native ODBC loads are the second fastest way to move data into SQL Server.  Linked servers are significantly slower than other options.

3. Whether the architecture is 32-bit or 64-bit makes no difference loading data into a load table in SQL Server 2008 from another database.