Archive: October ’07

Prototype versus model or prototype and model? 22 Oct ’07

On the surface there appears to be a conflict between model driven and prototype driven approaches to data warehouse design.  Prototypers eschew designs as being too theoretical and taking too long, designers are suspicious that prototypers don’t have a big picture view.

Experienced data warehouse practitoners utilize both approaches.  Modeling based on best practice design - look at the plethora of industry data models available from the likes of Teradata, IBM and Sybase - can save time and effort, but are not conducive to fruitful requirements discussions with business users. 

The strength of data warehouse prototypes are that utilizing real data and providing real answers enables business users to see what is possible and to frame requirements and  set priorities.  The two techniques do work well together.  We were recently engaged in a Teradata implementation.  The logical data model provided the first target for development, and the user community was essentially the business analysts who would probe and question to ensure all the data that would be useful was extracted and modeled to answer as yet unknown requirements.  Prototyping was used with the  model area as the source.  Prototypes could be built rapidly as the data was  (mostly) in place, and business users were engaged as they were seeing their data in a way they were going to use it.

In the mid market we have seen experienced practitoners undertake prototyping without modeling, in the enterprise space we see modeling and prototyping as complementary.

Parameters Limit the Need for Custom Code 18 Oct ’07

Parameters are a means of passing information between two or more procedures and between the RED environment and procedures. WhereScape Red supplies two functions to allow procedures to read and write parameters. These functions are WsParameterRead and WsParameterWrite. Using these functions a procedure can load and use the contents of a parameter, or modify an existing parameter, or add a new parameter.

Importantly the use of these parameter functions is not limited to custom procedures.

When creating a data warehouse using WhereScape Red a major benefit is the ability to keep custom code to a minimum. Parameters are an important aspect of the RED environment that facilitates the use of standard RED generated code.

The following is a list of some of the areas that these functions can be used in.

• Load table objects. Use $P$ in the ‘where clause’, ‘remote view’, ‘load override statement’ and transformations.

 

• Stage table objects. In the ‘where clause’ and ‘column transformations’.

 

• Fact table objects. In the ‘where clause’ and ‘column transformations’.

If performance issues dictate that in some situations the use of the parameter functions is not an option. You can also use the parameter table itself as a data source table.

This can be done through WhereScape RED generated views or loading the parameter data required directly from the metadata table that stores the parameter values. The parameter values are stored in a table called ‘dss_parameter’. The resulting load table or view can then be used to assist in filtering data sets via the use of joins during stage processing to other tables containing source data.

Examples of function use:

In The Load Table Where Clause AND TO_CHAR(AL3.CONN_ACTIVE_DATE,’YYYYMM’) IN (’$PCURRENT_CAL_MONTH$’ , ‘$PPREVIOUS_CAL_MONTH$’ , ‘$PPRIOR_CAL_MONTH$’)

In The Stage/Fact Table Where Clause

Where WsParameterRead(’QTR_RE_SUBMISSION’) = ‘Y’;

In the Stage/Fact Table Column Transformation

‘NZLQ’

|| substr(WsParameterRead(’PREVIOUS_CAL_MONTH’),5,2)

|| substr(WsParameterRead(’PREVIOUS_CAL_MONTH’),3,2)

Examples of dss_parameter use as a source table:

Consider the use of a load table sourced from a parameter that contains a variable piece of data that must be attributed to each row of your source data. Each source row should be associated with this data, but the content can change from one load to the next. This may be dependant on the month/ week /day of processing or something else entirely.

Using a load table that loads the parameter required from dss_parameter you can then join this table to your source data during staging without the need for any complexity involved in obtaining the variable data needed across multiple stage table processes. Any complexity can be handled once through a custom procedure that maintains the parameter.

Loading Data over Wide Area Networks 18 Oct ’07

Every now and then we encounter environments where data loading must be performed over a wide area network and are asked how we handle this scenario. Data loads over wide area networks present an interesting challenge. Our experience has been that the problem can be tackled from features within the tool itself and also how the warehouse ETL process is designed.

From a tool perspective, WhereScape RED provides some good options to help reduce network traffic and load times;

• RED provides a facility for permanent staging/mapping tables. These tables are kept intact over time and can be used to store valuable mapping/lookup information required during transformations, etc. This eliminates the need to potentially extract lots of lookup information during each and every extract. In this case you would only need to bring over the changed information and keep the mapping table(s) updated.

• Our standard approach is to perform change-data-capture (CDC) on all data in the warehouse – dimensions and facts, where possible. Rather than reloading entire dimension/fact tables each time they are processed, we would only bring across the changed data and insert/update the appropriate rows in the target tables. This will have the effect of reducing the overall extraction times and also reducing the network traffic. RED is designed with this approach in mind.

• We attempt to get only the raw data from the source system during the extract process. Once the data has been placed into our Load tables we would then go about the process of performing the complex transformations, etc. By performing all the transformations on the warehouse database we would hopefully reduce the amount of data that travels over the wide area network.

• “Remote Views” on load tables provide a facility where the SQL for the data extraction is created as a view on the source system. This forces the extraction query to be performed on the source system rather than allowing the optimizer to choose how it is performed, which could potentially lead to the query being performed across the database link. This option is available for Oracle environments and can provide significant performance gains and reduction in network traffic.

Other non-tool related considerations;

• Try to perform extracts when there is minimal load on the network. Running extracts outside normal business hours when user activity is at a minimum. Another consideration is to schedule extracts around other IT maintenance activities, such as system backups, which could consume large chunks of network bandwidth.

• Since data is going over a wide area network try to design all data extraction with a CDC approach in mind.

By taking advantage of the built-in functionality of WhereScape RED and designing the ETL process with a Wide Area Network in mind you can make significant improvements in the overall performance.

Post Load Procedures 18 Oct ’07

A post-load procedure is executed after the load table has completed and after any ‘after’ transformations have occurred.

So what is the post load procedure for?

The post load procedure is primarily available for situations that occur which require non standard solutions. The use of a post load procedure is a handy way of isolating custom code that might otherwise cause generic stage processing to become more complex and require custom modifications.

How do you decide to use it?

In general if you come up against load table transformation processing that meets some or all of the following criteria.

• Cannot be handled during the load from the source system.

 

• Cannot be handled during ‘after’ transformations.

 

• Involves processing that is isolated to the load table itself.

 

• Does not sit correctly within stage transformation processing.

Examples of situations where you might use a post load procedure are detailed below.

• Delete processing. Filtering on the source system can sometimes cause a performance issue. If this occurs and time permits the ‘whole’ dataset can be loaded. Stage tables could filter this data out, but if multiple stage tables require data from this table it may be quicker overall to delete the data from the load table prior to stage processing. The post load procedure can be used to do this.

 

• Data error checking and load validation. Certain data problems may require checking and validation prior to stage processing. The post load procedure can be used to raise errors and halt processing.

 

• Populate the load table with data. When multiple steps are required to obtain data from a table on the source system and the data warehouse is using database links to source the load data it can occasionally be worth loading the data during post load processing. As an example, when dealing with a header table that lacks a timestamp but the timestamp is available on the detail table. If standard join techniques become problematic the post load procedure can be used to write custom sql that is ‘tuned’ to load the data as quickly as possible.

There are as many different examples of when and how to use a post load procedure as there are different data warehouses and source system applications

Populating a Target Table from Multiple Sources 18 Oct ’07

A common problem encountered when building a data warehouse is to populate a single target table (dimension or fact) from multiple sources. WhereScape RED supports this capability through staging tables via the “Set Merge” Procedure option.

The Set Merge Procedure option allows the merging of two or more identical tables. The table to be merged must have exactly the same number of columns and column names. If necessary, blank columns could be added to one or other of the tables to make them identical. To use this procedure you must have the tables to be merged mentioned at least once in the ‘Source Table’ field of a columns properties.

To illustrate this capability, we will step through a simple example. We are attempting to build a customer dimension that will contain customer id, name and other demographic information. Depending on the type of customer (online or store) their information is stored in separate tables within the source system and the process for sourcing each type of customer varies:

1. First step is to create a load table (load_customer_store) that will source all customer information for those customers which purchase directly from a store.

 

2. Next step is to create a load table (load_customer_online) that will source all customer information for those customers which are deemed online (ie: purchase via web). Ensure that the column names for the table are identical to the ones in the load_customer_store table.

 

3. We now create a staging table (stage_customer) that will initially source from the load_customer_store table. Then alter one of the columns on the stage_customer table and modify the ‘Source Table’ property to source from load_customer_online table.

 

4. Now create the update procedure for stage_customer table and select procedure type of ‘Set Merge’. If you look at the generated code for this procedure you will notice that it loads data from both load_customer_store and load_customer_online tables into the stage_customer table.

 

5. Last step is to create the dimension dim_customer, using the stage_customer staging table as the source for this dimension.

Note: When scheduling the table for loading you must ensure that the dimension table (dim_customer) is not processed until after the staging table (stage_customer) is updated.

By taking advantage of the built-in functionality of WhereScape RED you eliminate the need to write and maintain custom code.

Multiple Source Systems – One Set of Load Tables. How Does it Work? 18 Oct ’07

Consider an environment where a source system is duplicated across 3 sites. Site A, Site B and Site C. There may be 400 load tables that require loading from each of these duplicated source systems, in practical terms they are the same tables but in different implementations of the same application. When building WhereScape RED objects to load data from these tables it is preferable to build the one set of load objects that provide data to the one set of dimension, stage and fact processes. All objects can then be used regardless of the source system providing the actual data. The key to resolving this problem is the use of a WhereScape RED callable procedure called Ws_Connect_Replace. This procedure allows the caller to replace the contents of a connection from another connection. For example, four connections are created:

Connection Site A

Connection Site B

Connection Site C

Connection Site Generic

Stores all connection details for Site A

Stores all connection details for Site B

Stores all connection details for Site C

When called Ws_Connect_Replace swaps A, B or C connectoin details into this connection

All 400 load tables use Connection Site Generic as the connection defined in the metadata.Prior to loading the data from the load tables the procedure Ws_Connect_Replace is called swapping site specific connection details into the generic connection. Processing can then continue in the knowledge that the correct data is being loaded.If you use a custom procedure at the start of a job that contains the call to Ws_Connect_Replace along side some parameter controls. Automation of which site to process can be achieved. In this way we can also have the one set of job metadata to maintain.

If your requirements are smaller and only involve one load table that requires source system or schema swapping, look at the WhereScape RED callable procedure Ws_Load_Change. This procedure allows the caller to change either the connection or the schema of a load table.

Migrating Between RDBMS Platforms 18 Oct ’07

A powerful feature within WhereScape RED is the ability to migrate a metadata repository from one RDBMS platform to another. This will allow you to take an existing data warehouse environment on one database, such as SQL Server, and move it to another like Oracle. While rarely used, this functionality is of immense value should you need to move RDBMS platforms at any stage after the data warehouse has been created.

The following steps can be performed to move an existing WhereScape RED metadata repository from one RDBMS platform to another. Note: this only moves the metadata structures and content, it does not move the data contained in the data warehouse.

Prior to performing the metadata migration you may wish to perform the following tasks to ease the process;

1. Upgrade to the latest version of WhereScape RED, to ensure you have the latest functionality.

 

2. Archive the Scheduler Audit and Detail log tables if you do not need to bring across this information. The archive process moves the records from the audit trail log to an audit trail archive table. This will reduce the amount of data that needs to be imported into the target metadata repository. To perform the archive, go to the Scheduler tab within RED and choose from the menu: Logs

Retrofitting Code in Teradata 18 Oct ’07

WhereScape RED automatically generates, runs and schedules Teradata stored procedures.   It also has support for Teradata macros and bteq scripts, allowing them to be held in the metadata, run and scheduled.

Many Teradata customers have large numbers of existing bteq scripts and macros.   WhereScape has a utility for importing existing procedures, bteq scripts and macros into the WhereScape RED metadata repository.

The WhereScape Code Retrofit Utility creates one procedure, macro or bteq script object in the WhereScape RED Repository for each procedure, macro or bteq script imported.

Large numbers of procedures, macros or bteq scripts can be imported in a single import.   The utility can also be called from a script inside WhereScape RED and scheduled if required:retrofitting-code-shrunk.JPG 

This utility can be supplied on request (including a sample script).

When to Consider Editing Dependencies on a Job 18 Oct ’07

In general, the recommendation regarding editing dependencies is…avoid editing dependencies on a job.

With some thought and configuration the standard WhereScape RED scheduler options for configuring task groupings and job dependencies will provide a job that runs optimally.  Of course a generalist approach can sometimes be circumvented for a valid reason, so if dependencies must be manually modified the option is available.

What can you do when editing dependencies?

Dependencies can be removed, modified and created.

When should this be considered?

When you have given real consideration as to how standard WhereScape RED scheduler configurations can provide the performance you require from the set of tasks. Having gone through this process of elimination and concluded that you are still better off editing dependencies. Edit them.

Having edited dependencies on a job how do I manage ongoing changes?

This question is presented to you after making changes to the job through the tasks edit box.

For a job with edited dependencies never answer yes to the prompt “Do you want to rebuild ALL dependencies”.  If you do all custom dependency edits will be lost.

If you add a task to the job, WhereScape RED will create the necessary dependencies for you without the need to rebuild all dependencies. These dependencies can then be reviewed and edited if necessary.

If you change the order of an existing task you will be presented with the following statement when leaving the task edit box:

Some tasks have beem modified in either their order or action.  NO changes have been made to the dependencies in these modified tasks.  Either make these changes manually (recommended if no standard dependencies) or edit again (make a change) and rebuild the dependencies.

WhereScape RED is now directing you to manually modify existing dependencies and add any new dependencies required using the dependency editor. WhereScape RED will make no additions or modifications to the existing dependencies in this situation.

Are jobs versioned?

No. When you make changes you risk losing a previously edited and working job if things go wrong. This can result in a significant amount of re-work. The advice is to back up jobs with edited dependencies separately. This can be done in two easy ways.

1. Prior to changing the job take a copy (right click and choose ‘insert copy of job’).

 

2. Create an application that holds all jobs with edited dependencies.

Custom Job Control 18 Oct ’07

In some situations you may choose to write custom code to control the activation of a job through the WhereScape RED Scheduler. Situations that can cause the need to consider this requirement are:• Activate a job based on a trigger event. Such as working day # of a calendar month.

• Maintain one version of the job metadata but name its scheduled run iteration dependent on the date of processing or the source system that provides the data

.• Ability to control the scheduled iteration of the job a given number of times, perhaps for a table rebuild.

• Have a job run as a task with a master job but using an alternate scheduler type. The windows scheduler and not the unix scheduler for example.

When writing this custom code it becomes necessary to communicate with some WhereScape Red scheduler metadata objects. The table below lists and describes some key metadata objects and how they can be used when writing custom code.

Object Namews_wrk_schedular

ws_wrk_job_run

 ws_job_restart

ws_job_create

ws_admin_v_sched

DescriptionA table listing schedulers and their active status.

A WhereScape RED audit meta data table that maintains a row for jobs that are running or failed.

A WhereScape RED audit meta data table that maintains a row for jobs that are running or failed

A WhereScape RED callable procedure that allows the creation of a new job from another job that must be in either a hold or waiting state.

A WhereScape RED meta data view. Provides a view across multiple RED audit metadata providing access to the current status of a job

Useful forThis table can be used to confirm that a particular scheduler or scheduler type is available.

If a job has failed a row will exist in this table. This table can then be used to establish if a restart or a job create is required.A restart is very useful because a job may fail having completed 90% of its tasks. When the issue is resolved the job can then pick up from the point that it failed.To restart a failed job

 To create a new job based on another job

This view can be monitored to establish the status of the job

The configuration of the custom procedure will depend heavily on your specific requirements. However in general you will need to communicate with the above objects in almost all cases. The following sql is provided as an example of how it might be done.

– ==============================================================================
– DBMS Name : Oracle
– Script Name : custom_task_job_control
– Description :
– Generated by :
– Generated for :
– Author :
– ==============================================================================
– Notes / History

WhereScape Limited Custom Job control.doc 28 July 2006 Page 1 of 5
– This is designed to be executed as a task within a job.
– It runs another job using Ws_Job_Create and then looks at the audit tables
– for successful completion of that job before completing successfully itself.
– This is needed because RED executes a task job within a job using only one schedular type.
– Unix OR Windows. Not a combination of the two.
– This procedure will execute a task job from within a job and use a different schedular type.
– The only draw back is that the task job audit meta data will be held seperatley to the
– calling job audit meta data.

CREATE OR REPLACE procedure custom_task_job_control
(
p_sequence IN number,
p_job_name IN varchar2,
p_task_name IN varchar2,
p_job_id IN number,
p_task_id IN number,
p_return_msg OUT varchar2,
p_status OUT number
)
AS
–===============================================================
– Control variables used in most programs
–===============================================================
v_MsgText varchar2(255); — Text for audit_trail
v_sql varchar2(4000);– sql statements
v_step integer := 0; — return code
v_update_count integer := 0; — no of records updated
v_insert_count integer := 0; — no of records inserted
v_count integer := 0; — General counter;
v_iteration integer := 0; — iteration counter;
v_fact_inv_start varchar2(30); — date;
v_fact_inv_end varchar2(30); — date;
v_fact_inv_current varchar2(30); — date;
v_status integer := 0; — Status holder
v_return_code varchar2(1);
v_return_msg varchar2(256);
v_wait_limit integer := 0;
v_wait_total integer := 0;
v_sleep integer := 0;
v_template_job_name varchar2(256);
v_task_job_name varchar2(256);
v_gpm_current_cal_month integer := 0; — date (yyyymm);
v_task_sequence integer := 0; — the sequence number of the created task job
v_task_status varchar2(30); — the sequence number of the created task job
BEGIN
v_step := 100;
–=======================================================
– Establish if a Windows schedular is available to
– run the load job.
– If not fail and halt processing
–=======================================================
SELECT count(*)
INTO v_count
FROM ws_wrk_scheduler
WHERE ws_type_ind = ‘W’
AND ws_stop_date is null;
IF v_count = 0 THEN
p_return_msg := ‘Exception no windows schedular active’||
‘ Step ‘ ||v_step;
p_status := WsWrkAudit(’F’,p_job_name, p_task_name, p_sequence,
p_return_msg,SQLCODE,SQLERRM,p_task_id, p_job_id);
p_status := -3;
RETURN;
END IF;
–=======================================================
– Set Some variables
–=======================================================
– Name of job to copy
v_template_job_name := ‘Populate dw fact’;
– Month of data to process
v_gpm_current_cal_month := WsParameterRead(’CURRENT_CAL_MONTH’);
– Combine the month of data to process and the template job name to
– create a task job name.
v_task_job_name := v_gpm_current_cal_month||’ ‘||v_template_job_name;
–=======================================================
WhereScape Limited Custom Job control.doc 28 July 2006 Page 2 of 5
– Descision time create or restart
– If a task job is in a failed state then initiate a restart
– (because the majority of work may already be done)
– else use ws_job_create
–=======================================================
SELECT count(*)
INTO v_count
FROM ws_wrk_job_run
WHERE ws_wrk_job_run.wjr_name = v_task_job_name;
IF v_count != 0 THEN
–=======================================================
– This procedure allows the automated restarting of a
– failed job.
–=======================================================
v_status := Ws_Job_Restart( p_sequence
, p_job_name
, p_task_name
, p_job_id
, p_task_id
, v_task_job_name
, v_return_code
, v_return_msg);
ELSE
–=======================================================
– Run the job using Ws_Job_Create
– This procedure allows the automated creating of a new job
– from an existing job that is in either a holding or waiting
– state.
–=======================================================
v_step := 300;
v_status := Ws_Job_Create( p_sequence
, p_job_name
, p_task_name
, p_job_id
, p_task_id
, v_template_job_name
, v_task_job_name
, NULL
, ‘ONCE’
, NULL
, NULL
, NULL
, NULL
, NULL
, NULL
, NULL
, NULL
, NULL
, NULL
, NULL
, NULL
, NULL
, v_return_code
, v_return_msg);
END IF;
p_return_msg := v_task_job_name||’ ‘||v_return_code||’ ‘||v_return_msg;
p_status := WsWrkAudit(’I’,p_job_name, p_task_name, p_sequence,
p_return_msg,NULL,NULL,p_task_id, p_job_id);
COMMIT;
–=======================================================
– Get the sequence number of the task job
–=======================================================
SELECT MAX(”Sequence”)
INTO v_task_sequence
FROM WS_ADMIN_V_SCHED
WHERE “Job Name” = v_task_job_name;
–=======================================================
– Check for completion of the job
–=======================================================

– Set wait limit for job to MORE than the expected completion time
– of the job called.
– Wait limit and sleep require SECONDS
WhereScape Limited Custom Job control.doc 28 July 2006 Page 3 of 5

v_sleep := 120; — 2 minutes
v_count := 0;
v_wait_total := 0;
v_wait_limit := 28800; — 8 hours

– Sleep for a few minutes before beginning the task job status loop

DBMS_LOCK.SLEEP(180);
WHILE v_wait_total < v_wait_limit LOOP

– Sleep for value in v_sleep

DBMS_LOCK.SLEEP(v_sleep);
v_wait_total := v_wait_total + v_sleep;
v_step := 400;
SELECT “Status”
INTO v_task_status
FROM WS_ADMIN_V_SCHED
WHERE “Sequence” = v_task_sequence;

– Look for ‘On Hold’,’Running’,’Pending’,’Waiting’,’Completed’,’Blocked’,’Failed’,’Failed - Aborted’,’Error Completion’,’Unknown’

IF v_task_status = ‘Completed’ THEN
– exit the loop as the job has completed successfully
EXIT;
ELSIF v_task_status IN (’Pending’,’Waiting’,’Running’) THEN
– keep looping through looking for completion status
NULL;
ELSIF v_task_status IN (’Failed’,’Failed - Aborted’,’Error Completion’,’Unknown’) THEN
– raise error and fail
p_return_msg := ‘Exception in custom_task_job_control. Called job ‘||v_task_job_name||’ has failed. Check audit trail.’;
p_status := WsWrkAudit(’F’,p_job_name, p_task_name, p_sequence,
p_return_msg,NULL,NULL,p_task_id, p_job_id);
p_status := -3;
RETURN;
ELSE
– ‘Blocked’,’On Hold’ or any other status not catered for
– should not occur raise error and fail
p_return_msg := ‘Exception in custom_task_job_control. Called job ‘||v_task_job_name||’ status is ‘||v_task_status;
p_status := WsWrkAudit(’F’,p_job_name, p_task_name, p_sequence,
p_return_msg,NULL,NULL,p_task_id, p_job_id);
p_status := -3;
RETURN;
END IF;
END LOOP;
–=======================================================
– If we have exceeded the wait limit it is likely that
– the task job has experienced a problem.
– This calling task will fail.
–=======================================================
v_step := 500;
IF v_wait_total >= v_wait_limit THEN
p_return_msg := ‘Exception in custom_task_job_control. Wait limit exceeded. Step ‘ ||v_step;
p_status := WsWrkAudit(’F’,p_job_name, p_task_name, p_sequence,
p_return_msg,NULL,NULL,p_task_id, p_job_id);
p_status := -3;
COMMIT;
WhereScape Limited Custom Job control.doc 28 July 2006 Page 4 of 5
RETURN;
END IF;
–===============================================================
–All Done report the results and return.
–p_status is the return code. Valid values are:
– 1 successfull completion
– -2 failed with error
– -3 failed with unhandled error
–p_return_msg is a 256 character message which should
– provide a summary of the result of this procedure.
–===============================================================
COMMIT;
v_step := 700;
p_status := 1;
p_return_msg := ‘custom_task_job_control completed. ‘;
RETURN;
EXCEPTION
–=================================================================
– Unhandle Exception. Return with Fatal Error
–=================================================================
WHEN OTHERS THEN
p_return_msg := ‘Unhandled Exception in custom_task_job_control. ‘||
‘ Step ‘ ||v_step || ‘ ‘||SQLERRM;
p_status := WsWrkAudit(’F’,p_job_name, p_task_name, p_sequence,
p_return_msg,SQLCODE,SQLERRM,p_task_id, p_job_id);
p_status := -3;
RETURN;
END custom_task_job_control;

Fact Tables with No Business Key 18 Oct ’07

Occasionally a fact table is required in which you are unable to define a unique Business Key for the table. A common scenario is with retail or concession data, where you are tracking individual transactions that occur at the cash register. In this case it is possible that you could have multiple rows in the fact table which are identical. One option is to aggregate the duplicate rows and just insert a single row into the fact table. However, most likely you will want to have each individual transaction loaded into the fact table. In this situation you would be unable to define a business key on the fact table as there are no columns that would make for a unique row. To accommodate this, WhereScape RED allows you to take a slightly different approach to loading the staging and fact tables.

Note: Like other functionality within WhereScape RED, all of the resulting code is auto-generated.

Staging Table

If you are doing a “Cursor” approach for populating the staging table, you will be prompted for business key during the procedure creation process. When prompted for the business key, do not select any columns and then deselect the “include update statement” option. The “Set” based methods will not prompt for a business key.

Fact Table

When you build the fact table update procedure, do not select any columns for the business key definition. At this point you can select the “Set based insert” option, on the “Define fact business key(s)” dialog box. If you wish to delete rows from the fact table, prior to loading the staging table data you can also select the “Include delete before insert” option as well (same dialog box).

If you selected the “Include delete before insert” option you will presented with a “Definition for a delete statement before the fact table insert” dialog box. This allows you to build a delete statement that will run prior to inserting any data into the fact table and delete rows based on a specific criterion – typically a column in the staging table. For example, you could delete all rows from the fact table which have a matching “batch id” in the staging table. Once the delete has completed the rows in the staging table will then be loaded into the fact table.

Whether you build this delete statement depends on how the data is being loaded into the fact table. ie: what is the process for loading new/changed rows. If you are continually appending data and never have a need to run a delete, then you don’t select the “Include before insert” option. However, the delete option is important as it can be used to ensure that you do not double-load data into the fact table.

While the majority of fact tables will have a business key defined, there are some scenarios where this is not possible. With a couple minor changes during the procedure build process you can easily accommodate these scenarios.

Keeping track of time when working in remote locations 18 Oct ’07

If you are working in different locations in different time zones scheduling can be confusing as the local PC time is different from the scheduler time.  To assist with this the WhereScape RED scheduler displays the “Database time at last action” in the information line at the bottom of the window.  Simple, but can be useful.

An Alternative Method for Impact Analysis 18 Oct ’07

One of the strengths of WhereScape RED has always been the ability to work in a changing environment; one where source systems as well as user requirements are volatile.  In order to identify the impact of changes, most customers use the track back and track forward capabilities.  These are available in two places – through the diagrammatical view for table level impact analysis, and through the “Track back on specified column usage” report available from the Reports menu.

Another technique is to use the “Search for string” menu option under Tools.  This search allows you to select whether you want to search procedures, scripts and/or tables for a specific string.  It will search the meta data for all occurrences of the specified string in the selected objects. 

Using the search capability can be the fastest way to see which objects will be impacted by an intended change.