Friday, September 19, 2008

Datastep with modify and the second history table.

It is amaizing how simple SAS allows different update strategies of
your datawarehouse. So I decided to post in my blog about it.
I use modified update strategy of type 1 (Ralph Kimball). The records
are replaced but the history is kept in the separate table. May be it
is update of type 4 or 5, but I am not sure. Have to check.

The datset with insurance polises OVK_PRO is weekly refreshed with
new polises but also existing polises that have changed. For example
the end date of the polis. The modify statement uses a KEY= Option
and a SET Statement: An individual index based record lookup. My
problem was that I could not fully follow the changes. I update four
fields for the existing polises: DATING=DATING_IN DATEND=DATEND_IN
DATREG=DATREG_IN KLANT_SOORT=KLANT_SOORT_IN and set the update date.
Let's call these four fields - modify fields. So at the end I can see
which records are changed but not which modify fields exactly.
For several resons the changes of the end date of the polis are
intresting to track. However I do not want to keep the history in the
main table because it is alreary too big. So update of type 2 is not
an option. So I tried to shift into a hybrid update strategy: modify
statement (type 1) as usial but with a second dataset
wrsas.ovk_hist_stg for the history.

With few modifications of the original datastep it worked!

So here it is:
When the matching record is found and at least one of the four fields
is different the program comes to the point :

do;
output wrsas.ovk_hist_stg;
....

the copy of the original records is sent to ovk_hist_stg. After that
the fields are updated as usual. I had to change
replace wrsas.OVK_PRO;
Normaly replace works without dataset name. but if there are two sets
, you have to specify otherwise SAS will
complate that it cannot modify the second dataset which is obviously
open for output and not for modifications.
So this is the code:

OVK_PRO - main table
OVK_PRO_UPDATE data staging table with new or modified polises
ovk_hist_stg- the hisotorical table to track changes
OVK_KEY - composite key defined on OVK_PRO . (the fields in the key
must be present in OVK_PRO_UPDATE for the lookup )

%LET STEP=UPDATE OVK_PRO ;
data wrsas.OVK_PRO wrsas.ovk_hist_stg;
set OVK_PRO_UPDATE(
keep=PRODUCT_ID NUMOVK NUMVNRKPR NUMVNRPRD DATING DATEND DATREG KLANT_SOORT
rename=(DATING=DATING_IN DATEND=DATEND_IN DATREG=DATREG_IN
KLANT_SOORT=KLANT_SOORT_IN)
);
do while (_iorc_=0);
modify wrsas.OVK_PRO key=OVK_KEY;
select (_IORC_);
when (%SYSRC(_SOK)) do;
if (DATING ne DATING_IN
or DATEND ne DATEND_IN
or DATREG ne DATREG_IN
or KLANT_SOORT ne KLANT_SOORT_IN) then
do;
output wrsas.ovk_hist_stg;
DATING=DATING_IN;
DATEND=DATEND_IN;
DATREG=DATREG_IN;
KLANT_SOORT=KLANT_SOORT_IN;
UPDATED_WHEN=date();
replace wrsas.OVK_PRO;
end;
_iorc_=0;
end;
when (%SYSRC(_DSENOM)) do;
_error_=0;
end;
otherwise do;
put 'ERR' 'OR: Unknown IO ';
stop;
end;
end;
end;
_iorc_=0; /* reset for next datastep iteration */
run;
%CHECK_ERROR;


Now I am looking forward to the next week to run in on fresh data and
maybe discover some weird data quality issues.


Met vriendelijke groet,
Maxim

http://www.4suc6.com
Share your goals!

No comments: