Friday, September 26, 2008

sql table join with a union of several tables

A join of a table with several other uniform tables can be done in a
single sql statement. In this example a table D (transactional data)
has to be joined with a table where products per transactions are
stored. It used to be a single table CTT_KPR. But recently a new line
of products was added as the separate table CTT_ZKKKPR. In order to
integrate data from the new table with the minimum efforts you can
use Union oprator and extend the existing join. The performance does
not change significantly. The code belows gives an idea how it can
look like.

......

AS D LEFT JOIN
( SELECT
ID_CTTMMT
,ID_MDW
,NUMTRA
,CODKPR
,CODMRK
,CODDVG
,DATPELEND
FROM KKC.CTT_KPR
UNION
SELECT
ID_CTTMMT
,ID_MDW
,NUMTRA
,CODKPR
,CODMRK
,CODDVG
,DATPELEND
FROM KKC.CTT_ZKKKPR
) as KPR
ON
D.ID_CTTMMT = KPR.ID_CTTMMT
AND D.NUMTRA = KPR.NUMTRA
AND D.ID_MDW = KPR.ID_MDW
.....

Friday, September 19, 2008

Datastep with modify and the second history table.

It is amaizing how simple SAS allows different update strategies of
your datawarehouse. So I decided to post in my blog about it.
I use modified update strategy of type 1 (Ralph Kimball). The records
are replaced but the history is kept in the separate table. May be it
is update of type 4 or 5, but I am not sure. Have to check.

The datset with insurance polises OVK_PRO is weekly refreshed with
new polises but also existing polises that have changed. For example
the end date of the polis. The modify statement uses a KEY= Option
and a SET Statement: An individual index based record lookup. My
problem was that I could not fully follow the changes. I update four
fields for the existing polises: DATING=DATING_IN DATEND=DATEND_IN
DATREG=DATREG_IN KLANT_SOORT=KLANT_SOORT_IN and set the update date.
Let's call these four fields - modify fields. So at the end I can see
which records are changed but not which modify fields exactly.
For several resons the changes of the end date of the polis are
intresting to track. However I do not want to keep the history in the
main table because it is alreary too big. So update of type 2 is not
an option. So I tried to shift into a hybrid update strategy: modify
statement (type 1) as usial but with a second dataset
wrsas.ovk_hist_stg for the history.

With few modifications of the original datastep it worked!

So here it is:
When the matching record is found and at least one of the four fields
is different the program comes to the point :

do;
output wrsas.ovk_hist_stg;
....

the copy of the original records is sent to ovk_hist_stg. After that
the fields are updated as usual. I had to change
replace wrsas.OVK_PRO;
Normaly replace works without dataset name. but if there are two sets
, you have to specify otherwise SAS will
complate that it cannot modify the second dataset which is obviously
open for output and not for modifications.
So this is the code:

OVK_PRO - main table
OVK_PRO_UPDATE data staging table with new or modified polises
ovk_hist_stg- the hisotorical table to track changes
OVK_KEY - composite key defined on OVK_PRO . (the fields in the key
must be present in OVK_PRO_UPDATE for the lookup )

%LET STEP=UPDATE OVK_PRO ;
data wrsas.OVK_PRO wrsas.ovk_hist_stg;
set OVK_PRO_UPDATE(
keep=PRODUCT_ID NUMOVK NUMVNRKPR NUMVNRPRD DATING DATEND DATREG KLANT_SOORT
rename=(DATING=DATING_IN DATEND=DATEND_IN DATREG=DATREG_IN
KLANT_SOORT=KLANT_SOORT_IN)
);
do while (_iorc_=0);
modify wrsas.OVK_PRO key=OVK_KEY;
select (_IORC_);
when (%SYSRC(_SOK)) do;
if (DATING ne DATING_IN
or DATEND ne DATEND_IN
or DATREG ne DATREG_IN
or KLANT_SOORT ne KLANT_SOORT_IN) then
do;
output wrsas.ovk_hist_stg;
DATING=DATING_IN;
DATEND=DATEND_IN;
DATREG=DATREG_IN;
KLANT_SOORT=KLANT_SOORT_IN;
UPDATED_WHEN=date();
replace wrsas.OVK_PRO;
end;
_iorc_=0;
end;
when (%SYSRC(_DSENOM)) do;
_error_=0;
end;
otherwise do;
put 'ERR' 'OR: Unknown IO ';
stop;
end;
end;
end;
_iorc_=0; /* reset for next datastep iteration */
run;
%CHECK_ERROR;


Now I am looking forward to the next week to run in on fresh data and
maybe discover some weird data quality issues.


Met vriendelijke groet,
Maxim

http://www.4suc6.com
Share your goals!