Grouping records with proc sql
I came across a weird thing in SAS Base while using proc sql to group records.
When you group records without using statistical functions gives
different results when you do use a statistical function. It seems to
me very strange. Example.
proc sql;
create table updt_pro as
select codmrk, coddvg, codkpr, codprd, count(*) as n
from ovk_pro_stg_vw
group by codmrk, coddvg, codkpr, codprd
;
quit;
gives me 367 records, and as expected the combinations of group
variables are unique.
the same code without count function gives 983450 records.
proc sql;
create table updt_pro as
select codmrk, coddvg, codkpr, codprd
from ovk_pro_stg_vw
group by codmrk, coddvg, codkpr, codprd
;
quit;
it is obligatory to use at least one aggregate statistical
function, otherwise SAS simply transform group by into order by.
(WARNING: A GROUP BY clause has been transformed into an ORDER BY
clause because neither the SELECT clause nor the optional HAVING
clause of the associated table-expression referenced a
summary function.
)
Tip: when you want to find unique combination of the group variables use select distinct:
proc sql;
create table updt_pro as
select distinct codmrk, coddvg, codkpr, codprdfrom ovk_pro_stg_vw
;
quit;
Maxim
No comments:
Post a Comment