Hi Everyone,
Good morning. I have a requirement like I need to remove duplicates for all columns except the last one which is always distinct. The last column values should come in a single row (but can be in multiple columns/can be concatenated into a single column) in the resultant set. The distinct values possible for the last columns are fixed which are S1,S2 & S3.
Please find below my input and the expected output.
Input:
--------
Col1 Col2 Col3 Col4
----- ----- ----- ------
M1 a b S1
M1 a b S2
M1 a b S3
M2 c d S1
M2 c d S2
M3 e f S1
I need the output in any of the two ways mentioned below
Output Way 1 :
-----------------
Col1 Col2 Col3 Col4 Col5 Col6
---- ----- ------ ----- ----- ------
M1 a b S1 S2 S3
M2 c d S1 S2
M3 e f S1
Output Way 2:
-------------------
Col1 Col2 Col3 Col4
---- ----- ------ -------------
M1 a b S1,S2,S3
M2 c d S1,S2
M3 e f S1
WITH TBL1(col1,col2,col3,col4)
AS
( SELECT 'M1','a','b','S1' FROM SYSIBM.SYSDUMMY1 UNION ALL
SELECT 'M1','a','b','S2' FROM SYSIBM.SYSDUMMY1 UNION ALL
SELECT 'M1','a','b','S3' FROM SYSIBM.SYSDUMMY1 UNION ALL
SELECT 'M2','c','d','S1' FROM SYSIBM.SYSDUMMY1 UNION ALL
SELECT 'M2','c','d','S2' FROM SYSIBM.SYSDUMMY1 UNION ALL
SELECT 'M3','e','f','S1' FROM SYSIBM.SYSDUMMY1
),
t1(rowNum,col1,col2,col3, col4) AS
( SELECT rownumber() over(partition by col1,col2,col3 order by col4),
col1,col2,col3,col4
FROM tbl1 ),
t2(col1,col2,col3, list, cnt) AS
( SELECT col1,col2,col3, VARCHAR(col4, 50), 1
FROM t1
WHERE rowNum = 1
UNION ALL
SELECT t2.col1,t2.col2,t2.col3, t2.list || ', ' || t1.col4,
t2.cnt + 1
FROM t2, t1
WHERE t2.col1 = t1.col1 AND
t2.col2 = t1.col2 AND
t2.col3 = t1.col3 AND
t2.cnt + 1 = t1.rowNum )
SELECT col1,col2,col3, list
FROM t2
WHERE ( col1,col2,col3, cnt ) IN (
SELECT col1,col2,col3,
MAX(rowNum)
FROM t1
GROUP BY col1,col2,col3 )
ORDER BY 1
Method 2:
Code:
WITH TBL1(col1,col2,col3,col4)
AS
( SELECT 'M1','a','b','S1' FROM SYSIBM.SYSDUMMY1 UNION ALL
SELECT 'M1','a','b','S2' FROM SYSIBM.SYSDUMMY1 UNION ALL
SELECT 'M1','a','b','S3' FROM SYSIBM.SYSDUMMY1 UNION ALL
SELECT 'M2','c','d','S1' FROM SYSIBM.SYSDUMMY1 UNION ALL
SELECT 'M2','c','d','S2' FROM SYSIBM.SYSDUMMY1 UNION ALL
SELECT 'M3','e','f','S1' FROM SYSIBM.SYSDUMMY1
)
select col1,col2,col3,
replace(
replace(
replace(
CAST(xmlserialize(XMLAGG(XMLELEMENT(NAME "x", col4) ) as CLOB(1000)) AS VARCHAR(1000))
, '</x><x>', ',')
, '<x>', '')
, '</x>', '') as ids
from TBl1
group by col1,col2,col3;
Input:
--------
Col1 Col2 Col3 Col4
----- ----- ----- ------
M1 a b S1
M1 a b S2
M1 a b S3
M2 c d S1
M2 c d S2
M3 e f S1
Code:
select main.col1,main.col2,main.col3,
case when main.S1 = 1 then 'S1' else ' ' end as col4,
case when main.S2 = 2 then 'S2' else ' ' end as col5,
case when main.S3 = 3 then 'S3' else ' ' end as col6
from
(select col1,col2,col3,
sum(case when col4 = 'S1' then 1 else 0 end) as S1,
sum(case when col4 = 'S2' then 2 else 0 end) as S2,
sum(case when col4 = 'S3' then 3 else 0 end) as S3
from table a
group by col1,col2,col3) main
select col1,col2,col3,
replace(max(case when col4 = 'S1' then 'Y' else ' ' end),'Y','S1') as col4,
replace(max(case when col4 = 'S2' then 'Y' else ' ' end),'Y','S2') as col5
replace(max(case when col4 = 'S3' then 'Y' else ' ' end),'Y','S3') as col6,
from table a
group by col1,col2,col3
All your methods work well. Bharat's XMLAGG method and Rohit's replace(max()) are very easy to use and are efficient too.
Comparatively XMLAGG method retrieves the result faster than replace(max()) method.
I appreciate your help. Thank you very much both of you .