This content has been marked as final. Show 7 replies
Whenever you have a question, please post a little sample data (CREATE TABLE and INSERT statements) and the results you want from that data. Explain how you get those results from that data.
If you can use commonly available tables, like those in the scott schema, to show your problem, then you don't need to post any sample data; just the results explanation.
You could use a WHERE clause to remove rows where ActualFGMaterialNum is NULL. Since the WHERE clause is applied after CONNECT BY is finished (and SYS_CONNECT_BY_PATH is formed), that probably means another sub-query. To make sure that rnum is consecutive, you'll have to be sure to compute rnum in the same sub-query where the NULLs are exclued, or at least not earlier than that.
A less efficient approach would be to replace multiple consecutive ';'s in the final output with a single ';'. In Oracle 10 (and up), that's fairly simple:
returns a copy of txt with all groups of consecutive ';'s (no matter how many there are) replaced by one ';'.
REGEXP_REPLACE ( txt , ';+' , ';' )
Do you really need to do two nested CONNECT BY queries? Whatever you're doing, I'll bet there's a simpler, more efficient way to do it.
Thank you for your note.
I am new to using sys_connect_by_path and have researched through a few posts on this forum to get a basic understanding.
The approach I used here is what I has picked up from one of the posts here which was trying to do a similar thing.
To your point, there may be a better way to do it, but I haven't been able to figure it out.
If you have any thoughts on how to improve it, I would be more than happy to learn.
The solution I am trying to come up with is to basically take multiple rows and concatenate their column values to show as a single string separated by semi colon. In the process I want to avoid duplicates.
user604168 wrote:A lot of people do that with just one CONNECT BY query; I don't know why you need two CONNECT BY queries, one nested inside the other. It depends on your data, the results you want from that data, and why you want those results (so that you'll get a solution that not only works on the sample data you post, but on any data you'll get in the future). That's why I asked you to post a little sample data (CREATE TABLE and INSERT statements), the results you want from that data, and an explanation of why you want those results from that data.
... The solution I am trying to come up with is to basically take multiple rows and concatenate their column values to show as a single string separated by semi colon. In the process I want to avoid duplicates.
If you have any thoughts on how to improve it, I would be more than happy to learn.What you posted may be a great solution for your problem.
There may be another way, simpler and more efficient, that gets the results you want.
As long as I don't know where you're starting from, and where you want to go, I can't give you very good directions.
Below are the details of create table / insert table
create table RepTest1 (ID NUMBER NOT NULL, PRIMARY KEY (ID)); create table RepTest2 (FID NUMBER NOT NULL, ID NUMBER, PRIMARY KEY (FID), CONSTRAINT fk_ID FOREIGN KEY (ID) REFERENCES RepTest1(ID)); create table RepTest3 (TID NUMBER NOT NULL, PGRD VARCHAR2(3) NULL, SPSU VARCHAR2(4) NULL, FID NUMBER, PRIMARY KEY (TID), CONSTRAINT fk_fid FOREIGN KEY (FID) REFERENCES RepTest2(FID));
Based on the above data, I want to concatenate all the RepTest3 columns which have the same FID for each unique ID in RepTest1. Here I also want to eliminate any duplicate values while concatenating .
INSERT INTO RepTest1 values (8229); INSERT INTO RepTest1 values (6614); INSERT INTO RepTest2 values (8249, 8229); INSERT INTO RepTest2 values (6634, 6614); INSERT INTO RepTest3 values (8519, 'K', '3A', 8249); INSERT INTO RepTest3 values (8520, 'K', '4A', 8249); INSERT INTO RepTest3 values (8521, 'B', '4A', 8249); INSERT INTO RepTest3 values (8522, 'B', '3A', 8249); INSERT INTO RepTest3 values (8523, 'K', '2A', 8249); INSERT INTO RepTest3 values (8525, 'K', '4A', 8249); INSERT INTO RepTest3 values (8526, 'B', '4A', 8249); INSERT INTO RepTest3 values (8977, 'K', '3A1', 8249); INSERT INTO RepTest3 values (8417, 'K', '2A', 8249); INSERT INTO RepTest3 values (8524, 'K', '3A', 8249); INSERT INTO RepTest3 values (7623, 'A', '00', 6634); INSERT INTO RepTest3 values (7624, '', '01', 6634); INSERT INTO RepTest3 values (7625, 'A', '02', 6634);
When I execute the below query:
The results are as follows:
SELECT ID, (trim(';' from (sys_connect_by_path(SPSU,';')))) SPSU, (trim(';' from (sys_connect_by_path(PGRD,';')))) PGRD FROM ( SELECT ID, CASE WHEN SYS_CONNECT_BY_PATH(PRIOR SPSU,';') || ',' LIKE '%' || SPSU || '%' THEN NULL ELSE SPSU END SPSU, CASE WHEN SYS_CONNECT_BY_PATH(PRIOR PGRD,';') || ',' LIKE '%' || PGRD || '%' THEN NULL ELSE PGRD END PGRD, rnum, tot FROM ( select rt1.ID AS ID , rt3.spsu AS spsu, rt3.pgrd AS pgrd, row_number() OVER (PARTITION BY rt1.id ORDER BY rt3.tid) rnum, COUNT(*) OVER (PARTITION BY rt1.id) tot FROM ADI.RepTest1 rt1 LEFT OUTER JOIN ADI.RepTest2 rt2 ON rt2.ID = rt1.Id LEFT OUTER JOIN ADI.RepTest3 rt3 ON rt3.FID = rt2.FID ) START WITH rnum=1 CONNECT BY PRIOR rnum = rnum-1 AND PRIOR ID = ID ) WHERE rnum=tot START WITH rnum =1 CONNECT BY PRIOR rnum = rnum-1 AND PRIOR ID = ID;
It returns the unique values removing the duplicates, but somehow introduces extra semicolons even when there is no null data.
_ID_ _SPSU_ _PGRD_ 6614 00;01;02 A 8229 2A;3A;4A;;;;;;;3A1 K;;;B
When I use Regexp_replace on the final value, it removes the extra semicolons, but I am not sure why it is adding the extra semicolons in the first place.
Pls let me know if I should provide any further detail.
I wouldn't use SYS_CONNECT_BY_PATH for this. I would use the user-defined aggregate fucntion STRAGG, which you can copy from
It's about 60 lines of PL/SQL code, but once you've installed it, you can use it anywhere, on any table.
Look how simple your query is using STRAGG!
SELECT r1.id , REPLACE (STRAGG (DISTINCT r3.spsu), ',', ';') AS spsu , REPLACE (STRAGG (DISTINCT r3.pgrd), ',', ';') AS pgrd FROM reptest1 r1 LEFT OUTER JOIN reptest2 r2 ON r1.id = r2.id LEFT OUTER JOIN reptest3 r3 ON r2.fid = r3.fid GROUP BY r1.id ORDER BY r1.id ;
Since STRAGG is set up once for ever, you don't get any choice about the delimiter. I installed it with ',' as the delimiter, but you could code it so ';' was the delimiter. As you can see, it's easy enough to use REPLACE if you don't want to use the standard delimiter.
` ID SPSU PGRD ---------- -------------------- -------------------- 6614 00;01;02 A 8229 2A;3A;3A1;4A B;K
If you wanted to use SYS_CONNECT_BY_PATH, I would unpivot the data so that there was one item (a single pgrd or spsu value) on a row, then use SYS_CONNECT_BY_PATH to get the delimited list.
This is more lines of code than what you had, but I think it will be faster because it is only doing one CONNECT BY query. This way works well if you need the output in a particular order.
WITH cntr AS ( SELECT 'PGRD' AS column_name FROM dual UNION ALL SELECT 'SPSU' FROM dual ) , unpivoted_data AS ( SELECT DISTINCT r1.id , c.column_name , CASE c.column_name WHEN 'PGRD' THEN r3.pgrd WHEN 'SPSU' THEN r3.spsu END AS txt , DENSE_RANK () OVER ( PARTITION BY r1.id , c.column_name ORDER BY CASE c.column_name WHEN 'PGRD' THEN r3.pgrd WHEN 'SPSU' THEN r3.spsu END ) AS r_num FROM reptest1 r1 JOIN reptest2 r2 ON r1.id = r2.id JOIN reptest3 r3 ON r2.fid = r3.fid CROSS JOIN cntr c ) , got_list_txt AS ( SELECT id , column_name , TRIM (BOTH ';' FROM SYS_CONNECT_BY_PATH (txt, ';')) AS list_txt FROM unpivoted_data WHERE CONNECT_BY_ISLEAF = 1 START WITH r_num = 1 CONNECT BY r_num = PRIOR r_num + 1 AND id = PRIOR id AND column_name = PRIOR column_name ) SELECT rt1.id , MIN (CASE WHEN column_name = 'SPSU' THEN lt.list_txt END) AS spsu , MIN (CASE WHEN column_name = 'PGRD' THEN lt.list_txt END) AS pgrd FROM reptest1 rt1 LEFT OUTER JOIN got_list_txt lt ON rt1.id = lt.id GROUP BY rt1.id ORDER BY rt1.id ;
Either way scales nicely if you need to add more delimited columns like spsu and pgrd.
The extra ';'s in your original query were caused because you weren't really excluding duplicates. Whether a value was a duplicate or not, it was causing an entry in the list, but code like this:
was checking for duplicates, and adding NULL (plus a ';' delimiter) to the list when the item was a duplicate. This is pretty inefficient compared to built-in techniques like DISTINCT, expecially since this involves both SYS_CONNECT_BY_PATH and LIKE, which are fairly expensive SQL tools.
... CASE WHEN SYS_CONNECT_BY_PATH(PRIOR SPSU,';') || ',' LIKE '%' || SPSU || '%' THEN NULL ELSE SPSU END SPSU,
Good job on the sample data! It looks like you changed the values to short, easy-to-recognize strings, such as '2A' and 'K'. Thanks! That's very helpful, and makes it easier to be sure the answer is correct.