Discussions
Categories
- 197K All Categories
- 2.5K Data
- 546 Big Data Appliance
- 1.9K Data Science
- 450.8K Databases
- 221.9K General Database Discussions
- 3.8K Java and JavaScript in the Database
- 31 Multilingual Engine
- 552 MySQL Community Space
- 479 NoSQL Database
- 7.9K Oracle Database Express Edition (XE)
- 3.1K ORDS, SODA & JSON in the Database
- 556 SQLcl
- 4K SQL Developer Data Modeler
- 187.2K SQL & PL/SQL
- 21.4K SQL Developer
- 296.3K Development
- 17 Developer Projects
- 139 Programming Languages
- 293K Development Tools
- 110 DevOps
- 3.1K QA/Testing
- 646.1K Java
- 28 Java Learning Subscription
- 37K Database Connectivity
- 158 Java Community Process
- 105 Java 25
- 22.1K Java APIs
- 138.2K Java Development Tools
- 165.3K Java EE (Java Enterprise Edition)
- 19 Java Essentials
- 162 Java 8 Questions
- 86K Java Programming
- 81 Java Puzzle Ball
- 65.1K New To Java
- 1.7K Training / Learning / Certification
- 13.8K Java HotSpot Virtual Machine
- 94.3K Java SE
- 13.8K Java Security
- 205 Java User Groups
- 24 JavaScript - Nashorn
- Programs
- 468 LiveLabs
- 39 Workshops
- 10.2K Software
- 6.7K Berkeley DB Family
- 3.5K JHeadstart
- 5.7K Other Languages
- 2.3K Chinese
- 175 Deutsche Oracle Community
- 1.1K Español
- 1.9K Japanese
- 233 Portuguese
Previous duplicate row removal - Reg

Hi Team,
Good Morning
I have following sample data
CREATE TABLE sample(
CHILD_NUMBER INTEGER NOT NULL PRIMARY KEY
,PARENT_NUMBER INTEGER
,KEY_VALUE VARCHAR(2) NOT NULL
,FLOW_NODE INTEGER NOT NULL
);
INSERT INTO sample(CHILD_NUMBER,PARENT_NUMBER,KEY_VALUE,FLOW_NODE) VALUES (95,NULL,'P1',1);
INSERT INTO sample(CHILD_NUMBER,PARENT_NUMBER,KEY_VALUE,FLOW_NODE) VALUES (96,95,'P1',2);
INSERT INTO sample(CHILD_NUMBER,PARENT_NUMBER,KEY_VALUE,FLOW_NODE) VALUES (96,95,'P1',1);
INSERT INTO sample(CHILD_NUMBER,PARENT_NUMBER,KEY_VALUE,FLOW_NODE) VALUES (98,NULL,'P2',1);
INSERT INTO sample(CHILD_NUMBER,PARENT_NUMBER,KEY_VALUE,FLOW_NODE) VALUES (99,98,'P2',2);
INSERT INTO sample(CHILD_NUMBER,PARENT_NUMBER,KEY_VALUE,FLOW_NODE) VALUES (99,98,'P2',1);
child_number duplicate should be filtered since, it has already available with flow_node 2. Agains it should not be repeated for flow_node 1
my expected output
Answers
-
Try this.
SELECT CHILD_NUMBER,PARENT_NUMBER,KEY_VALUE,FLOW_NODE FROM ( SELECT a.* ,ROW_NUMBER() OVER (PARTITION BY CHILD_NUMBER,PARENT_NUMBER,KEY_VALUE ORDER BY FLOW_NODE DESC) ranking FROM sample a ) WHERE ranking = 1
If you want to check for duplicate only in CHILD_NUMBER column then just use only this column in PARTITION BY clause instead of 3 columns in above query.
BTW if you have primary key on child_number column it wont have duplicates.
Regards
Arun
-
Hi, @User_U1SBG
Thanks for posting the sample data; that's very helpful.
CREATE TABLE sample(
CHILD_NUMBER INTEGER NOT NULL PRIMARY KEY
...
INSERT INTO sample(CHILD_NUMBER,PARENT_NUMBER,KEY_VALUE,FLOW_NODE) VALUES (96,95,'P1',2);
INSERT INTO sample(CHILD_NUMBER,PARENT_NUMBER,KEY_VALUE,FLOW_NODE) VALUES (96,95,'P1',1);
If child_number is the primary key, then you can't insert two rows with child_key=96. I had to remove the PRIMARY KEY constraint in order to make the INSERT statements work.
Agains it should not be repeated for flow_node 1
So, when the table contains two (or more) rows with the same combination of child_number, parent_number and key_value, the output should have only one row. That sounds like a job for GROUP BY, for example:
SELECT child_number, parent_number, key_value , MAX (flow_node) AS flow_node FROM sample GROUP BY child_number, parent_number, key_value ORDER BY child_number, parent_number, key_value -- or whatever you want ;
Aggregate functions are usually more efficient than analytic functions, such as ROW_NUMBER.