Forum Stats

  • 3,874,182 Users
  • 2,266,679 Discussions
  • 7,911,761 Comments

Discussions

How to get a record count for each file in a table?

User_DIL60
User_DIL60 Member Posts: 29 Green Ribbon

Good Morning Everyone,

Hope everyone is doing well. I need some help with a query where I am not getting the expected results. I am trying to write a query to fetch a result PASS/FAIL based on the record count WHERE INSERTED_DATE BETWEEN first day of the currrent month and last day of the current month. I have a situation where few files will be loaded every month in a table and I want to check record count for each file and if the record count is more than 2 for each file, then display PASS and even if one file is not loaded or have less than 2 records per file then display FAIL.

Create table and test data below.

create table ECS_MONTHLY_LOAD (

rec_id varchar2(50 byte),

file_name varchar2(25 byte),

inserted_date date);


INSERT INTO ECS_MONTHLY_LOAD (REC_ID,FILE_NAME,INSERTED_DATE) VALUES ('112233445566','86_ECS_MON_20221024.TXT','24-OCT-22');


INSERT INTO ECS_MONTHLY_LOAD (REC_ID,FILE_NAME,INSERTED_DATE) VALUES ('123456789','86_ECS_MON_20221024.TXT','24-OCT-22');


INSERT INTO ECS_MONTHLY_LOAD (REC_ID,FILE_NAME,INSERTED_DATE) VALUES ('987654321','86_ECS_MON_20221024.TXT','24-OCT-22');


INSERT INTO ECS_MONTHLY_LOAD (REC_ID,FILE_NAME,INSERTED_DATE) VALUES ('ABCD-EFGH-1234','76_ECS_MON_20221024.TXT','24-OCT-22');


INSERT INTO ECS_MONTHLY_LOAD (REC_ID,FILE_NAME,INSERTED_DATE) VALUES ('ZEDIC-9876-UJIE','76_ECS_MON_20221024.TXT','24-OCT-22');


INSERT INTO ECS_MONTHLY_LOAD (REC_ID,FILE_NAME,INSERTED_DATE) VALUES ('POIUYU-88768','96_ECS_MON_20221024.TXT','24-OCT-22');

Query I am using:

SELECT 

CASE WHEN COUNT(*) IS NULL OR COUNT(*)<2 THEN 'FAIL' ELSE 'PASS' END AS ECS_RECORD_COUNT_CHECK

FROM ECS_MONTHLY_LOAD 

WHERE 

INSERTED_DATE BETWEEN TRUNC((SYSDATE),'MONTH') AND LAST_DAY(SYSDATE);

I have only one record for file 96_ECS_MON_20221024.TXT and still I am getting output as PASS. If the record count per file is less than 2 or if any of the file '96_ECS_MON_20221024.TXT', '76_ECS_MON_20221024.TXT' or '86_ECS_MON_20221024.TXT' is not loaded then output should be FAIL. Date inside the file name changes every month. I think I am not checking record count for each file and not sure how I can do that. Any help on this please?

Best Answer

«1

Answers

  • Frank Kulash
    Frank Kulash Member, Moderator Posts: 42,746 Red Diamond
    edited Oct 24, 2022 2:59PM

    Hi, @User_DIL60

    Thanks for posting the sample data, but don't try to insert strings (such as '24-OCT-22') into a DATE column (such as inserted_date). Use DATE literals orTO_DATE with 4-digit years. Don't forget to post the exact results you want from the given sample data.

  • User_3ABCE
    User_3ABCE Member Posts: 192 Silver Badge

    BETWEEN TRUNC((SYSDATE),'MONTH') AND LAST_DAY(SYSDATE);

    You can skip the last day of month if INSERTED_DATE is not at 00:00:00.

    COUNT(*) IS NULL

    count(...) can not be null.

    If the record count per file is less than 2

    Your query counts the total number of records, not per file.

    if any of the file '96_ECS_MON_20221024.TXT', '76_ECS_MON_20221024.TXT' or '86_ECS_MON_20221024.TXT' is not loaded

    The list of files is predefined?

    select case (select count(count(*))
                 from ECS_MONTHLY_LOAD
                 where file_name in ('96_ECS_MON_20221024.TXT', '76_ECS_MON_20221024.TXT' or '86_ECS_MON_20221024.TXT')
                   and INSERTED_DATE >= trunc(sysdate,'mm') and INSERTED_DATE < last_day(trunc(sysdate))+1
                 group by file_name
                 having count(*) > 1)
             when 3 then 'PASS' else 'FAIL' end ECS_RECORD_COUNT_CHECK
    from dual;
    
  • Frank Kulash
    Frank Kulash Member, Moderator Posts: 42,746 Red Diamond
    edited Oct 24, 2022 3:19PM

    Hi, @User_DIL60

    COUNT (*) tells you how many rows were selected from the table. If you want to know how many rows there were for each file_name, then you can use GROUP BY file_name. Perhaps you want something like this:

    SELECT    CASE
    	      WHEN  MIN (COUNT (*)) > 2
    	      THEN  'PASS'
    	      ELSE  'FAIL'
    	  END	AS ecs_record_count_check
    FROM	  ecs_monthly_load
    WHERE	  inserted_date  >=              TRUNC (inserted_date, 'MONTH')
    AND	  insertd_date   <   ADD_MONTHS (TRUNC (inserted_date, 'MONTH'), 1)
    GROUP BY  file_name
    ;
    

    Be careful using LAST_DAY. The hours minutes and seconds returned witll be the same as the hours, minutes and seconds of the argument, so if you say

    WHERE 

    INSERTED_DATE BETWEEN TRUNC((SYSDATE),'MONTH') AND LAST_DAY(SYSDATE);

    ann run the query at 9:00 AM on any day in October, 2022, then you will only include rows with inserted_date up to 9:00 AM on October 31.

    What results do you want if there are no rows at all in the table for the current month?

  • User_DIL60
    User_DIL60 Member Posts: 29 Green Ribbon

    Thank you Frank. The query should fetch data between first day of the current month to last day of the current month (12:00AM to whenever the job runs so we don't need the last second of the last day. Thank you very much for educating me on this as I didn't know this before.)

    The query should output FAIL if record count is less than 2 per file name and if any of the file is not present in the table which means file is not loaded at all for that month. if the record count is more than 2 per file and all 3 files are loaded in that month then output should be PASS.

  • User_DIL60
    User_DIL60 Member Posts: 29 Green Ribbon
    edited Oct 24, 2022 4:26PM

    Thank you for your help. Yes the list of file names are predefined but the date in the file name varies every month. If each file have more than 2 records and all 3 files should be present then PASS else FAIL.

  • Frank Kulash
    Frank Kulash Member, Moderator Posts: 42,746 Red Diamond

    Hi, @User_DIL60

     and all 3 files should be present 

    In your initial problem there was no mention of 3 files. Is this the same problem? Does it matter what the file names of those 3 files are? What if more than 3 files are present? Post some different sample data (with DATE values for the DATE column) where the query I posted earlier is not giving the results you want.

  • User_DIL60
    User_DIL60 Member Posts: 29 Green Ribbon

    Thanks Frank. I apologize. Yes all three files

    ('96_ECS_MON_20221024.TXT', '76_ECS_MON_20221024.TXT' ,'86_ECS_MON_20221024.TXT' where date is dynamic in the file name and changes every month)

    should be present in the table between the first day of the month and last day of the month. We will be receiving only 3 files with respective names and not more than 3 files within a month. All three files should present and all files should have at least 2 records then output is PASS else output will be FAIL.

    Here are the Insert scripts with updated to_date conversion.

    INSERT INTO ECS_MONTHLY_LOAD (REC_ID,FILE_NAME,INSERTED_DATE) VALUES ('112233445566','86_ECS_MON_20221024.TXT',TO_DATE('24-OCT-22','DD-MON-YY'));


    INSERT INTO ECS_MONTHLY_LOAD (REC_ID,FILE_NAME,INSERTED_DATE) VALUES ('123456789','86_ECS_MON_20221024.TXT',TO_DATE('24-OCT-22','DD-MON-YY'));


    INSERT INTO ECS_MONTHLY_LOAD (REC_ID,FILE_NAME,INSERTED_DATE) VALUES ('987654321','86_ECS_MON_20221024.TXT',TO_DATE('24-OCT-22','DD-MON-YY'));


    INSERT INTO ECS_MONTHLY_LOAD (REC_ID,FILE_NAME,INSERTED_DATE) VALUES ('ABCD-EFGH-1234','76_ECS_MON_20221024.TXT',TO_DATE('24-OCT-22','DD-MON-YY'));


    INSERT INTO ECS_MONTHLY_LOAD (REC_ID,FILE_NAME,INSERTED_DATE) VALUES ('ZEDIC-9876-UJIE','76_ECS_MON_20221024.TXT',TO_DATE('24-OCT-22','DD-MON-YY'));


    INSERT INTO ECS_MONTHLY_LOAD (REC_ID,FILE_NAME,INSERTED_DATE) VALUES ('POIUYU-88768','96_ECS_MON_20221024.TXT',TO_DATE('24-OCT-22','DD-MON-YY'));

  • Frank Kulash
    Frank Kulash Member, Moderator Posts: 42,746 Red Diamond
    edited Oct 24, 2022 5:32PM

    Hi, @User_DIL60

    Once again, what is wrong with the query I posted earlier? With your new sample data, it produces these results:

    ECS_RECORD_COUNT_CHECK
    ----------------------
    FAIL
    

    Are you saying the correct result is 'PASS'? If so, why? You may need to add new conditions to the WHERE clause, and/or the CASE expression, but as long as I don't know exactly what you want, I can't say exactly what those changes should be. Once again, post some sample data where the query I posted earlier is not doing what you need. To make it easier to test, you might want to post several different months of data, so we can test several scenarios at the same time. For each month, explain if the query should produce 'PASS' or 'FAIL', and why.

  • User_DIL60
    User_DIL60 Member Posts: 29 Green Ribbon

    Thank you Frank. Nothing wrong with the query and result. I am just curious where we have specified the file names in the query to check if each file is present and each file has at least 2 rows per file. Below are some scenarios with sample data.

    scenario-1- FAIL( All 3 files are present but one file has only one record. We expect all 3 files have min 3 records)

    INSERT INTO ECS_MONTHLY_LOAD (REC_ID,FILE_NAME,INSERTED_DATE) VALUES ('112233445566','86_ECS_MON_20221024.TXT',TO_DATE('24-OCT-22','DD-MON-YY'));

    INSERT INTO ECS_MONTHLY_LOAD (REC_ID,FILE_NAME,INSERTED_DATE) VALUES ('123456789','86_ECS_MON_20221024.TXT',TO_DATE('24-OCT-22','DD-MON-YY'));

    INSERT INTO ECS_MONTHLY_LOAD (REC_ID,FILE_NAME,INSERTED_DATE) VALUES ('987654321','86_ECS_MON_20221024.TXT',TO_DATE('24-OCT-22','DD-MON-YY'));

    INSERT INTO ECS_MONTHLY_LOAD (REC_ID,FILE_NAME,INSERTED_DATE) VALUES ('ABCD-EFGH-1234','76_ECS_MON_20221024.TXT',TO_DATE('24-OCT-22','DD-MON-YY'));

    INSERT INTO ECS_MONTHLY_LOAD (REC_ID,FILE_NAME,INSERTED_DATE) VALUES ('ZEDIC-9876-UJIE','76_ECS_MON_20221024.TXT',TO_DATE('24-OCT-22','DD-MON-YY'));

    INSERT INTO ECS_MONTHLY_LOAD (REC_ID,FILE_NAME,INSERTED_DATE) VALUES ('POIUYU-88768','96_ECS_MON_20221024.TXT',TO_DATE('24-OCT-22','DD-MON-YY'));

    For the above data, we should see FAIL because we have only one record in 96_ECS_MON_20221024.TXT.

    scenario-2- PASS ( all 3 files are present and all 3 files have min 2 records)

    INSERT INTO ECS_MONTHLY_LOAD (REC_ID,FILE_NAME,INSERTED_DATE) VALUES ('112233445566','86_ECS_MON_20221024.TXT',TO_DATE('24-SEP-22','DD-MON-YY'));

    INSERT INTO ECS_MONTHLY_LOAD (REC_ID,FILE_NAME,INSERTED_DATE) VALUES ('123456789','86_ECS_MON_20221024.TXT',TO_DATE('24-SEP-22','DD-MON-YY'));

    INSERT INTO ECS_MONTHLY_LOAD (REC_ID,FILE_NAME,INSERTED_DATE) VALUES ('987654321','86_ECS_MON_20221024.TXT',TO_DATE('24-SEP-22','DD-MON-YY'));

    INSERT INTO ECS_MONTHLY_LOAD (REC_ID,FILE_NAME,INSERTED_DATE) VALUES ('ABCD-EFGH-1234','76_ECS_MON_20221024.TXT',TO_DATE('24-SEP-22','DD-MON-YY'));

    INSERT INTO ECS_MONTHLY_LOAD (REC_ID,FILE_NAME,INSERTED_DATE) VALUES ('ZEDIC-9876-UJIE','76_ECS_MON_20221024.TXT',TO_DATE('24-SEP-22','DD-MON-YY'));

    INSERT INTO ECS_MONTHLY_LOAD (REC_ID,FILE_NAME,INSERTED_DATE) VALUES ('POIUYU-88768','96_ECS_MON_20221024.TXT',TO_DATE('24-SEP-22','DD-MON-YY'));

    INSERT INTO ECS_MONTHLY_LOAD (REC_ID,FILE_NAME,INSERTED_DATE) VALUES ('P-88768','96_ECS_MON_20221024.TXT',TO_DATE('24-SEP-22','DD-MON-YY'));

    For the above we should expect PASS as every file has min 2 records and all 3 files are present.


    Scenario -3-FAIL (when one file data is not loaded. We need all 3 files and each file should have min 2 records)

    INSERT INTO ECS_MONTHLY_LOAD (REC_ID,FILE_NAME,INSERTED_DATE) VALUES ('112233445566','86_ECS_MON_20221024.TXT',TO_DATE('24-JUL-22','DD-MON-YY'));

    INSERT INTO ECS_MONTHLY_LOAD (REC_ID,FILE_NAME,INSERTED_DATE) VALUES ('123456789','86_ECS_MON_20221024.TXT',TO_DATE('24-JUL-22','DD-MON-YY'));

    INSERT INTO ECS_MONTHLY_LOAD (REC_ID,FILE_NAME,INSERTED_DATE) VALUES ('987654321','86_ECS_MON_20221024.TXT',TO_DATE('24-JUL-22','DD-MON-YY'));

    INSERT INTO ECS_MONTHLY_LOAD (REC_ID,FILE_NAME,INSERTED_DATE) VALUES ('ABCD-EFGH-1234','76_ECS_MON_20221024.TXT',TO_DATE('24-JUL-22','DD-MON-YY'));

    INSERT INTO ECS_MONTHLY_LOAD (REC_ID,FILE_NAME,INSERTED_DATE) VALUES ('ZEDIC-9876-UJIE','76_ECS_MON_20221024.TXT',TO_DATE('24-JUL-22','DD-MON-YY'));

    For the above data, we should see FAIL because we don't see any data for 96_ECS_MON_*.TXT

  • Frank Kulash
    Frank Kulash Member, Moderator Posts: 42,746 Red Diamond

    Hi, @User_DIL60

    So, you're only interested in three specific file_names. If there are any rows with other (or NULL) file_names, they should be ignored. To see if each of those three file names occur at least two times in the current month, you can do this:

    SELECT    CASE
    	      WHEN  MIN (COUNT (*))             >= 2
    	      AND   COUNT (DISTINCT file_name)  =  3
    	          THEN  'PASS'
    	      	  ELSE  'FAIL'
    	  END	AS ecs_record_count_check
    FROM	  ecs_monthly_load
    WHERE	  inserted_date  >=             TRUNC (SYSDATE, 'MONTH')
    AND	  inserted_date  <  ADD_MONTHS (TRUNC (SYSDATE, 'MONTH'), 1)
    AND       file_name  IN  ( '76_ECS_MON_20221024.TXT'
    	  	      	 , '86_ECS_MON_20221024.TXT'
    			 , '96_ECS_MON_20221024.TXT'
    	  	      	 )
    GROUP BY  file_name
    ;
    

    The following query tests it. Notice that the scalar sib-query is exactly the same as above, except that it uses run_date instead of SYSDATE:

    WITH    all_run_dates    AS
    (
    	SELECT DISTINCT  inserted_date  AS run_date
    	FROM   		 ecs_monthly_load
    )
    SELECT    run_date
    ,	  (
    		SELECT    CASE
    	      		      WHEN  MIN (COUNT (*))             >= 2
    	      		      AND   COUNT (DISTINCT file_name)  =  3
    	          	          THEN  'PASS'
    	      	  		  ELSE  'FAIL'
    	  		  END	AS ecs_record_count_check
    		FROM	  ecs_monthly_load
    		WHERE	  inserted_date  >=             TRUNC (run_date, 'MONTH')
    		AND	  inserted_date  <  ADD_MONTHS (TRUNC (run_date, 'MONTH'), 1)
    		AND       file_name  IN  ( '76_ECS_MON_20221024.TXT'
    	  	      	  	     	 , '86_ECS_MON_20221024.TXT'
    			 		 , '96_ECS_MON_20221024.TXT'
    	  	      	 		 )
    	        GROUP BY  file_name
    	  )			AS ecs_record_count_check
    FROM	  all_run_dates
    ORDER BY  run_date
    ;
    

    Output from your sample data:

    RUN_DATE    ECS_RECORD_COUNT_CHECK
    ----------- ----------------------
    24-Jul-2022 FAIL
    24-Sep-2022 PASS
    24-Oct-2022 FAIL
    


    User_DIL60