3 Replies Latest reply on Feb 11, 2007 8:24 AM by 800308

    counting Line of Code using regex

    807606
      The following code is for counting Line Of Code. The program has to ignore comments and blank space. The code worked fine for comment but not for blank space. can any one help me please.

      code:

      public void LOCode()
           {

                try {

                System.out.println("Enter file name to be read: " );
                          
                txtFileName = new String(keyboard.readLine());
                          System.out.println("\n \n");
                reader = new BufferedReader(new FileReader(txtFileName));
      Pattern pattern = Pattern.compile("(?:/\\*(?:[^*]|(?:\\*+[^*/]))*\\*+/)|(?://.*)",Pattern.MULTILINE);
                Matcher m = pattern.matcher(txtFileName);

                boolean b = m.matches();                
      String line = null;
                          while((line = reader.readLine()) !=null)
                          {
                               m.reset(line);
                               if(!(m.find()))
                               {
                                    count = count +1;
                                    
                               }

                          }
                          reader.close(); // close buffered reader!

           System.out.println("\nTotal Lines of code is: " + (count-1) +" \n");
                }
      code
        • 1. Re: counting Line of Code using regex
          800351
          Pattern.compile("(?:/\\*(?:[^*]|(?:\\*+[^*/]))*\\*+/)|(?://.*)",Pattern.MULTILINE);
          This should match normal statement line which happens to have comment as well, shouldn't this?

          This does not have the 'blank line' part of the regex pattern.

          Also,
          while((line = reader.readLine()) !=null)
          For a single line of text, MULTILINE mode is irrelevant.
          But, comment that span multiple lines of code can't be captured by this simple line read.

          My recommendation is:
          Make a multiple correct regex strings and patterns from them.
          Don't try to do everything in a single complex regex!

          And, a much better approach would be a simple parser that use Stack.
          /* and */
          and,
          // and EOL
          could be handled quite naturally by stack.

          In order to detect blank line, you should do Strin.trim() on the line.
          • 2. Re: counting Line of Code using regex
            807606
            Thanks for your kind suggestion

            could you please give me one simple example
            using trim()
            I will be greatful
            • 3. Re: counting Line of Code using regex
              800308
              kala,
              /*
              ** incomment_sc.c
              **
              ** KRC 11/12/2001
              **
              ** ansi c
              **
              ** prints uncommented version of specified .sc code-file to stdout
              **
              ** WARNING: this thing screws up the line spacing!
              */
              
              #include <stdlib.h>
              #include <stdio.h>
              #include <string.h>
              #include <ctype.h>
              #include <time.h>
              
              #define SEEK_SET 0
              
              #define CARRIAGE_RETURN     13
              #define LINE_FEED           10
              #define DOUBLE_QUOTE        34
              #define BACK_SLASH          92
              
              #define boolean int
              #define false 0
              #define true 1
              
              
              #ifdef DEBUG
                   #define debug fprintf
              #else
                   #define debug
              #endif
              
              int main( int argc, char *argv[] )
              {
              
                   FILE        *f;             /*file*/
                   char        fname[215];     /*file name*/
                   long        fpos = -1;      /*file position*/
                   char        c = 0;          /*current char*/
                   char        p = -1;         /*previous char*/
                   int         i = 0;          /*index*/
                   boolean     inC = false;    /*in Comment (block comment)*/
                   boolean     inDQ = false;   /*in Double Quotes*/
              
              
                   /* do we have a filename */
                   if ( argc < 2)
                   {
                        fprintf(stderr, "usage: %s filename\n", argv[0]);
                        fprintf(stderr, "filename contains C code\n");
                        return(1);
                   }
              
                   /* open the file */
                   strncpy(fname, argv[1], sizeof(fname));
                   fname[sizeof(fname)]='\0';
                   if( (f = fopen(fname, "r")) == NULL)
                   {
                        fprintf(stderr, "%s: can't open %s for input\n", argv[0], fname);
                        return(1);
                   }
              
                   /* read the file to stdout without the comments */
                   while ( (c=fgetc(f)) != EOF )
                   {
                        if ( !inC && p!=BACK_SLASH && c==DOUBLE_QUOTE )
                        {
                             putchar(p);
                             inDQ = !inDQ;
                        }
                        else if ( !inDQ && inC && p=='*' && c=='/' )
                        {
                             inC = false;
                             c=fgetc(f);
                        }
                        else if ( !inDQ && !inC && p=='/' && c=='*' )
                        {
                             inC = true;
                             c=fgetc(f);
                        }
                        else if ( !inDQ && !inC && p=='/' && c=='/')
                        {
                             while ( ((c=fgetc(f)) != LINE_FEED) && (c != EOF) );
                             p=-1;
                        }
                        else if ( p == LINE_FEED )
                        {
                             printf("\n");
                        }
                        else if ( (p>0) && !inC )
                        {
                             putchar(p);
                        }
                        p=c;
                   }
              
                   if ( !inC )
                        putchar(p);
              
                   fclose(f);
                   return (0);
              }
              forget the regexp, you'll go mad... I think this discussion is pertinant: http://discuss.fogcreek.com/joelonsoftware4/default.asp?cmd=show&ixPost=113898&ixReplies=13