    String.split issue using regular expressions

      Hello All,

      First time posting, so please forgive me if I violate any etiquette....

      I have a standard csv text file that I am reading line by line. I thought each line was formatted as follows:

      FieldA, FieldB, FieldC, ......, FieldX

      and I was using str.split(",") to separate into tokens.

      However, I have found that some lines contain commas that are not supposed to be part of the parsing. For example, one line may look like

      FieldA, FieldB, "Field C has some commas, commas, and more commas in it", ...., FieldX

      Anytime that the line contains non-separating commas, the author is very careful to enclose the entire field containing the ignorable commas in double quotes. So, what I would like to do is to create a split expression that will split the line based on commas that are not inside of double quotes, but I have no idea how to do it. I have looked at the regex area of the tutorial and tried
      but it does not work.

      Any help appreciated!!
          For example you can split using simple "," then assemble few fragments back to one string, starting from fragment started with " and stopping at fragment stopped with " :)
            The best way to parse CSV data is to use a dedicated tool, like the ones listed in this article. If you have to use regexes, or just want to learn how, a positive matching approach is preferable to split(). The following code, a modification of some sample code[1] that appears in The Book, assumes quoted fields in your data may contain escaped quotation marks in addition to commas, but may not contain line separators.
            import java.util.*;
            import java.util.regex.*;
            public class Test
              public static void main(String... args)
                String str = 
                  "FieldA, FieldB, \"Field C with commas, commas, and more commas\", , FieldX";
                List<String> fields = parseCsvLine(str);
                int i = 0;
                for (String s : fields)
                  System.out.printf("%nField %d: [%s]%n", i++, s);
              public static List<String> parseCsvLine(String line)
                String regex =
                    "(?<=^|,)[ \t]*+"                 + // Optional leading whitespace,
                    "(?:"                             + // followed by either...
                    "\"([^\"]*+(?:\"\"[^\"]++)*+)\""  + // ...by a quoted field...
                    "|"                               + // ...or...
                    "([^\",]*+)"                      + // ...some non-quoted text,
                    ")[ \t]*+";                         // and optional trailing whitespace.
                // Create a matcher for CSV fields, using the regex above.
                Matcher mMain = Pattern.compile(regex).matcher(line);
                // Create a matcher for doubled double-quotes
                Matcher mQuote = Pattern.compile("\"\"").matcher("");
                List<String> result = new ArrayList<String>();
                while (mMain.find())
                  // If field was not quoted, take it as it is; if it was quoted, 
                  // unescape any embedded quotation marks.
                  String field = (mMain.start(2) != -1) ? mMain.group(2).trim()
                               : mQuote.reset(mMain.group(1)).replaceAll("\"");
                return result;
            [1] http://regex.info/listing.cgi?ed=3&p=401