5 Replies Latest reply: Jun 26, 2009 3:19 PM by 807588 RSS

    Splittin Huge Excel file into small Excel Files Using Java

    807588
      Hi Experts,

      I have to split an excel file containing more than 20000 Rows into 4-5 different excel workbooks each containing equal number of records.
      Kindly help me incase any one of you done done this in Java.

      My logic would be like, if suppose an excel contains 16000 records, the file should be split into 4 different files containing 4000 records each.
      The division should be equal & each excel should on average contain 3000-4000 records.
      In short my formula would be as below

      *(No of records in Master Excel File) / (No of Excel workbooks or Files) = 3000 to 4000 records per workbook.*


      Thanks in advance.

      Regards,
      Venky
        • 1. Re: Splittin Huge Excel file into small Excel Files Using Java
          JoachimSauer
          Ok, you have permission to proceed.

          Or with slightly less sarcasm: what is your question?
          • 2. Re: Splittin Huge Excel file into small Excel Files Using Java
            807588
            What I'm trying to ask is how to split an excel file containing 20000+ records into small separate Excel files containing 3000-4000 records each using Java.
            • 4. Re: Splittin Huge Excel file into small Excel Files Using Java
              jduprez
              I have to split an excel file
              What do you call an Excel file?
              If you mean Microsoft's proprietary .xls format, you'll need an API that knows how to read such files. I dont't know of any specific one but I know an excellent way to find one.
              I you mean a mere .csv file (comma-separated values, or often semi-colon-separated values), you may find an existing parsing API as well but it's rather easy to implement it yourself. Have you tried using class StreamTokenizer?
              I have to split an excel file containing more than 20000 Rows
              This is not "huge"; this is a reasonable size for a desktop environment (probably in the magnitude of a hundred MB of data) so if your app needs to run on a desktop, you can load the whole file or the whole parsed content in memory.
              If you need to parse bigger files, of if you have to run on more constrained environments, then it might not fit into memory, so you'd have to read and process the file in chuncks.
              I have to split an excel file containing more than 20000 Rows into 4-5 different excel workbooks each containing equal number of records.
              How are the records organized in the 4-5 files? If it's arbitrary, the easier would be to read lines one by one and stuff each new line in a different output file in turn. Otherwise, organizing the records would probably require to traverse all data twice (one to count them, one to write), not to mention other processing (sorting, filtering,...)
              Kindly help me incase any one of you done done this in Java.
              "it's all about the done-done, da-di-done-dah"
              The division should be equal & each excel should on average contain 3000-4000 records.
              Is is just me or is there a contradiction in this sentence? What is the exact text of the exercice? If it's anything other than an exercice or self-practicing, there may be better ways to handle that than a Java app.

              You'll get better chance to be helped if you demonstrate more efforts (proof-read your text, label your questions, explain how you intend to tackle the problem, describe what you tried and spell out what failed or where you're stuck). Then you'll find out these forums have plenty of helpful souls.

              Brgds,

              Edited by: jduprez on Jun 26, 2009 11:29 AM
              • 5. Re: Splittin Huge Excel file into small Excel Files Using Java
                807588
                if your file is .xls you can use apache poi utility to read the number of record first and then read by row indexes and write to different files.

                take look at the http://poi.apache.org/spreadsheet/quick-guide.html#Iterator