This discussion is archived
6 Replies Latest reply: Jan 30, 2013 12:14 PM by jschellSomeoneStoleMyAlias RSS

Why is STDOUT to STDIN so dang slow in Windows?

802457 Newbie
Currently Being Moderated
I have a JAR file that reads a specific file format and writes the data to STDOUT.

I have another JAR file that reads this data from STDIN and processes it.

This data has hundreds of millions of records, each of which are ~2K bytes long.

When i read this data from a text file i can read it at a blazing speed, like multiple million records per minute.

However, when i run:

java -jar DataReader.jar | java -jar DataProcessor.jar (this is an example, not the EXACT name of the jars...)

the processing time is MUCH MUCH MUCH less. We're talking MAYBE 20-30K per minute.

I use BufferedOutputStream in the class that writes to STDOUT AND for the class that reads from STDIN.

I have tried increasing the buffersize for the above classes in multiple increments, but it doesn't seem to change anything by much.

Is there some sort of an Achilles Heel that i'm missing that can cripple the speed of this operation if i'm not aware of it, or is this just a naturally doomed process because of something in Windows?

What kind of throughput should i expect in this situation? What is fair/expected?

Any info is greatly appreciated!
  • 1. Re: Why is STDOUT to STDIN so dang slow in Windows?
    EJP Guru
    Currently Being Moderated
    Windows piping could be slow. It sounds like you'd be better off having the first JAR write to a file and the second one read from it. I never found pipes all that fast on Unix either. It's noteworthy that although compilers could be written to use pipes, they never are: they use temp files between the passes.
  • 2. Re: Why is STDOUT to STDIN so dang slow in Windows?
    rp0428 Guru
    Currently Being Moderated
    >
    I have a JAR file that reads a specific file format and writes the data to STDOUT.

    I have another JAR file that reads this data from STDIN and processes it.
    . . .
    I use BufferedOutputStream in the class that writes to STDOUT AND for the class that reads from STDIN.
    >
    For me the obvious question is why you are sending the data to the file system at all. Why not just pass it from the reader to the processor directly.

    If the processor is reading from STDIN the data must be in text format so why not have the reader use a ByteArrayOutputStream and let the processor use that byte array as a ByteArrayInputStream.

    There must be something you aren't telling us.
  • 3. Re: Why is STDOUT to STDIN so dang slow in Windows?
    jwenting Journeyer
    Currently Being Moderated
    both STDIN and STDOUT are buffered, so data won't be available on them until the buffers are full.
  • 4. Re: Why is STDOUT to STDIN so dang slow in Windows?
    BIJ001 Explorer
    Currently Being Moderated
    It's noteworthy that although compilers could be written to use pipes, they never are
    gcc
    -pipe

    Use pipes rather than temporary files for communication between the various stages of compilation.
    This fails to work on some systems where the assembler is unable to read from a pipe;
    but the GNU assembler has no trouble.
  • 5. Re: Why is STDOUT to STDIN so dang slow in Windows?
    EJP Guru
    Currently Being Moderated
    OK so now tell us that it's faster, and then why it's only an option, not the default.

    I went into all this in great depth many years ago when I was writing production compilers, and there is really no point. Pipes are nice for plugging occasional user command chains together, but if you're even slightly interested in performance there is absolutely no way you won't use a temp file in preference.
  • 6. Re: Why is STDOUT to STDIN so dang slow in Windows?
    jschellSomeoneStoleMyAlias Expert
    Currently Being Moderated
    pedron wrote:
    What kind of throughput should i expect in this situation? What is fair/expected?
    Easy way to find that out is to write two apps that do nothing but read and write and measure it.

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points