85

Java BufferedReader Is CPU Bound

 4 years ago
source link: https://www.tuicool.com/articles/MfUvAji
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

In an earlier post, I asked how fast the getline function in C++ could run through the lines in a text file. The answer was about 2 GB/s. That is slower than some of the best disk drives and network connections. If you take into account that software rarely only need to “just” access the lines, it is easy to build a system where text-file processing is processor-bound, as opposed to disk or network bound.

What about Java? In Java, the standard way to access lines in a text file is to use a BufferedReader . To avoid system calls, I create a large string containing many lines of text, and then I call a very simple processing function that merely records the length of the strings…

StringReader fr = new StringReader(data);
BufferedReader bf = new BufferedReader(fr);
bf.lines().forEach(s -> parseLine(s));

// elsewhere:
public void parseLine(String s) {
  volume += s.length();
}

The result is that Java is four times slower than C++, on the same system, for this benchmark:

BufferedReader.lines 0.5 GB/s

This is not the best that Java can do: Java can ingest data much faster. However, my results suggest that on modern systems, Java file parsing might be frequently processor-bound, as opposed to system bound. That is, you can buy much better disks and network cards, and your system won’t go any faster. Unless, of course, you have really good Java engineers.

Many firms probably just throw more hardware at the problem.

My source code is available .


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK