Category: Universal Media Server

Fastest way to read files in Java

Introduction:

In the Universal Media Server project, we recently ran some benchmarks to discover the fastest way to read files, particularly big files like HD movies. We tested four methods using an automatic benchmark script:

  1. FileChannel using File input
  2. FileChannel using Path input
  3. DataInputStream using File input
  4. RandomAccessFile using File input

We tested these on different hard drives with different rotation speeds, and with files from 600MB up to 22GB each, and using 1-100 threads to see what effect that had on the results.

Results:

We experienced different results but on average for our use case, we found that the two FileChannel methods were the best, and went with the second option since the Path input is the newer syntax in Java. The DataInputStream and RandomAccessFile had significantly slow outliers that had been causing problems on some hard drives.

My results:

FileChannel using File input:
Benchmarking of hashing 152000 files using 1 thread took 57277 ms (376824 ns average per file)
Benchmarking of hashing 152000 files using 100 threads took 20130 ms (132437 ns average per file)

FileChannel using Path input:
Benchmarking of hashing 152000 files using 1 thread took 56675 ms (372867 ns average per file)
Benchmarking of hashing 152000 files using 100 threads took 21373 ms (140615 ns average per file)

DataInputStream using File input:
Benchmarking of hashing 152000 files using 1 thread took 75716 ms (498133 ns average per file)
Benchmarking of hashing 152000 files using 100 threads took 330825 ms (2176486 ns average per file)

RandomAccessFile using File input:
Benchmarking of hashing 152000 files using 1 thread took 51090 ms (336121 ns average per file)
Benchmarking of hashing 152000 files using 100 threads took 326446 ms (2147671 ns average per file)

For other results and more details, check out the branch with the benchmarking code

Also note that we were doing a specific type of hashing that is used by OpenSubtitles, which involves reading the beginning and end of the file, so other uses of the reads may give different results.

© 2023 Spirton

Theme by Anders NorénUp ↑