System Caching vs No Caching
If i were to read in a large file, with multiple threads running concurrently would running with no buffer have a faster transfer speed or would running with an OS buffer would 开发者_StackOverflowsufficiently better?
You shouldn't have more than one thread reading the same file at the same time, read with one thread and then send the data somehow to the others. That said, reading with buffers will be faster, but the standard already use buffer (if you read with fread or ifstream, you won't have a buffer if you use the funcion read directly).
Note that the buffers from the standard will be aware of the disk sector size, which means that it will reduce disks access if you use read direcly.
This strongly depends on the access pattern. Consider as a first example a video player, with a sound and an video thread. They both need sequential access of the file, at approximately the same position. However, once it's been read, data isn't needed anymore. Therefore, you need a cache that reads ahead, but doesn't keep old data.
As a second example, consider a database application with file-based tables. Multiple threads may execute independent queries. Locality of reference differs between the different tables and indexes.
Clearly, the examples differ a lot. To some, that suggests that the application should manage the cache. Not really; the best approach is to tell the OS what you're going to do. It is in a much better position to do tradeoffs. It can see all traffic to disk, and figure out global RAM pressure.
For this reason, memory-mapped files can be a high-performance way to read big files with multiple threads in a random-access way. It gives the OS a good opportunity to balance I/O and memory.
精彩评论