Question

When to use memory-mapped files?

I have an application that receives chunks of data over the network, and writes these to disk. Once all chunks have been received, they can be decoded/recombined into the single file they actually represent.

I'm wondering if it's useful to use memory-mapped files or not - first for writing the single chunks to disk, second for the single file into which all of them are decoded.

My own feeling is that it might be useful for the second case only, anyone got some ideas on this?

Edit: It's a C# app, and I'm only planning an x64 version. (So running into the 'largest contigious free space' problem shouldn't be relevant)

 45  41518  45
1 Jan 1970

Solution

 31

Memory-mapped files are beneficial for scenarios where a relatively small portion (view) of a considerably larger file needs to be accessed repeatedly.

In this scenario, the operating system can help optimize the overall memory usage and paging behavior of the application by paging in and out only the most recently used portions of the mapped file.

In addition, memory-mapped files can expose interesting features such as copy-on-write or serve as the basis of shared-memory.

For your scenario, memory-mapped files can help you assemble the file if the chunks arrive out of order. However, you would still need to know the final file size in advance.

Also, you should be accessing the files only once, for writing a chunk. Thus, a performance advantage over explicitly implemented asynchronous I/O is unlikely, but it may be easier and quicker to implement your file writer correctly.

In .NET 4, Microsoft added support for memory-mapped files and there are some comprehensive articles with sample code, e.g. http://blogs.msdn.com/salvapatuel/archive/2009/06/08/working-with-memory-mapped-files-in-net-4.aspx.

2009-12-07

Solution

 12

Memory-mapped files are primarily used for Inter-Process Communication or I/O performance improvement.

In your case, are you trying to get better I/O performance?

Hate to point out the obivious, but Wikipedia gives a good rundown of the situation... http://en.wikipedia.org/wiki/Memory-mapped_file

Specifically...

The memory mapped approach has its cost in minor page faults - when a block of data is loaded in page cache, but not yet mapped in to the process's virtual memory space. Depending on the circumstances, memory mapped file I/O can actually be substantially slower than standard file I/O.

It sounds like you're about to prematurely optimize for speed. Why not a regular file approach, and then refactor for MM files later if needed?

2009-12-07