Memory and IOPS Deduplication
Virtuozzo provides memory and IOPS deduplication that helps save memory and IOPS on the Hardware Node and increases the maximum number of running Containers per Hardware Node.
Deduplication is provided by Virtuozzo File Cache which includes the pfcached daemon and a ploop image mounted to a directory on the Hardware Node. The file cache ploop contains copies of eligible files located inside Containers. To be eligible for caching, files in Containers must meet certain configurable requirements, e.g., be read in a certain number of Containers, be of certain size, be stored in certain directories in Containers.
When the kernel gets a request to read a file located in a Container ploop, it searches the file cache ploop for a copy of that file by the SHA1 hash stored as file’s extended attribute. If successful, the copy in the file cache ploop is read instead of the original file in the Container ploop. Otherwise, the original file in the Container ploop is read.
To populate the file cache ploop with most requested files, pfcached periodically obtains Container files read statistics from kernel, analyzes it, and copies files which are eligible to the file cache ploop. If the file cache ploop is running out of space, the least recently used files are removed from it.
Virtuozzo File Cache offers the following benefits:
- Memory deduplication. Only a single file from the file cache ploop needs to be loaded to memory instead of loading multiple identical files located in multiple Containers.
- IOPS deduplication. Only a single file from the file cache ploop needs to be read instead of reading multiple identical files located in multiple Containers.
If the Hardware Node has storage drives of various performance, e.g., SATA and SSD, the file cache ploop performs better if located on the fastest storage drive on the Node, e.g., SSD. In any case:
- If the Hardware Node memory is not overcommitted, the file cache mostly helps speed up Container start during which most files are read. In this case caches residing in memory are not cleaned often, so copies in the file cache ploop, once read during Container start, do not need to be reread often during Container operation.
- If the Hardware Node memory is overcommitted, Virtuozzo File Cache helps speed up both Container start and operation. In this case caches residing in memory may be cleaned often, so files in the file cache ploop need to be reread as often.
Virtuozzo File Cache can be managed with the pfcache utility described in the Virtuozzo 6 Command Line Reference Guide.
|