Linux page cache

Overview

Block device layer
page cache
IO scheduer


Page cache contains all file I/O data, direct I/O bypasses the page cache.


Page cache helps Linux to economize I/O

– Read requests can be made faster by adding a read ahead quantity, depending on the

historical behavior of file system accesses by applications

– Write requests are delayed and data in the page cache can have multiple updates before

being written to disk.

– Write requests in the page cache can be merged into larger I/O requests

But page cache...

    – Requires Linux memory pages

    – Is not useful when cached data is not exploited

Data just only needed once

Application buffers data itself

    – In Linux does not know which data the application really needs next. It makes only a guess

No alternatives if application cannot handle direct I/O


Consider to use...

direct I/O:
    – bypasses the page cache
    – is a good choice in all cases where the application does not want Linux to economize I/O and/or where the application buffers larger amount of file contents
async I/O:
    – prevents the application from being blocked in the I/O system call until the I/O completes
    – allows read merging by Linux in case of using page cache
    – can be combined with direct I/O

temporary files:
    – should not reside on real disks, a ram disk or tmpfs allows fastest access to these files
    – they don‘t need to survive a crash, don‘t place them on a journaling file system
file system:
    – use ext3 and select the appropriate journaling mode (journal, ordered, writeback)
    – turning off atime is only suitable if no application makes decisions on "last read" time,consider relatime instead

Direct I/O versus Page cache


Direct I/O

    – Preferable if application caches itself

Application knows best which data is needed again

Application knows which data is most likely needed next

    Example database base management systems DBMS

    – Preferable if caching makes no sense

Data only needed once

Backup and restore

Page cache

    – Optimizes re-read / write but can be critical

Data written to the page cache but not to disk yet can get lost if data loss cannot easily be handled

    – If application cannot handle direct I/O

Typical example is a file server


郑重声明:本站内容如果来自互联网及其他传播媒体,其版权均属原媒体及文章作者所有。转载目的在于传递更多信息及用于网络分享,并不代表本站赞同其观点和对其真实性负责,也不构成任何其他建议。