F2FS

出典: フリー百科事典『ウィキペディア(Wikipedia)』

出典: フリー百科事典『ウィキペディア(Wikipedia)』
F2FS
開発者 サムスン電子モトローラ・モビリティファーウェイ
正式名 Flash-Friendly File System
導入 2012年12月20日 (Linux 3.8)
構造
ディレクトリ マルチレベルハッシュテーブル
領域管理 ビットマップ(空き容量)、テーブル
限度
最大ファイル サイズ 3.94 TiB
最大ファイル数 ボリュームサイズに依存
最大ファイル名長 255バイト
最大ボリューム サイズ 16 TiB
ファイル名の文字 NUL と '/' 以外の全ての文字
特徴
タイムスタンプ 変更、属性変更、アクセス
日付分解能 1ナノ秒
属性 POSIX拡張属性
パーミッション POSIX、ACL
透過的圧縮 なし
透過的暗号化 あり
対応OS Linux
テンプレートを表示

F2FS(エフツーエフエス、Flash-Friendly File System)は、サムスン電子の金載極(韓国語: 김재극英語: Jaegeuk Kim)によって開発されたLinux向けのフラッシュファイルシステムである[1]

NANDフラッシュメモリベースのストレージ[注釈 1]は、スマートフォン・タブレットからサーバまで、様々な種類のコンピュータで利用されている。F2FSはこれらのストレージの特性を考慮したファイルシステムとして設計が行われている[2]

F2FSはログ構造ファイルシステムを、NANDフラッシュメモリベースのストレージに適したものにして取り入れている。F2FSでは、従来のログ構造ファイルシステムの問題点であるガベージコレクションの負荷の高さなどを解消している[1]。NANDフラッシュメモリベースのストレージではフラッシュ変換レイヤ(英語: Flash Translation Layer、FTL)が実装されており、フラッシュメモリの特性[注釈 2]の隠蔽とHDDエミュレートを行い、SATAなどのインタフェースで利用ができるようにしている。F2FSはFTLの存在を前提として、NANDフラッシュメモリベースのストレージの性能を引き出すことができるように設計されている[1]

特徴[編集]

F2FSには、以下の機能が実装されている[3]

  • マルチヘッドログ
  • マルチレベルハッシュテーブル
  • 静的/動的なホットデータとコールドデータの分離
  • Adaptive logging scheme
  • Configurable operational units
  • 二重チェックポイント
  • 巻き戻しと前進復帰の対応
  • ヒープ形式のブロック割り当て
  • TRIM英語版/FITRIMのサポート
  • オンラインデフラグメンテーション
  • インライン拡張属性
  • インラインデータ
  • インラインディレクトリ
  • オフラインファイルシステムチェック
  • Atomic operations
  • ファイルシステム全体の暗号化
  • オフラインでのファイルシステムのサイズ変更
  • キャッシュ (コンピュータシステム)の定期的な書き込み
  • Extent cache
  • Move_file_range
  • Host-managed SMR
  • 複数のデバイスに対応
  • Large IO submission
  • Disk Quota (user/group)

また、以下の機能の実装が予定されている[3]

設計[編集]

ディスクレイアウト[編集]

F2FS divides the whole volume into a number of segments, each of which is fixed at 2 MB. A section is composed of consecutive segments, and a zone consists of a set of sections. By default, section and zone sizes are set to the same size, but users can easily modify the size with mkfs.

F2FS splits the entire volume into six areas, and all except the superblock area consist of multiple segments as described below.

Superblock (SB)
The SB is located at the beginning of the partition. There are two copies to avoid file-system corruption. It contains basic partition information and some default F2FS parameters.
Checkpoint (CP)
The CP contains file system information, bitmaps for valid NAT/SIT sets, orphan inode lists, and summary entries of current active segments.
Segment Information Table (SIT)
The SIT contains the valid block count and validity bitmap of all the Main Area blocks.
Node Address Table (NAT)
The NAT is an address table for the Main Area node blocks.
Segment Summary Area (SSA)
The SSA contains entries which contain the owner information of the Main Area data and node blocks.
Main Area
The main area contains file and directory data and their indices.

In order to avoid misalignment between file system and flash storage, F2FS aligns the start block address of the CP with the segment size. It also aligns the Main Area start block address with the zone size by reserving some segments in the SSA area.

メタデータ構造[編集]

F2FS uses the checkpoint scheme to maintain file system integrity. At mount time, F2FS first tries to find the last valid checkpoint data by scanning the CP area. In order to reduce the scanning time, F2FS uses only two copies of the CP. One of them always indicates the last valid data, which is called a shadow copy mechanism. In addition to the CP, the NAT and SIT also use the shadow copy mechanism. For file system consistency, each CP points to which NAT and SIT copies are valid.

配列構造[編集]

The key data structure is the "node". Similar to traditional file structures, F2FS has three types of nodes: inode, direct node, indirect node. F2FS assigns 4 KB to an inode block which contains 923 data block indices, two direct node pointers, two indirect node pointers, and one double indirect node pointer as described below. A direct node block contains 1018 data block indices, and an indirect node block contains 1018 node block indices. Thus, one inode block (i.e., a file) covers:

4 KB × (923 + 2×1018 + 2×1018
2 + 10183) = 3.94 TB

Note that all the node blocks are mapped by the NAT, which means that the location of each node is translated by the NAT. To mitigate the wandering tree problem, F2FS is able to cut off the propagation of node updates caused by leaf data writes.

ディレクトリ構造[編集]

A directory entry (dentry) occupies 11 bytes, which consists of the following attributes.

A directory entry structure
hash hash value of the file name
ino inode number
len the length of file name
type file type such as directory, symlink, etc.

A dentry block consists of 214 dentry slots and file names. A bitmap is used to represent whether each dentry is valid or not. A dentry block occupies 4 KB and has the following composition:

Dentry Block (4 K) = bitmap (27 bytes) + reserved (3 bytes) +
                      dentries (11 * 214 bytes) + file name (8 * 214 bytes)

F2FS implements multi-level hash tables for the directory structure. Each level has a hash table with a dedicated number of hash buckets as shown below. Note that "A(2B)" means a bucket includes 2 data blocks.

Term
A indicates bucket
B indicates block
N indicates MAX_DIR_HASH_DEPTH
level #0    A(2B)
level #1    A(2B) - A(2B)
level #2    A(2B) - A(2B) - A(2B) - A(2B)
    ...
level #N/2  A(2B) - A(2B) - A(2B) - A(2B) - A(2B) - ... - A(2B)
    ...
level #N    A(4B) - A(4B) - A(4B) - A(4B) - A(4B) - ... - A(4B)

When F2FS finds a file name in a directory, first a hash value of the file name is calculated. Then, F2FS scans the hash table in level #0 to find the dentry consisting of the file name and its inode number. If not found, F2FS scans the next hash table in level #1. In this way, F2FS scans hash tables in each level incrementally from 1 to N. In each level F2FS needs to scan only one bucket determined by the following equation, which shows O(log(# of files)) complexity.

 bucket number to scan in level #n = (hash value) % (# of buckets in level #n)

In the case of file creation, F2FS finds empty consecutive slots that cover the file name. F2FS searches the empty slots in the hash tables of whole levels from 1 to N in the same way as the lookup operation.

デフォルトのブロック割り当て[編集]

At runtime, F2FS manages six active logs inside the "Main Area:" Hot/Warm/Cold node and Hot/Warm/Cold data.

Block allocation policy
Hot node Contains direct node blocks of directories.
Warm node Contains direct node blocks except hot node blocks.
Cold node Contains indirect node blocks.
Hot data Contains dentry blocks.
Warm data Contains data blocks except hot and cold data blocks.
Cold data Contains multimedia data or migrated data blocks.

LFS has two schemes for free space management: threaded log and copy-and-compaction. The copy-and-compaction scheme which is known as cleaning, is well-suited for devices showing very good sequential write performance, since free segments are served all the time for writing new data. However, it suffers from cleaning overhead during high utilization. Conversely, the threaded log scheme suffers from random writes, but no cleaning process is needed. F2FS adopts a hybrid scheme where the copy-and-compaction scheme is adopted by default, but the policy is dynamically changed to the threaded log scheme according to the file system status.

In order to align F2FS with underlying flash-based storage, F2FS allocates a segment in a unit of a section. F2FS expects the section size to be the same as the garbage collection unit size in FTL. With respect to the mapping granularity in FTL, F2FS allocates each section of the active logs to as many different zones as possible. FTL can write the active log data into one allocation unit according to its mapping granularity.

消去処理[編集]

F2FS does cleaning both on demand, and in the background. On-demand cleaning is triggered when there are not enough free segments to serve VFS calls. The background cleaner is executed by a kernel thread, and triggers the cleaning job when the system is idle.

F2FS supports two victim selection policies: greedy, and cost-benefit algorithms. In the greedy algorithm, F2FS selects a victim segment having the smallest number of valid blocks. In the cost-benefit algorithm, F2FS selects a victim segment according to the segment age and the number of valid blocks in order to address the log block thrashing problem present in the greedy algorithm. F2FS uses the greedy algorithm for on-demand cleaning, the background cleaner uses the cost-benefit algorithm.

In order to identify whether the data in the victim segment are valid or not, F2FS manages a bitmap. Each bit represents the validity of a block, and the bitmap is composed of a bit stream covering whole blocks in the Main Area.

脚注[編集]

注釈[編集]

  1. ^ SDカードSSDなど
  2. ^ 書き込み回数の上限の存在など

出典[編集]

  1. ^ a b c Jaegeuk Kim (2012年10月5日). “f2fs: introduce flash-friendly file system”. 2018年4月28日閲覧。
  2. ^ Jaegeuk Kim (2016年12月3日). “start [F2FS Wiki]”. 2018年4月29日閲覧。
  3. ^ a b Jaegeuk Kim (2017年7月12日). “development [F2FS Wiki]”. 2018年4月29日閲覧。

関連項目[編集]