1 What: /sys/fs/lustre/version
3 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
5 Shows current running lustre version.
7 What: /sys/fs/lustre/pinger
9 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
11 Shows if the lustre module has pinger support.
12 "on" means yes and "off" means no.
14 What: /sys/fs/lustre/health
16 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
18 Shows whenever current system state believed to be "healthy",
19 "NOT HEALTHY", or "LBUG" whenever lustre has experienced
20 an internal assertion failure
22 What: /sys/fs/lustre/jobid_name
24 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
26 Currently running job "name" for this node to be transferred
27 to Lustre servers for purposes of QoS and statistics gathering.
28 Writing into this file will change the name, reading outputs
31 What: /sys/fs/lustre/jobid_var
33 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
35 Control file for lustre "jobstats" functionality, write new
36 value from the list below to change the mode:
37 disable - disable job name reporting to the servers (default)
38 procname_uid - form the job name as the current running
39 command name and pid with a dot in between
41 nodelocal - use jobid_name value from above.
43 What: /sys/fs/lustre/timeout
45 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
47 Controls "lustre timeout" variable, also known as obd_timeout
48 in some old manual. In the past obd_timeout was of paramount
49 importance as the timeout value used everywhere and where
50 other timeouts were derived from. These days it's much less
51 important as network timeouts are mostly determined by
52 AT (adaptive timeouts).
53 Unit: seconds, default: 100
55 What: /sys/fs/lustre/max_dirty_mb
57 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
59 Controls total number of dirty cache (in megabytes) allowed
60 across all mounted lustre filesystems.
61 Since writeout of dirty pages in Lustre is somewhat expensive,
62 when you allow to many dirty pages, this might lead to
63 performance degradations as kernel tries to desperately
64 find some pages to free/writeout.
65 Default 1/2 RAM. Min value 4, max value 9/10 of RAM.
67 What: /sys/fs/lustre/debug_peer_on_timeout
69 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
71 Control if lnet debug information should be printed when
72 an RPC timeout occurs.
76 What: /sys/fs/lustre/dump_on_timeout
78 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
80 Controls if Lustre debug log should be dumped when an RPC
81 timeout occurs. This is useful if yout debug buffer typically
82 rolls over by the time you notice RPC timeouts.
84 What: /sys/fs/lustre/dump_on_eviction
86 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
88 Controls if Lustre debug log should be dumped when an this
89 client is evicted from one of the servers.
90 This is useful if yout debug buffer typically rolls over
91 by the time you notice the eviction event.
93 What: /sys/fs/lustre/at_min
95 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
97 Controls minimum adaptive timeout in seconds. If you encounter
98 a case where clients timeout due to server-reported processing
99 time being too short, you might consider increasing this value.
100 One common case of this if the underlying network has
101 unpredictable long delays.
104 What: /sys/fs/lustre/at_max
106 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
108 Controls maximum adaptive timeout in seconds. If at_max timeout
109 is reached for an RPC, the RPC will time out.
110 Some genuinuely slow network hardware might warrant increasing
112 Setting this value to 0 disables Adaptive Timeouts
113 functionality and old-style obd_timeout value is then used.
116 What: /sys/fs/lustre/at_extra
118 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
120 Controls how much extra time to request for unfinished requests
121 in processing in seconds. Normally a server-side parameter, it
122 is also used on the client for responses to various LDLM ASTs
123 that are handled with a special server thread on the client.
124 This is a way for the servers to ask the clients not to time
125 out the request that reached current servicing time estimate
126 yet and give it some more time.
129 What: /sys/fs/lustre/at_early_margin
131 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
133 Controls when to send the early reply for requests that are
134 about to timeout as an offset to the estimated service time in
138 What: /sys/fs/lustre/at_history
140 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
142 Controls for how many seconds to remember slowest events
143 encountered by adaptive timeouts code.
146 What: /sys/fs/lustre/llite/<fsname>-<uuid>/blocksize
148 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
150 Biggest blocksize on object storage server for this filesystem.
152 What: /sys/fs/lustre/llite/<fsname>-<uuid>/kbytestotal
154 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
156 Shows total number of kilobytes of space on this filesystem
158 What: /sys/fs/lustre/llite/<fsname>-<uuid>/kbytesfree
160 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
162 Shows total number of free kilobytes of space on this filesystem
164 What: /sys/fs/lustre/llite/<fsname>-<uuid>/kbytesavail
166 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
168 Shows total number of free kilobytes of space on this filesystem
169 actually available for use (taking into account per-client
170 grants and filesystem reservations).
172 What: /sys/fs/lustre/llite/<fsname>-<uuid>/filestotal
174 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
176 Shows total number of inodes on the filesystem.
178 What: /sys/fs/lustre/llite/<fsname>-<uuid>/filesfree
180 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
182 Shows estimated number of free inodes on the filesystem
184 What: /sys/fs/lustre/llite/<fsname>-<uuid>/client_type
186 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
188 Shows whenever this filesystem considers this client to be
189 compute cluster-local or remote. Remote clients have
190 additional uid/gid convrting logic applied.
192 What: /sys/fs/lustre/llite/<fsname>-<uuid>/fstype
194 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
196 Shows filesystem type of the filesystem
198 What: /sys/fs/lustre/llite/<fsname>-<uuid>/uuid
200 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
202 Shows this filesystem superblock uuid
204 What: /sys/fs/lustre/llite/<fsname>-<uuid>/max_read_ahead_mb
206 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
208 Sets maximum number of megabytes in system memory to be
209 given to read-ahead cache.
211 What: /sys/fs/lustre/llite/<fsname>-<uuid>/max_read_ahead_per_file_mb
213 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
215 Sets maximum number of megabytes to read-ahead for a single file
217 What: /sys/fs/lustre/llite/<fsname>-<uuid>/max_read_ahead_whole_mb
219 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
221 For small reads, how many megabytes to actually request from
222 the server as initial read-ahead.
224 What: /sys/fs/lustre/llite/<fsname>-<uuid>/checksum_pages
226 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
228 Enables or disables per-page checksum at llite layer, before
229 the pages are actually given to lower level for network transfer
231 What: /sys/fs/lustre/llite/<fsname>-<uuid>/stats_track_pid
233 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
235 Limit Lustre vfs operations gathering to just a single pid.
236 0 to track everything.
238 What: /sys/fs/lustre/llite/<fsname>-<uuid>/stats_track_ppid
240 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
242 Limit Lustre vfs operations gathering to just a single ppid.
243 0 to track everything.
245 What: /sys/fs/lustre/llite/<fsname>-<uuid>/stats_track_gid
247 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
249 Limit Lustre vfs operations gathering to just a single gid.
250 0 to track everything.
252 What: /sys/fs/lustre/llite/<fsname>-<uuid>/statahead_max
254 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
256 Controls maximum number of statahead requests to send when
257 sequential readdir+stat pattern is detected.
259 What: /sys/fs/lustre/llite/<fsname>-<uuid>/statahead_agl
261 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
263 Controls if AGL (async glimpse ahead - obtain object information
264 from OSTs in parallel with MDS during statahead) should be
266 0 to disable, 1 to enable.
268 What: /sys/fs/lustre/llite/<fsname>-<uuid>/lazystatfs
270 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
272 Controls statfs(2) behaviour in the face of down servers.
273 If 0, always wait for all servers to come online,
274 if 1, ignote inactive servers.
276 What: /sys/fs/lustre/llite/<fsname>-<uuid>/max_easize
278 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
280 Shows maximum number of bytes file striping data could be
281 in current configuration of storage.
283 What: /sys/fs/lustre/llite/<fsname>-<uuid>/default_easize
285 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
287 Shows maximum observed file striping data seen by this
288 filesystem client instance.
290 What: /sys/fs/lustre/llite/<fsname>-<uuid>/xattr_cache
292 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
294 Controls extended attributes client-side cache.
295 1 to enable, 0 to disable.
297 What: /sys/fs/lustre/ldlm/cancel_unused_locks_before_replay
299 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
301 Controls if client should replay unused locks during recovery
302 If a client tends to have a lot of unused locks in LRU,
303 recovery times might become prolonged.
304 1 - just locally cancel unused locks (default)
305 0 - replay unused locks.
307 What: /sys/fs/lustre/ldlm/namespaces/<name>/resource_count
309 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
311 Displays number of lock resources (objects on which individual
312 locks are taken) currently allocated in this namespace.
314 What: /sys/fs/lustre/ldlm/namespaces/<name>/lock_count
316 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
318 Displays number or locks allocated in this namespace.
320 What: /sys/fs/lustre/ldlm/namespaces/<name>/lru_size
322 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
324 Controls and displays LRU size limit for unused locks for this
326 0 - LRU size is unlimited, controlled by server resources
327 positive number - number of locks to allow in lock LRU list
329 What: /sys/fs/lustre/ldlm/namespaces/<name>/lock_unused_count
331 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
333 Display number of locks currently sitting in the LRU list
336 What: /sys/fs/lustre/ldlm/namespaces/<name>/lru_max_age
338 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
340 Maximum number of milliseconds a lock could sit in LRU list
341 before client would voluntarily cancel it as unused.
343 What: /sys/fs/lustre/ldlm/namespaces/<name>/early_lock_cancel
345 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
347 Controls "early lock cancellation" feature on this namespace
348 if supported by the server.
349 When enabled, tries to preemtively cancel locks that would be
350 cancelled by verious operations and bundle the cancellation
351 requests in the same RPC as the main operation, which results
352 in significant speedups due to reduced lock-pingpong RPCs.
354 1 - enabled (default)
356 What: /sys/fs/lustre/ldlm/namespaces/<name>/pool/granted
358 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
360 Displays number of granted locks in this namespace
362 What: /sys/fs/lustre/ldlm/namespaces/<name>/pool/grant_rate
364 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
366 Number of granted locks in this namespace during last
369 What: /sys/fs/lustre/ldlm/namespaces/<name>/pool/cancel_rate
371 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
373 Number of lock cancellations in this namespace during
376 What: /sys/fs/lustre/ldlm/namespaces/<name>/pool/grant_speed
378 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
380 Calculated speed of lock granting (grant_rate - cancel_rate)
383 What: /sys/fs/lustre/ldlm/namespaces/<name>/pool/grant_plan
385 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
387 Estimated number of locks to be granted in the next time
388 interval in this namespace
390 What: /sys/fs/lustre/ldlm/namespaces/<name>/pool/limit
392 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
394 Controls number of allowed locks in this pool.
395 When lru_size is 0, this is the actual limit then.
397 What: /sys/fs/lustre/ldlm/namespaces/<name>/pool/lock_volume_factor
399 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
401 Multiplier for all lock volume calculations above.
402 Default is 1. Increase to make the client to more agressively
403 clean it's lock LRU list for this namespace.
405 What: /sys/fs/lustre/ldlm/namespaces/<name>/pool/server_lock_volume
407 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
409 Calculated server lock volume.
411 What: /sys/fs/lustre/ldlm/namespaces/<name>/pool/recalc_period
413 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
415 Controls length of time between recalculation of above
418 What: /sys/fs/lustre/ldlm/services/ldlm_cbd/threads_min
420 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
422 Controls minimum number of ldlm callback threads to start.
424 What: /sys/fs/lustre/ldlm/services/ldlm_cbd/threads_max
426 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
428 Controls maximum number of ldlm callback threads to start.
430 What: /sys/fs/lustre/ldlm/services/ldlm_cbd/threads_started
432 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
434 Shows actual number of ldlm callback threads running.
436 What: /sys/fs/lustre/ldlm/services/ldlm_cbd/high_priority_ratio
438 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
440 Controls what percentage of ldlm callback threads is dedicated
441 to "high priority" incoming requests.
443 What: /sys/fs/lustre/{obdtype}/{connection_name}/blocksize
445 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
447 Blocksize on backend filesystem for service behind this obd
448 device (or biggest blocksize for compound devices like lov
451 What: /sys/fs/lustre/{obdtype}/{connection_name}/kbytestotal
453 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
455 Total number of kilobytes of space on backend filesystem
456 for service behind this obd (or total amount for compound
457 devices like lov lmv)
459 What: /sys/fs/lustre/{obdtype}/{connection_name}/kbytesfree
461 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
463 Number of free kilobytes on backend filesystem for service
464 behind this obd (or total amount for compound devices
467 What: /sys/fs/lustre/{obdtype}/{connection_name}/kbytesavail
469 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
471 Number of kilobytes of free space on backend filesystem
472 for service behind this obd (or total amount for compound
473 devices like lov lmv) that is actually available for use
474 (taking into account per-client and filesystem reservations).
476 What: /sys/fs/lustre/{obdtype}/{connection_name}/filestotal
478 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
480 Number of inodes on backend filesystem for service behind this
483 What: /sys/fs/lustre/{obdtype}/{connection_name}/filesfree
485 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
487 Number of free inodes on backend filesystem for service
490 What: /sys/fs/lustre/mdc/{connection_name}/max_pages_per_rpc
492 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
494 Maximum number of readdir pages to fit into a single readdir
497 What: /sys/fs/lustre/{mdc,osc}/{connection_name}/max_rpcs_in_flight
499 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
501 Maximum number of parallel RPCs on the wire to allow on
502 this connection. Increasing this number would help on higher
503 latency links, but has a chance of overloading a server
504 if you have too many clients like this.
507 What: /sys/fs/lustre/osc/{connection_name}/max_pages_per_rpc
509 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
511 Maximum number of pages to fit into a single RPC.
512 Typically bigger RPCs allow for better performance.
513 Default: however many pages to form 1M of data (256 pages
514 for 4K page sized platforms)
516 What: /sys/fs/lustre/osc/{connection_name}/active
518 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
520 Controls accessibility of this connection. If set to 0,
521 fail all accesses immediately.
523 What: /sys/fs/lustre/osc/{connection_name}/checksums
525 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
527 Controls whenever to checksum bulk RPC data over the wire
529 1: enable (default) ; 0: disable
531 What: /sys/fs/lustre/osc/{connection_name}/contention_seconds
533 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
535 Controls for how long to consider a file contended once
536 indicated as such by the server.
537 When a file is considered contended, all operations switch to
538 synchronous lockless mode to avoid cache and lock pingpong.
540 What: /sys/fs/lustre/osc/{connection_name}/cur_dirty_bytes
542 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
544 Displays how many dirty bytes is presently in the cache for this
547 What: /sys/fs/lustre/osc/{connection_name}/cur_grant_bytes
549 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
551 Shows how many bytes we have as a "dirty cache" grant from the
552 server. Writing a value smaller than shown allows to release
553 some grant back to the server.
554 Dirty cache grant is a way Lustre ensures that cached successful
555 writes on client do not end up discarded by the server due to
556 lack of space later on.
558 What: /sys/fs/lustre/osc/{connection_name}/cur_lost_grant_bytes
560 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
562 Shows how many granted bytes were released to the server due
563 to lack of write activity on this client.
565 What: /sys/fs/lustre/osc/{connection_name}/grant_shrink_interval
567 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
569 Number of seconds with no write activity for this target
570 to start releasing dirty grant back to the server.
572 What: /sys/fs/lustre/osc/{connection_name}/destroys_in_flight
574 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
576 Number of DESTROY RPCs currently in flight to this target.
578 What: /sys/fs/lustre/osc/{connection_name}/lockless_truncate
580 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
582 Controls whether lockless truncate RPCs are allowed to this
584 Lockless truncate causes server to perform the locking which
585 is beneficial if the truncate is not followed by a write
587 1: enable ; 0: disable (default)
589 What: /sys/fs/lustre/osc/{connection_name}/max_dirty_mb
591 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
593 Controls how much dirty data this client can accumulate
594 for this target. This is orthogonal to dirty grant and is
595 a hard limit even if the server would allow a bigger dirty
597 While allowing higher dirty cache is beneficial for write
598 performance, flushing write cache takes longer and as such
599 the node might be more prone to OOMs.
600 Having this value set too low might result in not being able
601 to sent too many parallel WRITE RPCs.
604 What: /sys/fs/lustre/osc/{connection_name}/resend_count
606 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
608 Controls how many times to try and resend RPCs to this target
609 that failed with "recoverable" status, such as EAGAIN,
612 What: /sys/fs/lustre/lov/{connection_name}/numobd
614 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
616 Number of OSC targets managed by this LOV instance.
618 What: /sys/fs/lustre/lov/{connection_name}/activeobd
620 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
622 Number of OSC targets managed by this LOV instance that are
625 What: /sys/fs/lustre/lmv/{connection_name}/numobd
627 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
629 Number of MDC targets managed by this LMV instance.
631 What: /sys/fs/lustre/lmv/{connection_name}/activeobd
633 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
635 Number of MDC targets managed by this LMV instance that are
638 What: /sys/fs/lustre/lmv/{connection_name}/placement
640 Contact: "Oleg Drokin" <oleg.drokin@intel.com>
642 Determines policy of inode placement in case of multiple
644 CHAR - based on a hash of the file name used at creation time
646 NID - based on a hash of creating client network id.