1 ===========================================
2 Fault injection capabilities infrastructure
3 ===========================================
5 See also drivers/md/md-faulty.c and "every_nth" module option for scsi_debug.
8 Available fault injection capabilities
9 --------------------------------------
13 injects slab allocation failures. (kmalloc(), kmem_cache_alloc(), ...)
17 injects page allocation failures. (alloc_pages(), get_free_pages(), ...)
21 injects failures in user memory access functions. (copy_from_user(), get_user(), ...)
25 injects futex deadlock and uaddr fault errors.
29 injects disk IO errors on devices permitted by setting
30 /sys/block/<device>/make-it-fail or
31 /sys/block/<device>/<partition>/make-it-fail. (submit_bio_noacct())
35 injects MMC data errors on devices permitted by setting
36 debugfs entries under /sys/kernel/debug/mmc0/fail_mmc_request
40 injects error return on specific functions, which are marked by
41 ALLOW_ERROR_INJECTION() macro, by setting debugfs entries
42 under /sys/kernel/debug/fail_function. No boot option supported.
44 - NVMe fault injection
46 inject NVMe status code and retry flag on devices permitted by setting
47 debugfs entries under /sys/kernel/debug/nvme*/fault_inject. The default
48 status code is NVME_SC_INVALID_OPCODE with no retry. The status code and
49 retry flag can be set via the debugfs.
52 Configure fault-injection capabilities behavior
53 -----------------------------------------------
58 fault-inject-debugfs kernel module provides some debugfs entries for runtime
59 configuration of fault-injection capabilities.
61 - /sys/kernel/debug/fail*/probability:
63 likelihood of failure injection, in percent.
67 Note that one-failure-per-hundred is a very high error rate
68 for some testcases. Consider setting probability=100 and configure
69 /sys/kernel/debug/fail*/interval for such testcases.
71 - /sys/kernel/debug/fail*/interval:
73 specifies the interval between failures, for calls to
74 should_fail() that pass all the other tests.
76 Note that if you enable this, by setting interval>1, you will
77 probably want to set probability=100.
79 - /sys/kernel/debug/fail*/times:
81 specifies how many times failures may happen at most. A value of -1
84 - /sys/kernel/debug/fail*/space:
86 specifies an initial resource "budget", decremented by "size"
87 on each call to should_fail(,size). Failure injection is
88 suppressed until "space" reaches zero.
90 - /sys/kernel/debug/fail*/verbose
94 specifies the verbosity of the messages when failure is
95 injected. '0' means no messages; '1' will print only a single
96 log line per failure; '2' will print a call trace too -- useful
97 to debug the problems revealed by fault injection.
99 - /sys/kernel/debug/fail*/task-filter:
101 Format: { 'Y' | 'N' }
103 A value of 'N' disables filtering by process (default).
104 Any positive value limits failures to only processes indicated by
105 /proc/<pid>/make-it-fail==1.
107 - /sys/kernel/debug/fail*/require-start,
108 /sys/kernel/debug/fail*/require-end,
109 /sys/kernel/debug/fail*/reject-start,
110 /sys/kernel/debug/fail*/reject-end:
112 specifies the range of virtual addresses tested during
113 stacktrace walking. Failure is injected only if some caller
114 in the walked stacktrace lies within the required range, and
115 none lies within the rejected range.
116 Default required range is [0,ULONG_MAX) (whole of virtual address space).
117 Default rejected range is [0,0).
119 - /sys/kernel/debug/fail*/stacktrace-depth:
121 specifies the maximum stacktrace depth walked during search
122 for a caller within [require-start,require-end) OR
123 [reject-start,reject-end).
125 - /sys/kernel/debug/fail_page_alloc/ignore-gfp-highmem:
127 Format: { 'Y' | 'N' }
129 default is 'N', setting it to 'Y' won't inject failures into
130 highmem/user allocations.
132 - /sys/kernel/debug/failslab/ignore-gfp-wait:
133 - /sys/kernel/debug/fail_page_alloc/ignore-gfp-wait:
135 Format: { 'Y' | 'N' }
137 default is 'N', setting it to 'Y' will inject failures
138 only into non-sleep allocations (GFP_ATOMIC allocations).
140 - /sys/kernel/debug/fail_page_alloc/min-order:
142 specifies the minimum page allocation order to be injected
145 - /sys/kernel/debug/fail_futex/ignore-private:
147 Format: { 'Y' | 'N' }
149 default is 'N', setting it to 'Y' will disable failure injections
150 when dealing with private (address space) futexes.
152 - /sys/kernel/debug/fail_function/inject:
154 Format: { 'function-name' | '!function-name' | '' }
156 specifies the target function of error injection by name.
157 If the function name leads '!' prefix, given function is
158 removed from injection list. If nothing specified ('')
159 injection list is cleared.
161 - /sys/kernel/debug/fail_function/injectable:
163 (read only) shows error injectable functions and what type of
164 error values can be specified. The error type will be one of
166 - NULL: retval must be 0.
167 - ERRNO: retval must be -1 to -MAX_ERRNO (-4096).
168 - ERR_NULL: retval must be 0 or -1 to -MAX_ERRNO (-4096).
170 - /sys/kernel/debug/fail_function/<function-name>/retval:
172 specifies the "error" return value to inject to the given function.
173 This will be created when the user specifies a new injection entry.
174 Note that this file only accepts unsigned values. So, if you want to
175 use a negative errno, you better use 'printf' instead of 'echo', e.g.:
176 $ printf %#x -12 > retval
181 In order to inject faults while debugfs is not available (early boot time),
182 use the boot option::
189 mmc_core.fail_request=<interval>,<probability>,<space>,<times>
194 - /proc/<pid>/fail-nth,
195 /proc/self/task/<tid>/fail-nth:
197 Write to this file of integer N makes N-th call in the task fail.
198 Read from this file returns a integer value. A value of '0' indicates
199 that the fault setup with a previous write to this file was injected.
200 A positive integer N indicates that the fault wasn't yet injected.
201 Note that this file enables all types of faults (slab, futex, etc).
202 This setting takes precedence over all other generic debugfs settings
203 like probability, interval, times, etc. But per-capability settings
204 (e.g. fail_futex/ignore-private) take precedence over it.
206 This feature is intended for systematic testing of faults in a single
207 system call. See an example below.
209 How to add new fault injection capability
210 -----------------------------------------
212 - #include <linux/fault-inject.h>
214 - define the fault attributes
216 DECLARE_FAULT_ATTR(name);
218 Please see the definition of struct fault_attr in fault-inject.h
221 - provide a way to configure fault attributes
225 If you need to enable the fault injection capability from boot time, you can
226 provide boot option to configure it. There is a helper function for it:
228 setup_fault_attr(attr, str);
232 failslab, fail_page_alloc, fail_usercopy, and fail_make_request use this way.
235 fault_create_debugfs_attr(name, parent, attr);
239 If the scope of the fault injection capability is limited to a
240 single kernel module, it is better to provide module parameters to
241 configure the fault attributes.
243 - add a hook to insert failures
245 Upon should_fail() returning true, client code should inject a failure:
247 should_fail(attr, size);
252 - Inject slab allocation failures into module init/exit code::
257 echo Y > /sys/kernel/debug/$FAILTYPE/task-filter
258 echo 10 > /sys/kernel/debug/$FAILTYPE/probability
259 echo 100 > /sys/kernel/debug/$FAILTYPE/interval
260 echo -1 > /sys/kernel/debug/$FAILTYPE/times
261 echo 0 > /sys/kernel/debug/$FAILTYPE/space
262 echo 2 > /sys/kernel/debug/$FAILTYPE/verbose
263 echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-wait
267 bash -c "echo 1 > /proc/self/make-it-fail && exec $*"
272 echo "Usage: $0 modulename [ modulename ... ]"
279 faulty_system modprobe $m
282 faulty_system modprobe -r $m
285 ------------------------------------------------------------------------------
287 - Inject page allocation failures only for a specific module::
291 FAILTYPE=fail_page_alloc
296 echo "Usage: $0 <modulename>"
302 if [ ! -d /sys/module/$module/sections ]
304 echo Module $module is not loaded
308 cat /sys/module/$module/sections/.text > /sys/kernel/debug/$FAILTYPE/require-start
309 cat /sys/module/$module/sections/.data > /sys/kernel/debug/$FAILTYPE/require-end
311 echo N > /sys/kernel/debug/$FAILTYPE/task-filter
312 echo 10 > /sys/kernel/debug/$FAILTYPE/probability
313 echo 100 > /sys/kernel/debug/$FAILTYPE/interval
314 echo -1 > /sys/kernel/debug/$FAILTYPE/times
315 echo 0 > /sys/kernel/debug/$FAILTYPE/space
316 echo 2 > /sys/kernel/debug/$FAILTYPE/verbose
317 echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-wait
318 echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-highmem
319 echo 10 > /sys/kernel/debug/$FAILTYPE/stacktrace-depth
321 trap "echo 0 > /sys/kernel/debug/$FAILTYPE/probability" SIGINT SIGTERM EXIT
323 echo "Injecting errors into the module $module... (interrupt to stop)"
326 ------------------------------------------------------------------------------
328 - Inject open_ctree error while btrfs mount::
333 dd if=/dev/zero of=testfile.img bs=1M seek=1000 count=1
334 DEVICE=$(losetup --show -f testfile.img)
335 mkfs.btrfs -f $DEVICE
338 FAILTYPE=fail_function
340 echo $FAILFUNC > /sys/kernel/debug/$FAILTYPE/inject
341 printf %#x -12 > /sys/kernel/debug/$FAILTYPE/$FAILFUNC/retval
342 echo N > /sys/kernel/debug/$FAILTYPE/task-filter
343 echo 100 > /sys/kernel/debug/$FAILTYPE/probability
344 echo 0 > /sys/kernel/debug/$FAILTYPE/interval
345 echo -1 > /sys/kernel/debug/$FAILTYPE/times
346 echo 0 > /sys/kernel/debug/$FAILTYPE/space
347 echo 1 > /sys/kernel/debug/$FAILTYPE/verbose
349 mount -t btrfs $DEVICE tmpmnt
358 echo > /sys/kernel/debug/$FAILTYPE/inject
365 Tool to run command with failslab or fail_page_alloc
366 ----------------------------------------------------
367 In order to make it easier to accomplish the tasks mentioned above, we can use
368 tools/testing/fault-injection/failcmd.sh. Please run a command
369 "./tools/testing/fault-injection/failcmd.sh --help" for more information and
370 see the following examples.
374 Run a command "make -C tools/testing/selftests/ run_tests" with injecting slab
377 # ./tools/testing/fault-injection/failcmd.sh \
378 -- make -C tools/testing/selftests/ run_tests
380 Same as above except to specify 100 times failures at most instead of one time
383 # ./tools/testing/fault-injection/failcmd.sh --times=100 \
384 -- make -C tools/testing/selftests/ run_tests
386 Same as above except to inject page allocation failure instead of slab
389 # env FAILCMD_TYPE=fail_page_alloc \
390 ./tools/testing/fault-injection/failcmd.sh --times=100 \
391 -- make -C tools/testing/selftests/ run_tests
393 Systematic faults using fail-nth
394 ---------------------------------
396 The following code systematically faults 0-th, 1-st, 2-nd and so on
397 capabilities in the socketpair() system call::
399 #include <sys/types.h>
400 #include <sys/stat.h>
401 #include <sys/socket.h>
402 #include <sys/syscall.h>
412 int i, err, res, fail_nth, fds[2];
415 system("echo N > /sys/kernel/debug/failslab/ignore-gfp-wait");
416 sprintf(buf, "/proc/self/task/%ld/fail-nth", syscall(SYS_gettid));
417 fail_nth = open(buf, O_RDWR);
419 sprintf(buf, "%d", i);
420 write(fail_nth, buf, strlen(buf));
421 res = socketpair(AF_LOCAL, SOCK_STREAM, 0, fds);
423 pread(fail_nth, buf, sizeof(buf), 0);
428 printf("%d-th fault %c: res=%d/%d\n", i, atoi(buf) ? 'N' : 'Y',
438 1-th fault Y: res=-1/23
439 2-th fault Y: res=-1/23
440 3-th fault Y: res=-1/12
441 4-th fault Y: res=-1/12
442 5-th fault Y: res=-1/23
443 6-th fault Y: res=-1/23
444 7-th fault Y: res=-1/23
445 8-th fault Y: res=-1/12
446 9-th fault Y: res=-1/12
447 10-th fault Y: res=-1/12
448 11-th fault Y: res=-1/12
449 12-th fault Y: res=-1/12
450 13-th fault Y: res=-1/12
451 14-th fault Y: res=-1/12
452 15-th fault Y: res=-1/12
453 16-th fault N: res=0/12