Linux 6.7-rc7
[linux-modified.git] / Documentation / ABI / testing / sysfs-bus-platform-devices-ampere-smpro
1 What:           /sys/bus/platform/devices/smpro-errmon.*/error_[core|mem|pcie|other]_[ce|ue]
2 KernelVersion:  6.1
3 Contact:        Quan Nguyen <quan@os.amperecomputing.com>
4 Description:
5                 (RO) Contains the 48-byte Ampere (Vendor-Specific) Error Record printed
6                 in hex format according to the table below:
7
8                 +--------+---------------+-------------+------------------------------------------------------------+
9                 | Offset |     Field     | Size (byte) |                     Description                            |
10                 +--------+---------------+-------------+------------------------------------------------------------+
11                 | 00     | Error Type    | 1           | See :ref:`the table below <smpro-error-types>` for details |
12                 +--------+---------------+-------------+------------------------------------------------------------+
13                 | 01     | Subtype       | 1           | See :ref:`the table below <smpro-error-types>` for details |
14                 +--------+---------------+-------------+------------------------------------------------------------+
15                 | 02     | Instance      | 2           | See :ref:`the table below <smpro-error-types>` for details |
16                 +--------+---------------+-------------+------------------------------------------------------------+
17                 | 04     | Error status  | 4           | See ARM RAS specification for details                      |
18                 +--------+---------------+-------------+------------------------------------------------------------+
19                 | 08     | Error Address | 8           | See ARM RAS specification for details                      |
20                 +--------+---------------+-------------+------------------------------------------------------------+
21                 | 16     | Error Misc 0  | 8           | See ARM RAS specification for details                      |
22                 +--------+---------------+-------------+------------------------------------------------------------+
23                 | 24     | Error Misc 1  | 8           | See ARM RAS specification for details                      |
24                 +--------+---------------+-------------+------------------------------------------------------------+
25                 | 32     | Error Misc 2  | 8           | See ARM RAS specification for details                      |
26                 +--------+---------------+-------------+------------------------------------------------------------+
27                 | 40     | Error Misc 3  | 8           | See ARM RAS specification for details                      |
28                 +--------+---------------+-------------+------------------------------------------------------------+
29
30                 The table below defines the value of error types, their subtype, subcomponent and instance:
31
32                 .. _smpro-error-types:
33
34                 +-----------------+------------+----------+----------------+----------------------------------------+
35                 |   Error Group   | Error Type | Sub type | Sub component  |               Instance                 |
36                 +-----------------+------------+----------+----------------+----------------------------------------+
37                 | CPM (core)      | 0          | 0        | Snoop-Logic    | CPM #                                  |
38                 +-----------------+------------+----------+----------------+----------------------------------------+
39                 | CPM (core)      | 0          | 2        | Armv8 Core 1   | CPM #                                  |
40                 +-----------------+------------+----------+----------------+----------------------------------------+
41                 | MCU (mem)       | 1          | 1        | ERR1           | MCU # \| SLOT << 11                    |
42                 +-----------------+------------+----------+----------------+----------------------------------------+
43                 | MCU (mem)       | 1          | 2        | ERR2           | MCU # \| SLOT << 11                    |
44                 +-----------------+------------+----------+----------------+----------------------------------------+
45                 | MCU (mem)       | 1          | 3        | ERR3           | MCU #                                  |
46                 +-----------------+------------+----------+----------------+----------------------------------------+
47                 | MCU (mem)       | 1          | 4        | ERR4           | MCU #                                  |
48                 +-----------------+------------+----------+----------------+----------------------------------------+
49                 | MCU (mem)       | 1          | 5        | ERR5           | MCU #                                  |
50                 +-----------------+------------+----------+----------------+----------------------------------------+
51                 | MCU (mem)       | 1          | 6        | ERR6           | MCU #                                  |
52                 +-----------------+------------+----------+----------------+----------------------------------------+
53                 | MCU (mem)       | 1          | 7        | Link Error     | MCU #                                  |
54                 +-----------------+------------+----------+----------------+----------------------------------------+
55                 | Mesh (other)    | 2          | 0        | Cross Point    | X \| (Y << 5) \| NS <<11               |
56                 +-----------------+------------+----------+----------------+----------------------------------------+
57                 | Mesh (other)    | 2          | 1        | Home Node(IO)  | X \| (Y << 5) \| NS <<11               |
58                 +-----------------+------------+----------+----------------+----------------------------------------+
59                 | Mesh (other)    | 2          | 2        | Home Node(Mem) | X \| (Y << 5) \| NS <<11 \| device<<12 |
60                 +-----------------+------------+----------+----------------+----------------------------------------+
61                 | Mesh (other)    | 2          | 4        | CCIX Node      | X \| (Y << 5) \| NS <<11               |
62                 +-----------------+------------+----------+----------------+----------------------------------------+
63                 | 2P Link (other) | 3          | 0        | N/A            | Altra 2P Link #                        |
64                 +-----------------+------------+----------+----------------+----------------------------------------+
65                 | GIC (other)     | 5          | 0        | ERR0           | 0                                      |
66                 +-----------------+------------+----------+----------------+----------------------------------------+
67                 | GIC (other)     | 5          | 1        | ERR1           | 0                                      |
68                 +-----------------+------------+----------+----------------+----------------------------------------+
69                 | GIC (other)     | 5          | 2        | ERR2           | 0                                      |
70                 +-----------------+------------+----------+----------------+----------------------------------------+
71                 | GIC (other)     | 5          | 3        | ERR3           | 0                                      |
72                 +-----------------+------------+----------+----------------+----------------------------------------+
73                 | GIC (other)     | 5          | 4        | ERR4           | 0                                      |
74                 +-----------------+------------+----------+----------------+----------------------------------------+
75                 | GIC (other)     | 5          | 5        | ERR5           | 0                                      |
76                 +-----------------+------------+----------+----------------+----------------------------------------+
77                 | GIC (other)     | 5          | 6        | ERR6           | 0                                      |
78                 +-----------------+------------+----------+----------------+----------------------------------------+
79                 | GIC (other)     | 5          | 7        | ERR7           | 0                                      |
80                 +-----------------+------------+----------+----------------+----------------------------------------+
81                 | GIC (other)     | 5          | 8        | ERR8           | 0                                      |
82                 +-----------------+------------+----------+----------------+----------------------------------------+
83                 | GIC (other)     | 5          | 9        | ERR9           | 0                                      |
84                 +-----------------+------------+----------+----------------+----------------------------------------+
85                 | GIC (other)     | 5          | 10       | ERR10          | 0                                      |
86                 +-----------------+------------+----------+----------------+----------------------------------------+
87                 | GIC (other)     | 5          | 11       | ERR11          | 0                                      |
88                 +-----------------+------------+----------+----------------+----------------------------------------+
89                 | GIC (other)     | 5          | 12       | ERR12          | 0                                      |
90                 +-----------------+------------+----------+----------------+----------------------------------------+
91                 | GIC (other)     | 5          | 13-21    | ERR13          | RC # + 1                               |
92                 +-----------------+------------+----------+----------------+----------------------------------------+
93                 | SMMU (other)    | 6          | TCU      | 100            | RC #                                   |
94                 +-----------------+------------+----------+----------------+----------------------------------------+
95                 | SMMU (other)    | 6          | TBU0     | 0              | RC #                                   |
96                 +-----------------+------------+----------+----------------+----------------------------------------+
97                 | SMMU (other)    | 6          | TBU1     | 1              | RC #                                   |
98                 +-----------------+------------+----------+----------------+----------------------------------------+
99                 | SMMU (other)    | 6          | TBU2     | 2              | RC #                                   |
100                 +-----------------+------------+----------+----------------+----------------------------------------+
101                 | SMMU (other)    | 6          | TBU3     | 3              | RC #                                   |
102                 +-----------------+------------+----------+----------------+----------------------------------------+
103                 | SMMU (other)    | 6          | TBU4     | 4              | RC #                                   |
104                 +-----------------+------------+----------+----------------+----------------------------------------+
105                 | SMMU (other)    | 6          | TBU5     | 5              | RC #                                   |
106                 +-----------------+------------+----------+----------------+----------------------------------------+
107                 | SMMU (other)    | 6          | TBU6     | 6              | RC #                                   |
108                 +-----------------+------------+----------+----------------+----------------------------------------+
109                 | SMMU (other)    | 6          | TBU7     | 7              | RC #                                   |
110                 +-----------------+------------+----------+----------------+----------------------------------------+
111                 | SMMU (other)    | 6          | TBU8     | 8              | RC #                                   |
112                 +-----------------+------------+----------+----------------+----------------------------------------+
113                 | SMMU (other)    | 6          | TBU9     | 9              | RC #                                   |
114                 +-----------------+------------+----------+----------------+----------------------------------------+
115                 | PCIe AER (pcie) | 7          | Root     | 0              | RC #                                   |
116                 +-----------------+------------+----------+----------------+----------------------------------------+
117                 | PCIe AER (pcie) | 7          | Device   | 1              | RC #                                   |
118                 +-----------------+------------+----------+----------------+----------------------------------------+
119                 | PCIe RC (pcie)  | 8          | RCA HB   | 0              | RC #                                   |
120                 +-----------------+------------+----------+----------------+----------------------------------------+
121                 | PCIe RC (pcie)  | 8          | RCB HB   | 1              | RC #                                   |
122                 +-----------------+------------+----------+----------------+----------------------------------------+
123                 | PCIe RC (pcie)  | 8          | RASDP    | 8              | RC #                                   |
124                 +-----------------+------------+----------+----------------+----------------------------------------+
125                 | OCM (other)     | 9          | ERR0     | 0              | 0                                      |
126                 +-----------------+------------+----------+----------------+----------------------------------------+
127                 | OCM (other)     | 9          | ERR1     | 1              | 0                                      |
128                 +-----------------+------------+----------+----------------+----------------------------------------+
129                 | OCM (other)     | 9          | ERR2     | 2              | 0                                      |
130                 +-----------------+------------+----------+----------------+----------------------------------------+
131                 | SMpro (other)   | 10         | ERR0     | 0              | 0                                      |
132                 +-----------------+------------+----------+----------------+----------------------------------------+
133                 | SMpro (other)   | 10         | ERR1     | 1              | 0                                      |
134                 +-----------------+------------+----------+----------------+----------------------------------------+
135                 | SMpro (other)   | 10         | MPA_ERR  | 2              | 0                                      |
136                 +-----------------+------------+----------+----------------+----------------------------------------+
137                 | PMpro (other)   | 11         | ERR0     | 0              | 0                                      |
138                 +-----------------+------------+----------+----------------+----------------------------------------+
139                 | PMpro (other)   | 11         | ERR1     | 1              | 0                                      |
140                 +-----------------+------------+----------+----------------+----------------------------------------+
141                 | PMpro (other)   | 11         | MPA_ERR  | 2              | 0                                      |
142                 +-----------------+------------+----------+----------------+----------------------------------------+
143
144                 Example::
145
146                  # cat error_other_ue
147                  880807001e004010401040101500000001004010401040100c0000000000000000000000000000000000000000000000
148
149                 The detail of each sysfs entries is as below:
150
151                 +-------------+---------------------------------------------------------+----------------------------------+
152                 |   Error     |                   Sysfs entry                           |   Description (when triggered)   |
153                 +-------------+---------------------------------------------------------+----------------------------------+
154                 | Core's CE   | /sys/bus/platform/devices/smpro-errmon.*/error_core_ce  | Core has CE error                |
155                 +-------------+---------------------------------------------------------+----------------------------------+
156                 | Core's UE   | /sys/bus/platform/devices/smpro-errmon.*/error_core_ue  | Core has UE error                |
157                 +-------------+---------------------------------------------------------+----------------------------------+
158                 | Memory's CE | /sys/bus/platform/devices/smpro-errmon.*/error_mem_ce   | Memory has CE error              |
159                 +-------------+---------------------------------------------------------+----------------------------------+
160                 | Memory's UE | /sys/bus/platform/devices/smpro-errmon.*/error_mem_ue   | Memory has UE error              |
161                 +-------------+---------------------------------------------------------+----------------------------------+
162                 | PCIe's CE   | /sys/bus/platform/devices/smpro-errmon.*/error_pcie_ce  | any PCIe controller has CE error |
163                 +-------------+---------------------------------------------------------+----------------------------------+
164                 | PCIe's UE   | /sys/bus/platform/devices/smpro-errmon.*/error_pcie_ue  | any PCIe controller has UE error |
165                 +-------------+---------------------------------------------------------+----------------------------------+
166                 | Other's CE  | /sys/bus/platform/devices/smpro-errmon.*/error_other_ce | any other CE error               |
167                 +-------------+---------------------------------------------------------+----------------------------------+
168                 | Other's UE  | /sys/bus/platform/devices/smpro-errmon.*/error_other_ue | any other UE error               |
169                 +-------------+---------------------------------------------------------+----------------------------------+
170
171                 UE: Uncorrect-able Error
172                 CE: Correct-able Error
173
174                 For details, see section `3.3 Ampere (Vendor-Specific) Error Record Formats,
175                 Altra Family RAS Supplement`.
176
177
178 What:           /sys/bus/platform/devices/smpro-errmon.*/overflow_[core|mem|pcie|other]_[ce|ue]
179 KernelVersion:  6.1
180 Contact:        Quan Nguyen <quan@os.amperecomputing.com>
181 Description:
182                 (RO) Return the overflow status of each type HW error reported:
183
184                   - 0      : No overflow
185                   - 1      : There is an overflow and the oldest HW errors are dropped
186
187                 The detail of each sysfs entries is as below:
188
189                 +-------------+-----------------------------------------------------------+---------------------------------------+
190                 |   Overflow  |                   Sysfs entry                             |             Description               |
191                 +-------------+-----------------------------------------------------------+---------------------------------------+
192                 | Core's CE   | /sys/bus/platform/devices/smpro-errmon.*/overflow_core_ce | Core CE error overflow                |
193                 +-------------+-----------------------------------------------------------+---------------------------------------+
194                 | Core's UE   | /sys/bus/platform/devices/smpro-errmon.*/overflow_core_ue | Core UE error overflow                |
195                 +-------------+-----------------------------------------------------------+---------------------------------------+
196                 | Memory's CE | /sys/bus/platform/devices/smpro-errmon.*/overflow_mem_ce  | Memory CE error overflow              |
197                 +-------------+-----------------------------------------------------------+---------------------------------------+
198                 | Memory's UE | /sys/bus/platform/devices/smpro-errmon.*/overflow_mem_ue  | Memory UE error overflow              |
199                 +-------------+-----------------------------------------------------------+---------------------------------------+
200                 | PCIe's CE   | /sys/bus/platform/devices/smpro-errmon.*/overflow_pcie_ce | any PCIe controller CE error overflow |
201                 +-------------+-----------------------------------------------------------+---------------------------------------+
202                 | PCIe's UE   | /sys/bus/platform/devices/smpro-errmon.*/overflow_pcie_ue | any PCIe controller UE error overflow |
203                 +-------------+-----------------------------------------------------------+---------------------------------------+
204                 | Other's CE  | /sys/bus/platform/devices/smpro-errmon.*/overflow_other_ce| any other CE error overflow           |
205                 +-------------+-----------------------------------------------------------+---------------------------------------+
206                 | Other's UE  | /sys/bus/platform/devices/smpro-errmon.*/overflow_other_ue| other UE error overflow               |
207                 +-------------+-----------------------------------------------------------+---------------------------------------+
208
209                 where:
210
211                   - UE: Uncorrect-able Error
212                   - CE: Correct-able Error
213
214 What:           /sys/bus/platform/devices/smpro-errmon.*/[error|warn]_[smpro|pmpro]
215 KernelVersion:  6.1
216 Contact:        Quan Nguyen <quan@os.amperecomputing.com>
217 Description:
218                 (RO) Contains the internal firmware error/warning printed as hex format.
219
220                 The detail of each sysfs entries is as below:
221
222                 +---------------+------------------------------------------------------+--------------------------+
223                 |   Error       |                   Sysfs entry                        |        Description       |
224                 +---------------+------------------------------------------------------+--------------------------+
225                 | SMpro error   | /sys/bus/platform/devices/smpro-errmon.*/error_smpro | system has SMpro error   |
226                 +---------------+------------------------------------------------------+--------------------------+
227                 | SMpro warning | /sys/bus/platform/devices/smpro-errmon.*/warn_smpro  | system has SMpro warning |
228                 +---------------+------------------------------------------------------+--------------------------+
229                 | PMpro error   | /sys/bus/platform/devices/smpro-errmon.*/error_pmpro | system has PMpro error   |
230                 +---------------+------------------------------------------------------+--------------------------+
231                 | PMpro warning | /sys/bus/platform/devices/smpro-errmon.*/warn_pmpro  | system has PMpro warning |
232                 +---------------+------------------------------------------------------+--------------------------+
233
234                 For details, see section `5.10 RAS Internal Error Register Definitions,
235                 Altra Family Soc BMC Interface Specification`.
236
237 What:           /sys/bus/platform/devices/smpro-errmon.*/event_[vrd_warn_fault|vrd_hot|dimm_hot|dimm_2x_refresh]
238 KernelVersion:  6.1 (event_[vrd_warn_fault|vrd_hot|dimm_hot]), 6.4 (event_dimm_2x_refresh)
239 Contact:        Quan Nguyen <quan@os.amperecomputing.com>
240 Description:
241                 (RO) Contains the detail information in case of VRD/DIMM warning/hot events
242                 in hex format as below::
243
244                     AAAA
245
246                 where:
247
248                   - ``AAAA``: The event detail information data
249
250                 The detail of each sysfs entries is as below:
251
252                 +---------------+---------------------------------------------------------------+---------------------+
253                 |   Event       |                        Sysfs entry                            |     Description     |
254                 +---------------+---------------------------------------------------------------+---------------------+
255                 | VRD HOT       | /sys/bus/platform/devices/smpro-errmon.*/event_vrd_hot        | VRD Hot             |
256                 +---------------+---------------------------------------------------------------+---------------------+
257                 | VR Warn/Fault | /sys/bus/platform/devices/smpro-errmon.*/event_vrd_warn_fault | VR Warning or Fault |
258                 +---------------+---------------------------------------------------------------+---------------------+
259                 | DIMM HOT      | /sys/bus/platform/devices/smpro-errmon.*/event_dimm_hot       | DIMM Hot            |
260                 +---------------+---------------------------------------------------------------+---------------------+
261                 | DIMM 2X       | /sys/bus/platform/devices/smpro-errmon.*/event_dimm_2x_refresh| DIMM 2x refresh rate|
262                 | REFRESH RATE  |                                                               | event in high temp  |
263                 +---------------+---------------------------------------------------------------+---------------------+
264
265                 For more details, see section `5.7 GPI Status Registers and 5.9 Memory Error Register Definitions,
266                 Altra Family Soc BMC Interface Specification`.
267
268 What:           /sys/bus/platform/devices/smpro-errmon.*/event_dimm[0-15]_syndrome
269 KernelVersion:  6.4
270 Contact:        Quan Nguyen <quan@os.amperecomputing.com>
271 Description:
272                 (RO) The sysfs returns the 2-byte DIMM failure syndrome data for slot
273                 0-15 if it failed to initialize.
274
275                 For more details, see section `5.11 Boot Stage Register Definitions,
276                 Altra Family Soc BMC Interface Specification`.
277
278 What:           /sys/bus/platform/devices/smpro-misc.*/boot_progress
279 KernelVersion:  6.1
280 Contact:        Quan Nguyen <quan@os.amperecomputing.com>
281 Description:
282                 (RO) Contains the boot stages information in hex as format below::
283
284                     AABBCCCCCCCC
285
286                 where:
287
288                   - ``AA``      : The boot stages
289
290                     - 00: SMpro firmware booting
291                     - 01: PMpro firmware booting
292                     - 02: ATF BL1 firmware booting
293                     - 03: DDR initialization
294                     - 04: DDR training report status
295                     - 05: ATF BL2 firmware booting
296                     - 06: ATF BL31 firmware booting
297                     - 07: ATF BL32 firmware booting
298                     - 08: UEFI firmware booting
299                     - 09: OS booting
300
301                   - ``BB``      : Boot status
302
303                     - 00: Not started
304                     - 01: Started
305                     - 02: Completed without error
306                     - 03: Failed.
307
308                   - ``CCCCCCCC``: Boot status information defined for each boot stages
309
310                 For details, see section `5.11 Boot Stage Register Definitions`
311                 and section `6. Processor Boot Progress Codes, Altra Family Soc BMC
312                 Interface Specification`.
313
314
315 What:           /sys/bus/platform/devices/smpro-misc*/soc_power_limit
316 KernelVersion:  6.1
317 Contact:        Quan Nguyen <quan@os.amperecomputing.com>
318 Description:
319                 (RW) Contains the desired SoC power limit in Watt.
320                 Writes to this sysfs set the desired SoC power limit (W).
321                 Reads from this register return the current SoC power limit (W).
322                 The value ranges:
323
324                   - Minimum: 120 W
325                   - Maximum: Socket TDP power