Documentation/driver-api/surface_aggregator/ssh.rst

   1 .. SPDX-License-Identifier: GPL-2.0+
   2
   3 .. |u8| replace:: :c:type:`u8 <u8>`
   4 .. |u16| replace:: :c:type:`u16 <u16>`
   5 .. |TYPE| replace:: ``TYPE``
   6 .. |LEN| replace:: ``LEN``
   7 .. |SEQ| replace:: ``SEQ``
   8 .. |SYN| replace:: ``SYN``
   9 .. |NAK| replace:: ``NAK``
  10 .. |ACK| replace:: ``ACK``
  11 .. |DATA| replace:: ``DATA``
  12 .. |DATA_SEQ| replace:: ``DATA_SEQ``
  13 .. |DATA_NSQ| replace:: ``DATA_NSQ``
  14 .. |TC| replace:: ``TC``
  15 .. |TID| replace:: ``TID``
  16 .. |SID| replace:: ``SID``
  17 .. |IID| replace:: ``IID``
  18 .. |RQID| replace:: ``RQID``
  19 .. |CID| replace:: ``CID``
  20
  21 ===========================
  22 Surface Serial Hub Protocol
  23 ===========================
  24
  25 The Surface Serial Hub (SSH) is the central communication interface for the
  26 embedded Surface Aggregator Module controller (SAM or EC), found on newer
  27 Surface generations. We will refer to this protocol and interface as
  28 SAM-over-SSH, as opposed to SAM-over-HID for the older generations.
  29
  30 On Surface devices with SAM-over-SSH, SAM is connected to the host via UART
  31 and defined in ACPI as device with ID ``MSHW0084``. On these devices,
  32 significant functionality is provided via SAM, including access to battery
  33 and power information and events, thermal read-outs and events, and many
  34 more. For Surface Laptops, keyboard input is handled via HID directed
  35 through SAM, on the Surface Laptop 3 and Surface Book 3 this also includes
  36 touchpad input.
  37
  38 Note that the standard disclaimer for this subsystem also applies to this
  39 document: All of this has been reverse-engineered and may thus be erroneous
  40 and/or incomplete.
  41
  42 All CRCs used in the following are two-byte ``crc_ccitt_false(0xffff, ...)``.
  43 All multi-byte values are little-endian, there is no implicit padding between
  44 values.
  45
  46
  47 SSH Packet Protocol: Definitions
  48 ================================
  49
  50 The fundamental communication unit of the SSH protocol is a frame
  51 (:c:type:`struct ssh_frame <ssh_frame>`). A frame consists of the following
  52 fields, packed together and in order:
  53
  54 .. flat-table:: SSH Frame
  55    :widths: 1 1 4
  56    :header-rows: 1
  57
  58    * - Field
  59      - Type
  60      - Description
  61
  62    * - |TYPE|
  63      - |u8|
  64      - Type identifier of the frame.
  65
  66    * - |LEN|
  67      - |u16|
  68      - Length of the payload associated with the frame.
  69
  70    * - |SEQ|
  71      - |u8|
  72      - Sequence ID (see explanation below).
  73
  74 Each frame structure is followed by a CRC over this structure. The CRC over
  75 the frame structure (|TYPE|, |LEN|, and |SEQ| fields) is placed directly
  76 after the frame structure and before the payload. The payload is followed by
  77 its own CRC (over all payload bytes). If the payload is not present (i.e.
  78 the frame has ``LEN=0``), the CRC of the payload is still present and will
  79 evaluate to ``0xffff``. The |LEN| field does not include any of the CRCs, it
  80 equals the number of bytes between the CRC of the frame and the CRC of the
  81 payload.
  82
  83 Additionally, the following fixed two-byte sequences are used:
  84
  85 .. flat-table:: SSH Byte Sequences
  86    :widths: 1 1 4
  87    :header-rows: 1
  88
  89    * - Name
  90      - Value
  91      - Description
  92
  93    * - |SYN|
  94      - ``[0xAA, 0x55]``
  95      - Synchronization bytes.
  96
  97 A message consists of |SYN|, followed by the frame (|TYPE|, |LEN|, |SEQ| and
  98 CRC) and, if specified in the frame (i.e. ``LEN > 0``), payload bytes,
  99 followed finally, regardless if the payload is present, the payload CRC. The
 100 messages corresponding to an exchange are, in part, identified by having the
 101 same sequence ID (|SEQ|), stored inside the frame (more on this in the next
 102 section). The sequence ID is a wrapping counter.
 103
 104 A frame can have the following types
 105 (:c:type:`enum ssh_frame_type <ssh_frame_type>`):
 106
 107 .. flat-table:: SSH Frame Types
 108    :widths: 1 1 4
 109    :header-rows: 1
 110
 111    * - Name
 112      - Value
 113      - Short Description
 114
 115    * - |NAK|
 116      - ``0x04``
 117      - Sent on error in previously received message.
 118
 119    * - |ACK|
 120      - ``0x40``
 121      - Sent to acknowledge receival of |DATA| frame.
 122
 123    * - |DATA_SEQ|
 124      - ``0x80``
 125      - Sent to transfer data. Sequenced.
 126
 127    * - |DATA_NSQ|
 128      - ``0x00``
 129      - Same as |DATA_SEQ|, but does not need to be ACKed.
 130
 131 Both |NAK|- and |ACK|-type frames are used to control flow of messages and
 132 thus do not carry a payload. |DATA_SEQ|- and |DATA_NSQ|-type frames on the
 133 other hand must carry a payload. The flow sequence and interaction of
 134 different frame types will be described in more depth in the next section.
 135
 136
 137 SSH Packet Protocol: Flow Sequence
 138 ==================================
 139
 140 Each exchange begins with |SYN|, followed by a |DATA_SEQ|- or
 141 |DATA_NSQ|-type frame, followed by its CRC, payload, and payload CRC. In
 142 case of a |DATA_NSQ|-type frame, the exchange is then finished. In case of a
 143 |DATA_SEQ|-type frame, the receiving party has to acknowledge receival of
 144 the frame by responding with a message containing an |ACK|-type frame with
 145 the same sequence ID of the |DATA| frame. In other words, the sequence ID of
 146 the |ACK| frame specifies the |DATA| frame to be acknowledged. In case of an
 147 error, e.g. an invalid CRC, the receiving party responds with a message
 148 containing an |NAK|-type frame. As the sequence ID of the previous data
 149 frame, for which an error is indicated via the |NAK| frame, cannot be relied
 150 upon, the sequence ID of the |NAK| frame should not be used and is set to
 151 zero. After receival of an |NAK| frame, the sending party should re-send all
 152 outstanding (non-ACKed) messages.
 153
 154 Sequence IDs are not synchronized between the two parties, meaning that they
 155 are managed independently for each party. Identifying the messages
 156 corresponding to a single exchange thus relies on the sequence ID as well as
 157 the type of the message, and the context. Specifically, the sequence ID is
 158 used to associate an ``ACK`` with its ``DATA_SEQ``-type frame, but not
 159 ``DATA_SEQ``- or ``DATA_NSQ``-type frames with other ``DATA``- type frames.
 160
 161 An example exchange might look like this:
 162
 163 ::
 164
 165     tx: -- SYN FRAME(D) CRC(F) PAYLOAD CRC(P) -----------------------------
 166     rx: ------------------------------------- SYN FRAME(A) CRC(F) CRC(P) --
 167
 168 where both frames have the same sequence ID (``SEQ``). Here, ``FRAME(D)``
 169 indicates a |DATA_SEQ|-type frame, ``FRAME(A)`` an ``ACK``-type frame,
 170 ``CRC(F)`` the CRC over the previous frame, ``CRC(P)`` the CRC over the
 171 previous payload. In case of an error, the exchange would look like this:
 172
 173 ::
 174
 175     tx: -- SYN FRAME(D) CRC(F) PAYLOAD CRC(P) -----------------------------
 176     rx: ------------------------------------- SYN FRAME(N) CRC(F) CRC(P) --
 177
 178 upon which the sender should re-send the message. ``FRAME(N)`` indicates an
 179 |NAK|-type frame. Note that the sequence ID of the |NAK|-type frame is fixed
 180 to zero. For |DATA_NSQ|-type frames, both exchanges are the same:
 181
 182 ::
 183
 184     tx: -- SYN FRAME(DATA_NSQ) CRC(F) PAYLOAD CRC(P) ----------------------
 185     rx: -------------------------------------------------------------------
 186
 187 Here, an error can be detected, but not corrected or indicated to the
 188 sending party. These exchanges are symmetric, i.e. switching ``rx`` and
 189 ``tx`` results again in a valid exchange. Currently, no longer exchanges are
 190 known.
 191
 192
 193 Commands: Requests, Responses, and Events
 194 =========================================
 195
 196 Commands are sent as payload inside a data frame. Currently, this is the
 197 only known payload type of |DATA| frames, with a payload-type value of
 198 ``0x80`` (:c:type:`SSH_PLD_TYPE_CMD <ssh_payload_type>`).
 199
 200 The command-type payload (:c:type:`struct ssh_command <ssh_command>`)
 201 consists of an eight-byte command structure, followed by optional and
 202 variable length command data. The length of this optional data is derived
 203 from the frame payload length given in the corresponding frame, i.e. it is
 204 ``frame.len - sizeof(struct ssh_command)``. The command struct contains the
 205 following fields, packed together and in order:
 206
 207 .. flat-table:: SSH Command
 208    :widths: 1 1 4
 209    :header-rows: 1
 210
 211    * - Field
 212      - Type
 213      - Description
 214
 215    * - |TYPE|
 216      - |u8|
 217      - Type of the payload. For commands always ``0x80``.
 218
 219    * - |TC|
 220      - |u8|
 221      - Target category.
 222
 223    * - |TID|
 224      - |u8|
 225      - Target ID for commands/messages.
 226
 227    * - |SID|
 228      - |u8|
 229      - Source ID for commands/messages.
 230
 231    * - |IID|
 232      - |u8|
 233      - Instance ID.
 234
 235    * - |RQID|
 236      - |u16|
 237      - Request ID.
 238
 239    * - |CID|
 240      - |u8|
 241      - Command ID.
 242
 243 The command struct and data, in general, does not contain any failure
 244 detection mechanism (e.g. CRCs), this is solely done on the frame level.
 245
 246 Command-type payloads are used by the host to send commands and requests to
 247 the EC as well as by the EC to send responses and events back to the host.
 248 We differentiate between requests (sent by the host), responses (sent by the
 249 EC in response to a request), and events (sent by the EC without a preceding
 250 request).
 251
 252 Commands and events are uniquely identified by their target category
 253 (``TC``) and command ID (``CID``). The target category specifies a general
 254 category for the command (e.g. system in general, vs. battery and AC, vs.
 255 temperature, and so on), while the command ID specifies the command inside
 256 that category. Only the combination of |TC| + |CID| is unique. Additionally,
 257 commands have an instance ID (``IID``), which is used to differentiate
 258 between different sub-devices. For example ``TC=3`` ``CID=1`` is a
 259 request to get the temperature on a thermal sensor, where |IID| specifies
 260 the respective sensor. If the instance ID is not used, it should be set to
 261 zero. If instance IDs are used, they, in general, start with a value of one,
 262 whereas zero may be used for instance independent queries, if applicable. A
 263 response to a request should have the same target category, command ID, and
 264 instance ID as the corresponding request.
 265
 266 Responses are matched to their corresponding request via the request ID
 267 (``RQID``) field. This is a 16 bit wrapping counter similar to the sequence
 268 ID on the frames. Note that the sequence ID of the frames for a
 269 request-response pair does not match. Only the request ID has to match.
 270 Frame-protocol wise these are two separate exchanges, and may even be
 271 separated, e.g. by an event being sent after the request but before the
 272 response. Not all commands produce a response, and this is not detectable by
 273 |TC| + |CID|. It is the responsibility of the issuing party to wait for a
 274 response (or signal this to the communication framework, as is done in
 275 SAN/ACPI via the ``SNC`` flag).
 276
 277 Events are identified by unique and reserved request IDs. These IDs should
 278 not be used by the host when sending a new request. They are used on the
 279 host to, first, detect events and, second, match them with a registered
 280 event handler. Request IDs for events are chosen by the host and directed to
 281 the EC when setting up and enabling an event source (via the
 282 enable-event-source request). The EC then uses the specified request ID for
 283 events sent from the respective source. Note that an event should still be
 284 identified by its target category, command ID, and, if applicable, instance
 285 ID, as a single event source can send multiple different event types. In
 286 general, however, a single target category should map to a single reserved
 287 event request ID.
 288
 289 Furthermore, requests, responses, and events have an associated target ID
 290 (``TID``) and source ID (``SID``). These two fields indicate where a message
 291 originates from (``SID``) and what the intended target of the message is
 292 (``TID``). Note that a response to a specific request therefore has the source
 293 and target IDs swapped when compared to the original request (i.e. the request
 294 target is the response source and the request source is the response target).
 295 See (:c:type:`enum ssh_request_id <ssh_request_id>`) for possible values of
 296 both.
 297
 298 Note that, even though requests and events should be uniquely identifiable by
 299 target category and command ID alone, the EC may require specific target ID and
 300 instance ID values to accept a command. A command that is accepted for
 301 ``TID=1``, for example, may not be accepted for ``TID=2`` and vice versa. While
 302 this may not always hold in reality, you can think of different target/source
 303 IDs indicating different physical ECs with potentially different feature sets.
 304
 305
 306 Limitations and Observations
 307 ============================
 308
 309 The protocol can, in theory, handle up to ``U8_MAX`` frames in parallel,
 310 with up to ``U16_MAX`` pending requests (neglecting request IDs reserved for
 311 events). In practice, however, this is more limited. From our testing
 312 (although via a python and thus a user-space program), it seems that the EC
 313 can handle up to four requests (mostly) reliably in parallel at a certain
 314 time. With five or more requests in parallel, consistent discarding of
 315 commands (ACKed frame but no command response) has been observed. For five
 316 simultaneous commands, this reproducibly resulted in one command being
 317 dropped and four commands being handled.
 318
 319 However, it has also been noted that, even with three requests in parallel,
 320 occasional frame drops happen. Apart from this, with a limit of three
 321 pending requests, no dropped commands (i.e. command being dropped but frame
 322 carrying command being ACKed) have been observed. In any case, frames (and
 323 possibly also commands) should be re-sent by the host if a certain timeout
 324 is exceeded. This is done by the EC for frames with a timeout of one second,
 325 up to two re-tries (i.e. three transmissions in total). The limit of
 326 re-tries also applies to received NAKs, and, in a worst case scenario, can
 327 lead to entire messages being dropped.
 328
 329 While this also seems to work fine for pending data frames as long as no
 330 transmission failures occur, implementation and handling of these seems to
 331 depend on the assumption that there is only one non-acknowledged data frame.
 332 In particular, the detection of repeated frames relies on the last sequence
 333 number. This means that, if a frame that has been successfully received by
 334 the EC is sent again, e.g. due to the host not receiving an |ACK|, the EC
 335 will only detect this if it has the sequence ID of the last frame received
 336 by the EC. As an example: Sending two frames with ``SEQ=0`` and ``SEQ=1``
 337 followed by a repetition of ``SEQ=0`` will not detect the second ``SEQ=0``
 338 frame as such, and thus execute the command in this frame each time it has
 339 been received, i.e. twice in this example. Sending ``SEQ=0``, ``SEQ=1`` and
 340 then repeating ``SEQ=1`` will detect the second ``SEQ=1`` as repetition of
 341 the first one and ignore it, thus executing the contained command only once.
 342
 343 In conclusion, this suggests a limit of at most one pending un-ACKed frame
 344 (per party, effectively leading to synchronous communication regarding
 345 frames) and at most three pending commands. The limit to synchronous frame
 346 transfers seems to be consistent with behavior observed on Windows.