.. index:: value
   pair: binary format; value
.. _binary-value:

Values
------


.. index:: byte
   pair: binary format; byte
.. _binary-byte:

Bytes
~~~~~

:ref:`Bytes <syntax-byte>` encode themselves.

.. math::
   \begin{array}{llcll@{\qquad}l}
   \production{byte} & \Bbyte &::=&
     \hex{00} &\Rightarrow& \hex{00} \\ &&|&&
     \dots \\ &&|&
     \hex{FF} &\Rightarrow& \hex{FF} \\
   \end{array}


.. index:: integer, unsigned integer, signed integer, uninterpreted integer, LEB128, two's complement
   pair: binary format; integer
   pair: binary format; unsigned integer
   pair: binary format; signed integer
   pair: binary format; uninterpreted integer
.. _binary-sint:
.. _binary-uint:
.. _binary-int:

Integers
~~~~~~~~

All :ref:`integers <syntax-int>` are encoded using the |LEB128|_ variable-length integer encoding, in either unsigned or signed variant.

:ref:`Unsigned integers <syntax-uint>` are encoded in |UnsignedLEB128|_ format.
As an additional constraint, the total number of bytes encoding a value of type :math:`\uN` must not exceed :math:`\F{ceil}(N/7)` bytes.

.. math::
   \begin{array}{llclll@{\qquad}l}
   \production{unsigned integer} & \BuN &::=&
     n{:}\Bbyte &\Rightarrow& n & (\iff n < 2^7 \wedge n < 2^N) \\ &&|&
     n{:}\Bbyte~~m{:}\BuX{(N\B{-7})} &\Rightarrow&
       2^7\cdot m + (n-2^7) & (\iff n \geq 2^7 \wedge N > 7) \\
   \end{array}

:ref:`Signed integers <syntax-sint>` are encoded in |SignedLEB128|_ format, which uses a two's complement representation.
As an additional constraint, the total number of bytes encoding a value of type :math:`\sN` must not exceed :math:`\F{ceil}(N/7)` bytes.

.. math::
   \begin{array}{llclll@{\qquad}l}
   \production{signed integer} & \BsN &::=&
     n{:}\Bbyte &\Rightarrow& n & (\iff n < 2^6 \wedge n < 2^{N-1}) \\ &&|&
     n{:}\Bbyte &\Rightarrow& n-2^7 & (\iff 2^6 \leq n < 2^7 \wedge n \geq 2^7-2^{N-1}) \\ &&|&
     n{:}\Bbyte~~m{:}\BsX{(N\B{-7})} &\Rightarrow&
       2^7\cdot m + (n-2^7) & (\iff n \geq 2^7 \wedge N > 7) \\
   \end{array}

:ref:`Uninterpreted integers <syntax-int>` are encoded as signed integers.

.. math::
   \begin{array}{llclll@{\qquad\qquad}l}
   \production{uninterpreted integer} & \BiN &::=&
     n{:}\BsN &\Rightarrow& i & (\iff n = \signed_{\iN}(i))
   \end{array}

.. note::
   The side conditions :math:`N > 7` in the productions for non-terminal bytes of the :math:`\uX{}` and :math:`\sX{}` encodings restrict the encoding's length.
   However, "trailing zeros" are still allowed within these bounds.
   For example, :math:`\hex{03}` and :math:`\hex{83}~\hex{00}` are both well-formed encodings for the value :math:`3` as a |u8|.
   Similarly, either of :math:`\hex{7e}` and :math:`\hex{FE}~\hex{7F}` and :math:`\hex{FE}~\hex{FF}~\hex{7F}` are well-formed encodings of the value :math:`-2` as a |s16|.

   The side conditions on the value :math:`n` of terminal bytes further enforce that
   any unused bits in these bytes must be :math:`0` for positive values and :math:`1` for negative ones.
   For example, :math:`\hex{83}~\hex{10}` is malformed as a |u8| encoding.
   Similarly, both :math:`\hex{83}~\hex{3E}` and :math:`\hex{FF}~\hex{7B}` are malformed as |s8| encodings.


.. index:: floating-point number, little endian
   pair: binary format; floating-point number
.. _binary-float:

Floating-Point
~~~~~~~~~~~~~~

:ref:`Floating-point <syntax-float>` values are encoded directly by their |IEEE754|_ bit pattern in |LittleEndian|_ byte order:

.. math::
   \begin{array}{llclll@{\qquad\qquad}l}
   \production{floating-point value} & \BfN &::=&
     b^\ast{:\,}\Bbyte^{N/8} &\Rightarrow& \bytes_{\fN}^{-1}(b^\ast) \\
   \end{array}


.. index:: name, byte, Unicode, ! UTF-8
   pair: binary format; name
.. _binary-utf8:
.. _binary-name:

Names
~~~~~

:ref:`Names <syntax-name>` are encoded as a :ref:`vector <binary-vec>` of bytes containing the |Unicode|_ UTF-8 encoding of the name's code point sequence.

.. math::
   \begin{array}{llclllll}
   \production{name} & \Bname &::=&
     b^\ast{:}\Bvec(\Bbyte) &\Rightarrow& \name
       && (\iff \utf8(\name) = b^\ast) \\
   \end{array}

The auxiliary |utf8| function expressing this encoding is defined as follows:

.. math::
   \begin{array}{@{}lcl@{\qquad}l}
   \utf8(c^\ast) &=& (\utf8(c))^\ast \\[1ex]
   \utf8(c) &=& b &
     (\begin{array}[t]{@{}c@{~}l@{}}
      \iff & c < \unicode{80} \\
      \wedge & c = b) \\
      \end{array} \\
   \utf8(c) &=& b_1~b_2 &
     (\begin{array}[t]{@{}c@{~}l@{}}
      \iff & \unicode{80} \leq c < \unicode{800} \\
      \wedge & c = 2^6(b_1-\hex{C0})+(b_2-\hex{80})) \\
      \end{array} \\
   \utf8(c) &=& b_1~b_2~b_3 &
     (\begin{array}[t]{@{}c@{~}l@{}}
      \iff & \unicode{800} \leq c < \unicode{10000} \\
      \wedge & c = 2^{12}(b_1-\hex{C0})+2^6(b_2-\hex{80})+(b_3-\hex{80})) \\
      \end{array} \\
   \utf8(c) &=& b_1~b_2~b_3~b_4 &
     (\begin{array}[t]{@{}c@{~}l@{}}
      \iff & \unicode{10000} \leq c < \unicode{110000} \\
      \wedge & c = 2^{18}(b_1-\hex{C0})+2^{12}(b_2-\hex{80})+2^6(b_3-\hex{80})+(b_4-\hex{80})) \\
      \end{array} \\
   \end{array}
