.. index:: value
   pair: text format; value
.. _text-value:

Values
------

The grammar productions in this section define *lexical syntax*,
hence no :ref:`white space <text-space>` is allowed.


.. index:: integer, unsigned integer, signed integer, uninterpreted integer
   pair: text format; integer
   pair: text format; unsigned integer
   pair: text format; signed integer
   pair: text format; uninterpreted integer
.. _text-sign:
.. _text-digit:
.. _text-hexdigit:
.. _text-num:
.. _text-hexnum:
.. _text-sint:
.. _text-uint:
.. _text-int:

Integers
~~~~~~~~

All :ref:`integers <syntax-int>` can be written in either decimal or hexadecimal notation.
In both cases, digits can optionally be separated by underscores.

.. math::
   \begin{array}{llclll@{\qquad}l}
   \production{sign} & \Tsign &::=&
     \epsilon \Rightarrow {+} ~~|~~
     \text{+} \Rightarrow {+} ~~|~~
     \text{-} \Rightarrow {-} \\
   \production{decimal digit} & \Tdigit &::=&
     \text{0} \Rightarrow 0 ~~|~~ \dots ~~|~~ \text{9} \Rightarrow 9 \\
   \production{hexadecimal digit} & \Thexdigit &::=&
     d{:}\Tdigit \Rightarrow d \\ &&|&
     \text{A} \Rightarrow 10 ~~|~~ \dots ~~|~~ \text{F} \Rightarrow 15 \\ &&|&
     \text{a} \Rightarrow 10 ~~|~~ \dots ~~|~~ \text{f} \Rightarrow 15
   \\[1ex]
   \production{decimal number} & \Tnum &::=&
     d{:}\Tdigit &\Rightarrow& d \\ &&|&
     n{:}\Tnum~~\text{\_}^?~~d{:}\Tdigit &\Rightarrow& 10\cdot n + d \\
   \production{hexadecimal number} & \Thexnum &::=&
     h{:}\Thexdigit &\Rightarrow& h \\ &&|&
     n{:}\Thexnum~~\text{\_}^?~~h{:}\Thexdigit &\Rightarrow& 16\cdot n + h \\
   \end{array}

The allowed syntax for integer literals depends on size and signedness.
Moreover, their value must lie within the range of the respective type.

.. math::
   \begin{array}{llclll@{\qquad}l}
   \production{unsigned integer} & \TuN &::=&
     n{:}\Tnum &\Rightarrow& n & (\iff n < 2^N) \\ &&|&
     \text{0x}~~n{:}\Thexnum &\Rightarrow& n & (\iff n < 2^N) \\
   \production{signed integer} & \TsN &::=&
     {\pm}{:}\Tsign~~n{:}\Tnum &\Rightarrow& \pm n & (\iff -2^{N-1} \leq \pm n < 2^{N-1}) \\ &&|&
     {\pm}{:}\Tsign~~\text{0x}~~n{:}\Thexnum &\Rightarrow& \pm n & (\iff -2^{N-1} \leq \pm n < 2^{N-1}) \\
   \end{array}

:ref:`Uninterpreted integers <syntax-int>` can be written as either signed or unsigned, and are normalized to unsigned in the abstract syntax.

.. math::
   \begin{array}{llclll@{\qquad\qquad}l}
   \production{uninterpreted integers} & \TiN &::=&
     n{:}\TuN &\Rightarrow& n \\ &&|&
     i{:}\TsN &\Rightarrow& n & (\iff i = \signed(n)) \\
   \end{array}


.. index:: floating-point number
   pair: text format; floating-point number
.. _text-frac:
.. _text-hexfrac:
.. _text-hexfloat:
.. _text-float:

Floating-Point
~~~~~~~~~~~~~~

:ref:`Floating-point <syntax-float>` values can be represented in either decimal or hexadecimal notation.

.. math::
   \begin{array}{llclll@{\qquad\qquad}l}
   \production{decimal floating-point fraction} & \Tfrac &::=&
     \epsilon &\Rightarrow& 0 \\ &&|&
     d{:}\Tdigit~~q{:}\Tfrac &\Rightarrow& (d+q)/10 \\ &&|&
     d{:}\Tdigit~~\text{\_}~~p{:}\Tdigit~~q{:}\Tfrac &\Rightarrow& (d+(p+q)/10)/10 \\
   \production{hexadecimal floating-point fraction} & \Thexfrac &::=&
     \epsilon &\Rightarrow& 0 \\ &&|&
     h{:}\Thexdigit~~q{:}\Thexfrac &\Rightarrow& (h+q)/16 \\ &&|&
     h{:}\Thexdigit~~\text{\_}~~~~p{:}\Thexdigit~~q{:}\Thexfrac &\Rightarrow& (h+(p+q)/16)/16 \\
   \production{decimal floating-point number} & \Tfloat &::=&
     p{:}\Tnum~\text{.}~q{:}\Tfrac
       &\Rightarrow& p+q \\ &&|&
     p{:}\Tnum~(\text{E}~|~\text{e})~{\pm}{:}\Tsign~e{:}\Tnum
       &\Rightarrow& p\cdot 10^{\pm e} \\ &&|&
     p{:}\Tnum~\text{.}~q{:}\Tfrac~(\text{E}~|~\text{e})~{\pm}{:}\Tsign~e{:}\Tnum
       &\Rightarrow& (p+q)\cdot 10^{\pm e} \\
   \production{hexadecimal floating-point number} & \Thexfloat &::=&
     \text{0x}~p{:}\Thexnum~\text{.}~q{:}\Thexfrac
       &\Rightarrow& p+q \\ &&|&
     \text{0x}~p{:}\Thexnum~(\text{P}~|~\text{p})~{\pm}{:}\Tsign~e{:}\Tnum
       &\Rightarrow& p\cdot 2^{\pm e} \\ &&|&
     \text{0x}~p{:}\Thexnum~\text{.}~q{:}\Thexfrac~(\text{P}~|~\text{p})~{\pm}{:}\Tsign~e{:}\Tnum
       &\Rightarrow& (p+q)\cdot 2^{\pm e}
   \end{array}

The value of a literal must not lie outside the representable range of the corresponding |IEEE754|_ type
(that is, a numeric value must not overflow to :math:`\pm\mbox{infinity}`),
but it may be :ref:`rounded <aux-ieee>` to the nearest representable value.

.. note::
   Rounding can be prevented by using hexadecimal notation with no more significant bits than supported by the required type.

Floating-point values may also be written as constants for *infinity* or *canonical NaN* (*not a number*).
Furthermore, arbitrary NaN values may be expressed by providing an explicit payload value.

.. math::
   \begin{array}{llclll@{\qquad\qquad}l}
   \production{floating-point value} & \TfN &::=&
     {\pm}{:}\Tsign~z{:}\TfNmag &\Rightarrow& \pm z \\
   \production{floating-point magnitude} & \TfNmag &::=&
     z{:}\Tfloat &\Rightarrow& \ieee_N(z) & (\iff \ieee_N(z) \neq \pm \infty) \\ &&|&
     z{:}\Thexfloat &\Rightarrow& \ieee_N(z) & (\iff \ieee_N(z) \neq \pm \infty) \\ &&|&
     \text{inf} &\Rightarrow& \infty \\ &&|&
     \text{nan} &\Rightarrow& \NAN(2^{\significand(N)-1}) \\ &&|&
     \text{nan{:}0x}~n{:}\Thexnum &\Rightarrow& \NAN(n) & (\iff 1 \leq n < 2^{\significand(N)}) \\
   \end{array}


.. index:: ! string, byte, character, ASCII, Unicode, UTF-8
   pair: text format; byte
   pair: text format; string
.. _text-byte:
.. _text-string:

Strings
~~~~~~~

*Strings* denote sequences of bytes that can represent both textual and binary data.
They are enclosed in quotation marks
and may contain any character other than |ASCII|_ control characters, quotation marks (:math:`\text{"}`), or backslash (:math:`\text{\backslash}`),
except when expressed with an *escape sequence*.

.. math::
   \begin{array}{llclll@{\qquad\qquad}l}
   \production{string} & \Tstring &::=&
     \text{"}~(b^\ast{:}\Tstringelem)^\ast~\text{"}
       &\Rightarrow& \concat((b^\ast)^\ast)
       & (\iff |\concat((b^\ast)^\ast)| < 2^{32}) \\
   \production{string element} & \Tstringelem &::=&
     c{:}\Tstringchar &\Rightarrow& \utf8(c) \\ &&|&
     \text{\backslash}~n{:}\Thexdigit~m{:}\Thexdigit
       &\Rightarrow& 16\cdot n+m \\
   \end{array}

Each character in a string literal represents the byte sequence corresponding to its UTF-8 |Unicode|_ (Section 2.5) encoding,
except for hexadecimal escape sequences :math:`\textl\backslash hh\textr`, which represent raw bytes of the respective value.

.. math::
   \begin{array}{llclll@{\qquad\qquad}l}
   \production{string character} & \Tstringchar &::=&
     c{:}\Tchar &\Rightarrow& c \qquad
       & (\iff c \geq \unicode{20} \wedge c \neq \unicode{7F} \wedge c \neq \text{"} c \neq \text{\backslash}) \\ &&|&
     \text{\backslash t} &\Rightarrow& \unicode{09} \\ &&|&
     \text{\backslash n} &\Rightarrow& \unicode{0A} \\ &&|&
     \text{\backslash r} &\Rightarrow& \unicode{0D} \\ &&|&
     \text{\backslash{"}} &\Rightarrow& \unicode{22} \\ &&|&
     \text{\backslash{'}} &\Rightarrow& \unicode{27} \\ &&|&
     \text{\backslash\backslash} &\Rightarrow& \unicode{5C} \\ &&|&
     \text{\backslash u\{}~n{:}\Thexnum~\text{\}}
       &\Rightarrow& \unicode{(n)} & (\iff n < \hex{D800} \vee \hex{E000} \leq n < \hex{110000}) \\
   \end{array}


.. index:: name, byte, character, code point
   pair: text format; name
.. _text-name:

Names
~~~~~

:ref:`Names <syntax-name>` are strings denoting a literal character sequence. 
A name string must form a valid UTF-8 encoding as defined by |Unicode|_ (Section 2.5) that is interpreted as a string of Unicode code points.

.. math::
   \begin{array}{llclll@{\qquad}l}
   \production{name} & \Tname &::=&
     b^\ast{:}\Tstring &\Rightarrow& c^\ast & (\iff b^\ast = \utf8(c^\ast)) \\
   \end{array}

.. note::
   Presuming the source text is itself encoded correctly,
   strings that do not contain any uses of hexadecimal byte escapes are always valid names.


.. index:: ! identifiers
   pair: text format; identifiers
.. _text-idchar:
.. _text-id:

Identifiers
~~~~~~~~~~~

:ref:`Indices <syntax-index>` can be given in both numeric and symbolic form.
Symbolic *identifiers* that stand in lieu of indices start with :math:`\text{\$}`, followed by any sequence of printable |ASCII|_ characters that does not contain a space, quotation mark, comma, semicolon, or bracket.

.. math::
   \begin{array}{llclll@{\qquad}l}
   \production{identifier} & \Tid &::=&
     \text{\$}~\Tidchar^+ \\
   \production{identifier character} & \Tidchar &::=&
     \text{0} ~~|~~ \dots ~~|~~ \text{9} \\ &&|&
     \text{A} ~~|~~ \dots ~~|~~ \text{Z} \\ &&|&
     \text{a} ~~|~~ \dots ~~|~~ \text{z} \\ &&|&
     \text{!} ~~|~~
     \text{\#} ~~|~~
     \text{\$} ~~|~~
     \text{\%} ~~|~~
     \text{\&} ~~|~~
     \text{'} ~~|~~
     \text{*} ~~|~~
     \text{+} ~~|~~
     \text{-} ~~|~~
     \text{.} ~~|~~
     \text{/} \\ &&|&
     \text{:} ~~|~~
     \text{<} ~~|~~
     \text{=} ~~|~~
     \text{>} ~~|~~
     \text{?} ~~|~~
     \text{@} ~~|~~
     \text{\backslash} ~~|~~
     \text{\hat{~~}} ~~|~~
     \text{\_} ~~|~~
     \text{\grave{~~}} ~~|~~
     \text{|} ~~|~~
     \text{\tilde{~~}} \\
   \end{array}

.. _text-id-fresh:

Conventions
...........

The expansion rules of some abbreviations require insertion of a *fresh* identifier.
That may be any syntactically valid identifier that does not already occur in the given source text.
