doc-src/IsarRef/syntax.tex
 author wenzelm Sat Oct 30 20:13:16 1999 +0200 (1999-10-30) changeset 7981 5120a2a15d06 parent 7895 7c492d8bc8e3 child 8102 424f6e663977 permissions -rw-r--r--
tuned;
     1

     2 \chapter{Isar Syntax Primitives}

     3

     4 We give a complete reference of all basic syntactic entities underlying the

     5 Isabelle/Isar document syntax.  Actual theory and proof commands will be

     6 introduced later on.

     7

     8 \medskip

     9

    10 In order to get started with writing well-formed Isabelle/Isar documents, the

    11 most important aspect to be noted is the difference of \emph{inner} versus

    12 \emph{outer} syntax.  Inner syntax is that of Isabelle types and terms of the

    13 logic, while outer syntax is that of Isabelle/Isar theories (including

    14 proofs).  As a general rule, inner syntax entities may occur only as

    15 \emph{atomic entities} within outer syntax.  For example, the string

    16 \texttt{"x + y"} and identifier \texttt{z} are legal term specifications

    17 within a theory, while \texttt{x + y} is not.

    18

    19 \begin{warn}

    20   Note that classic Isabelle theories used to fake parts of the inner type

    21   syntax, with rather complicated rules when quotes may be omitted.  Despite

    22   the minor drawback of requiring quotes more often, the syntax of

    23   Isabelle/Isar is simpler and more robust in that respect.

    24 \end{warn}

    25

    26 \medskip

    27

    28 Another notable point is proper input termination.  Proof~General demands any

    29 command to be terminated by \texttt{;}''

    30 (semicolon)\index{semicolon}\index{*;}.  As far as plain Isabelle/Isar is

    31 concerned, commands may be directly run together, though.  In the presentation

    32 of Isabelle/Isar documents, semicolons are omitted in order to gain

    33 readability.

    34

    35

    36 \section{Lexical matters}\label{sec:lex-syntax}

    37

    38 The Isabelle/Isar outer syntax provides token classes as presented below.

    39 Note that some of these coincide (by full intention) with the inner lexical

    40 syntax as presented in \cite{isabelle-ref}.  These different levels of syntax

    41 should not be confused, though.

    42

    43 %FIXME keyword, command

    44 \begin{matharray}{rcl}

    45   ident & = & letter~quasiletter^* \\

    46   longident & = & ident\verb,.,ident~\dots~ident \\

    47   symident & = & sym^+ \\

    48   nat & = & digit^+ \\

    49   var & = & \verb,?,ident ~|~ \verb,?,ident\verb,.,nat \\

    50   typefree & = & \verb,',ident \\

    51   typevar & = & \verb,?,typefree ~|~ \verb,?,typefree\verb,.,nat \\

    52   string & = & \verb,", ~\dots~ \verb,", \\

    53   verbatim & = & \verb,{*, ~\dots~ \verb,*}, \\

    54 \end{matharray}

    55 \begin{matharray}{rcl}

    56   letter & = & \verb,a, ~|~ \dots ~|~ \verb,z, ~|~ \verb,A, ~|~ \dots ~|~ \verb,Z, \\

    57   digit & = & \verb,0, ~|~ \dots ~|~ \verb,9, \\

    58   quasiletter & = & letter ~|~ digit ~|~ \verb,_, ~|~ \verb,', \\

    59   sym & = & \verb,!, ~|~ \verb,#, ~|~ \verb,$, ~|~ \verb,%, ~|~ \verb,&, ~|~ %$

    60    \verb,*, ~|~ \verb,+, ~|~ \verb,-, ~|~ \verb,/, ~|~ \verb,:, ~|~

    61    \verb,<, ~|~ \verb,=, ~|~ \verb,>, ~|~ \verb,?, ~|~ \mathtt{\at} ~|~ \\

    62   & & \verb,^, ~|~ \verb,_, ~|~ \verb,, ~|~ \verb,|, ~|~ \verb,~, \\

    63 \end{matharray}

    64

    65 The syntax of \texttt{string} admits any characters, including newlines;

    66 \verb|"|'' (double-quote) and \verb|\|'' (backslash) have to be escaped by

    67 a backslash.  Note that ML-style control characters are \emph{not} supported.

    68 The body of \texttt{verbatim} may consist of any text not containing

    69 \verb|*}|''.

    70

    71 Comments take the form \texttt{(*~\dots~*)} and may be

    72 nested\footnote{Proof~General may get confused by nested comments.}, just as

    73 in ML. Note that these are \emph{source} comments only, which are stripped

    74 after lexical analysis of the input.  The Isar document syntax also provides

    75 \emph{formal comments} that are actually part of the text (see

    76 \S\ref{sec:comments}).

    77

    78

    79 \section{Common syntax entities}

    80

    81 Subsequently, we introduce several basic syntactic entities, such as names,

    82 terms, and theorem specifications, which have been factored out of the actual

    83 Isar language elements to be described later.

    84

    85 Note that some of the basic syntactic entities introduced below (e.g.\

    86 \railqtoken{name}) act much like tokens rather than plain nonterminals (e.g.\

    87 \railnonterm{sort}), especially for the sake of error messages.  E.g.\ syntax

    88 elements such as $\CONSTS$ referring to \railqtoken{name} or \railqtoken{type}

    89 would really report a missing name or type rather than any of the constituent

    90 primitive tokens such as \railtoken{ident} or \railtoken{string}.

    91

    92

    93 \subsection{Names}

    94

    95 Entity \railqtoken{name} usually refers to any name of types, constants,

    96 theorems etc.\ that are to be \emph{declared} or \emph{defined} (so qualified

    97 identifiers are excluded).  Quoted strings provide an escape for

    98 non-identifier names or those ruled out by outer syntax keywords (e.g.\

    99 \verb|"let"|).  Already existing objects are usually referenced by

   100 \railqtoken{nameref}.

   101

   102 \indexoutertoken{name}\indexoutertoken{parname}\indexoutertoken{nameref}

   103 \begin{rail}

   104   name: ident | symident | string

   105   ;

   106   parname: '(' name ')'

   107   ;

   108   nameref: name | longident

   109   ;

   110 \end{rail}

   111

   112

   113 \subsection{Comments}\label{sec:comments}

   114

   115 Large chunks of plain \railqtoken{text} are usually given

   116 \railtoken{verbatim}, i.e.\ enclosed in \verb|{*|~\dots~\verb|*}|.  For

   117 convenience, any of the smaller text units conforming to \railqtoken{nameref}

   118 are admitted as well.  Almost any of the Isar commands may be annotated by a

   119 marginal \railnonterm{comment} of the form \texttt{--} \railqtoken{text}.

   120 Note that the latter kind of comment is actually part of the language, while

   121 source level comments \verb|(*|~\dots~\verb|*)| are stripped at the lexical

   122 level.  A few commands such as $\PROOFNAME$ admit additional markup with a

   123 level of interest'': \texttt{\%} followed by an optional number $n$ (default

   124 $n = 1$) indicates that the respective part of the document becomes $n$ levels

   125 more obscure; \texttt{\%\%} means that interest drops by $\infty$ --- abandon

   126 every hope, who enter here.

   127

   128 \indexoutertoken{text}\indexouternonterm{comment}\indexouternonterm{interest}

   129 \begin{rail}

   130   text: verbatim | nameref

   131   ;

   132   comment: '--' text

   133   ;

   134   interest: percent nat? | ppercent

   135   ;

   136 \end{rail}

   137

   138

   139 \subsection{Type classes, sorts and arities}

   140

   141 The syntax of sorts and arities is given directly at the outer level.  Note

   142 that this is in contrast to types and terms (see \ref{sec:types-terms}).

   143

   144 \indexouternonterm{sort}\indexouternonterm{arity}\indexouternonterm{simplearity}

   145 \indexouternonterm{classdecl}

   146 \begin{rail}

   147   classdecl: name ('<' (nameref + ','))?

   148   ;

   149   sort: nameref | lbrace (nameref * ',') rbrace

   150   ;

   151   arity: ('(' (sort + ',') ')')? sort

   152   ;

   153   simplearity: ('(' (sort + ',') ')')? nameref

   154   ;

   155 \end{rail}

   156

   157

   158 \subsection{Types and terms}\label{sec:types-terms}

   159

   160 The actual inner Isabelle syntax, that of types and terms of the logic, is far

   161 too sophisticated in order to be modelled explicitly at the outer theory

   162 level.  Basically, any such entity has to be quoted here to turn it into a

   163 single token (the parsing and type-checking is performed internally later).

   164 For convenience, a slightly more liberal convention is adopted: quotes may be

   165 omitted for any type or term that is already \emph{atomic} at the outer level.

   166 For example, one may write just \texttt{x} instead of \texttt{"x"}.  Note that

   167 symbolic identifiers (e.g.\ \texttt{++}) are available as well, provided these

   168 are not superseded by commands or keywords (e.g.\ \texttt{+}).

   169

   170 \indexoutertoken{type}\indexoutertoken{term}\indexoutertoken{prop}

   171 \begin{rail}

   172   type: nameref | typefree | typevar

   173   ;

   174   term: nameref | var | nat

   175   ;

   176   prop: term

   177   ;

   178 \end{rail}

   179

   180 Type declarations and definitions usually refer to \railnonterm{typespec} on

   181 the left-hand side.  This models basic type constructor application at the

   182 outer syntax level.  Note that only plain postfix notation is available here,

   183 but no infixes.

   184

   185 \indexouternonterm{typespec}

   186 \begin{rail}

   187   typespec: (() | typefree | '(' ( typefree + ',' ) ')') name

   188   ;

   189 \end{rail}

   190

   191

   192 \subsection{Term patterns}\label{sec:term-pats}

   193

   194 Assumptions and goal statements usually admit casual binding of schematic term

   195 variables by giving (optional) patterns of the form $\ISS{p@1\;\dots}{p@n}$.

   196 There are separate versions available for \railqtoken{term}s and

   197 \railqtoken{prop}s.  The latter provides a $\CONCLNAME$ part with patterns

   198 referring the (atomic) conclusion of a rule.

   199

   200 \indexouternonterm{termpat}\indexouternonterm{proppat}

   201 \begin{rail}

   202   termpat: '(' ('is' term +) ')'

   203   ;

   204   proppat: '(' (('is' prop +) | 'concl' ('is' prop +) | ('is' prop +) 'concl' ('is' prop +)) ')'

   205   ;

   206 \end{rail}

   207

   208

   209 \subsection{Mixfix annotations}

   210

   211 Mixfix annotations specify concrete \emph{inner} syntax of Isabelle types and

   212 terms.  Some commands such as $\TYPES$ (see \S\ref{sec:types-pure}) admit

   213 infixes only, while $\CONSTS$ (see \S\ref{sec:consts}) and

   214 $\isarkeyword{syntax}$ (see \S\ref{sec:syn-trans}) support the full range of

   215 general mixfixes and binders.

   216

   217 \indexouternonterm{infix}\indexouternonterm{mixfix}

   218 \begin{rail}

   219   infix: '(' ('infixl' | 'infixr') string? nat ')'

   220   ;

   221   mixfix: infix | '(' string prios? nat? ')' | '(' 'binder' string prios? nat ')'

   222   ;

   223

   224   prios: '[' (nat + ',') ']'

   225   ;

   226 \end{rail}

   227

   228

   229 \subsection{Attributes and theorems}\label{sec:syn-att}

   230

   231 Attributes (and proof methods, see \S\ref{sec:syn-meth}) have their own

   232 semi-inner'' syntax, in the sense that input conforming to

   233 \railnonterm{args} below is parsed by the attribute a second time.  The

   234 attribute argument specifications may be any sequence of atomic entities

   235 (identifiers, strings etc.), or properly bracketed argument lists.  Below

   236 \railqtoken{atom} refers to any atomic entity, including any

   237 \railtoken{keyword} conforming to \railtoken{symident}.

   238

   239 \indexoutertoken{atom}\indexouternonterm{args}\indexouternonterm{attributes}

   240 \begin{rail}

   241   atom: nameref | typefree | typevar | var | nat | keyword

   242   ;

   243   arg: atom | '(' args ')' | '[' args ']' | lbrace args rbrace

   244   ;

   245   args: arg *

   246   ;

   247   attributes: '[' (nameref args * ',') ']'

   248   ;

   249 \end{rail}

   250

   251 Theorem specifications come in several flavors: \railnonterm{axmdecl} and

   252 \railnonterm{thmdecl} usually refer to axioms, assumptions or results of goal

   253 statements, while \railnonterm{thmdef} collects lists of existing theorems.

   254 Existing theorems are given by \railnonterm{thmref} and \railnonterm{thmrefs},

   255 the former requires an actual singleton result.  Any of these theorem

   256 specifications may include lists of attributes both on the left and right hand

   257 sides; attributes are applied to any immediately preceding theorem.  If names

   258 are omitted, the theorems are not stored within the theorem database of the

   259 theory or proof context; any given attributes are still applied, though.

   260

   261 \indexouternonterm{thmdecl}\indexouternonterm{axmdecl}

   262 \indexouternonterm{thmdef}\indexouternonterm{thmrefs}

   263 \begin{rail}

   264   axmdecl: name attributes? ':'

   265   ;

   266   thmdecl: thmname ':'

   267   ;

   268   thmdef: thmname '='

   269   ;

   270   thmref: nameref attributes?

   271   ;

   272   thmrefs: thmref +

   273   ;

   274

   275   thmname: name attributes | name | attributes

   276   ;

   277 \end{rail}

   278

   279

   280 \subsection{Proof methods}\label{sec:syn-meth}

   281

   282 Proof methods are either basic ones, or expressions composed of methods via

   283 \texttt{,}'' (sequential composition), \texttt{|}'' (alternative choices),

   284 \texttt{?}'' (try), \texttt{+}'' (repeat at least once).  In practice,

   285 proof methods are usually just a comma separated list of

   286 \railqtoken{nameref}~\railnonterm{args} specifications.  Note that parentheses

   287 may be dropped for single method specifications (with no arguments).

   288

   289 \indexouternonterm{method}

   290 \begin{rail}

   291   method: (nameref | '(' methods ')') (() | '?' | '+')

   292   ;

   293   methods: (nameref args | method) + (',' | '|')

   294   ;

   295 \end{rail}

   296

   297

   298 %%% Local Variables:

   299 %%% mode: latex

   300 %%% TeX-master: "isar-ref"

   301 %%% End:
`