doc-src/IsarRef/syntax.tex
 author wenzelm Sun May 21 14:33:46 2000 +0200 (2000-05-21) changeset 8896 c80aba8c1d5e parent 8690 48786b52c8d8 child 9051 887a15590f0e permissions -rw-r--r--
replaced {{ }} by { };
     1

     2 \chapter{Isar Syntax Primitives}

     3

     4 We give a complete reference of all basic syntactic entities underlying the

     5 Isabelle/Isar document syntax.  Actual theory and proof commands will be

     6 introduced later on.

     7

     8 \medskip

     9

    10 In order to get started with writing well-formed Isabelle/Isar documents, the

    11 most important aspect to be noted is the difference of \emph{inner} versus

    12 \emph{outer} syntax.  Inner syntax is that of Isabelle types and terms of the

    13 logic, while outer syntax is that of Isabelle/Isar theories (including

    14 proofs).  As a general rule, inner syntax entities may occur only as

    15 \emph{atomic entities} within outer syntax.  For example, the string

    16 \texttt{"x + y"} and identifier \texttt{z} are legal term specifications

    17 within a theory, while \texttt{x + y} is not.

    18

    19 \begin{warn}

    20   Note that classic Isabelle theories used to fake parts of the inner syntax

    21   of types, with rather complicated rules when quotes may be omitted.  Despite

    22   the minor drawback of requiring quotes more often, the syntax of

    23   Isabelle/Isar is much simpler and more robust in that respect.

    24 \end{warn}

    25

    26 \medskip

    27

    28 Another notable point is proper input termination.  Proof~General demands any

    29 command to be terminated by \texttt{;}''

    30 (semicolon)\index{semicolon}\index{*;}.  As far as plain Isabelle/Isar is

    31 concerned, commands may be directly run together, though.  In the presentation

    32 of Isabelle/Isar documents, semicolons are omitted in order to gain

    33 readability.

    34

    35

    36 \section{Lexical matters}\label{sec:lex-syntax}

    37

    38 The Isabelle/Isar outer syntax provides token classes as presented below.

    39 Note that some of these coincide (by full intention) with the inner lexical

    40 syntax as presented in \cite{isabelle-ref}.  These different levels of syntax

    41 should not be confused, though.

    42

    43 %FIXME keyword, command

    44 \begin{matharray}{rcl}

    45   ident & = & letter~quasiletter^* \\

    46   longident & = & ident\verb,.,ident~\dots~ident \\

    47   symident & = & sym^+ ~|~ symbol \\

    48   nat & = & digit^+ \\

    49   var & = & \verb,?,ident ~|~ \verb,?,ident\verb,.,nat \\

    50   typefree & = & \verb,',ident \\

    51   typevar & = & \verb,?,typefree ~|~ \verb,?,typefree\verb,.,nat \\

    52   string & = & \verb,", ~\dots~ \verb,", \\

    53   verbatim & = & \verb,{*, ~\dots~ \verb,*}, \\

    54 \end{matharray}

    55 \begin{matharray}{rcl}

    56   letter & = & \verb,a, ~|~ \dots ~|~ \verb,z, ~|~ \verb,A, ~|~ \dots ~|~ \verb,Z, \\

    57   digit & = & \verb,0, ~|~ \dots ~|~ \verb,9, \\

    58   quasiletter & = & letter ~|~ digit ~|~ \verb,_, ~|~ \verb,', \\

    59   sym & = & \verb,!, ~|~ \verb,#, ~|~ \verb,$, ~|~ \verb,%, ~|~ \verb,&, ~|~ %$

    60    \verb,*, ~|~ \verb,+, ~|~ \verb,-, ~|~ \verb,/, ~|~ \verb,:, ~|~

    61    \verb,<, ~|~ \verb,=, ~|~ \verb,>, ~|~ \verb,?, ~|~ \mathtt{\at} ~|~ \\

    62   & & \verb,^, ~|~ \verb,_, ~|~ \verb,, ~|~ \verb,|, ~|~ \verb,~, \\

    63   symbol & = & {\forall} ~|~ {\exists} ~|~ \dots

    64 \end{matharray}

    65

    66 The syntax of \texttt{string} admits any characters, including newlines;

    67 \verb|"|'' (double-quote) and \verb|\|'' (backslash) have to be escaped by

    68 a backslash.  Note that ML-style control characters are \emph{not} supported.

    69 The body of \texttt{verbatim} may consist of any text not containing

    70 \verb|*}|''.

    71

    72 Comments take the form \texttt{(*~\dots~*)} and may be

    73 nested\footnote{Proof~General may occasionally get confused by nested

    74   comments.}, just as in ML. Note that these are \emph{source} comments only,

    75 which are stripped after lexical analysis of the input.  The Isar document

    76 syntax also provides \emph{formal comments} that are actually part of the text

    77 (see \S\ref{sec:comments}).

    78

    79

    80 \section{Common syntax entities}

    81

    82 Subsequently, we introduce several basic syntactic entities, such as names,

    83 terms, and theorem specifications, which have been factored out of the actual

    84 Isar language elements to be described later.

    85

    86 Note that some of the basic syntactic entities introduced below (e.g.\

    87 \railqtoken{name}) act much like tokens rather than plain nonterminals (e.g.\

    88 \railnonterm{sort}), especially for the sake of error messages.  E.g.\ syntax

    89 elements such as $\CONSTS$ referring to \railqtoken{name} or \railqtoken{type}

    90 would really report a missing name or type rather than any of the constituent

    91 primitive tokens such as \railtoken{ident} or \railtoken{string}.

    92

    93

    94 \subsection{Names}

    95

    96 Entity \railqtoken{name} usually refers to any name of types, constants,

    97 theorems etc.\ that are to be \emph{declared} or \emph{defined} (so qualified

    98 identifiers are excluded here).  Quoted strings provide an escape for

    99 non-identifier names or those ruled out by outer syntax keywords (e.g.\

   100 \verb|"let"|).  Already existing objects are usually referenced by

   101 \railqtoken{nameref}.

   102

   103 \indexoutertoken{name}\indexoutertoken{parname}\indexoutertoken{nameref}

   104 \begin{rail}

   105   name: ident | symident | string | nat

   106   ;

   107   parname: '(' name ')'

   108   ;

   109   nameref: name | longident

   110   ;

   111 \end{rail}

   112

   113

   114 \subsection{Comments}\label{sec:comments}

   115

   116 Large chunks of plain \railqtoken{text} are usually given

   117 \railtoken{verbatim}, i.e.\ enclosed in \verb|{*|~\dots~\verb|*}|.  For

   118 convenience, any of the smaller text units conforming to \railqtoken{nameref}

   119 are admitted as well.  Almost any of the Isar commands may be annotated by

   120 marginal \railnonterm{comment} of the form \texttt{--} \railqtoken{text}.

   121 Note that the latter kind of comment is actually part of the language, while

   122 source level comments \verb|(*|~\dots~\verb|*)| are stripped at the lexical

   123 level.  A few commands such as $\PROOFNAME$ admit additional markup with a

   124 level of interest'': \texttt{\%} followed by an optional number $n$ (default

   125 $n = 1$) indicates that the respective part of the document becomes $n$ levels

   126 more obscure; \texttt{\%\%} means that interest drops by $\infty$ --- abandon

   127 every hope, who enter here.

   128

   129 \indexoutertoken{text}\indexouternonterm{comment}\indexouternonterm{interest}

   130 \begin{rail}

   131   text: verbatim | nameref

   132   ;

   133   comment: ('--' text +)

   134   ;

   135   interest: percent nat? | ppercent

   136   ;

   137 \end{rail}

   138

   139

   140 \subsection{Type classes, sorts and arities}

   141

   142 Classes are specified by plain names.  Sorts have a very simple inner syntax,

   143 which is either a single class name $c$ or a list $\{c@1, \dots, c@n\}$

   144 referring to the intersection of these classes.  The syntax of type arities is

   145 given directly at the outer level.

   146

   147 \indexouternonterm{sort}\indexouternonterm{arity}\indexouternonterm{simplearity}

   148 \indexouternonterm{classdecl}

   149 \begin{rail}

   150   classdecl: name ('<' (nameref + ','))?

   151   ;

   152   sort: nameref

   153   ;

   154   arity: ('(' (sort + ',') ')')? sort

   155   ;

   156   simplearity: ('(' (sort + ',') ')')? nameref

   157   ;

   158 \end{rail}

   159

   160

   161 \subsection{Types and terms}\label{sec:types-terms}

   162

   163 The actual inner Isabelle syntax, that of types and terms of the logic, is far

   164 too sophisticated in order to be modelled explicitly at the outer theory

   165 level.  Basically, any such entity has to be quoted to turn it into a single

   166 token (the parsing and type-checking is performed internally later).  For

   167 convenience, a slightly more liberal convention is adopted: quotes may be

   168 omitted for any type or term that is already \emph{atomic} at the outer level.

   169 For example, one may write just \texttt{x} instead of \texttt{"x"}.  Note that

   170 symbolic identifiers (e.g.\ \texttt{++} or $\forall$) are available as well,

   171 provided these are not superseded by commands or keywords (e.g.\ \texttt{+}).

   172

   173 \indexoutertoken{type}\indexoutertoken{term}\indexoutertoken{prop}

   174 \begin{rail}

   175   type: nameref | typefree | typevar

   176   ;

   177   term: nameref | var

   178   ;

   179   prop: term

   180   ;

   181 \end{rail}

   182

   183 Positional instantiations are indicated by giving a sequence of terms, or the

   184 placeholder $\_$'' (underscore), which means to skip a position.

   185

   186 \indexoutertoken{inst}\indexoutertoken{insts}

   187 \begin{rail}

   188   inst: underscore | term

   189   ;

   190   insts: (inst *)

   191   ;

   192 \end{rail}

   193

   194 Type declarations and definitions usually refer to \railnonterm{typespec} on

   195 the left-hand side.  This models basic type constructor application at the

   196 outer syntax level.  Note that only plain postfix notation is available here,

   197 but no infixes.

   198

   199 \indexouternonterm{typespec}

   200 \begin{rail}

   201   typespec: (() | typefree | '(' ( typefree + ',' ) ')') name

   202   ;

   203 \end{rail}

   204

   205

   206 \subsection{Term patterns}\label{sec:term-pats}

   207

   208 Assumptions and goal statements usually admit casual binding of schematic term

   209 variables by giving (optional) patterns of the form $\ISS{p@1\;\dots}{p@n}$.

   210 There are separate versions available for \railqtoken{term}s and

   211 \railqtoken{prop}s.  The latter provides a $\CONCLNAME$ part with patterns

   212 referring the (atomic) conclusion of a rule.

   213

   214 \indexouternonterm{termpat}\indexouternonterm{proppat}

   215 \begin{rail}

   216   termpat: '(' ('is' term +) ')'

   217   ;

   218   proppat: '(' (('is' prop +) | 'concl' ('is' prop +) | ('is' prop +) 'concl' ('is' prop +)) ')'

   219   ;

   220 \end{rail}

   221

   222

   223 \subsection{Mixfix annotations}

   224

   225 Mixfix annotations specify concrete \emph{inner} syntax of Isabelle types and

   226 terms (see also \cite{isabelle-ref}).  Some commands such as $\TYPES$ (see

   227 \S\ref{sec:types-pure}) admit infixes only, while $\CONSTS$ (see

   228 \S\ref{sec:consts}) and $\isarkeyword{syntax}$ (see \S\ref{sec:syn-trans})

   229 support the full range of general mixfixes and binders.

   230

   231 \indexouternonterm{infix}\indexouternonterm{mixfix}

   232 \begin{rail}

   233   infix: '(' ('infixl' | 'infixr') string? nat ')'

   234   ;

   235   mixfix: infix | '(' string prios? nat? ')' | '(' 'binder' string prios? nat ')'

   236   ;

   237

   238   prios: '[' (nat + ',') ']'

   239   ;

   240 \end{rail}

   241

   242

   243 \subsection{Attributes and theorems}\label{sec:syn-att}

   244

   245 Attributes (and proof methods, see \S\ref{sec:syn-meth}) have their own

   246 semi-inner'' syntax, in the sense that input conforming to

   247 \railnonterm{args} below is parsed by the attribute a second time.  The

   248 attribute argument specifications may be any sequence of atomic entities

   249 (identifiers, strings etc.), or properly bracketed argument lists.  Below

   250 \railqtoken{atom} refers to any atomic entity, including any

   251 \railtoken{keyword} conforming to \railtoken{symident}.

   252

   253 \indexoutertoken{atom}\indexouternonterm{args}\indexouternonterm{attributes}

   254 \begin{rail}

   255   atom: nameref | typefree | typevar | var | nat | keyword

   256   ;

   257   arg: atom | '(' args ')' | '[' args ']'

   258   ;

   259   args: arg *

   260   ;

   261   attributes: '[' (nameref args * ',') ']'

   262   ;

   263 \end{rail}

   264

   265 Theorem specifications come in several flavors: \railnonterm{axmdecl} and

   266 \railnonterm{thmdecl} usually refer to axioms, assumptions or results of goal

   267 statements, while \railnonterm{thmdef} collects lists of existing theorems.

   268 Existing theorems are given by \railnonterm{thmref} and \railnonterm{thmrefs},

   269 the former requires an actual singleton result.  Any of these theorem

   270 specifications may include lists of attributes both on the left and right hand

   271 sides; attributes are applied to any immediately preceding theorem.  If names

   272 are omitted, the theorems are not stored within the theorem database of the

   273 theory or proof context; any given attributes are still applied, though.

   274

   275 \indexouternonterm{thmdecl}\indexouternonterm{axmdecl}

   276 \indexouternonterm{thmdef}\indexouternonterm{thmrefs}

   277 \begin{rail}

   278   axmdecl: name attributes? ':'

   279   ;

   280   thmdecl: thmname ':'

   281   ;

   282   thmdef: thmname '='

   283   ;

   284   thmref: nameref attributes?

   285   ;

   286   thmrefs: thmref +

   287   ;

   288

   289   thmname: name attributes | name | attributes

   290   ;

   291 \end{rail}

   292

   293

   294 \subsection{Proof methods}\label{sec:syn-meth}

   295

   296 Proof methods are either basic ones, or expressions composed of methods via

   297 \texttt{,}'' (sequential composition), \texttt{|}'' (alternative choices),

   298 \texttt{?}'' (try), \texttt{+}'' (repeat at least once).  In practice,

   299 proof methods are usually just a comma separated list of

   300 \railqtoken{nameref}~\railnonterm{args} specifications.  Note that parentheses

   301 may be dropped for single method specifications (with no arguments).

   302

   303 \indexouternonterm{method}

   304 \begin{rail}

   305   method: (nameref | '(' methods ')') (() | '?' | '+')

   306   ;

   307   methods: (nameref args | method) + (',' | '|')

   308   ;

   309 \end{rail}

   310

   311 Proper use of Isar proof methods does \emph{not} involve goal addressing.

   312 Nevertheless, specifying goal ranges may occasionally come in handy in

   313 emulating tactic scripts.  Note that $[n-]$ refers to all goals, starting from

   314 $n$.  All goals may be specified by $[!]$, which is the same as $[1-]$.

   315

   316 \indexouternonterm{goalspec}

   317 \begin{rail}

   318   goalspec: '[' (nat '-' nat | nat '-' | nat | '!' ) ']'

   319   ;

   320 \end{rail}

   321

   322

   323 %%% Local Variables:

   324 %%% mode: latex

   325 %%% TeX-master: "isar-ref"

   326 %%% End:
`