1577 the status of constant versus variable is not yet know during |
1577 the status of constant versus variable is not yet know during |
1578 parsing. This is in contrast to print translations, where constants |
1578 parsing. This is in contrast to print translations, where constants |
1579 are explicitly known from the given term in its fully internal form. |
1579 are explicitly known from the given term in its fully internal form. |
1580 *} |
1580 *} |
1581 |
1581 |
|
1582 |
|
1583 subsection {* Built-in syntax transformations *} |
|
1584 |
|
1585 text {* |
|
1586 Here are some further details of the main syntax transformation |
|
1587 phases of \figref{fig:parse-print}. |
|
1588 *} |
|
1589 |
|
1590 |
|
1591 subsubsection {* Transforming parse trees to ASTs *} |
|
1592 |
|
1593 text {* The parse tree is the raw output of the parser. It is |
|
1594 transformed into an AST according to some basic scheme that may be |
|
1595 augmented by AST translation functions as explained in |
|
1596 \secref{sec:tr-funs}. |
|
1597 |
|
1598 The parse tree is constructed by nesting the right-hand sides of the |
|
1599 productions used to recognize the input. Such parse trees are |
|
1600 simply lists of tokens and constituent parse trees, the latter |
|
1601 representing the nonterminals of the productions. Ignoring AST |
|
1602 translation functions, parse trees are transformed to ASTs by |
|
1603 stripping out delimiters and copy productions, while retaining some |
|
1604 source position information from input tokens. |
|
1605 |
|
1606 The Pure syntax provides predefined AST translations to make the |
|
1607 basic @{text "\<lambda>"}-term structure more apparent within the |
|
1608 (first-order) AST representation, and thus facilitate the use of |
|
1609 @{command translations} (see also \secref{sec:syn-trans}). This |
|
1610 covers ordinary term application, type application, nested |
|
1611 abstraction, iterated meta implications and function types. The |
|
1612 effect is illustrated on some representative input strings is as |
|
1613 follows: |
|
1614 |
|
1615 \begin{center} |
|
1616 \begin{tabular}{ll} |
|
1617 input source & AST \\ |
|
1618 \hline |
|
1619 @{text "f x y z"} & @{verbatim "(f x y z)"} \\ |
|
1620 @{text "'a ty"} & @{verbatim "(ty 'a)"} \\ |
|
1621 @{text "('a, 'b)ty"} & @{verbatim "(ty 'a 'b)"} \\ |
|
1622 @{text "\<lambda>x y z. t"} & @{verbatim "(\"_abs\" x (\"_abs\" y (\"_abs\" z t)))"} \\ |
|
1623 @{text "\<lambda>x :: 'a. t"} & @{verbatim "(\"_abs\" (\"_constrain\" x 'a) t)"} \\ |
|
1624 @{text "\<lbrakk>P; Q; R\<rbrakk> \<Longrightarrow> S"} & @{verbatim "(\"==>\" P (\"==>\" Q (\"==>\" R S)))"} \\ |
|
1625 @{text "['a, 'b, 'c] \<Rightarrow> 'd"} & @{verbatim "(\"fun\" 'a (\"fun\" 'b (\"fun\" 'c 'd)))"} \\ |
|
1626 \end{tabular} |
|
1627 \end{center} |
|
1628 |
|
1629 Note that type and sort constraints may occur in further places --- |
|
1630 translations need to be ready to cope with them. The built-in |
|
1631 syntax transformation from parse trees to ASTs insert additional |
|
1632 constraints that represent source positions. |
|
1633 *} |
|
1634 |
|
1635 |
|
1636 subsubsection {* Transforming ASTs to terms *} |
|
1637 |
|
1638 text {* After application of macros (\secref{sec:syn-trans}), the AST |
|
1639 is transformed into a term. This term still lacks proper type |
|
1640 information, but it might contain some constraints consisting of |
|
1641 applications with head @{verbatim "_constrain"}, where the second |
|
1642 argument is a type encoded as a pre-term within the syntax. Type |
|
1643 inference later introduces correct types, or indicates type errors |
|
1644 in the input. |
|
1645 |
|
1646 Ignoring parse translations, ASTs are transformed to terms by |
|
1647 mapping AST constants to term constants, AST variables to term |
|
1648 variables or constants (according to the name space), and AST |
|
1649 applications to iterated term applications. |
|
1650 |
|
1651 The outcome is still a first-order term. Proper abstractions and |
|
1652 bound variables are introduced by parse translations associated with |
|
1653 certain syntax constants. Thus @{verbatim "(_abs x x)"} eventually |
|
1654 becomes a de-Bruijn term @{verbatim "Abs (\"x\", _, Bound 0)"}. |
|
1655 *} |
|
1656 |
|
1657 |
|
1658 subsubsection {* Printing of terms *} |
|
1659 |
|
1660 text {* The output phase is essentially the inverse of the input |
|
1661 phase. Terms are translated via abstract syntax trees into |
|
1662 pretty-printed text. |
|
1663 |
|
1664 Ignoring print translations, the transformation maps term constants, |
|
1665 variables and applications to the corresponding constructs on ASTs. |
|
1666 Abstractions are mapped to applications of the special constant |
|
1667 @{verbatim "_abs"} as seen before. Type constraints are represented |
|
1668 via special @{verbatim "_constrain"} forms, according to various |
|
1669 policies of type annotation determined elsewhere. Sort constraints |
|
1670 of type variables are handled in a similar fashion. |
|
1671 |
|
1672 After application of macros (\secref{sec:syn-trans}), the AST is |
|
1673 finally pretty-printed. The built-in print AST translations reverse |
|
1674 the corresponding parse AST translations. |
|
1675 |
|
1676 \medskip For the actual printing process, the priority grammar |
|
1677 (\secref{sec:priority-grammar}) plays a vital role: productions are |
|
1678 used as templates for pretty printing, with argument slots stemming |
|
1679 from nonterminals, and syntactic sugar stemming from literal tokens. |
|
1680 |
|
1681 Each AST application with constant head @{text "c"} and arguments |
|
1682 @{text "t\<^sub>1"}, \dots, @{text "t\<^sub>n"} (for @{text "n = 0"} the AST is |
|
1683 just the constant @{text "c"} itself) is printed according to the |
|
1684 first grammar production of result name @{text "c"}. The required |
|
1685 syntax priority of the argument slot is given by its nonterminal |
|
1686 @{text "A\<^sup>(\<^sup>p\<^sup>)"}. The argument @{text "t\<^sub>i"} that corresponds to the |
|
1687 position of @{text "A\<^sup>(\<^sup>p\<^sup>)"} is printed recursively, and then put in |
|
1688 parentheses \emph{if} its priority @{text "p"} requires this. The |
|
1689 resulting output is concatenated with the syntactic sugar according |
|
1690 to the grammar production. |
|
1691 |
|
1692 If an AST application @{text "(c x\<^sub>1 \<dots> x\<^sub>m)"} has more arguments than |
|
1693 the corresponding production, it is first split into @{text "((c x\<^sub>1 |
|
1694 \<dots> x\<^sub>n) x\<^sub>n\<^sub>+\<^sub>1 \<dots> x\<^sub>m)"} and then printed recursively as above. |
|
1695 |
|
1696 Applications with too few arguments or with non-constant head or |
|
1697 without a corresponding production are printed in prefix-form like |
|
1698 @{text "f t\<^sub>1 \<dots> t\<^sub>n"} for terms. |
|
1699 |
|
1700 Multiple productions associated with some name @{text "c"} are tried |
|
1701 in order of appearance within the grammar. An occurrence of some |
|
1702 AST variable @{text "x"} is printed as @{text "x"} outright. |
|
1703 |
|
1704 \medskip White space is \emph{not} inserted automatically. If |
|
1705 blanks (or breaks) are required to separate tokens, they need to be |
|
1706 specified in the mixfix declaration (\secref{sec:mixfix}). |
|
1707 *} |
|
1708 |
1582 end |
1709 end |