# HG changeset patch # User wenzelm # Date 1287694414 -3600 # Node ID 31dd361a30609a77c44e03e58d7214ef8491a689 # Parent 1206e88f1284ff9d082e82cf404c4f1efa4152c4 more on "Style and orthography"; diff -r 1206e88f1284 -r 31dd361a3060 doc-src/IsarImplementation/Thy/ML.thy --- a/doc-src/IsarImplementation/Thy/ML.thy Thu Oct 21 20:06:13 2010 +0100 +++ b/doc-src/IsarImplementation/Thy/ML.thy Thu Oct 21 21:53:34 2010 +0100 @@ -28,6 +28,177 @@ merely reflect snapshots that are never really up-to-date.} *} +section {* Style and orthography *} + +text {* The sources Isabelle/Isar implementation are optimized for + \emph{readability} and \emph{maintainability}. The main purpose of + the sources is to tell an informed reader what is really going on + and how things really work. This is a non-trivial aim, but it + greatly helps to follow a certain style for Isabelle/ML that has + emerged from long years of Isabelle system development.\footnote{See + also the very interesting style guide for OCaml + \url{http://caml.inria.fr/resources/doc/guides/guidelines.en.html} + which shares many of our means and ends.} + + The main principle behind any coding style is \emph{consistency}. + For a single author and a small project this merely means ``choose + your style and stick to it''. A complex project like Isabelle, with + long years of development and different contributors this requires + more and more standardization. A coding style that is changed every + few years or with every new contributor is no style at all, because + consistency is quickly lost. Global consistency is hard to achieve, + but one should always strife at least for local consistency of + modules and sub-systems, without deviating from some general + principles how to write Isabelle/ML. + + In a sense, a common coding style is like an \emph{orthography} for + the sources: it helps to read quickly over it and see through the + main points, without getting distracted by accidental presentation + of free-style text. +*} + + +subsection {* Basic naming conventions *} + +text {* Since ML is the primary medium to express the meaning of the + source text, naming of ML entities requires special care. + + FIXME +*} + + +subsection {* Header and sectioning *} + +text {* Isabelle source files have a certain standardized header that + follows ancient traditions back to the earliest versions of the + system by Larry Paulson (including precise spacing). E.g.\ see + @{"file" "~~/src/Pure/thm.ML"}. + + The header includes at least @{verbatim Title} and @{verbatim + Author} entries, followed by a prose description of the purpose of + the module. The latter can range from a single line to several + paragraphs of explanations. + + The rest of the file is divided into sections, subsections, + subsubsections, paragraphs etc.\ using some simple layout via ML + comments as follows. + +\begin{verbatim} +(*** section ***) + +(** subsection **) + +(* subsubsection *) + +(*short paragraph*) + +(* + long paragraph + more text +*) +\end{verbatim} + + As in regular typography, there is some extra space \emph{before} + section headings that are adjacent to plain text (not other headings + as in the example above). + + \medskip The precise wording of the prose text given in these + headings is chosen carefully to indicate the main theme of what is + coming next in the formal ML text. +*} + + +subsection {* General source layout *} + +text {* The general Isabelle/ML source layout imitates regular + type-setting to some extent, augmented by the requirements for + deeply nested expressions that are commonplace in functional + programming. + + \paragraph{Line length} is 80 characters according to ancient + standards, but we allow as much as 100 characters to acknowledge the + extra space requirements due to the relatively long fully-qualified + library references in Isabelle/ML. + + \paragraph{White-space} is used to emphasize the structure of + expressions, following mostly standard conventions for mathematical + typesetting, as can be seen in plain {\TeX} or {\LaTeX}. This + defines positioning of spaces for parentheses, punctuation, infixes + as illustrated here: + + \begin{verbatim} + val x = y + z * (a + b); + val pair = (a, b); + val record = {foo = 1, bar = 2}; + \end{verbatim} + + Lines are normally broken after an infix operator or punctuation + character. E.g.\ + + \begin{verbatim} + val x = + a + + b + + c; + + val tuple = + (a, + b, + c); + \end{verbatim} + + Some special infixes (e.g.\ @{verbatim "|>"}) work better at the + start of the line. For punctuation there are no such exceptions. + + Function application follows the tradition of @{text "\"}-calculus, + not informal mathematics. For example: @{verbatim "f a b"} for a + curried function, or @{verbatim "g (a, b)"} for a tupled function. + Note that the space between @{verbatim g} and the pair @{verbatim + "(a, b)"} follows the important principle of compositionality: the + layout of @{verbatim "g p"} does not change when @{verbatim "p"} is + refined to a concrete pair. + + \paragraph{Indentation} uses plain spaces, never hard + tabulators.\footnote{Tabulators were invented to move the carriage + of a type-writer to certain predefined positions. In software they + could be used as a primitive run-length compression of consecutive + spaces, but the precise result would depend on non-standardized + editor configuration.} + + Each level of logical nesting is indented by 2 spaces, sometimes 1; + very rarely 4, never 8. + + Indentation follows the simple ``logical form'' that only depends on + the nesting depth, not the accidental length of the text that + initiates a level of nesting. Example: + + \begin{verbatim} + (*right*) + if b then + expr1 + else + expr2 + + (*wrong*) + if b then + expr1 + else + expr2 + \end{verbatim} + + The second form has many problems: it assumes a fixed-width font + when viewing the sources, it uses more space on the line and makes + it hard to observe its strict length limit (working against + \emph{readability}), it requires extra editing to adapt the layout + to changes of the initial expression @{verbatim b} (working against + \emph{maintainability}) etc. + + \medskip Any kind of two-dimensional or tabular layouts, ASCII-art + with lines or boxes of stars etc. should be avoided for the same + reasons. +*} + + section {* SML embedded into Isabelle/Isar *} text {* ML and Isar are intertwined via an open-ended bootstrap