author wenzelm Thu, 21 Oct 2010 21:53:34 +0100 changeset 39878 31dd361a3060 parent 39877 1206e88f1284 child 39879 be8a7334e4ec
more on "Style and orthography";
--- a/doc-src/IsarImplementation/Thy/ML.thy	Thu Oct 21 20:06:13 2010 +0100
+++ b/doc-src/IsarImplementation/Thy/ML.thy	Thu Oct 21 21:53:34 2010 +0100
@@ -28,6 +28,177 @@
merely reflect snapshots that are never really up-to-date.}  *}

+section {* Style and orthography *}
+
+text {* The sources Isabelle/Isar implementation are optimized for
+  \emph{readability} and \emph{maintainability}.  The main purpose of
+  the sources is to tell an informed reader what is really going on
+  and how things really work.  This is a non-trivial aim, but it
+  greatly helps to follow a certain style for Isabelle/ML that has
+  emerged from long years of Isabelle system development.\footnote{See
+  also the very interesting style guide for OCaml
+  \url{http://caml.inria.fr/resources/doc/guides/guidelines.en.html}
+  which shares many of our means and ends.}
+
+  The main principle behind any coding style is \emph{consistency}.
+  For a single author and a small project this merely means choose
+  your style and stick to it''.  A complex project like Isabelle, with
+  long years of development and different contributors this requires
+  more and more standardization.  A coding style that is changed every
+  few years or with every new contributor is no style at all, because
+  consistency is quickly lost.  Global consistency is hard to achieve,
+  but one should always strife at least for local consistency of
+  modules and sub-systems, without deviating from some general
+  principles how to write Isabelle/ML.
+
+  In a sense, a common coding style is like an \emph{orthography} for
+  the sources: it helps to read quickly over it and see through the
+  main points, without getting distracted by accidental presentation
+  of free-style text.
+*}
+
+
+subsection {* Basic naming conventions *}
+
+text {* Since ML is the primary medium to express the meaning of the
+  source text, naming of ML entities requires special care.
+
+  FIXME
+*}
+
+
+subsection {* Header and sectioning *}
+
+text {* Isabelle source files have a certain standardized header that
+  follows ancient traditions back to the earliest versions of the
+  system by Larry Paulson (including precise spacing).  E.g.\ see
+  @{"file" "~~/src/Pure/thm.ML"}.
+
+  The header includes at least @{verbatim Title} and @{verbatim
+  Author} entries, followed by a prose description of the purpose of
+  the module.  The latter can range from a single line to several
+  paragraphs of explanations.
+
+  The rest of the file is divided into sections, subsections,
+  subsubsections, paragraphs etc.\ using some simple layout via ML
+
+\begin{verbatim}
+(*** section ***)
+
+(** subsection **)
+
+(* subsubsection *)
+
+(*short paragraph*)
+
+(*
+  long paragraph
+  more text
+*)
+\end{verbatim}
+
+  As in regular typography, there is some extra space \emph{before}
+  as in the example above).
+
+  \medskip The precise wording of the prose text given in these
+  headings is chosen carefully to indicate the main theme of what is
+  coming next in the formal ML text.
+*}
+
+
+subsection {* General source layout *}
+
+text {* The general Isabelle/ML source layout imitates regular
+  type-setting to some extent, augmented by the requirements for
+  deeply nested expressions that are commonplace in functional
+  programming.
+
+  \paragraph{Line length} is 80 characters according to ancient
+  standards, but we allow as much as 100 characters to acknowledge the
+  extra space requirements due to the relatively long fully-qualified
+  library references in Isabelle/ML.
+
+  \paragraph{White-space} is used to emphasize the structure of
+  expressions, following mostly standard conventions for mathematical
+  typesetting, as can be seen in plain {\TeX} or {\LaTeX}.  This
+  defines positioning of spaces for parentheses, punctuation, infixes
+  as illustrated here:
+
+  \begin{verbatim}
+  val x = y + z * (a + b);
+  val pair = (a, b);
+  val record = {foo = 1, bar = 2};
+  \end{verbatim}
+
+  Lines are normally broken after an infix operator or punctuation
+  character.  E.g.\
+
+  \begin{verbatim}
+  val x =
+    a +
+    b +
+    c;
+
+  val tuple =
+   (a,
+    b,
+    c);
+  \end{verbatim}
+
+  Some special infixes (e.g.\ @{verbatim "|>"}) work better at the
+  start of the line.  For punctuation there are no such exceptions.
+
+  Function application follows the tradition of @{text "\<lambda>"}-calculus,
+  not informal mathematics.  For example: @{verbatim "f a b"} for a
+  curried function, or @{verbatim "g (a, b)"} for a tupled function.
+  Note that the space between @{verbatim g} and the pair @{verbatim
+  "(a, b)"} follows the important principle of compositionality: the
+  layout of @{verbatim "g p"} does not change when @{verbatim "p"} is
+  refined to a concrete pair.
+
+  \paragraph{Indentation} uses plain spaces, never hard
+  tabulators.\footnote{Tabulators were invented to move the carriage
+  of a type-writer to certain predefined positions.  In software they
+  could be used as a primitive run-length compression of consecutive
+  spaces, but the precise result would depend on non-standardized
+  editor configuration.}
+
+  Each level of logical nesting is indented by 2 spaces, sometimes 1;
+  very rarely 4, never 8.
+
+  Indentation follows the simple logical form'' that only depends on
+  the nesting depth, not the accidental length of the text that
+  initiates a level of nesting.  Example:
+
+  \begin{verbatim}
+  (*right*)
+  if b then
+    expr1
+  else
+    expr2
+
+  (*wrong*)
+  if b then
+            expr1
+  else
+            expr2
+  \end{verbatim}
+
+  The second form has many problems: it assumes a fixed-width font
+  when viewing the sources, it uses more space on the line and makes
+  it hard to observe its strict length limit (working against
text {* ML and Isar are intertwined via an open-ended bootstrap