src/Doc/System/Server.thy
author wenzelm
Mon, 19 Mar 2018 19:24:45 +0100
changeset 67904 465f43a9f780
child 67917 d13b2dd20f5e
permissions -rw-r--r--
documentation for the Isabelle server;

(*:maxLineLen=78:*)

theory Server
imports Base
begin

chapter \<open>The Isabelle server\<close>

text \<open>
  An Isabelle session requires at least two processes, which are both rather
  heavy: Isabelle/Scala for the system infrastructure and Isabelle/ML for the
  logic session (e.g.\ HOL). In principle, these processes can be invoked
  directly on the command-line, e.g.\ via @{tool java}, @{tool scala}, @{tool
  process}, @{tool console}, but this approach is inadequate for reactive
  applications that require quick responses from the prover.

  In contrast, the Isabelle server exposes Isabelle/Scala as a
  ``terminate-stay-resident'' application that manages multiple logic
  \<^emph>\<open>sessions\<close> and concurrent tasks to use \<^emph>\<open>theories\<close>. This provides an
  analogous to @{ML Thy_Info.use_theories} in Isabelle/ML, but with full
  concurrency and Isabelle/PIDE markup.

  The client/server arrangement via TCP sockets also opens possibilities for
  remote Isabelle services that are accessed by local applications, e.g.\ via
  an SSH tunnel.
\<close>


section \<open>Command-line tools\<close>

subsection \<open>Server \label{sec:tool-server}\<close>

text \<open>
  The @{tool_def server} tool manages resident server processes:
  @{verbatim [display]
\<open>Usage: isabelle server [OPTIONS]

  Options are:
    -L FILE      logging on FILE
    -c           console interaction with specified server
    -l           list servers (alternative operation)
    -n NAME      explicit server name (default: isabelle)
    -p PORT      explicit server port
    -s           assume existing server, no implicit startup
    -x           exit specified server (alternative operation)

  Manage resident Isabelle servers.\<close>}

  The main operation of \<^verbatim>\<open>isabelle server\<close> is to ensure that a named server is
  running, either by finding an already running process (according to the
  central database file @{path "$ISABELLE_HOME_USER/servers.db"}) or by
  becoming itself a new server that accepts connections on a particular TCP
  socket. The server name and its address are printed as initial output line.
  If another server instance is already running, the current
  \<^verbatim>\<open>isabelle server\<close> process will terminate; otherwise, it keeps running as a
  new server process until an explicit \<^verbatim>\<open>shutdown\<close> command is received.
  Further details of the server socket protocol are explained in
  \secref{sec:server-protocol}.

  Other server management operations are invoked via options \<^verbatim>\<open>-l\<close> and \<^verbatim>\<open>-x\<close>
  (see below).

  \<^medskip>
  Option \<^verbatim>\<open>-n\<close> specifies an alternative server name: at most one process for
  each name may run, but each server instance supports multiple connections
  and logic sessions.

  \<^medskip>
  Option \<^verbatim>\<open>-p\<close> specifies an explicit TCP port for the server socket (which is
  always on \<^verbatim>\<open>localhost\<close>): the default is to let the operating system assign a
  free port number.

  \<^medskip>
  Option \<^verbatim>\<open>-s\<close> strictly assumes that the specified server process is already
  running, skipping the optional server startup phase.

  \<^medskip>
  Option \<^verbatim>\<open>-c\<close> connects the console in/out channels after the initial check
  for a suitable server process. Note that the @{tool client} tool
  (\secref{sec:tool-client}) provides a more convenient way to do this
  interactively, together with command-line editor support.

  \<^medskip>
  Option \<^verbatim>\<open>-l\<close> lists all active server processes with their connection
  details.

  \<^medskip>
  Option \<^verbatim>\<open>-x\<close> exits the specified server process by sending it a \<^verbatim>\<open>shutdown\<close>
  command.

  \<^medskip>
  Option \<^verbatim>\<open>-L\<close> specifies a log file for exceptional output of internal server
  and session operations.
\<close>


subsection \<open>Client \label{sec:tool-client}\<close>

text \<open>
  The @{tool_def client} provides console interaction for Isabelle servers:
  @{verbatim [display]
\<open>Usage: isabelle client [OPTIONS]

  Options are:
    -n NAME      explicit server name
    -p PORT      explicit server port

  Console interaction for Isabelle server (with line-editor).\<close>}

  This is a convenient wrapper to \<^verbatim>\<open>isabelle server -s -c\<close> for interactive
  experimentation, wrapped into @{setting ISABELLE_LINE_EDITOR} if available.
  The server name is sufficient for identification, as the client can
  determine the connection details from the local database of active servers.

  \<^medskip>
  Option \<^verbatim>\<open>-n\<close> specifies an explicit server name as in @{tool server}.

  \<^medskip>
  Option \<^verbatim>\<open>-p\<close> specifies an explicit server port as in @{tool server}.
\<close>


subsection \<open>Examples\<close>

text \<open>
  Ensure that a particular server instance is running in the background:
  @{verbatim [display] \<open>isabelle server -n test &\<close>}

  Here the first line of output presents the connection details:\<^footnote>\<open>This
  information may be used in an independent TCP client, while the
  Isabelle/Scala tool merely needs the server name to access the database of
  servers.\<close>
  @{verbatim [display] \<open>server "test" = 127.0.0.1:4711 (password "XYZ")\<close>}

  \<^medskip>
  List available server processes:
  @{verbatim [display] \<open>isabelle server -l\<close>}

  \<^medskip>
  Connect the command-line client to the above test server:
  @{verbatim [display] \<open>isabelle client -n test\<close>}

  Interaction now works on a line-by-line basis, with commands like \<^verbatim>\<open>help\<close> or
  \<^verbatim>\<open>echo\<close>. For example, some JSON values may be echoed like this:

  @{verbatim [display]
\<open>echo 42
echo [1, 2, 3]
echo {"a": "text", "b": true, "c": 42}\<close>}

  Closing the connection (via CTRL-D) leaves the server running: it is
  possible to reconnect again, and have multiple connections at the same time.

  \<^medskip>
  Finally, exit the named server on the command-line:
  @{verbatim [display] \<open>isabelle server -n test -x\<close>}
\<close>


section \<open>Protocol messages \label{sec:server-protocol}\<close>

text \<open>
  The Isabelle server listens on a regular TCP socket, using a line-oriented
  protocol of structured messages: input \<^emph>\<open>commands\<close> and output \<^emph>\<open>results\<close>
  (via \<^verbatim>\<open>OK\<close> or \<^verbatim>\<open>ERROR\<close>) are strictly alternating on the toplevel, but
  commands may also return a \<^emph>\<open>task\<close> identifier to indicate an ongoing
  asynchronous process that is joined later (via \<^verbatim>\<open>FINISHED\<close> or \<^verbatim>\<open>FAILED\<close>).
  Asynchronous \<^verbatim>\<open>NOTE\<close> messages may occur at any time: they are independent of
  the main command-reply protocol.

  For example, the synchronous \<^verbatim>\<open>echo\<close> command immediately returns its
  argument as \<^verbatim>\<open>OK\<close> result. In contrast, the asynchronous \<^verbatim>\<open>session_build\<close>
  command returns \<^verbatim>\<open>OK {"task":\<close>\<open>id\<close>\<^verbatim>\<open>}\<close> and continues in the background. It
  will eventually produce \<^verbatim>\<open>FINISHED {"task":\<close>\<open>id\<close>\<^verbatim>\<open>,\<close>\<open>\<dots>\<close>\<^verbatim>\<open>}\<close> or
  \<^verbatim>\<open>FAILED {"task":\<close>\<open>id\<close>\<^verbatim>\<open>,\<close>\<open>\<dots>\<close>\<^verbatim>\<open>}\<close> with the final result. Intermediately, it
  may emit asynchronous messages of the form \<^verbatim>\<open>NOTE {"task":\<close>\<open>id\<close>\<^verbatim>\<open>,\<close>\<open>\<dots>\<close>\<^verbatim>\<open>}\<close>
  to inform about its progress. Due to the explicit task identifier, the
  client can show these messages in the proper context, e.g.\ a GUI window for
  the session build job.

  \medskip Subsequently, the protocol message formats are described in further
  detail.
\<close>


subsection \<open>Byte messages\<close>

text \<open>
  The client-server connection is a raw byte-channel for bidirectional
  communication, but the Isabelle server always works with messages of a
  particular length. Messages are written as a single chunk that is flushed
  immediately.

  The message boundary is determined as follows:

    \<^item> A \<^emph>\<open>short message\<close> consists of a single line: it is a sequence of
    arbitrary bytes excluding CR (13) and LF (10), and terminated by CR-LF or
    just LF.

    \<^item> A \<^emph>\<open>long message\<close> starts with a single line as above, which consists
    only of decimal digits: that is interpreted as length of the subsequent
    block of arbitrary bytes. A final line-terminator may be included here,
    but is not required.

  Messages in JSON format (see below) always fit on a single line, due to
  escaping of newline characters within string literals. This is convenient
  for interactive experimentation, but it can impact performance for very long
  messages. If the message byte-length is given on the preceding line, the
  server can read the message efficiently as a single block.
\<close>


subsection \<open>Text messages\<close>

text \<open>
  Messages are read and written as byte streams (with byte lengths), but the
  content is always interpreted as plain text in terms of the UTF-8
  encoding.\<^footnote>\<open>See also the ``UTF-8 Everywhere Manifesto''
  \<^url>\<open>http://utf8everywhere.org\<close>.\<close>

  Note that line-endings and other formatting characters are invariant wrt.
  UTF-8 representation of text: thus implementations are free to determine the
  overall message structure before or after applying the text encoding.
\<close>


subsection \<open>Input and output messages\<close>

text \<open>
  Server input and output messages have a uniform format as follows:

    \<^item> \<open>name argument\<close> such that:

    \<^item> \<open>name\<close> is the longest prefix consisting of ASCII letters, digits,
    ``\<^verbatim>\<open>_\<close>'' (underscore), or ``\<^verbatim>\<open>.\<close>'' (dot),

    \<^item> the separator between \<open>name\<close> and \<open>argument\<close> is the longest possible
    sequence of ASCII blanks (it could be empty, e.g.\ when the argument
    starts with a quote or bracket),

    \<^item> \<open>argument\<close> is the rest of the message without the line terminator.

  \<^medskip>
  Input messages are sent from the client to the server. Here the \<open>name\<close>
  specifies a \<^emph>\<open>server command\<close>: the list of known commands may be
  retrieved via the \<^verbatim>\<open>help\<close> command.

  \<^medskip>
  Output messages are sent from the server to the client. Here the \<open>name\<close>
  specifies the \<^emph>\<open>server reply\<close>, which always has a specific meaning as
  follows:

    \<^item> synchronous results: \<^verbatim>\<open>OK\<close> or \<^verbatim>\<open>ERROR\<close>
    \<^item> asynchronous results: \<^verbatim>\<open>FINISHED\<close> or \<^verbatim>\<open>FAILED\<close>
    \<^item> intermediate notifications: \<^verbatim>\<open>NOTE\<close>

  \<^medskip>
  The \<open>argument\<close> format is uniform for both input and output messages:

    \<^item> empty argument (Scala type \<^verbatim>\<open>Unit\<close>)
    \<^item> XML element in YXML notation (Scala type \<^verbatim>\<open>XML.Elem\<close>)
    \<^item> JSON value (Scala type \<^verbatim>\<open>JSON.T\<close>)

  JSON values may consist of objects (records), arrays (lists), strings,
  numbers, bools, null.\<^footnote>\<open>See also the official specification
  \<^url>\<open>https://www.json.org\<close> and unofficial explorations ``Parsing JSON is a
  Minefield'' \<^url>\<open>http://seriot.ch/parsing_json.php\<close>.\<close> Since JSON requires
  explicit quotes and backslash-escapes to represent arbitrary text, the YXML
  notation for XML trees (\secref{sec:yxml-vs-xml}) works better
  for large messages with a lot of PIDE markup.

  Nonetheless, the most common commands use JSON by default: big chunks of
  text (theory sources etc.) are taken from the underlying file-system and
  results are pre-formatted for plain-text output, without PIDE markup
  information. This is a concession to simplicity: the server imitates the
  appearance of command-line tools on top of the Isabelle/PIDE infrastructure.
\<close>


subsection \<open>Initial password exchange\<close>

text \<open>
  Whenever a new client opens the server socket, the initial message needs to
  be its unique password. The server replies either with \<^verbatim>\<open>OK\<close> (and some
  information about the Isabelle version) or by silent disconnection of what
  is considered an illegal connection attempt.

  Server passwords are created as Universally Unique Identifier (UUID) in
  Isabelle/Scala and stored in a per-user database, with restricted
  file-system access only for the current user. The Isabelle/Scala server
  implementation is careful to expose the password only on private output
  channels, and not on a process command-line (which is accessible to other
  users, e.g.\ via the \<^verbatim>\<open>ps\<close> command).
\<close>


subsection \<open>Synchronous commands\<close>

text \<open>
  A \<^emph>\<open>synchronous command\<close> corresponds to regular function application in
  Isabelle/Scala, with single argument and result (regular or error). Both the
  argument and the result may consist of type \<^verbatim>\<open>Unit\<close>, \<^verbatim>\<open>XML.Elem\<close>, \<^verbatim>\<open>JSON.T\<close>.
  An error result typically consists of a JSON object with error message and
  potentially further fields (this resembles exceptions in Scala).

  These are the protocol exchanges for both cases of command execution:
  \begin{center}
  \begin{tabular}{rl}
  \<^bold>\<open>input:\<close> & \<open>command argument\<close> \\
  (a) regular \<^bold>\<open>output:\<close> & \<^verbatim>\<open>OK\<close> \<open>result\<close> \\
  (b) error \<^bold>\<open>output:\<close> & \<^verbatim>\<open>ERROR\<close> \<open>result\<close> \\
  \end{tabular}
  \end{center}
\<close>


subsection \<open>Asynchronous commands\<close>

text \<open>
  An \<^emph>\<open>asynchronous command\<close> corresponds to an ongoing process that finishes
  or fails eventually, while emitting arbitrary notifications intermediately.
  Formally, it starts as synchronous command with immediate result \<^verbatim>\<open>OK\<close> and
  the \<^verbatim>\<open>task\<close> identifier, or an immediate \<^verbatim>\<open>ERROR\<close> that indicates bad command
  syntax. For a running task, the termination is indicated later by
  \<^verbatim>\<open>FINISHED\<close> or \<^verbatim>\<open>FAILED\<close>, together with its ultimate result.

  These are the protocol exchanges for various cases of command task
  execution:

  \begin{center}
  \begin{tabular}{rl}
  \<^bold>\<open>input:\<close> & \<open>command argument\<close> \\
  immediate \<^bold>\<open>output:\<close> & \<^verbatim>\<open>OK {"task":\<close>\<open>id\<close>\<^verbatim>\<open>}\<close> \\
  intermediate \<^bold>\<open>output:\<close> & \<^verbatim>\<open>NOTE {"task":\<close>\<open>id\<close>\<^verbatim>\<open>,\<close>\<open>\<dots>\<close>\<^verbatim>\<open>}\<close> \\
  intermediate \<^bold>\<open>output:\<close> & \<^verbatim>\<open>NOTE {"task":\<close>\<open>id\<close>\<^verbatim>\<open>,\<close>\<open>\<dots>\<close>\<^verbatim>\<open>}\<close> \\
  (a) regular \<^bold>\<open>output:\<close> & \<^verbatim>\<open>FINISHED {"task":\<close>\<open>id\<close>\<^verbatim>\<open>,\<close>\<open>\<dots>\<close>\<^verbatim>\<open>}\<close> \\
  (b) error \<^bold>\<open>output:\<close> & \<^verbatim>\<open>FAILED {"task":\<close>\<open>id\<close>\<^verbatim>\<open>,\<close>\<open>\<dots>\<close>\<^verbatim>\<open>}\<close> \\[3ex]
  \<^bold>\<open>input:\<close> & \<open>command argument\<close> \\
  immediate \<^bold>\<open>output:\<close> & \<^verbatim>\<open>ERROR\<close>~\<open>\<dots>\<close> \\
  \end{tabular}
  \end{center}

  All asynchronous messages are decorated with the task identifier that was
  revealed in the immediate (synchronous) result. Thus it is possible to emit
  further asynchronous commands and dispatch the mingled stream of
  asynchronous messages properly.

  The synchronous command \<^verbatim>\<open>cancel\<close>~\<open>id\<close> tells the specified task to terminate
  prematurely: usually causing a \<^verbatim>\<open>FAILED\<close> result with error message
  \<^verbatim>\<open>Interrupt\<close>, but this is not guaranteed: the cancel event may come too
  late or the task may just ignore it.
\<close>

end