back to Unix format...
authorpaulson
Tue Apr 10 16:09:26 2001 +0200 (2001-04-10)
changeset 112487a696a130de2
parent 11247 a9c8861e84a9
child 11249 a0e3c67c1394
back to Unix format...
doc-src/TutorialI/Protocol/protocol.tex
     1.1 --- a/doc-src/TutorialI/Protocol/protocol.tex	Tue Apr 10 16:02:01 2001 +0200
     1.2 +++ b/doc-src/TutorialI/Protocol/protocol.tex	Tue Apr 10 16:09:26 2001 +0200
     1.3 @@ -1,1 +1,554 @@
     1.4 -% $Id$
\chapter{Case Study: Verifying a Cryptographic Protocol}
\label{chap:crypto}

%crypto primitives FIXME: GET RID OF MANY
\def\lbb{\mathopen{\{\kern-.30em|}}
\def\rbb{\mathclose{|\kern-.32em\}}}
\def\comp#1{\lbb#1\rbb}

Communications security is an ancient art.  Julius Caesar is said to have
encrypted his messages, shifting each letter three places along the
alphabet.  Mary Queen of Scots was convicted of treason after a cipher used
in her letters was broken.  Today's postal system incorporates security
features. The envelope provides a degree of
\emph{secrecy}.  The signature provides \emph{authenticity} (proof of
origin), as do departmental stamps and letterheads. 

Networks are vulnerable: messages pass through many computers, any of which
might be controlled an adversary, who thus can capture or redirect
messages.  People who wish to communicate securely over such a network can
use cryptography, but if they are to understand each other, they need to
follow a
\emph{protocol}: a pre-arranged sequence of message formats. 

Protocols can be attacked in many ways, even if encryption is unbreakable. 
A \emph{splicing attack} involves an adversary's sending a message composed
of parts of several old messages.  This fake message may have the correct
format, fooling an honest party.  The adversary might be able to masquerade
as somebody else, or he might obtain a secret key.

\emph{Nonces} help prevent splicing attacks. A typical nonce is a 20-byte
random number. Each message that requires a reply incorporates a nonce. The
reply must include a copy of that nonce, to prove that it is not a replay of
a past message.  Nonces must be cryptographically protected, since
otherwise an adversary could easily generate fakes. You should be starting
to see that protocol design is tricky!

Researchers are developing methods for proving the correctness of security
protocols.  The Needham-Schroeder public-key
protocol~\cite{needham-schroeder} has become a standard test case. 
Proposed in 1978, it was found to be defective nearly two decades
later~\cite{lowe-fdr}.  This toy protocol will be useful in demonstrating
how to verify protocols using Isabelle.


\section{The Needham-Schroeder Public-Key Protocol}\label{sec:ns-protocol}

This protocol uses public-key cryptography. Each person has a private key, known only to
himself, and a public key, known to everybody. If Alice wants to send Bob a secret message, she
encrypts it using Bob's public key (which everybody knows), and sends it to Bob. Only Bob has the
matching private key, which is needed in order to decrypt Alice's message.

The core of the Needham-Schroeder protocol consists of three messages:
\begin{alignat*}{2}
  &1.&\quad  A\to B  &: \comp{Na,A}\sb{Kb} \\
  &2.&\quad  B\to A  &: \comp{Na,Nb}\sb{Ka} \\
  &3.&\quad  A\to B  &: \comp{Nb}\sb{Kb}
\end{alignat*}
First, let's understand the notation. In the first message, Alice
sends Bob a message consisting of a nonce generated by Alice~($Na$)
paired  with Alice's name~($A$) and encrypted using Bob's public
key~($Kb$). In the second message, Bob sends Alice a message
consisting of $Na$ paired with a nonce generated by Bob~($Nb$), 
encrypted using Alice's public key~($Ka$). In the last message, Alice
returns $Nb$ to Bob, encrypted using his public key.

When Alice receives Message~2, she knows that Bob has acted on her
message, since only he could have decrypted
$\comp{Na,A}\sb{Kb}$ and extracted~$Na$.  That is precisely what
nonces are for.  Similarly, message~3 assures Bob that Alice is
active.  But the protocol was widely believed~\cite{ban89} to satisfy a
further property: that
$Na$ and~$Nb$ were secrets shared by Alice and Bob.  (Many
protocols generate such shared secrets, which can be used
to lessen the reliance on slow public-key operations.)  Lowe found this
claim to be false: if Alice runs the protocol with someone untrustworthy
(Charlie say), then he can start a new run with another agent (Bob say). 
Charlie uses Alice as an oracle, masquerading as
Alice to Bob~\cite{lowe-fdr}.
\begin{alignat*}{4}
  &1.&\quad  A\to C  &: \comp{Na,A}\sb{Kc}   & 
      \qquad 1'.&\quad  C\to B  &: \comp{Na,A}\sb{Kb} \\
  &2.&\quad  B\to A  &: \comp{Na,Nb}\sb{Ka} \\
  &3.&\quad  A\to C  &: \comp{Nb}\sb{Kc}  &
      3'.&\quad  A\to B  &: \comp{Nb}\sb{Kb}
\end{alignat*}
In messages~1 and~3, Charlie removes the encryption using his private
key and re-encrypts Alice's messages using Bob's public key. Bob is
left thinking he has run the protocol with Alice, which was not
Alice's intention, and Bob is unaware that the ``secret'' nonces are
known to Charlie.  This is a typical man-in-the-middle attack launched
by an insider.

Lowe also suggested a fix, namely to include Bob's name in message~2:
\begin{alignat*}{2}
  &1.&\quad  A\to B  &: \comp{Na,A}\sb{Kb} \\
  &2.&\quad  B\to A  &: \comp{Na,Nb,B}\sb{Ka} \\
  &3.&\quad  A\to B  &: \comp{Nb}\sb{Kb}
\end{alignat*}
If Charlie tries the same attack, Alice will receive the message
$\comp{Na,Nb,B}\sb{Ka}$ when she was expecting to receive
$\comp{Na,Nb,C}\sb{Ka}$.  She will abandon the run, and eventually so
will Bob.

In ground-breaking work, Lowe~\cite{lowe-fdr} showed how such attacks
could be found automatically using a model checker.  An alternative,
which we shall examine below, is to prove protocols correct.  Proofs
can be done under more realistic assumptions because our model does
not have to be finite.  The strategy is to formalize the operational
semantics of the system and to prove security properties using rule
induction.


\section{Agents and Messages}

All protocol specifications refer to a syntactic theory of messages. 
Datatype
\isa{agent} introduces the constant \isa{Server} (a trusted central
machine, needed for some protocols), an infinite population of ``friendly''
agents, and the~\isa{Spy}:
\begin{isabelle}
\isacommand{datatype}\ agent\ =\ Server\ |\ Friend\ nat\ |\ Spy
\end{isabelle}
%
Keys are just natural numbers.  Function \isa{invKey} maps a public key to
the matching private key, and vice versa:
\begin{isabelle}
\isacommand{types}\ key\ =\ nat\isanewline
\isacommand{consts}\ invKey\ ::\ "key=>key"
\end{isabelle}
Datatype
\isa{msg} introduces the message forms, which include agent names, nonces,
keys, compound messages, and encryptions.  
\begin{isabelle}
\isacommand{datatype}\isanewline
\ \ \ \ \ msg\ =\ Agent\ \ agent\ \ \ \ \ \isanewline
\ \ \ \ \ \ \ \ \ |\ Nonce\ \ nat\ \ \ \ \ \ \ \isanewline
\ \ \ \ \ \ \ \ \ |\ Key\ \ \ \ key\ \ \ \ \ \ \ \isanewline
\ \ \ \ \ \ \ \ \ |\ MPair\ \ msg\ msg\ \ \ \isanewline
\ \ \ \ \ \ \ \ \ |\ Crypt\ \ key\ msg\ \ \ \isanewline
\end{isabelle}
%
The notation $\comp{X\sb 1,\ldots X\sb{n-1},X\sb n}$
abbreviates
$\isa{MPair}\,X\sb 1\,\ldots\allowbreak(\isa{MPair}\,X\sb{n-1}\,X\sb n)$.

Since datatype constructors are injective, we have the theorem
\begin{isabelle}
Crypt K X = Crypt K' X' \isasymLongrightarrow\ K=K' \isasymand X=X'.
\end{isabelle}
A ciphertext can be decrypted using only one key and
can yield only one plaintext.  This assumption is realistic if encryption
includes some built-in redundancy.


\section{Modelling the Adversary}

The spy is part of the system and must be built into the model.  He is
a malicious user who does not have to follow the protocol.  He
watches the network and uses any keys he knows to decrypt messages,
perhaps learning additional keys and nonces.  He uses the items he has
accumulated to compose new messages, which he may send to anybody.  

Two functions enable us to formalize this behaviour: \isa{analz} and
\isa{synth}.  Each function maps a sets of messages to another set of
messages. The set \isa{analz H} formalizes what the adversary can learn
from the set of messages~$H$.  The closure properties of this set are
defined inductively.
\begin{isabelle}
\isacommand{consts}\ \ analz\ \ \ ::\ "msg\ set\ =>\ msg\ set"\isanewline
\isacommand{inductive}\ "analz\ H"\isanewline
\ \ \isakeyword{intros}\ \isanewline
\ \ \ \ Inj\ [intro,simp]\ :\ \ \ "X\ \isasymin \ H\
\isasymLongrightarrow\ X\
\isasymin
\ analz\ H"\isanewline
\ \ \ \ Fst:\ \ \ \ \ "\isacharbraceleft |X,Y|\isacharbraceright \ \isasymin \ analz\ H\
\isasymLongrightarrow\ X\ \isasymin \ analz\ H"\isanewline
\ \ \ \ Snd:\ \ \ \ \ "\isacharbraceleft |X,Y|\isacharbraceright \ \isasymin \ analz\ H\
\isasymLongrightarrow\ Y\ \isasymin \ analz\ H"\isanewline
\ \ \ \ Decrypt\ [dest]:\ \isanewline
\ \ \ \ \ \ \ \ \ \ \ \ \ "[|Crypt\ K\ X\ \isasymin \ analz\ H;\ Key(invKey\
K):\ analz\ H|]\isanewline
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \isasymLongrightarrow\ X\ \isasymin \ analz\ H"
\end{isabelle}
%
Note the \isa{Decrypt} rule: the spy can decrypt a
message encrypted with key~$K$ if he has the matching key,~$K^{-1}$. 
Properties proved by rule induction include the following:
\begin{isabelle}
G\isasymsubseteq H\ \isasymLongrightarrow\ analz(G)\ \isasymsubseteq\
analz(H)
\rulename{analz_mono}\isanewline
analz (analz H) = analz H
\rulename{analz_idem} 
\end{isabelle}

The set of fake messages that an intruder could invent
starting from~\isa{H} is \isa{synth(analz~H)}, where \isa{synth H}
formalizes what the adversary can build from the set of messages~$H$.  
\begin{isabelle}
\isacommand{consts}\ \ synth\ \ \ ::\ "msg\ set\ =>\ msg\ set"\isanewline
\isacommand{inductive}\ "synth\ H"\isanewline
\ \ \isakeyword{intros}\ \isanewline
\ \ \ \ Inj\ \ \ [intro]:\ "X\ \isasymin \ H\ \isasymLongrightarrow\
X\ \isasymin \ synth\ H"\isanewline
\ \ \ \ Agent\ [intro]:\ "Agent\ agt\ \isasymin \ synth\ H"\isanewline
\ \ \ \ MPair\ [intro]:\isanewline
\ \ \ \ \ \ \ \ \ \ \ \ \ "[|X\ \isasymin \ synth\ H;\ \ Y\ \isasymin
\ synth\ H|]\ \isasymLongrightarrow\
\isacharbraceleft |X,Y|\isacharbraceright \ \isasymin \ synth\ H"\isanewline
\ \ \ \ Crypt\ [intro]:\isanewline
\ \ \ \ \ \ \ \ \ \ \ \ \ "[|X\ \isasymin \ synth\ H;\ \ Key K\
\isasymin \ H|]\ \isasymLongrightarrow\ Crypt\ K\ X\
\isasymin \ synth\ H"
\end{isabelle}
The set includes all agent names.  Nonces and keys are assumed to be
unguessable, so none are included beyond those already in~$H$.   Two
elements of \isa{synth H} can be combined, and an element can be encrypted
using a key present in~$H$.

Like \isa{analz}, this set operator is monotone and idempotent.  It also
satisfies an interesting equation involving \isa{analz}:
\begin{isabelle}
analz (synth H)\ =\ analz H\ \isasymunion\ synth H
\rulename{analz_synth} 
\end{isabelle}
%
Rule inversion plays a major role in reasoning about \isa{synth}, through
declarations such as this one:
\begin{isabelle}
\isacommand{inductive\_cases}\ Nonce_synth\ [elim!]:\ "Nonce\ n\ \isasymin
\ synth\ H"
\end{isabelle}
%
The resulting elimination rule replaces every assumption of the form
\isa{Nonce\ n\ \isasymin \ synth\ H} by \isa{Nonce\ n\ 
\isasymin \ H}, expressing that a nonce cannot be guessed.  
%The theory also uses rule inversion with constructors \isa{Key}, \isa{Crypt}
%and \isa{MPair} (message pairing).

%
A third operator, \isa{parts}, is useful for stating correctness
properties.  The set
\isa{parts H} consists of the components of elements of~$H$.  This set
includes~\isa{H} and is closed under the projections from a compound
message to its immediate parts. 
Its definition resembles that of \isa{analz} except in the rule
corresponding to the constructor \isa{Crypt}: 
\begin{isabelle}
\ \ \ \ \ Crypt\ K\ X\ \isasymin \ parts\ H\ \isasymLongrightarrow\ X\
\isasymin \ parts\ H
\end{isabelle}
The body of an encrypted message is always regarded as part of it.  We can
use \isa{parts} to express general well-formedness properties of a protocol,
for example, that an uncompromised agent's private key will never be
included as a component of any message.


\section{Event Traces}\label{sec:events}

The system's behaviour is formalized as a set of traces of
\emph{events}.  The most important event, \isa{Says~A~B~X}, expresses the
attempt by~$A$ to send~$B$ the
  message~$X$.  A trace is simply a list, constructed in reverse
using~\isa{\#}.  

Sometimes the protocol requires an agent to generate a new nonce. The
probability that a 20-byte random number has appeared before is effectively
zero.  To formalize this important property, the set \isa{used evs}
denotes the set of all items mentioned in the trace~\isa{evs}.
The function \isa{used} has a straightforward
recursive definition.  Here is the case for \isa{Says} event:
\begin{isabelle}
\ \ \ \ \ used\ ((Says\ A\ B\ X)\ \#\ evs)\ =\ parts\ \isacharbraceleft
X\isacharbraceright \ \isasymunion\ (used\ evs)
\end{isabelle}

The function \isa{knows} formalizes an agent's knowledge.  Mostly we only
case about the spy's knowledge, and \isa{knows Spy evs} is the set of items
available to the spy in the trace~\isa{evs}.  Already in the empty trace,
the spy starts with some secrets at his disposal, such as the private keys
of compromised users.  After each \isa{Says} event, the spy learns the
message that was sent.  Combinations of functions express other important
concepts involving~\isa{evs}:
\begin{itemize}
\item \isa{analz (knows Spy evs)} is the items that the spy could
learn by decryption
\item \isa{synth (analz (knows Spy evs))} is the items that the spy
could generate
\end{itemize}

The function
\isa{pubK} maps agents to their public keys.  The function
\isa{priK} maps agents to their private keys, is defined in terms of
\isa{invKey} and \isa{pubK} by a translation.
\begin{isabelle}
\isacommand{consts}\ \,pubK\ \ \ \ ::\ "agent\ =>\ key"\isanewline
\isacommand{syntax}\ priK\ \ \ \ ::\ "agent\ =>\ key"\isanewline
\isacommand{translations}\ \ \ \ "priK\ x"\ \ ==\ "invKey(pubK\ x)"
\end{isabelle}
The set \isa{bad} consists of those agents whose private keys are known to
the spy.

Two axioms are asserted about the public-key cryptosystem. 
No two agents have the same public key, and no private key equals
any public key.
\begin{isabelle}
\isacommand{axioms}\isanewline
\ \ inj_pubK:\ \ \ \ \ \ \ \ "inj\ pubK"\isanewline
\ \ priK_neq_pubK:\ \ \ "priK\ A\ \isasymnoteq\ pubK\ B"
\end{isabelle}





\section{Modelling the Protocol}\label{sec:modelling}

Let us formalize the Needham-Schroeder public-key protocol, as corrected by
Lowe:
\begin{alignat*}{2}
  &1.&\quad  A\to B  &: \comp{Na,A}\sb{Kb} \\
  &2.&\quad  B\to A  &: \comp{Na,Nb,B}\sb{Ka} \\
  &3.&\quad  A\to B  &: \comp{Nb}\sb{Kb}
\end{alignat*}

Each protocol step is specified by a rule of an inductive definition.  An
event trace has type \isa{event list}, so we declare the constant
\isa{ns_public} to be a set of such traces.
\begin{isabelle}
\isacommand{consts}\ \ ns_public\ \ ::\ "event\ list\ set"
\end{isabelle}

\begin{figure}
\begin{isabelle}
\isacommand{inductive}\ ns_public\isanewline
\ \ \isakeyword{intros}\ \isanewline
\ \ \ \ \ \ \ \ \ \isanewline
\ \ \ Nil:\ \ "[]\ \isasymin \ ns_public"\isanewline
\isanewline
\ \ \ \ \ \ \ \ \ \isanewline
\ \ \ Fake:\ "\isasymlbrakk evsf\ \isasymin \ ns_public;\ \ X\ \isasymin \ synth\ (analz\ (spies\ evsf))\isasymrbrakk \isanewline
\ \ \ \ \ \ \ \ \ \ \isasymLongrightarrow \ Says\ Spy\ B\ X\ \ \#\ evsf\ \isasymin \ ns_public"\isanewline
\isanewline
\ \ \ \ \ \ \ \ \ \isanewline
\ \ \ NS1:\ \ "\isasymlbrakk evs1\ \isasymin \ ns_public;\ \ Nonce\ NA\ \isasymnotin \ used\ evs1\isasymrbrakk \isanewline
\ \ \ \ \ \ \ \ \ \ \isasymLongrightarrow \ Says\ A\ B\ (Crypt\ (pubK\ B)\ \isasymlbrace Nonce\ NA,\ Agent\ A\isasymrbrace )\isanewline
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \#\ evs1\ \ \isasymin \ \ ns_public"\isanewline
\isanewline
\ \ \ \ \ \ \ \ \ \isanewline
\ \ \ NS2:\ \ "\isasymlbrakk evs2\ \isasymin \ ns_public;\ \ Nonce\ NB\ \isasymnotin \ used\ evs2;\isanewline
\ \ \ \ \ \ \ \ \ \ \ Says\ A'\ B\ (Crypt\ (pubK\ B)\ \isasymlbrace Nonce\ NA,\ Agent\ A\isasymrbrace )\ \isasymin \ set\ evs2\isasymrbrakk \isanewline
\ \ \ \ \ \ \ \ \ \ \isasymLongrightarrow \ Says\ B\ A\ (Crypt\ (pubK\ A)\ \isasymlbrace Nonce\ NA,\ Nonce\ NB,\ Agent\ B\isasymrbrace )\isanewline
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \#\ evs2\ \ \isasymin \ \ ns_public"\isanewline
\isanewline
\ \ \ \ \ \ \ \ \ \isanewline
\ \ \ NS3:\ \ "\isasymlbrakk evs3\ \isasymin \ ns_public;\isanewline
\ \ \ \ \ \ \ \ \ \ \ Says\ A\ \ B\ (Crypt\ (pubK\ B)\ \isasymlbrace Nonce\ NA,\ Agent\ A\isasymrbrace )\ \isasymin \ set\ evs3;\isanewline
\ \ \ \ \ \ \ \ \ \ \ Says\ B'\ A\ (Crypt\ (pubK\ A)\ \isasymlbrace Nonce\ NA,\ Nonce\ NB,\ Agent\ B\isasymrbrace )\isanewline
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \isasymin \ set\ evs3\isasymrbrakk \isanewline
\ \ \ \ \ \ \ \ \ \ \isasymLongrightarrow \ Says\ A\ B\ (Crypt\ (pubK\ B)\ (Nonce\ NB))\ \#\ evs3\ \isasymin \
ns_public"
\end{isabelle}
\caption{An Inductive Protocol Definition}\label{fig:ns_public}
\end{figure}

Figure~\ref{fig:ns_public} presents the inductive definition.  The
\isa{Nil} rule introduces the empty trace.  The \isa{Fake} rule models the
adversary's sending a message built from components taken from past
traffic, expressed using the functions \isa{synth} and
\isa{analz}. 
The next three rules model how honest agents would perform the three
protocol steps.  

Here is a detailed explanation of rule \isa{NS2}.
A trace containing an event of the form
\begin{isabelle}
\ \ \ \ \ Says\ A'\ B\ (Crypt\ (pubK\ B)\ \isasymlbrace Nonce\
NA,\ Agent\ A\isasymrbrace)
\end{isabelle}
%
may be extended by an event of the form
\begin{isabelle}
\ \ \ \ \ Says\ B\ A\ (Crypt\ (pubK\ A)\
\isasymlbrace Nonce\ NA,\ Nonce\ NB,\ Agent\ B\isasymrbrace)
\end{isabelle}
where \isa{NB} is a fresh nonce: \isa{Nonce\ NB\ \isasymnotin \ used\ evs2}.
Writing the sender as \isa{A'} indicates that \isa{B} does not 
know who sent the message.  Calling the trace variable \isa{evs2} rather
than simply \isa{evs} helps us know where we are in a proof after many
case-splits: every subgoal mentioning \isa{evs2} involves message~2 of the
protocol.

Benefits of this approach are simplicity and clarity.  The semantic model
is set theory, proofs are by induction and the translation from the informal
notation to the inductive rules is straightforward. 


\section{Proving Elementary Properties}

Secrecy properties are hard to prove.  The conclusion of a typical secrecy
theorem is 
\isa{X\ \isasymnotin\ analz (knows Spy evs)}.  The difficulty arises from
having to reason about \isa{analz}, or less formally, showing that the spy
can never learn~\isa{X}.  Much easier is to prove that \isa{X} can never
occur at all.  Such \emph{regularity} properties are typically expressed
using \isa{parts} rather than \isa{analz}.

The following lemma states that \isa{A}'s private key is potentially
known to the spy if and only if \isa{A} belongs to the set \isa{bad} of
compromised agents.  The statement uses \isa{parts}: the very presence of
her private key in a message, whether protected by encryption or not, is
enough to confirm that \isa{A} is compromised.  The proof, like nearly all
protocol proofs, is by induction over traces.
\begin{isabelle}
\isacommand{lemma}\ Spy_see_priK\ [simp]:\ \isanewline
\ \ \ \ \ \ "evs\ \isasymin \ ns_public\isanewline
\ \ \ \ \ \ \ \isasymLongrightarrow \
(Key\ (priK\ A)\ \isasymin \ parts\ (spies\ evs))\ =\ (A\ \isasymin \
bad)"\isanewline
\isacommand{apply}\ (erule\ ns_public.induct,\ simp_all)
\end{isabelle}
%
The induction yields five subgoals, one for each rule in the definition of
\isa{ns_public}.  Informally, the idea is to prove that the protocol
property holds initially (rule \isa{Nil}), is preserved by each of the
legitimate protocol steps (rules \isa{NS1}--\isa{3}), and even is preserved
in the face of anything the spy can do (rule \isa{Fake}).  Simplification
leaves only the \isa{Fake} case (as indicated by the variable name
\isa{evsf}):
\begin{isabelle}
\ 1.\ \isasymAnd X\ evsf.\isanewline
\isaindent{\ 1.\ \ \ \ }\isasymlbrakk evsf\ \isasymin \ ns_public;\isanewline
\isaindent{\ 1.\ \ \ \ \ \ \ }(Key\ (priK\ A)\ \isasymin \ parts\ (knows\ Spy\ evsf))\ =\ (A\ \isasymin \ bad);\isanewline
\isaindent{\ 1.\ \ \ \ \ \ \ }X\ \isasymin \ synth\ (analz\ (knows\ Spy\ evsf))\isasymrbrakk \isanewline
\isaindent{\ 1.\ \ \ \ }\isasymLongrightarrow \ (Key\ (priK\ A)\ \isasymin \ parts\ (insert\ X\ (knows\ Spy\ evsf)))\ =\isanewline
\isaindent{\ 1.\ \ \ \ \isasymLongrightarrow \ }(A\ \isasymin \
bad)\isanewline
\isacommand{by}\ blast
\end{isabelle}
%
The \isa{Fake} case is proved automatically.  If
\isa{priK~A} is in the extended trace then either (1) it was already in the
original trace or (2) it was
generated by the spy, and so the spy must have known this key already. 
Either way, the induction hypothesis applies.  If the spy can tell himself
something, then he must have known it already.

\emph{Unicity} lemmas are regularity lemmas stating that specified items
can occur only once in a trace.  The following lemma states that a nonce
cannot be used both as $Na$ and as $Nb$ unless
it is known to the spy.  Intuitively, it holds because honest agents
always choose fresh values as nonces; only the spy might reuse a value,
and he doesn't know this particular value.  The proof script is three steps
long.
\begin{isabelle}
\isacommand{lemma}\ no_nonce_NS1_NS2\ [rule_format]:\ \isanewline
\ "evs\ \isasymin \ ns_public\ \isanewline
\ \ \isasymLongrightarrow \ Crypt\ (pubK\ C)\ \isasymlbrace NA',\ Nonce\ NA,\ Agent\ D\isasymrbrace \ \isasymin \ parts\ (spies\ evs)\ \isasymlongrightarrow \isanewline
\ \ \ \ \ \ Crypt\ (pubK\ B)\ \isasymlbrace Nonce\ NA,\ Agent\ A\isasymrbrace \ \isasymin \ parts\ (spies\ evs)\ \isasymlongrightarrow \ \ \isanewline
\ \ \ \ \ \ Nonce\ NA\ \isasymin \ analz\ (spies\ evs)"
\end{isabelle}

This unicity lemma states that, if \isa{NA} is secret, then its appearance
in any instance of message~1 determines the other components.  The proof is
another easy induction.
\begin{isabelle}
\isacommand{lemma}\ unique_NA:\ \isanewline
\ \ \ \ \ "\isasymlbrakk Crypt(pubK\ B)\ \ \isasymlbrace Nonce\ NA,\ Agent\ A\ \isasymrbrace \ \isasymin \ parts(spies\ evs);\ \ \isanewline
\ \ \ \ \ \ \ Crypt(pubK\ B')\ \isasymlbrace Nonce\ NA,\ Agent\ A'\isasymrbrace \ \isasymin \ parts(spies\ evs);\ \ \isanewline
\ \ \ \ \ \ \ Nonce\ NA\ \isasymnotin \ analz\ (spies\ evs);\ evs\ \isasymin \ ns_public\isasymrbrakk \isanewline
\ \ \ \ \ \ \isasymLongrightarrow \ A=A'\ \isasymand \ B=B'"
\end{isabelle}


\section{Proving Secrecy Theorems}\label{sec:secrecy}

The secrecy theorems for Bob (the second participant) are especially
important, since they fail for the original protocol.  This theorem states
that if Bob sends message~2 to Alice, and both agents are uncompromised,
then Bob's nonce will never reach the spy.
\begin{isabelle}
\isacommand{theorem}\ Spy_not_see_NB\ [dest]:\isanewline
\ "\isasymlbrakk Says\ B\ A\ (Crypt\ (pubK\ A)\ \isasymlbrace Nonce\ NA,\ Nonce\ NB,\ Agent\ B\isasymrbrace )\ \isasymin \ set\ evs;\isanewline
\ \ \ A\ \isasymnotin \ bad;\ \ B\ \isasymnotin \ bad;\ \ evs\ \isasymin \ ns_public\isasymrbrakk \isanewline
\ \ \isasymLongrightarrow \ Nonce\ NB\ \isasymnotin \ analz\ (spies\ evs)"
\end{isabelle}
%
To prove this, we must formulate the induction properly (one of the
assumptions mentions~\isa{evs}), apply induction, and simplify.
The proof states are too complicated to present in full.  
Let's just look
at the easiest subgoal, that for message~1:
\begin{isabelle}
\ 1.\ \isasymAnd Ba\ NAa\ evs1.\isanewline
\isaindent{\ 1.\ \ \ \ }\isasymlbrakk A\ \isasymnotin \ bad;\ B\ \isasymnotin \ bad;\ evs1\ \isasymin \ ns_public;\isanewline
\isaindent{\ 1.\ \ \ \ \ \ \ }Says\ B\ A\ (Crypt\ (pubK\ A)\ \isasymlbrace Nonce\ NA,\ Nonce\ NB,\ Agent\ B\isasymrbrace )\isanewline
\isaindent{\ 1.\ \ \ \ \ \ \ }\isasymin \ set\ evs1\ \isasymlongrightarrow \isanewline
\isaindent{\ 1.\ \ \ \ \ \ \ }Nonce\ NB\ \isasymnotin \ analz\ (knows\ Spy\ evs1);\isanewline
\isaindent{\ 1.\ \ \ \ \ \ \ }Nonce\ NAa\ \isasymnotin \ used\ evs1\isasymrbrakk \isanewline
\isaindent{\ 1.\ \ \ \ }\isasymLongrightarrow \ Ba\ \isasymin \ bad\ \isasymlongrightarrow \isanewline
\isaindent{\ 1.\ \ \ \ \isasymLongrightarrow \ }Says\ B\ A\ (Crypt\ (pubK\ A)\ \isasymlbrace Nonce\ NA,\ Nonce\ NB,\ Agent\ B\isasymrbrace )\isanewline
\isaindent{\ 1.\ \ \ \ \isasymLongrightarrow \ }\isasymin \ set\ evs1\ \isasymlongrightarrow \isanewline
\isaindent{\ 1.\ \ \ \ \isasymLongrightarrow \ }NB\ \isasymnoteq \ NAa
\end{isabelle}
This subgoal refers to another protocol run, in which \isa{B} has sent
message~1 to somebody called~\isa{Ba}.  Agent \isa{Ba} 
is compromised, and so the spy has learnt the nonce that was just sent,
which is called~\isa{NAa}.  We need to know that this nonce differs from the
one we care about, namely~\isa{NB}\@.  They do indeed differ: \isa{NB} was
sent earlier, while \isa{NAa} was chosen to be fresh (we have the assumption
\isa{Nonce\ NAa\ \isasymnotin \ used\ evs1}). 

Note that our reasoning concerned \isa{B}'s participation in another
run.  Agents may engage in several runs concurrently, and some attacks work
by interleaving the messages of two runs.  With model checking, this
possibility can cause a state-space explosion, and for us it
certainly complicates proofs.  The biggest subgoal concerns message~2.  It
splits into several cases, some of which are proved by unicity, others by
the induction hypothesis.  For all those complications, the proofs are
automatic by \isa{blast} with the theorem \isa{no_nonce_NS1_NS2}.

The remaining proofs are straightforward.  This theorem asserts that if
\isa{B} has sent an instance of message~2 to~\isa{A} and has received the
expected reply, then that reply really came from~\isa{A}.  The proof is a
simple induction.
\begin{isabelle}
\isacommand{theorem}\ B_trusts_NS3:\isanewline
\ "\isasymlbrakk Says\ B\ A\ \ (Crypt\ (pubK\ A)\ \isasymlbrace Nonce\ NA,\ Nonce\ NB,\ Agent\ B\isasymrbrace )\ \isasymin \ set\ evs;\isanewline
\ \ \ Says\ A'\ B\ (Crypt\ (pubK\ B)\ (Nonce\ NB))\ \isasymin \ set\ evs;\ \isanewline
\ \ \ A\ \isasymnotin \ bad;\ \ B\ \isasymnotin \ bad;\ \ evs\ \isasymin \ ns_public\isasymrbrakk \ \ \ \ \ \ \ \ \isanewline
\ \ \isasymLongrightarrow \ Says\ A\ B\ (Crypt\ (pubK\ B)\ (Nonce\ NB))\ \isasymin \ set\
evs"
\end{isabelle}

From similar assumptions, we can prove that \isa{A} started the protocol
run by sending an instance of message~1 involving the nonce~\isa{NA}.  For
this theorem, the conclusion is 
\begin{isabelle}
\ \ \ \ \ Says\ A\ B\ (Crypt\ (pubK\ B)\ \isasymlbrace Nonce\ NA,\ Agent\
A\isasymrbrace )\ \isasymin \ set\ evs
\end{isabelle}
%
Analogous theorems can be proved for~\isa{A}, stating that nonce~\isa{NA}
remains secret and that message~2 really originates with~\isa{B}.  Even the
flawed protocol establishes these properties for~\isa{A};
the flaw only harms the second participant.
Detailed information on this technique can be found
elsewhere~\cite{paulson-jcs}, including proofs of an Internet
protocol~\cite{paulson-tls}.


\endinput
     1.5 \ No newline at end of file
     1.6 +% $Id$
     1.7 +\chapter{Case Study: Verifying a Cryptographic Protocol}
     1.8 +\label{chap:crypto}
     1.9 +
    1.10 +%crypto primitives FIXME: GET RID OF MANY
    1.11 +\def\lbb{\mathopen{\{\kern-.30em|}}
    1.12 +\def\rbb{\mathclose{|\kern-.32em\}}}
    1.13 +\def\comp#1{\lbb#1\rbb}
    1.14 +
    1.15 +Communications security is an ancient art.  Julius Caesar is said to have
    1.16 +encrypted his messages, shifting each letter three places along the
    1.17 +alphabet.  Mary Queen of Scots was convicted of treason after a cipher used
    1.18 +in her letters was broken.  Today's postal system incorporates security
    1.19 +features. The envelope provides a degree of
    1.20 +\emph{secrecy}.  The signature provides \emph{authenticity} (proof of
    1.21 +origin), as do departmental stamps and letterheads. 
    1.22 +
    1.23 +Networks are vulnerable: messages pass through many computers, any of which
    1.24 +might be controlled an adversary, who thus can capture or redirect
    1.25 +messages.  People who wish to communicate securely over such a network can
    1.26 +use cryptography, but if they are to understand each other, they need to
    1.27 +follow a
    1.28 +\emph{protocol}: a pre-arranged sequence of message formats. 
    1.29 +
    1.30 +Protocols can be attacked in many ways, even if encryption is unbreakable. 
    1.31 +A \emph{splicing attack} involves an adversary's sending a message composed
    1.32 +of parts of several old messages.  This fake message may have the correct
    1.33 +format, fooling an honest party.  The adversary might be able to masquerade
    1.34 +as somebody else, or he might obtain a secret key.
    1.35 +
    1.36 +\emph{Nonces} help prevent splicing attacks. A typical nonce is a 20-byte
    1.37 +random number. Each message that requires a reply incorporates a nonce. The
    1.38 +reply must include a copy of that nonce, to prove that it is not a replay of
    1.39 +a past message.  Nonces must be cryptographically protected, since
    1.40 +otherwise an adversary could easily generate fakes. You should be starting
    1.41 +to see that protocol design is tricky!
    1.42 +
    1.43 +Researchers are developing methods for proving the correctness of security
    1.44 +protocols.  The Needham-Schroeder public-key
    1.45 +protocol~\cite{needham-schroeder} has become a standard test case. 
    1.46 +Proposed in 1978, it was found to be defective nearly two decades
    1.47 +later~\cite{lowe-fdr}.  This toy protocol will be useful in demonstrating
    1.48 +how to verify protocols using Isabelle.
    1.49 +
    1.50 +
    1.51 +\section{The Needham-Schroeder Public-Key Protocol}\label{sec:ns-protocol}
    1.52 +
    1.53 +This protocol uses public-key cryptography. Each person has a private key, known only to
    1.54 +himself, and a public key, known to everybody. If Alice wants to send Bob a secret message, she
    1.55 +encrypts it using Bob's public key (which everybody knows), and sends it to Bob. Only Bob has the
    1.56 +matching private key, which is needed in order to decrypt Alice's message.
    1.57 +
    1.58 +The core of the Needham-Schroeder protocol consists of three messages:
    1.59 +\begin{alignat*}{2}
    1.60 +  &1.&\quad  A\to B  &: \comp{Na,A}\sb{Kb} \\
    1.61 +  &2.&\quad  B\to A  &: \comp{Na,Nb}\sb{Ka} \\
    1.62 +  &3.&\quad  A\to B  &: \comp{Nb}\sb{Kb}
    1.63 +\end{alignat*}
    1.64 +First, let's understand the notation. In the first message, Alice
    1.65 +sends Bob a message consisting of a nonce generated by Alice~($Na$)
    1.66 +paired  with Alice's name~($A$) and encrypted using Bob's public
    1.67 +key~($Kb$). In the second message, Bob sends Alice a message
    1.68 +consisting of $Na$ paired with a nonce generated by Bob~($Nb$), 
    1.69 +encrypted using Alice's public key~($Ka$). In the last message, Alice
    1.70 +returns $Nb$ to Bob, encrypted using his public key.
    1.71 +
    1.72 +When Alice receives Message~2, she knows that Bob has acted on her
    1.73 +message, since only he could have decrypted
    1.74 +$\comp{Na,A}\sb{Kb}$ and extracted~$Na$.  That is precisely what
    1.75 +nonces are for.  Similarly, message~3 assures Bob that Alice is
    1.76 +active.  But the protocol was widely believed~\cite{ban89} to satisfy a
    1.77 +further property: that
    1.78 +$Na$ and~$Nb$ were secrets shared by Alice and Bob.  (Many
    1.79 +protocols generate such shared secrets, which can be used
    1.80 +to lessen the reliance on slow public-key operations.)  Lowe found this
    1.81 +claim to be false: if Alice runs the protocol with someone untrustworthy
    1.82 +(Charlie say), then he can start a new run with another agent (Bob say). 
    1.83 +Charlie uses Alice as an oracle, masquerading as
    1.84 +Alice to Bob~\cite{lowe-fdr}.
    1.85 +\begin{alignat*}{4}
    1.86 +  &1.&\quad  A\to C  &: \comp{Na,A}\sb{Kc}   & 
    1.87 +      \qquad 1'.&\quad  C\to B  &: \comp{Na,A}\sb{Kb} \\
    1.88 +  &2.&\quad  B\to A  &: \comp{Na,Nb}\sb{Ka} \\
    1.89 +  &3.&\quad  A\to C  &: \comp{Nb}\sb{Kc}  &
    1.90 +      3'.&\quad  A\to B  &: \comp{Nb}\sb{Kb}
    1.91 +\end{alignat*}
    1.92 +In messages~1 and~3, Charlie removes the encryption using his private
    1.93 +key and re-encrypts Alice's messages using Bob's public key. Bob is
    1.94 +left thinking he has run the protocol with Alice, which was not
    1.95 +Alice's intention, and Bob is unaware that the ``secret'' nonces are
    1.96 +known to Charlie.  This is a typical man-in-the-middle attack launched
    1.97 +by an insider.
    1.98 +
    1.99 +Lowe also suggested a fix, namely to include Bob's name in message~2:
   1.100 +\begin{alignat*}{2}
   1.101 +  &1.&\quad  A\to B  &: \comp{Na,A}\sb{Kb} \\
   1.102 +  &2.&\quad  B\to A  &: \comp{Na,Nb,B}\sb{Ka} \\
   1.103 +  &3.&\quad  A\to B  &: \comp{Nb}\sb{Kb}
   1.104 +\end{alignat*}
   1.105 +If Charlie tries the same attack, Alice will receive the message
   1.106 +$\comp{Na,Nb,B}\sb{Ka}$ when she was expecting to receive
   1.107 +$\comp{Na,Nb,C}\sb{Ka}$.  She will abandon the run, and eventually so
   1.108 +will Bob.
   1.109 +
   1.110 +In ground-breaking work, Lowe~\cite{lowe-fdr} showed how such attacks
   1.111 +could be found automatically using a model checker.  An alternative,
   1.112 +which we shall examine below, is to prove protocols correct.  Proofs
   1.113 +can be done under more realistic assumptions because our model does
   1.114 +not have to be finite.  The strategy is to formalize the operational
   1.115 +semantics of the system and to prove security properties using rule
   1.116 +induction.
   1.117 +
   1.118 +
   1.119 +\section{Agents and Messages}
   1.120 +
   1.121 +All protocol specifications refer to a syntactic theory of messages. 
   1.122 +Datatype
   1.123 +\isa{agent} introduces the constant \isa{Server} (a trusted central
   1.124 +machine, needed for some protocols), an infinite population of ``friendly''
   1.125 +agents, and the~\isa{Spy}:
   1.126 +\begin{isabelle}
   1.127 +\isacommand{datatype}\ agent\ =\ Server\ |\ Friend\ nat\ |\ Spy
   1.128 +\end{isabelle}
   1.129 +%
   1.130 +Keys are just natural numbers.  Function \isa{invKey} maps a public key to
   1.131 +the matching private key, and vice versa:
   1.132 +\begin{isabelle}
   1.133 +\isacommand{types}\ key\ =\ nat\isanewline
   1.134 +\isacommand{consts}\ invKey\ ::\ "key=>key"
   1.135 +\end{isabelle}
   1.136 +Datatype
   1.137 +\isa{msg} introduces the message forms, which include agent names, nonces,
   1.138 +keys, compound messages, and encryptions.  
   1.139 +\begin{isabelle}
   1.140 +\isacommand{datatype}\isanewline
   1.141 +\ \ \ \ \ msg\ =\ Agent\ \ agent\ \ \ \ \ \isanewline
   1.142 +\ \ \ \ \ \ \ \ \ |\ Nonce\ \ nat\ \ \ \ \ \ \ \isanewline
   1.143 +\ \ \ \ \ \ \ \ \ |\ Key\ \ \ \ key\ \ \ \ \ \ \ \isanewline
   1.144 +\ \ \ \ \ \ \ \ \ |\ MPair\ \ msg\ msg\ \ \ \isanewline
   1.145 +\ \ \ \ \ \ \ \ \ |\ Crypt\ \ key\ msg\ \ \ \isanewline
   1.146 +\end{isabelle}
   1.147 +%
   1.148 +The notation $\comp{X\sb 1,\ldots X\sb{n-1},X\sb n}$
   1.149 +abbreviates
   1.150 +$\isa{MPair}\,X\sb 1\,\ldots\allowbreak(\isa{MPair}\,X\sb{n-1}\,X\sb n)$.
   1.151 +
   1.152 +Since datatype constructors are injective, we have the theorem
   1.153 +\begin{isabelle}
   1.154 +Crypt K X = Crypt K' X' \isasymLongrightarrow\ K=K' \isasymand X=X'.
   1.155 +\end{isabelle}
   1.156 +A ciphertext can be decrypted using only one key and
   1.157 +can yield only one plaintext.  This assumption is realistic if encryption
   1.158 +includes some built-in redundancy.
   1.159 +
   1.160 +
   1.161 +\section{Modelling the Adversary}
   1.162 +
   1.163 +The spy is part of the system and must be built into the model.  He is
   1.164 +a malicious user who does not have to follow the protocol.  He
   1.165 +watches the network and uses any keys he knows to decrypt messages,
   1.166 +perhaps learning additional keys and nonces.  He uses the items he has
   1.167 +accumulated to compose new messages, which he may send to anybody.  
   1.168 +
   1.169 +Two functions enable us to formalize this behaviour: \isa{analz} and
   1.170 +\isa{synth}.  Each function maps a sets of messages to another set of
   1.171 +messages. The set \isa{analz H} formalizes what the adversary can learn
   1.172 +from the set of messages~$H$.  The closure properties of this set are
   1.173 +defined inductively.
   1.174 +\begin{isabelle}
   1.175 +\isacommand{consts}\ \ analz\ \ \ ::\ "msg\ set\ =>\ msg\ set"\isanewline
   1.176 +\isacommand{inductive}\ "analz\ H"\isanewline
   1.177 +\ \ \isakeyword{intros}\ \isanewline
   1.178 +\ \ \ \ Inj\ [intro,simp]\ :\ \ \ "X\ \isasymin \ H\
   1.179 +\isasymLongrightarrow\ X\
   1.180 +\isasymin
   1.181 +\ analz\ H"\isanewline
   1.182 +\ \ \ \ Fst:\ \ \ \ \ "\isacharbraceleft |X,Y|\isacharbraceright \ \isasymin \ analz\ H\
   1.183 +\isasymLongrightarrow\ X\ \isasymin \ analz\ H"\isanewline
   1.184 +\ \ \ \ Snd:\ \ \ \ \ "\isacharbraceleft |X,Y|\isacharbraceright \ \isasymin \ analz\ H\
   1.185 +\isasymLongrightarrow\ Y\ \isasymin \ analz\ H"\isanewline
   1.186 +\ \ \ \ Decrypt\ [dest]:\ \isanewline
   1.187 +\ \ \ \ \ \ \ \ \ \ \ \ \ "[|Crypt\ K\ X\ \isasymin \ analz\ H;\ Key(invKey\
   1.188 +K):\ analz\ H|]\isanewline
   1.189 +\ \ \ \ \ \ \ \ \ \ \ \ \ \ \isasymLongrightarrow\ X\ \isasymin \ analz\ H"
   1.190 +\end{isabelle}
   1.191 +%
   1.192 +Note the \isa{Decrypt} rule: the spy can decrypt a
   1.193 +message encrypted with key~$K$ if he has the matching key,~$K^{-1}$. 
   1.194 +Properties proved by rule induction include the following:
   1.195 +\begin{isabelle}
   1.196 +G\isasymsubseteq H\ \isasymLongrightarrow\ analz(G)\ \isasymsubseteq\
   1.197 +analz(H)
   1.198 +\rulename{analz_mono}\isanewline
   1.199 +analz (analz H) = analz H
   1.200 +\rulename{analz_idem} 
   1.201 +\end{isabelle}
   1.202 +
   1.203 +The set of fake messages that an intruder could invent
   1.204 +starting from~\isa{H} is \isa{synth(analz~H)}, where \isa{synth H}
   1.205 +formalizes what the adversary can build from the set of messages~$H$.  
   1.206 +\begin{isabelle}
   1.207 +\isacommand{consts}\ \ synth\ \ \ ::\ "msg\ set\ =>\ msg\ set"\isanewline
   1.208 +\isacommand{inductive}\ "synth\ H"\isanewline
   1.209 +\ \ \isakeyword{intros}\ \isanewline
   1.210 +\ \ \ \ Inj\ \ \ [intro]:\ "X\ \isasymin \ H\ \isasymLongrightarrow\
   1.211 +X\ \isasymin \ synth\ H"\isanewline
   1.212 +\ \ \ \ Agent\ [intro]:\ "Agent\ agt\ \isasymin \ synth\ H"\isanewline
   1.213 +\ \ \ \ MPair\ [intro]:\isanewline
   1.214 +\ \ \ \ \ \ \ \ \ \ \ \ \ "[|X\ \isasymin \ synth\ H;\ \ Y\ \isasymin
   1.215 +\ synth\ H|]\ \isasymLongrightarrow\
   1.216 +\isacharbraceleft |X,Y|\isacharbraceright \ \isasymin \ synth\ H"\isanewline
   1.217 +\ \ \ \ Crypt\ [intro]:\isanewline
   1.218 +\ \ \ \ \ \ \ \ \ \ \ \ \ "[|X\ \isasymin \ synth\ H;\ \ Key K\
   1.219 +\isasymin \ H|]\ \isasymLongrightarrow\ Crypt\ K\ X\
   1.220 +\isasymin \ synth\ H"
   1.221 +\end{isabelle}
   1.222 +The set includes all agent names.  Nonces and keys are assumed to be
   1.223 +unguessable, so none are included beyond those already in~$H$.   Two
   1.224 +elements of \isa{synth H} can be combined, and an element can be encrypted
   1.225 +using a key present in~$H$.
   1.226 +
   1.227 +Like \isa{analz}, this set operator is monotone and idempotent.  It also
   1.228 +satisfies an interesting equation involving \isa{analz}:
   1.229 +\begin{isabelle}
   1.230 +analz (synth H)\ =\ analz H\ \isasymunion\ synth H
   1.231 +\rulename{analz_synth} 
   1.232 +\end{isabelle}
   1.233 +%
   1.234 +Rule inversion plays a major role in reasoning about \isa{synth}, through
   1.235 +declarations such as this one:
   1.236 +\begin{isabelle}
   1.237 +\isacommand{inductive\_cases}\ Nonce_synth\ [elim!]:\ "Nonce\ n\ \isasymin
   1.238 +\ synth\ H"
   1.239 +\end{isabelle}
   1.240 +%
   1.241 +The resulting elimination rule replaces every assumption of the form
   1.242 +\isa{Nonce\ n\ \isasymin \ synth\ H} by \isa{Nonce\ n\ 
   1.243 +\isasymin \ H}, expressing that a nonce cannot be guessed.  
   1.244 +%The theory also uses rule inversion with constructors \isa{Key}, \isa{Crypt}
   1.245 +%and \isa{MPair} (message pairing).
   1.246 +
   1.247 +%
   1.248 +A third operator, \isa{parts}, is useful for stating correctness
   1.249 +properties.  The set
   1.250 +\isa{parts H} consists of the components of elements of~$H$.  This set
   1.251 +includes~\isa{H} and is closed under the projections from a compound
   1.252 +message to its immediate parts. 
   1.253 +Its definition resembles that of \isa{analz} except in the rule
   1.254 +corresponding to the constructor \isa{Crypt}: 
   1.255 +\begin{isabelle}
   1.256 +\ \ \ \ \ Crypt\ K\ X\ \isasymin \ parts\ H\ \isasymLongrightarrow\ X\
   1.257 +\isasymin \ parts\ H
   1.258 +\end{isabelle}
   1.259 +The body of an encrypted message is always regarded as part of it.  We can
   1.260 +use \isa{parts} to express general well-formedness properties of a protocol,
   1.261 +for example, that an uncompromised agent's private key will never be
   1.262 +included as a component of any message.
   1.263 +
   1.264 +
   1.265 +\section{Event Traces}\label{sec:events}
   1.266 +
   1.267 +The system's behaviour is formalized as a set of traces of
   1.268 +\emph{events}.  The most important event, \isa{Says~A~B~X}, expresses the
   1.269 +attempt by~$A$ to send~$B$ the
   1.270 +  message~$X$.  A trace is simply a list, constructed in reverse
   1.271 +using~\isa{\#}.  
   1.272 +
   1.273 +Sometimes the protocol requires an agent to generate a new nonce. The
   1.274 +probability that a 20-byte random number has appeared before is effectively
   1.275 +zero.  To formalize this important property, the set \isa{used evs}
   1.276 +denotes the set of all items mentioned in the trace~\isa{evs}.
   1.277 +The function \isa{used} has a straightforward
   1.278 +recursive definition.  Here is the case for \isa{Says} event:
   1.279 +\begin{isabelle}
   1.280 +\ \ \ \ \ used\ ((Says\ A\ B\ X)\ \#\ evs)\ =\ parts\ \isacharbraceleft
   1.281 +X\isacharbraceright \ \isasymunion\ (used\ evs)
   1.282 +\end{isabelle}
   1.283 +
   1.284 +The function \isa{knows} formalizes an agent's knowledge.  Mostly we only
   1.285 +case about the spy's knowledge, and \isa{knows Spy evs} is the set of items
   1.286 +available to the spy in the trace~\isa{evs}.  Already in the empty trace,
   1.287 +the spy starts with some secrets at his disposal, such as the private keys
   1.288 +of compromised users.  After each \isa{Says} event, the spy learns the
   1.289 +message that was sent.  Combinations of functions express other important
   1.290 +concepts involving~\isa{evs}:
   1.291 +\begin{itemize}
   1.292 +\item \isa{analz (knows Spy evs)} is the items that the spy could
   1.293 +learn by decryption
   1.294 +\item \isa{synth (analz (knows Spy evs))} is the items that the spy
   1.295 +could generate
   1.296 +\end{itemize}
   1.297 +
   1.298 +The function
   1.299 +\isa{pubK} maps agents to their public keys.  The function
   1.300 +\isa{priK} maps agents to their private keys, is defined in terms of
   1.301 +\isa{invKey} and \isa{pubK} by a translation.
   1.302 +\begin{isabelle}
   1.303 +\isacommand{consts}\ \,pubK\ \ \ \ ::\ "agent\ =>\ key"\isanewline
   1.304 +\isacommand{syntax}\ priK\ \ \ \ ::\ "agent\ =>\ key"\isanewline
   1.305 +\isacommand{translations}\ \ \ \ "priK\ x"\ \ ==\ "invKey(pubK\ x)"
   1.306 +\end{isabelle}
   1.307 +The set \isa{bad} consists of those agents whose private keys are known to
   1.308 +the spy.
   1.309 +
   1.310 +Two axioms are asserted about the public-key cryptosystem. 
   1.311 +No two agents have the same public key, and no private key equals
   1.312 +any public key.
   1.313 +\begin{isabelle}
   1.314 +\isacommand{axioms}\isanewline
   1.315 +\ \ inj_pubK:\ \ \ \ \ \ \ \ "inj\ pubK"\isanewline
   1.316 +\ \ priK_neq_pubK:\ \ \ "priK\ A\ \isasymnoteq\ pubK\ B"
   1.317 +\end{isabelle}
   1.318 +
   1.319 +
   1.320 +
   1.321 +
   1.322 +
   1.323 +\section{Modelling the Protocol}\label{sec:modelling}
   1.324 +
   1.325 +Let us formalize the Needham-Schroeder public-key protocol, as corrected by
   1.326 +Lowe:
   1.327 +\begin{alignat*}{2}
   1.328 +  &1.&\quad  A\to B  &: \comp{Na,A}\sb{Kb} \\
   1.329 +  &2.&\quad  B\to A  &: \comp{Na,Nb,B}\sb{Ka} \\
   1.330 +  &3.&\quad  A\to B  &: \comp{Nb}\sb{Kb}
   1.331 +\end{alignat*}
   1.332 +
   1.333 +Each protocol step is specified by a rule of an inductive definition.  An
   1.334 +event trace has type \isa{event list}, so we declare the constant
   1.335 +\isa{ns_public} to be a set of such traces.
   1.336 +\begin{isabelle}
   1.337 +\isacommand{consts}\ \ ns_public\ \ ::\ "event\ list\ set"
   1.338 +\end{isabelle}
   1.339 +
   1.340 +\begin{figure}
   1.341 +\begin{isabelle}
   1.342 +\isacommand{inductive}\ ns_public\isanewline
   1.343 +\ \ \isakeyword{intros}\ \isanewline
   1.344 +\ \ \ \ \ \ \ \ \ \isanewline
   1.345 +\ \ \ Nil:\ \ "[]\ \isasymin \ ns_public"\isanewline
   1.346 +\isanewline
   1.347 +\ \ \ \ \ \ \ \ \ \isanewline
   1.348 +\ \ \ Fake:\ "\isasymlbrakk evsf\ \isasymin \ ns_public;\ \ X\ \isasymin \ synth\ (analz\ (spies\ evsf))\isasymrbrakk \isanewline
   1.349 +\ \ \ \ \ \ \ \ \ \ \isasymLongrightarrow \ Says\ Spy\ B\ X\ \ \#\ evsf\ \isasymin \ ns_public"\isanewline
   1.350 +\isanewline
   1.351 +\ \ \ \ \ \ \ \ \ \isanewline
   1.352 +\ \ \ NS1:\ \ "\isasymlbrakk evs1\ \isasymin \ ns_public;\ \ Nonce\ NA\ \isasymnotin \ used\ evs1\isasymrbrakk \isanewline
   1.353 +\ \ \ \ \ \ \ \ \ \ \isasymLongrightarrow \ Says\ A\ B\ (Crypt\ (pubK\ B)\ \isasymlbrace Nonce\ NA,\ Agent\ A\isasymrbrace )\isanewline
   1.354 +\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \#\ evs1\ \ \isasymin \ \ ns_public"\isanewline
   1.355 +\isanewline
   1.356 +\ \ \ \ \ \ \ \ \ \isanewline
   1.357 +\ \ \ NS2:\ \ "\isasymlbrakk evs2\ \isasymin \ ns_public;\ \ Nonce\ NB\ \isasymnotin \ used\ evs2;\isanewline
   1.358 +\ \ \ \ \ \ \ \ \ \ \ Says\ A'\ B\ (Crypt\ (pubK\ B)\ \isasymlbrace Nonce\ NA,\ Agent\ A\isasymrbrace )\ \isasymin \ set\ evs2\isasymrbrakk \isanewline
   1.359 +\ \ \ \ \ \ \ \ \ \ \isasymLongrightarrow \ Says\ B\ A\ (Crypt\ (pubK\ A)\ \isasymlbrace Nonce\ NA,\ Nonce\ NB,\ Agent\ B\isasymrbrace )\isanewline
   1.360 +\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \#\ evs2\ \ \isasymin \ \ ns_public"\isanewline
   1.361 +\isanewline
   1.362 +\ \ \ \ \ \ \ \ \ \isanewline
   1.363 +\ \ \ NS3:\ \ "\isasymlbrakk evs3\ \isasymin \ ns_public;\isanewline
   1.364 +\ \ \ \ \ \ \ \ \ \ \ Says\ A\ \ B\ (Crypt\ (pubK\ B)\ \isasymlbrace Nonce\ NA,\ Agent\ A\isasymrbrace )\ \isasymin \ set\ evs3;\isanewline
   1.365 +\ \ \ \ \ \ \ \ \ \ \ Says\ B'\ A\ (Crypt\ (pubK\ A)\ \isasymlbrace Nonce\ NA,\ Nonce\ NB,\ Agent\ B\isasymrbrace )\isanewline
   1.366 +\ \ \ \ \ \ \ \ \ \ \ \ \ \ \isasymin \ set\ evs3\isasymrbrakk \isanewline
   1.367 +\ \ \ \ \ \ \ \ \ \ \isasymLongrightarrow \ Says\ A\ B\ (Crypt\ (pubK\ B)\ (Nonce\ NB))\ \#\ evs3\ \isasymin \
   1.368 +ns_public"
   1.369 +\end{isabelle}
   1.370 +\caption{An Inductive Protocol Definition}\label{fig:ns_public}
   1.371 +\end{figure}
   1.372 +
   1.373 +Figure~\ref{fig:ns_public} presents the inductive definition.  The
   1.374 +\isa{Nil} rule introduces the empty trace.  The \isa{Fake} rule models the
   1.375 +adversary's sending a message built from components taken from past
   1.376 +traffic, expressed using the functions \isa{synth} and
   1.377 +\isa{analz}. 
   1.378 +The next three rules model how honest agents would perform the three
   1.379 +protocol steps.  
   1.380 +
   1.381 +Here is a detailed explanation of rule \isa{NS2}.
   1.382 +A trace containing an event of the form
   1.383 +\begin{isabelle}
   1.384 +\ \ \ \ \ Says\ A'\ B\ (Crypt\ (pubK\ B)\ \isasymlbrace Nonce\
   1.385 +NA,\ Agent\ A\isasymrbrace)
   1.386 +\end{isabelle}
   1.387 +%
   1.388 +may be extended by an event of the form
   1.389 +\begin{isabelle}
   1.390 +\ \ \ \ \ Says\ B\ A\ (Crypt\ (pubK\ A)\
   1.391 +\isasymlbrace Nonce\ NA,\ Nonce\ NB,\ Agent\ B\isasymrbrace)
   1.392 +\end{isabelle}
   1.393 +where \isa{NB} is a fresh nonce: \isa{Nonce\ NB\ \isasymnotin \ used\ evs2}.
   1.394 +Writing the sender as \isa{A'} indicates that \isa{B} does not 
   1.395 +know who sent the message.  Calling the trace variable \isa{evs2} rather
   1.396 +than simply \isa{evs} helps us know where we are in a proof after many
   1.397 +case-splits: every subgoal mentioning \isa{evs2} involves message~2 of the
   1.398 +protocol.
   1.399 +
   1.400 +Benefits of this approach are simplicity and clarity.  The semantic model
   1.401 +is set theory, proofs are by induction and the translation from the informal
   1.402 +notation to the inductive rules is straightforward. 
   1.403 +
   1.404 +
   1.405 +\section{Proving Elementary Properties}
   1.406 +
   1.407 +Secrecy properties are hard to prove.  The conclusion of a typical secrecy
   1.408 +theorem is 
   1.409 +\isa{X\ \isasymnotin\ analz (knows Spy evs)}.  The difficulty arises from
   1.410 +having to reason about \isa{analz}, or less formally, showing that the spy
   1.411 +can never learn~\isa{X}.  Much easier is to prove that \isa{X} can never
   1.412 +occur at all.  Such \emph{regularity} properties are typically expressed
   1.413 +using \isa{parts} rather than \isa{analz}.
   1.414 +
   1.415 +The following lemma states that \isa{A}'s private key is potentially
   1.416 +known to the spy if and only if \isa{A} belongs to the set \isa{bad} of
   1.417 +compromised agents.  The statement uses \isa{parts}: the very presence of
   1.418 +her private key in a message, whether protected by encryption or not, is
   1.419 +enough to confirm that \isa{A} is compromised.  The proof, like nearly all
   1.420 +protocol proofs, is by induction over traces.
   1.421 +\begin{isabelle}
   1.422 +\isacommand{lemma}\ Spy_see_priK\ [simp]:\ \isanewline
   1.423 +\ \ \ \ \ \ "evs\ \isasymin \ ns_public\isanewline
   1.424 +\ \ \ \ \ \ \ \isasymLongrightarrow \
   1.425 +(Key\ (priK\ A)\ \isasymin \ parts\ (spies\ evs))\ =\ (A\ \isasymin \
   1.426 +bad)"\isanewline
   1.427 +\isacommand{apply}\ (erule\ ns_public.induct,\ simp_all)
   1.428 +\end{isabelle}
   1.429 +%
   1.430 +The induction yields five subgoals, one for each rule in the definition of
   1.431 +\isa{ns_public}.  Informally, the idea is to prove that the protocol
   1.432 +property holds initially (rule \isa{Nil}), is preserved by each of the
   1.433 +legitimate protocol steps (rules \isa{NS1}--\isa{3}), and even is preserved
   1.434 +in the face of anything the spy can do (rule \isa{Fake}).  Simplification
   1.435 +leaves only the \isa{Fake} case (as indicated by the variable name
   1.436 +\isa{evsf}):
   1.437 +\begin{isabelle}
   1.438 +\ 1.\ \isasymAnd X\ evsf.\isanewline
   1.439 +\isaindent{\ 1.\ \ \ \ }\isasymlbrakk evsf\ \isasymin \ ns_public;\isanewline
   1.440 +\isaindent{\ 1.\ \ \ \ \ \ \ }(Key\ (priK\ A)\ \isasymin \ parts\ (knows\ Spy\ evsf))\ =\ (A\ \isasymin \ bad);\isanewline
   1.441 +\isaindent{\ 1.\ \ \ \ \ \ \ }X\ \isasymin \ synth\ (analz\ (knows\ Spy\ evsf))\isasymrbrakk \isanewline
   1.442 +\isaindent{\ 1.\ \ \ \ }\isasymLongrightarrow \ (Key\ (priK\ A)\ \isasymin \ parts\ (insert\ X\ (knows\ Spy\ evsf)))\ =\isanewline
   1.443 +\isaindent{\ 1.\ \ \ \ \isasymLongrightarrow \ }(A\ \isasymin \
   1.444 +bad)\isanewline
   1.445 +\isacommand{by}\ blast
   1.446 +\end{isabelle}
   1.447 +%
   1.448 +The \isa{Fake} case is proved automatically.  If
   1.449 +\isa{priK~A} is in the extended trace then either (1) it was already in the
   1.450 +original trace or (2) it was
   1.451 +generated by the spy, and so the spy must have known this key already. 
   1.452 +Either way, the induction hypothesis applies.  If the spy can tell himself
   1.453 +something, then he must have known it already.
   1.454 +
   1.455 +\emph{Unicity} lemmas are regularity lemmas stating that specified items
   1.456 +can occur only once in a trace.  The following lemma states that a nonce
   1.457 +cannot be used both as $Na$ and as $Nb$ unless
   1.458 +it is known to the spy.  Intuitively, it holds because honest agents
   1.459 +always choose fresh values as nonces; only the spy might reuse a value,
   1.460 +and he doesn't know this particular value.  The proof script is three steps
   1.461 +long.
   1.462 +\begin{isabelle}
   1.463 +\isacommand{lemma}\ no_nonce_NS1_NS2\ [rule_format]:\ \isanewline
   1.464 +\ "evs\ \isasymin \ ns_public\ \isanewline
   1.465 +\ \ \isasymLongrightarrow \ Crypt\ (pubK\ C)\ \isasymlbrace NA',\ Nonce\ NA,\ Agent\ D\isasymrbrace \ \isasymin \ parts\ (spies\ evs)\ \isasymlongrightarrow \isanewline
   1.466 +\ \ \ \ \ \ Crypt\ (pubK\ B)\ \isasymlbrace Nonce\ NA,\ Agent\ A\isasymrbrace \ \isasymin \ parts\ (spies\ evs)\ \isasymlongrightarrow \ \ \isanewline
   1.467 +\ \ \ \ \ \ Nonce\ NA\ \isasymin \ analz\ (spies\ evs)"
   1.468 +\end{isabelle}
   1.469 +
   1.470 +This unicity lemma states that, if \isa{NA} is secret, then its appearance
   1.471 +in any instance of message~1 determines the other components.  The proof is
   1.472 +another easy induction.
   1.473 +\begin{isabelle}
   1.474 +\isacommand{lemma}\ unique_NA:\ \isanewline
   1.475 +\ \ \ \ \ "\isasymlbrakk Crypt(pubK\ B)\ \ \isasymlbrace Nonce\ NA,\ Agent\ A\ \isasymrbrace \ \isasymin \ parts(spies\ evs);\ \ \isanewline
   1.476 +\ \ \ \ \ \ \ Crypt(pubK\ B')\ \isasymlbrace Nonce\ NA,\ Agent\ A'\isasymrbrace \ \isasymin \ parts(spies\ evs);\ \ \isanewline
   1.477 +\ \ \ \ \ \ \ Nonce\ NA\ \isasymnotin \ analz\ (spies\ evs);\ evs\ \isasymin \ ns_public\isasymrbrakk \isanewline
   1.478 +\ \ \ \ \ \ \isasymLongrightarrow \ A=A'\ \isasymand \ B=B'"
   1.479 +\end{isabelle}
   1.480 +
   1.481 +
   1.482 +\section{Proving Secrecy Theorems}\label{sec:secrecy}
   1.483 +
   1.484 +The secrecy theorems for Bob (the second participant) are especially
   1.485 +important, since they fail for the original protocol.  This theorem states
   1.486 +that if Bob sends message~2 to Alice, and both agents are uncompromised,
   1.487 +then Bob's nonce will never reach the spy.
   1.488 +\begin{isabelle}
   1.489 +\isacommand{theorem}\ Spy_not_see_NB\ [dest]:\isanewline
   1.490 +\ "\isasymlbrakk Says\ B\ A\ (Crypt\ (pubK\ A)\ \isasymlbrace Nonce\ NA,\ Nonce\ NB,\ Agent\ B\isasymrbrace )\ \isasymin \ set\ evs;\isanewline
   1.491 +\ \ \ A\ \isasymnotin \ bad;\ \ B\ \isasymnotin \ bad;\ \ evs\ \isasymin \ ns_public\isasymrbrakk \isanewline
   1.492 +\ \ \isasymLongrightarrow \ Nonce\ NB\ \isasymnotin \ analz\ (spies\ evs)"
   1.493 +\end{isabelle}
   1.494 +%
   1.495 +To prove this, we must formulate the induction properly (one of the
   1.496 +assumptions mentions~\isa{evs}), apply induction, and simplify.
   1.497 +The proof states are too complicated to present in full.  
   1.498 +Let's just look
   1.499 +at the easiest subgoal, that for message~1:
   1.500 +\begin{isabelle}
   1.501 +\ 1.\ \isasymAnd Ba\ NAa\ evs1.\isanewline
   1.502 +\isaindent{\ 1.\ \ \ \ }\isasymlbrakk A\ \isasymnotin \ bad;\ B\ \isasymnotin \ bad;\ evs1\ \isasymin \ ns_public;\isanewline
   1.503 +\isaindent{\ 1.\ \ \ \ \ \ \ }Says\ B\ A\ (Crypt\ (pubK\ A)\ \isasymlbrace Nonce\ NA,\ Nonce\ NB,\ Agent\ B\isasymrbrace )\isanewline
   1.504 +\isaindent{\ 1.\ \ \ \ \ \ \ }\isasymin \ set\ evs1\ \isasymlongrightarrow \isanewline
   1.505 +\isaindent{\ 1.\ \ \ \ \ \ \ }Nonce\ NB\ \isasymnotin \ analz\ (knows\ Spy\ evs1);\isanewline
   1.506 +\isaindent{\ 1.\ \ \ \ \ \ \ }Nonce\ NAa\ \isasymnotin \ used\ evs1\isasymrbrakk \isanewline
   1.507 +\isaindent{\ 1.\ \ \ \ }\isasymLongrightarrow \ Ba\ \isasymin \ bad\ \isasymlongrightarrow \isanewline
   1.508 +\isaindent{\ 1.\ \ \ \ \isasymLongrightarrow \ }Says\ B\ A\ (Crypt\ (pubK\ A)\ \isasymlbrace Nonce\ NA,\ Nonce\ NB,\ Agent\ B\isasymrbrace )\isanewline
   1.509 +\isaindent{\ 1.\ \ \ \ \isasymLongrightarrow \ }\isasymin \ set\ evs1\ \isasymlongrightarrow \isanewline
   1.510 +\isaindent{\ 1.\ \ \ \ \isasymLongrightarrow \ }NB\ \isasymnoteq \ NAa
   1.511 +\end{isabelle}
   1.512 +This subgoal refers to another protocol run, in which \isa{B} has sent
   1.513 +message~1 to somebody called~\isa{Ba}.  Agent \isa{Ba} 
   1.514 +is compromised, and so the spy has learnt the nonce that was just sent,
   1.515 +which is called~\isa{NAa}.  We need to know that this nonce differs from the
   1.516 +one we care about, namely~\isa{NB}\@.  They do indeed differ: \isa{NB} was
   1.517 +sent earlier, while \isa{NAa} was chosen to be fresh (we have the assumption
   1.518 +\isa{Nonce\ NAa\ \isasymnotin \ used\ evs1}). 
   1.519 +
   1.520 +Note that our reasoning concerned \isa{B}'s participation in another
   1.521 +run.  Agents may engage in several runs concurrently, and some attacks work
   1.522 +by interleaving the messages of two runs.  With model checking, this
   1.523 +possibility can cause a state-space explosion, and for us it
   1.524 +certainly complicates proofs.  The biggest subgoal concerns message~2.  It
   1.525 +splits into several cases, some of which are proved by unicity, others by
   1.526 +the induction hypothesis.  For all those complications, the proofs are
   1.527 +automatic by \isa{blast} with the theorem \isa{no_nonce_NS1_NS2}.
   1.528 +
   1.529 +The remaining proofs are straightforward.  This theorem asserts that if
   1.530 +\isa{B} has sent an instance of message~2 to~\isa{A} and has received the
   1.531 +expected reply, then that reply really came from~\isa{A}.  The proof is a
   1.532 +simple induction.
   1.533 +\begin{isabelle}
   1.534 +\isacommand{theorem}\ B_trusts_NS3:\isanewline
   1.535 +\ "\isasymlbrakk Says\ B\ A\ \ (Crypt\ (pubK\ A)\ \isasymlbrace Nonce\ NA,\ Nonce\ NB,\ Agent\ B\isasymrbrace )\ \isasymin \ set\ evs;\isanewline
   1.536 +\ \ \ Says\ A'\ B\ (Crypt\ (pubK\ B)\ (Nonce\ NB))\ \isasymin \ set\ evs;\ \isanewline
   1.537 +\ \ \ A\ \isasymnotin \ bad;\ \ B\ \isasymnotin \ bad;\ \ evs\ \isasymin \ ns_public\isasymrbrakk \ \ \ \ \ \ \ \ \isanewline
   1.538 +\ \ \isasymLongrightarrow \ Says\ A\ B\ (Crypt\ (pubK\ B)\ (Nonce\ NB))\ \isasymin \ set\
   1.539 +evs"
   1.540 +\end{isabelle}
   1.541 +
   1.542 +From similar assumptions, we can prove that \isa{A} started the protocol
   1.543 +run by sending an instance of message~1 involving the nonce~\isa{NA}.  For
   1.544 +this theorem, the conclusion is 
   1.545 +\begin{isabelle}
   1.546 +\ \ \ \ \ Says\ A\ B\ (Crypt\ (pubK\ B)\ \isasymlbrace Nonce\ NA,\ Agent\
   1.547 +A\isasymrbrace )\ \isasymin \ set\ evs
   1.548 +\end{isabelle}
   1.549 +%
   1.550 +Analogous theorems can be proved for~\isa{A}, stating that nonce~\isa{NA}
   1.551 +remains secret and that message~2 really originates with~\isa{B}.  Even the
   1.552 +flawed protocol establishes these properties for~\isa{A};
   1.553 +the flaw only harms the second participant.
   1.554 +Detailed information on this technique can be found
   1.555 +elsewhere~\cite{paulson-jcs}, including proofs of an Internet
   1.556 +protocol~\cite{paulson-tls}.
   1.557 +
   1.558 +
   1.559 +\endinput