doc-src/TutorialI/Inductive/AB.thy
author blanchet
Tue, 17 Feb 2009 14:01:54 +0100
changeset 29956 62f931b257b7
parent 27167 a99747ccba87
permissions -rw-r--r--
Reintroduce set_interpreter for Collect and op :. I removed them by accident when removing old code that dealt with the "set" type. Incidentally, there is still some broken "set" code in Refute that should be fixed (see TODO in refute.ML).
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
17914
99ead7a7eb42 fix headers;
wenzelm
parents: 16585
diff changeset
     1
(*<*)theory AB imports Main begin(*>*)
10225
b9fd52525b69 *** empty log message ***
nipkow
parents: 10217
diff changeset
     2
10884
2995639c6a09 renaming of some files
paulson
parents: 10608
diff changeset
     3
section{*Case Study: A Context Free Grammar*}
10217
e61e7e1eacaf *** empty log message ***
nipkow
parents:
diff changeset
     4
10242
028f54cd2cc9 *** empty log message ***
nipkow
parents: 10237
diff changeset
     5
text{*\label{sec:CFG}
11494
23a118849801 revisions and indexing
paulson
parents: 11310
diff changeset
     6
\index{grammars!defining inductively|(}%
10236
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
     7
Grammars are nothing but shorthands for inductive definitions of nonterminals
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
     8
which represent sets of strings. For example, the production
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
     9
$A \to B c$ is short for
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
    10
\[ w \in B \Longrightarrow wc \in A \]
10884
2995639c6a09 renaming of some files
paulson
parents: 10608
diff changeset
    11
This section demonstrates this idea with an example
2995639c6a09 renaming of some files
paulson
parents: 10608
diff changeset
    12
due to Hopcroft and Ullman, a grammar for generating all words with an
2995639c6a09 renaming of some files
paulson
parents: 10608
diff changeset
    13
equal number of $a$'s and~$b$'s:
10236
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
    14
\begin{eqnarray}
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
    15
S &\to& \epsilon \mid b A \mid a B \nonumber\\
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
    16
A &\to& a S \mid b A A \nonumber\\
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
    17
B &\to& b S \mid a B B \nonumber
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
    18
\end{eqnarray}
10884
2995639c6a09 renaming of some files
paulson
parents: 10608
diff changeset
    19
At the end we say a few words about the relationship between
2995639c6a09 renaming of some files
paulson
parents: 10608
diff changeset
    20
the original proof \cite[p.\ts81]{HopcroftUllman} and our formal version.
10236
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
    21
10287
9ab1671398a6 two spelling fixes
paulson
parents: 10283
diff changeset
    22
We start by fixing the alphabet, which consists only of @{term a}'s
10884
2995639c6a09 renaming of some files
paulson
parents: 10608
diff changeset
    23
and~@{term b}'s:
10236
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
    24
*}
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
    25
11705
ac8ca15c556c fixed numerals;
wenzelm
parents: 11494
diff changeset
    26
datatype alfa = a | b
10217
e61e7e1eacaf *** empty log message ***
nipkow
parents:
diff changeset
    27
10236
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
    28
text{*\noindent
10287
9ab1671398a6 two spelling fixes
paulson
parents: 10283
diff changeset
    29
For convenience we include the following easy lemmas as simplification rules:
10236
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
    30
*}
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
    31
11705
ac8ca15c556c fixed numerals;
wenzelm
parents: 11494
diff changeset
    32
lemma [simp]: "(x \<noteq> a) = (x = b) \<and> (x \<noteq> b) = (x = a)"
10884
2995639c6a09 renaming of some files
paulson
parents: 10608
diff changeset
    33
by (case_tac x, auto)
10217
e61e7e1eacaf *** empty log message ***
nipkow
parents:
diff changeset
    34
10236
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
    35
text{*\noindent
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
    36
Words over this alphabet are of type @{typ"alfa list"}, and
23733
3f8ad7418e55 Adapted to new inductive definition package.
berghofe
parents: 23380
diff changeset
    37
the three nonterminals are declared as sets of such words.
10884
2995639c6a09 renaming of some files
paulson
parents: 10608
diff changeset
    38
The productions above are recast as a \emph{mutual} inductive
10242
028f54cd2cc9 *** empty log message ***
nipkow
parents: 10237
diff changeset
    39
definition\index{inductive definition!simultaneous}
10884
2995639c6a09 renaming of some files
paulson
parents: 10608
diff changeset
    40
of @{term S}, @{term A} and~@{term B}:
10236
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
    41
*}
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
    42
23733
3f8ad7418e55 Adapted to new inductive definition package.
berghofe
parents: 23380
diff changeset
    43
inductive_set
25330
15bf0f47a87d added inductive
nipkow
parents: 23733
diff changeset
    44
  S :: "alfa list set" and
15bf0f47a87d added inductive
nipkow
parents: 23733
diff changeset
    45
  A :: "alfa list set" and
15bf0f47a87d added inductive
nipkow
parents: 23733
diff changeset
    46
  B :: "alfa list set"
23733
3f8ad7418e55 Adapted to new inductive definition package.
berghofe
parents: 23380
diff changeset
    47
where
10236
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
    48
  "[] \<in> S"
23733
3f8ad7418e55 Adapted to new inductive definition package.
berghofe
parents: 23380
diff changeset
    49
| "w \<in> A \<Longrightarrow> b#w \<in> S"
3f8ad7418e55 Adapted to new inductive definition package.
berghofe
parents: 23380
diff changeset
    50
| "w \<in> B \<Longrightarrow> a#w \<in> S"
10217
e61e7e1eacaf *** empty log message ***
nipkow
parents:
diff changeset
    51
23733
3f8ad7418e55 Adapted to new inductive definition package.
berghofe
parents: 23380
diff changeset
    52
| "w \<in> S        \<Longrightarrow> a#w   \<in> A"
3f8ad7418e55 Adapted to new inductive definition package.
berghofe
parents: 23380
diff changeset
    53
| "\<lbrakk> v\<in>A; w\<in>A \<rbrakk> \<Longrightarrow> b#v@w \<in> A"
10236
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
    54
23733
3f8ad7418e55 Adapted to new inductive definition package.
berghofe
parents: 23380
diff changeset
    55
| "w \<in> S            \<Longrightarrow> b#w   \<in> B"
3f8ad7418e55 Adapted to new inductive definition package.
berghofe
parents: 23380
diff changeset
    56
| "\<lbrakk> v \<in> B; w \<in> B \<rbrakk> \<Longrightarrow> a#v@w \<in> B"
10217
e61e7e1eacaf *** empty log message ***
nipkow
parents:
diff changeset
    57
10236
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
    58
text{*\noindent
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
    59
First we show that all words in @{term S} contain the same number of @{term
10884
2995639c6a09 renaming of some files
paulson
parents: 10608
diff changeset
    60
a}'s and @{term b}'s. Since the definition of @{term S} is by mutual
2995639c6a09 renaming of some files
paulson
parents: 10608
diff changeset
    61
induction, so is the proof: we show at the same time that all words in
10236
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
    62
@{term A} contain one more @{term a} than @{term b} and all words in @{term
27167
nipkow
parents: 25330
diff changeset
    63
B} contain one more @{term b} than @{term a}.
10236
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
    64
*}
10217
e61e7e1eacaf *** empty log message ***
nipkow
parents:
diff changeset
    65
10236
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
    66
lemma correctness:
23380
15f7a6745cce fixed filter syntax
nipkow
parents: 17914
diff changeset
    67
  "(w \<in> S \<longrightarrow> size[x\<leftarrow>w. x=a] = size[x\<leftarrow>w. x=b])     \<and>
15f7a6745cce fixed filter syntax
nipkow
parents: 17914
diff changeset
    68
   (w \<in> A \<longrightarrow> size[x\<leftarrow>w. x=a] = size[x\<leftarrow>w. x=b] + 1) \<and>
15f7a6745cce fixed filter syntax
nipkow
parents: 17914
diff changeset
    69
   (w \<in> B \<longrightarrow> size[x\<leftarrow>w. x=b] = size[x\<leftarrow>w. x=a] + 1)"
10217
e61e7e1eacaf *** empty log message ***
nipkow
parents:
diff changeset
    70
10236
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
    71
txt{*\noindent
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
    72
These propositions are expressed with the help of the predefined @{term
23380
15f7a6745cce fixed filter syntax
nipkow
parents: 17914
diff changeset
    73
filter} function on lists, which has the convenient syntax @{text"[x\<leftarrow>xs. P
10236
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
    74
x]"}, the list of all elements @{term x} in @{term xs} such that @{prop"P x"}
10884
2995639c6a09 renaming of some files
paulson
parents: 10608
diff changeset
    75
holds. Remember that on lists @{text size} and @{text length} are synonymous.
10236
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
    76
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
    77
The proof itself is by rule induction and afterwards automatic:
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
    78
*}
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
    79
10884
2995639c6a09 renaming of some files
paulson
parents: 10608
diff changeset
    80
by (rule S_A_B.induct, auto)
10217
e61e7e1eacaf *** empty log message ***
nipkow
parents:
diff changeset
    81
10236
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
    82
text{*\noindent
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
    83
This may seem surprising at first, and is indeed an indication of the power
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
    84
of inductive definitions. But it is also quite straightforward. For example,
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
    85
consider the production $A \to b A A$: if $v,w \in A$ and the elements of $A$
10884
2995639c6a09 renaming of some files
paulson
parents: 10608
diff changeset
    86
contain one more $a$ than~$b$'s, then $bvw$ must again contain one more $a$
2995639c6a09 renaming of some files
paulson
parents: 10608
diff changeset
    87
than~$b$'s.
10236
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
    88
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
    89
As usual, the correctness of syntactic descriptions is easy, but completeness
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
    90
is hard: does @{term S} contain \emph{all} words with an equal number of
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
    91
@{term a}'s and @{term b}'s? It turns out that this proof requires the
10884
2995639c6a09 renaming of some files
paulson
parents: 10608
diff changeset
    92
following lemma: every string with two more @{term a}'s than @{term
2995639c6a09 renaming of some files
paulson
parents: 10608
diff changeset
    93
b}'s can be cut somewhere such that each half has one more @{term a} than
10236
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
    94
@{term b}. This is best seen by imagining counting the difference between the
10283
ff003e2b790c *** empty log message ***
nipkow
parents: 10242
diff changeset
    95
number of @{term a}'s and @{term b}'s starting at the left end of the
ff003e2b790c *** empty log message ***
nipkow
parents: 10242
diff changeset
    96
word. We start with 0 and end (at the right end) with 2. Since each move to the
10236
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
    97
right increases or decreases the difference by 1, we must have passed through
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
    98
1 on our way from 0 to 2. Formally, we appeal to the following discrete
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
    99
intermediate value theorem @{thm[source]nat0_intermed_int_val}
16412
50eab0183aea *** empty log message ***
nipkow
parents: 12815
diff changeset
   100
@{thm[display,margin=60]nat0_intermed_int_val[no_vars]}
10236
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   101
where @{term f} is of type @{typ"nat \<Rightarrow> int"}, @{typ int} are the integers,
11308
b28bbb153603 *** empty log message ***
nipkow
parents: 11257
diff changeset
   102
@{text"\<bar>.\<bar>"} is the absolute value function\footnote{See
b28bbb153603 *** empty log message ***
nipkow
parents: 11257
diff changeset
   103
Table~\ref{tab:ascii} in the Appendix for the correct \textsc{ascii}
11705
ac8ca15c556c fixed numerals;
wenzelm
parents: 11494
diff changeset
   104
syntax.}, and @{term"1::int"} is the integer 1 (see \S\ref{sec:numbers}).
10217
e61e7e1eacaf *** empty log message ***
nipkow
parents:
diff changeset
   105
11147
d848c6693185 *** empty log message ***
nipkow
parents: 10884
diff changeset
   106
First we show that our specific function, the difference between the
10236
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   107
numbers of @{term a}'s and @{term b}'s, does indeed only change by 1 in every
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   108
move to the right. At this point we also start generalizing from @{term a}'s
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   109
and @{term b}'s to an arbitrary property @{term P}. Otherwise we would have
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   110
to prove the desired lemma twice, once as stated above and once with the
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   111
roles of @{term a}'s and @{term b}'s interchanged.
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   112
*}
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   113
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   114
lemma step1: "\<forall>i < size w.
23380
15f7a6745cce fixed filter syntax
nipkow
parents: 17914
diff changeset
   115
  \<bar>(int(size[x\<leftarrow>take (i+1) w. P x])-int(size[x\<leftarrow>take (i+1) w. \<not>P x]))
15f7a6745cce fixed filter syntax
nipkow
parents: 17914
diff changeset
   116
   - (int(size[x\<leftarrow>take i w. P x])-int(size[x\<leftarrow>take i w. \<not>P x]))\<bar> \<le> 1"
10236
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   117
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   118
txt{*\noindent
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   119
The lemma is a bit hard to read because of the coercion function
11147
d848c6693185 *** empty log message ***
nipkow
parents: 10884
diff changeset
   120
@{text"int :: nat \<Rightarrow> int"}. It is required because @{term size} returns
10884
2995639c6a09 renaming of some files
paulson
parents: 10608
diff changeset
   121
a natural number, but subtraction on type~@{typ nat} will do the wrong thing.
10236
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   122
Function @{term take} is predefined and @{term"take i xs"} is the prefix of
10884
2995639c6a09 renaming of some files
paulson
parents: 10608
diff changeset
   123
length @{term i} of @{term xs}; below we also need @{term"drop i xs"}, which
10236
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   124
is what remains after that prefix has been dropped from @{term xs}.
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   125
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   126
The proof is by induction on @{term w}, with a trivial base case, and a not
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   127
so trivial induction step. Since it is essentially just arithmetic, we do not
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   128
discuss it.
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   129
*}
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   130
12332
aea72a834c85 *** empty log message ***
nipkow
parents: 11870
diff changeset
   131
apply(induct_tac w)
16585
02cf78f0afce replacing zabs_def by abs_if
paulson
parents: 16412
diff changeset
   132
apply(auto simp add: abs_if take_Cons split: nat.split)
02cf78f0afce replacing zabs_def by abs_if
paulson
parents: 16412
diff changeset
   133
done
10217
e61e7e1eacaf *** empty log message ***
nipkow
parents:
diff changeset
   134
10236
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   135
text{*
11494
23a118849801 revisions and indexing
paulson
parents: 11310
diff changeset
   136
Finally we come to the above-mentioned lemma about cutting in half a word with two more elements of one sort than of the other sort:
10236
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   137
*}
10217
e61e7e1eacaf *** empty log message ***
nipkow
parents:
diff changeset
   138
e61e7e1eacaf *** empty log message ***
nipkow
parents:
diff changeset
   139
lemma part1:
23380
15f7a6745cce fixed filter syntax
nipkow
parents: 17914
diff changeset
   140
 "size[x\<leftarrow>w. P x] = size[x\<leftarrow>w. \<not>P x]+2 \<Longrightarrow>
15f7a6745cce fixed filter syntax
nipkow
parents: 17914
diff changeset
   141
  \<exists>i\<le>size w. size[x\<leftarrow>take i w. P x] = size[x\<leftarrow>take i w. \<not>P x]+1"
10236
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   142
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   143
txt{*\noindent
10884
2995639c6a09 renaming of some files
paulson
parents: 10608
diff changeset
   144
This is proved by @{text force} with the help of the intermediate value theorem,
10608
620647438780 *** empty log message ***
nipkow
parents: 10520
diff changeset
   145
instantiated appropriately and with its first premise disposed of by lemma
620647438780 *** empty log message ***
nipkow
parents: 10520
diff changeset
   146
@{thm[source]step1}:
10236
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   147
*}
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   148
11870
181bd2050cf4 Numerals now work for the integers: the binary numerals for 0 and 1 rewrite
paulson
parents: 11705
diff changeset
   149
apply(insert nat0_intermed_int_val[OF step1, of "P" "w" "1"])
11705
ac8ca15c556c fixed numerals;
wenzelm
parents: 11494
diff changeset
   150
by force
10236
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   151
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   152
text{*\noindent
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   153
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   154
Lemma @{thm[source]part1} tells us only about the prefix @{term"take i w"}.
10884
2995639c6a09 renaming of some files
paulson
parents: 10608
diff changeset
   155
An easy lemma deals with the suffix @{term"drop i w"}:
10236
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   156
*}
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   157
10217
e61e7e1eacaf *** empty log message ***
nipkow
parents:
diff changeset
   158
e61e7e1eacaf *** empty log message ***
nipkow
parents:
diff changeset
   159
lemma part2:
23380
15f7a6745cce fixed filter syntax
nipkow
parents: 17914
diff changeset
   160
  "\<lbrakk>size[x\<leftarrow>take i w @ drop i w. P x] =
15f7a6745cce fixed filter syntax
nipkow
parents: 17914
diff changeset
   161
    size[x\<leftarrow>take i w @ drop i w. \<not>P x]+2;
15f7a6745cce fixed filter syntax
nipkow
parents: 17914
diff changeset
   162
    size[x\<leftarrow>take i w. P x] = size[x\<leftarrow>take i w. \<not>P x]+1\<rbrakk>
15f7a6745cce fixed filter syntax
nipkow
parents: 17914
diff changeset
   163
   \<Longrightarrow> size[x\<leftarrow>drop i w. P x] = size[x\<leftarrow>drop i w. \<not>P x]+1"
12815
wenzelm
parents: 12332
diff changeset
   164
by(simp del: append_take_drop_id)
10217
e61e7e1eacaf *** empty log message ***
nipkow
parents:
diff changeset
   165
10236
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   166
text{*\noindent
11257
622331bbdb7f *** empty log message ***
nipkow
parents: 11147
diff changeset
   167
In the proof we have disabled the normally useful lemma
10884
2995639c6a09 renaming of some files
paulson
parents: 10608
diff changeset
   168
\begin{isabelle}
2995639c6a09 renaming of some files
paulson
parents: 10608
diff changeset
   169
@{thm append_take_drop_id[no_vars]}
2995639c6a09 renaming of some files
paulson
parents: 10608
diff changeset
   170
\rulename{append_take_drop_id}
2995639c6a09 renaming of some files
paulson
parents: 10608
diff changeset
   171
\end{isabelle}
11257
622331bbdb7f *** empty log message ***
nipkow
parents: 11147
diff changeset
   172
to allow the simplifier to apply the following lemma instead:
622331bbdb7f *** empty log message ***
nipkow
parents: 11147
diff changeset
   173
@{text[display]"[x\<in>xs@ys. P x] = [x\<in>xs. P x] @ [x\<in>ys. P x]"}
10236
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   174
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   175
To dispose of trivial cases automatically, the rules of the inductive
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   176
definition are declared simplification rules:
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   177
*}
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   178
11705
ac8ca15c556c fixed numerals;
wenzelm
parents: 11494
diff changeset
   179
declare S_A_B.intros[simp]
10236
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   180
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   181
text{*\noindent
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   182
This could have been done earlier but was not necessary so far.
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   183
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   184
The completeness theorem tells us that if a word has the same number of
10884
2995639c6a09 renaming of some files
paulson
parents: 10608
diff changeset
   185
@{term a}'s and @{term b}'s, then it is in @{term S}, and similarly 
2995639c6a09 renaming of some files
paulson
parents: 10608
diff changeset
   186
for @{term A} and @{term B}:
10236
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   187
*}
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   188
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   189
theorem completeness:
23380
15f7a6745cce fixed filter syntax
nipkow
parents: 17914
diff changeset
   190
  "(size[x\<leftarrow>w. x=a] = size[x\<leftarrow>w. x=b]     \<longrightarrow> w \<in> S) \<and>
15f7a6745cce fixed filter syntax
nipkow
parents: 17914
diff changeset
   191
   (size[x\<leftarrow>w. x=a] = size[x\<leftarrow>w. x=b] + 1 \<longrightarrow> w \<in> A) \<and>
15f7a6745cce fixed filter syntax
nipkow
parents: 17914
diff changeset
   192
   (size[x\<leftarrow>w. x=b] = size[x\<leftarrow>w. x=a] + 1 \<longrightarrow> w \<in> B)"
10236
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   193
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   194
txt{*\noindent
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   195
The proof is by induction on @{term w}. Structural induction would fail here
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   196
because, as we can see from the grammar, we need to make bigger steps than
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   197
merely appending a single letter at the front. Hence we induct on the length
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   198
of @{term w}, using the induction rule @{thm[source]length_induct}:
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   199
*}
10217
e61e7e1eacaf *** empty log message ***
nipkow
parents:
diff changeset
   200
11705
ac8ca15c556c fixed numerals;
wenzelm
parents: 11494
diff changeset
   201
apply(induct_tac w rule: length_induct)
27167
nipkow
parents: 25330
diff changeset
   202
apply(rename_tac w)
10236
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   203
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   204
txt{*\noindent
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   205
The @{text rule} parameter tells @{text induct_tac} explicitly which induction
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   206
rule to use. For details see \S\ref{sec:complete-ind} below.
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   207
In this case the result is that we may assume the lemma already
27167
nipkow
parents: 25330
diff changeset
   208
holds for all words shorter than @{term w}. Because the induction step renames
nipkow
parents: 25330
diff changeset
   209
the induction variable we rename it back to @{text w}.
10236
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   210
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   211
The proof continues with a case distinction on @{term w},
11494
23a118849801 revisions and indexing
paulson
parents: 11310
diff changeset
   212
on whether @{term w} is empty or not.
10236
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   213
*}
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   214
11705
ac8ca15c556c fixed numerals;
wenzelm
parents: 11494
diff changeset
   215
apply(case_tac w)
ac8ca15c556c fixed numerals;
wenzelm
parents: 11494
diff changeset
   216
 apply(simp_all)
10236
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   217
(*<*)apply(rename_tac x v)(*>*)
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   218
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   219
txt{*\noindent
11257
622331bbdb7f *** empty log message ***
nipkow
parents: 11147
diff changeset
   220
Simplification disposes of the base case and leaves only a conjunction
622331bbdb7f *** empty log message ***
nipkow
parents: 11147
diff changeset
   221
of two step cases to be proved:
10884
2995639c6a09 renaming of some files
paulson
parents: 10608
diff changeset
   222
if @{prop"w = a#v"} and @{prop[display]"size[x\<in>v. x=a] = size[x\<in>v. x=b]+2"} then
10236
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   223
@{prop"b#v \<in> A"}, and similarly for @{prop"w = b#v"}.
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   224
We only consider the first case in detail.
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   225
11257
622331bbdb7f *** empty log message ***
nipkow
parents: 11147
diff changeset
   226
After breaking the conjunction up into two cases, we can apply
10236
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   227
@{thm[source]part1} to the assumption that @{term w} contains two more @{term
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   228
a}'s than @{term b}'s.
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   229
*}
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   230
11257
622331bbdb7f *** empty log message ***
nipkow
parents: 11147
diff changeset
   231
apply(rule conjI)
622331bbdb7f *** empty log message ***
nipkow
parents: 11147
diff changeset
   232
 apply(clarify)
622331bbdb7f *** empty log message ***
nipkow
parents: 11147
diff changeset
   233
 apply(frule part1[of "\<lambda>x. x=a", simplified])
622331bbdb7f *** empty log message ***
nipkow
parents: 11147
diff changeset
   234
 apply(clarify)
10236
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   235
txt{*\noindent
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   236
This yields an index @{prop"i \<le> length v"} such that
23380
15f7a6745cce fixed filter syntax
nipkow
parents: 17914
diff changeset
   237
@{prop[display]"length [x\<leftarrow>take i v . x = a] = length [x\<leftarrow>take i v . x = b] + 1"}
11147
d848c6693185 *** empty log message ***
nipkow
parents: 10884
diff changeset
   238
With the help of @{thm[source]part2} it follows that
23380
15f7a6745cce fixed filter syntax
nipkow
parents: 17914
diff changeset
   239
@{prop[display]"length [x\<leftarrow>drop i v . x = a] = length [x\<leftarrow>drop i v . x = b] + 1"}
10236
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   240
*}
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   241
11257
622331bbdb7f *** empty log message ***
nipkow
parents: 11147
diff changeset
   242
 apply(drule part2[of "\<lambda>x. x=a", simplified])
622331bbdb7f *** empty log message ***
nipkow
parents: 11147
diff changeset
   243
  apply(assumption)
10236
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   244
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   245
txt{*\noindent
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   246
Now it is time to decompose @{term v} in the conclusion @{prop"b#v \<in> A"}
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   247
into @{term"take i v @ drop i v"},
11257
622331bbdb7f *** empty log message ***
nipkow
parents: 11147
diff changeset
   248
*}
622331bbdb7f *** empty log message ***
nipkow
parents: 11147
diff changeset
   249
622331bbdb7f *** empty log message ***
nipkow
parents: 11147
diff changeset
   250
 apply(rule_tac n1=i and t=v in subst[OF append_take_drop_id])
622331bbdb7f *** empty log message ***
nipkow
parents: 11147
diff changeset
   251
622331bbdb7f *** empty log message ***
nipkow
parents: 11147
diff changeset
   252
txt{*\noindent
622331bbdb7f *** empty log message ***
nipkow
parents: 11147
diff changeset
   253
(the variables @{term n1} and @{term t} are the result of composing the
622331bbdb7f *** empty log message ***
nipkow
parents: 11147
diff changeset
   254
theorems @{thm[source]subst} and @{thm[source]append_take_drop_id})
10236
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   255
after which the appropriate rule of the grammar reduces the goal
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   256
to the two subgoals @{prop"take i v \<in> A"} and @{prop"drop i v \<in> A"}:
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   257
*}
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   258
11257
622331bbdb7f *** empty log message ***
nipkow
parents: 11147
diff changeset
   259
 apply(rule S_A_B.intros)
10236
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   260
11257
622331bbdb7f *** empty log message ***
nipkow
parents: 11147
diff changeset
   261
txt{*
10236
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   262
Both subgoals follow from the induction hypothesis because both @{term"take i
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   263
v"} and @{term"drop i v"} are shorter than @{term w}:
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   264
*}
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   265
11257
622331bbdb7f *** empty log message ***
nipkow
parents: 11147
diff changeset
   266
  apply(force simp add: min_less_iff_disj)
622331bbdb7f *** empty log message ***
nipkow
parents: 11147
diff changeset
   267
 apply(force split add: nat_diff_split)
10236
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   268
11257
622331bbdb7f *** empty log message ***
nipkow
parents: 11147
diff changeset
   269
txt{*
10884
2995639c6a09 renaming of some files
paulson
parents: 10608
diff changeset
   270
The case @{prop"w = b#v"} is proved analogously:
10236
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   271
*}
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   272
11257
622331bbdb7f *** empty log message ***
nipkow
parents: 11147
diff changeset
   273
apply(clarify)
622331bbdb7f *** empty log message ***
nipkow
parents: 11147
diff changeset
   274
apply(frule part1[of "\<lambda>x. x=b", simplified])
622331bbdb7f *** empty log message ***
nipkow
parents: 11147
diff changeset
   275
apply(clarify)
622331bbdb7f *** empty log message ***
nipkow
parents: 11147
diff changeset
   276
apply(drule part2[of "\<lambda>x. x=b", simplified])
622331bbdb7f *** empty log message ***
nipkow
parents: 11147
diff changeset
   277
 apply(assumption)
622331bbdb7f *** empty log message ***
nipkow
parents: 11147
diff changeset
   278
apply(rule_tac n1=i and t=v in subst[OF append_take_drop_id])
622331bbdb7f *** empty log message ***
nipkow
parents: 11147
diff changeset
   279
apply(rule S_A_B.intros)
12815
wenzelm
parents: 12332
diff changeset
   280
 apply(force simp add: min_less_iff_disj)
wenzelm
parents: 12332
diff changeset
   281
by(force simp add: min_less_iff_disj split add: nat_diff_split)
10217
e61e7e1eacaf *** empty log message ***
nipkow
parents:
diff changeset
   282
10236
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   283
text{*
10884
2995639c6a09 renaming of some files
paulson
parents: 10608
diff changeset
   284
We conclude this section with a comparison of our proof with 
11494
23a118849801 revisions and indexing
paulson
parents: 11310
diff changeset
   285
Hopcroft\index{Hopcroft, J. E.} and Ullman's\index{Ullman, J. D.}
23a118849801 revisions and indexing
paulson
parents: 11310
diff changeset
   286
\cite[p.\ts81]{HopcroftUllman}.
23a118849801 revisions and indexing
paulson
parents: 11310
diff changeset
   287
For a start, the textbook
11257
622331bbdb7f *** empty log message ***
nipkow
parents: 11147
diff changeset
   288
grammar, for no good reason, excludes the empty word, thus complicating
622331bbdb7f *** empty log message ***
nipkow
parents: 11147
diff changeset
   289
matters just a little bit: they have 8 instead of our 7 productions.
10236
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   290
11147
d848c6693185 *** empty log message ***
nipkow
parents: 10884
diff changeset
   291
More importantly, the proof itself is different: rather than
d848c6693185 *** empty log message ***
nipkow
parents: 10884
diff changeset
   292
separating the two directions, they perform one induction on the
d848c6693185 *** empty log message ***
nipkow
parents: 10884
diff changeset
   293
length of a word. This deprives them of the beauty of rule induction,
d848c6693185 *** empty log message ***
nipkow
parents: 10884
diff changeset
   294
and in the easy direction (correctness) their reasoning is more
d848c6693185 *** empty log message ***
nipkow
parents: 10884
diff changeset
   295
detailed than our @{text auto}. For the hard part (completeness), they
d848c6693185 *** empty log message ***
nipkow
parents: 10884
diff changeset
   296
consider just one of the cases that our @{text simp_all} disposes of
d848c6693185 *** empty log message ***
nipkow
parents: 10884
diff changeset
   297
automatically. Then they conclude the proof by saying about the
d848c6693185 *** empty log message ***
nipkow
parents: 10884
diff changeset
   298
remaining cases: ``We do this in a manner similar to our method of
d848c6693185 *** empty log message ***
nipkow
parents: 10884
diff changeset
   299
proof for part (1); this part is left to the reader''. But this is
d848c6693185 *** empty log message ***
nipkow
parents: 10884
diff changeset
   300
precisely the part that requires the intermediate value theorem and
d848c6693185 *** empty log message ***
nipkow
parents: 10884
diff changeset
   301
thus is not at all similar to the other cases (which are automatic in
d848c6693185 *** empty log message ***
nipkow
parents: 10884
diff changeset
   302
Isabelle). The authors are at least cavalier about this point and may
d848c6693185 *** empty log message ***
nipkow
parents: 10884
diff changeset
   303
even have overlooked the slight difficulty lurking in the omitted
11494
23a118849801 revisions and indexing
paulson
parents: 11310
diff changeset
   304
cases.  Such errors are found in many pen-and-paper proofs when they
23a118849801 revisions and indexing
paulson
parents: 11310
diff changeset
   305
are scrutinized formally.%
23a118849801 revisions and indexing
paulson
parents: 11310
diff changeset
   306
\index{grammars!defining inductively|)}
23a118849801 revisions and indexing
paulson
parents: 11310
diff changeset
   307
*}
10236
7626cb4e1407 *** empty log message ***
nipkow
parents: 10225
diff changeset
   308
10225
b9fd52525b69 *** empty log message ***
nipkow
parents: 10217
diff changeset
   309
(*<*)end(*>*)