author  wenzelm 
Mon, 07 Oct 2013 21:24:44 +0200  
changeset 54313  da2e6282a4f5 
parent 48985  5386df44a037 
child 58620  7435b6a3f72e 
permissions  rwrr 
17914  1 
(*<*)theory AB imports Main begin(*>*) 
10225  2 

10884  3 
section{*Case Study: A Context Free Grammar*} 
10217  4 

10242  5 
text{*\label{sec:CFG} 
11494  6 
\index{grammars!defining inductively(}% 
10236  7 
Grammars are nothing but shorthands for inductive definitions of nonterminals 
8 
which represent sets of strings. For example, the production 

9 
$A \to B c$ is short for 

10 
\[ w \in B \Longrightarrow wc \in A \] 

10884  11 
This section demonstrates this idea with an example 
12 
due to Hopcroft and Ullman, a grammar for generating all words with an 

13 
equal number of $a$'s and~$b$'s: 

10236  14 
\begin{eqnarray} 
15 
S &\to& \epsilon \mid b A \mid a B \nonumber\\ 

16 
A &\to& a S \mid b A A \nonumber\\ 

17 
B &\to& b S \mid a B B \nonumber 

18 
\end{eqnarray} 

10884  19 
At the end we say a few words about the relationship between 
20 
the original proof \cite[p.\ts81]{HopcroftUllman} and our formal version. 

10236  21 

10287  22 
We start by fixing the alphabet, which consists only of @{term a}'s 
10884  23 
and~@{term b}'s: 
10236  24 
*} 
25 

11705  26 
datatype alfa = a  b 
10217  27 

10236  28 
text{*\noindent 
10287  29 
For convenience we include the following easy lemmas as simplification rules: 
10236  30 
*} 
31 

11705  32 
lemma [simp]: "(x \<noteq> a) = (x = b) \<and> (x \<noteq> b) = (x = a)" 
10884  33 
by (case_tac x, auto) 
10217  34 

10236  35 
text{*\noindent 
36 
Words over this alphabet are of type @{typ"alfa list"}, and 

23733  37 
the three nonterminals are declared as sets of such words. 
10884  38 
The productions above are recast as a \emph{mutual} inductive 
10242  39 
definition\index{inductive definition!simultaneous} 
10884  40 
of @{term S}, @{term A} and~@{term B}: 
10236  41 
*} 
42 

23733  43 
inductive_set 
25330  44 
S :: "alfa list set" and 
45 
A :: "alfa list set" and 

46 
B :: "alfa list set" 

23733  47 
where 
10236  48 
"[] \<in> S" 
23733  49 
 "w \<in> A \<Longrightarrow> b#w \<in> S" 
50 
 "w \<in> B \<Longrightarrow> a#w \<in> S" 

10217  51 

23733  52 
 "w \<in> S \<Longrightarrow> a#w \<in> A" 
53 
 "\<lbrakk> v\<in>A; w\<in>A \<rbrakk> \<Longrightarrow> b#v@w \<in> A" 

10236  54 

23733  55 
 "w \<in> S \<Longrightarrow> b#w \<in> B" 
56 
 "\<lbrakk> v \<in> B; w \<in> B \<rbrakk> \<Longrightarrow> a#v@w \<in> B" 

10217  57 

10236  58 
text{*\noindent 
59 
First we show that all words in @{term S} contain the same number of @{term 

10884  60 
a}'s and @{term b}'s. Since the definition of @{term S} is by mutual 
61 
induction, so is the proof: we show at the same time that all words in 

10236  62 
@{term A} contain one more @{term a} than @{term b} and all words in @{term 
27167  63 
B} contain one more @{term b} than @{term a}. 
10236  64 
*} 
10217  65 

10236  66 
lemma correctness: 
23380  67 
"(w \<in> S \<longrightarrow> size[x\<leftarrow>w. x=a] = size[x\<leftarrow>w. x=b]) \<and> 
68 
(w \<in> A \<longrightarrow> size[x\<leftarrow>w. x=a] = size[x\<leftarrow>w. x=b] + 1) \<and> 

69 
(w \<in> B \<longrightarrow> size[x\<leftarrow>w. x=b] = size[x\<leftarrow>w. x=a] + 1)" 

10217  70 

10236  71 
txt{*\noindent 
72 
These propositions are expressed with the help of the predefined @{term 

23380  73 
filter} function on lists, which has the convenient syntax @{text"[x\<leftarrow>xs. P 
10236  74 
x]"}, the list of all elements @{term x} in @{term xs} such that @{prop"P x"} 
10884  75 
holds. Remember that on lists @{text size} and @{text length} are synonymous. 
10236  76 

77 
The proof itself is by rule induction and afterwards automatic: 

78 
*} 

79 

10884  80 
by (rule S_A_B.induct, auto) 
10217  81 

10236  82 
text{*\noindent 
83 
This may seem surprising at first, and is indeed an indication of the power 

84 
of inductive definitions. But it is also quite straightforward. For example, 

85 
consider the production $A \to b A A$: if $v,w \in A$ and the elements of $A$ 

10884  86 
contain one more $a$ than~$b$'s, then $bvw$ must again contain one more $a$ 
87 
than~$b$'s. 

10236  88 

89 
As usual, the correctness of syntactic descriptions is easy, but completeness 

90 
is hard: does @{term S} contain \emph{all} words with an equal number of 

91 
@{term a}'s and @{term b}'s? It turns out that this proof requires the 

10884  92 
following lemma: every string with two more @{term a}'s than @{term 
93 
b}'s can be cut somewhere such that each half has one more @{term a} than 

10236  94 
@{term b}. This is best seen by imagining counting the difference between the 
10283  95 
number of @{term a}'s and @{term b}'s starting at the left end of the 
96 
word. We start with 0 and end (at the right end) with 2. Since each move to the 

10236  97 
right increases or decreases the difference by 1, we must have passed through 
98 
1 on our way from 0 to 2. Formally, we appeal to the following discrete 

99 
intermediate value theorem @{thm[source]nat0_intermed_int_val} 

16412  100 
@{thm[display,margin=60]nat0_intermed_int_val[no_vars]} 
10236  101 
where @{term f} is of type @{typ"nat \<Rightarrow> int"}, @{typ int} are the integers, 
11308  102 
@{text"\<bar>.\<bar>"} is the absolute value function\footnote{See 
103 
Table~\ref{tab:ascii} in the Appendix for the correct \textsc{ascii} 

11705  104 
syntax.}, and @{term"1::int"} is the integer 1 (see \S\ref{sec:numbers}). 
10217  105 

11147  106 
First we show that our specific function, the difference between the 
10236  107 
numbers of @{term a}'s and @{term b}'s, does indeed only change by 1 in every 
108 
move to the right. At this point we also start generalizing from @{term a}'s 

109 
and @{term b}'s to an arbitrary property @{term P}. Otherwise we would have 

110 
to prove the desired lemma twice, once as stated above and once with the 

111 
roles of @{term a}'s and @{term b}'s interchanged. 

112 
*} 

113 

114 
lemma step1: "\<forall>i < size w. 

23380  115 
\<bar>(int(size[x\<leftarrow>take (i+1) w. P x])int(size[x\<leftarrow>take (i+1) w. \<not>P x])) 
116 
 (int(size[x\<leftarrow>take i w. P x])int(size[x\<leftarrow>take i w. \<not>P x]))\<bar> \<le> 1" 

10236  117 

118 
txt{*\noindent 

119 
The lemma is a bit hard to read because of the coercion function 

11147  120 
@{text"int :: nat \<Rightarrow> int"}. It is required because @{term size} returns 
10884  121 
a natural number, but subtraction on type~@{typ nat} will do the wrong thing. 
10236  122 
Function @{term take} is predefined and @{term"take i xs"} is the prefix of 
10884  123 
length @{term i} of @{term xs}; below we also need @{term"drop i xs"}, which 
10236  124 
is what remains after that prefix has been dropped from @{term xs}. 
125 

126 
The proof is by induction on @{term w}, with a trivial base case, and a not 

127 
so trivial induction step. Since it is essentially just arithmetic, we do not 

128 
discuss it. 

129 
*} 

130 

12332  131 
apply(induct_tac w) 
16585  132 
apply(auto simp add: abs_if take_Cons split: nat.split) 
133 
done 

10217  134 

10236  135 
text{* 
11494  136 
Finally we come to the abovementioned lemma about cutting in half a word with two more elements of one sort than of the other sort: 
10236  137 
*} 
10217  138 

139 
lemma part1: 

23380  140 
"size[x\<leftarrow>w. P x] = size[x\<leftarrow>w. \<not>P x]+2 \<Longrightarrow> 
141 
\<exists>i\<le>size w. size[x\<leftarrow>take i w. P x] = size[x\<leftarrow>take i w. \<not>P x]+1" 

10236  142 

143 
txt{*\noindent 

10884  144 
This is proved by @{text force} with the help of the intermediate value theorem, 
10608  145 
instantiated appropriately and with its first premise disposed of by lemma 
146 
@{thm[source]step1}: 

10236  147 
*} 
148 

11870
181bd2050cf4
Numerals now work for the integers: the binary numerals for 0 and 1 rewrite
paulson
parents:
11705
diff
changeset

149 
apply(insert nat0_intermed_int_val[OF step1, of "P" "w" "1"]) 
11705  150 
by force 
10236  151 

152 
text{*\noindent 

153 

154 
Lemma @{thm[source]part1} tells us only about the prefix @{term"take i w"}. 

10884  155 
An easy lemma deals with the suffix @{term"drop i w"}: 
10236  156 
*} 
157 

10217  158 

159 
lemma part2: 

23380  160 
"\<lbrakk>size[x\<leftarrow>take i w @ drop i w. P x] = 
161 
size[x\<leftarrow>take i w @ drop i w. \<not>P x]+2; 

162 
size[x\<leftarrow>take i w. P x] = size[x\<leftarrow>take i w. \<not>P x]+1\<rbrakk> 

163 
\<Longrightarrow> size[x\<leftarrow>drop i w. P x] = size[x\<leftarrow>drop i w. \<not>P x]+1" 

12815  164 
by(simp del: append_take_drop_id) 
10217  165 

10236  166 
text{*\noindent 
11257  167 
In the proof we have disabled the normally useful lemma 
10884  168 
\begin{isabelle} 
169 
@{thm append_take_drop_id[no_vars]} 

170 
\rulename{append_take_drop_id} 

171 
\end{isabelle} 

11257  172 
to allow the simplifier to apply the following lemma instead: 
173 
@{text[display]"[x\<in>xs@ys. P x] = [x\<in>xs. P x] @ [x\<in>ys. P x]"} 

10236  174 

175 
To dispose of trivial cases automatically, the rules of the inductive 

176 
definition are declared simplification rules: 

177 
*} 

178 

11705  179 
declare S_A_B.intros[simp] 
10236  180 

181 
text{*\noindent 

182 
This could have been done earlier but was not necessary so far. 

183 

184 
The completeness theorem tells us that if a word has the same number of 

10884  185 
@{term a}'s and @{term b}'s, then it is in @{term S}, and similarly 
186 
for @{term A} and @{term B}: 

10236  187 
*} 
188 

189 
theorem completeness: 

23380  190 
"(size[x\<leftarrow>w. x=a] = size[x\<leftarrow>w. x=b] \<longrightarrow> w \<in> S) \<and> 
191 
(size[x\<leftarrow>w. x=a] = size[x\<leftarrow>w. x=b] + 1 \<longrightarrow> w \<in> A) \<and> 

192 
(size[x\<leftarrow>w. x=b] = size[x\<leftarrow>w. x=a] + 1 \<longrightarrow> w \<in> B)" 

10236  193 

194 
txt{*\noindent 

195 
The proof is by induction on @{term w}. Structural induction would fail here 

196 
because, as we can see from the grammar, we need to make bigger steps than 

197 
merely appending a single letter at the front. Hence we induct on the length 

198 
of @{term w}, using the induction rule @{thm[source]length_induct}: 

199 
*} 

10217  200 

11705  201 
apply(induct_tac w rule: length_induct) 
27167  202 
apply(rename_tac w) 
10236  203 

204 
txt{*\noindent 

205 
The @{text rule} parameter tells @{text induct_tac} explicitly which induction 

206 
rule to use. For details see \S\ref{sec:completeind} below. 

207 
In this case the result is that we may assume the lemma already 

27167  208 
holds for all words shorter than @{term w}. Because the induction step renames 
209 
the induction variable we rename it back to @{text w}. 

10236  210 

211 
The proof continues with a case distinction on @{term w}, 

11494  212 
on whether @{term w} is empty or not. 
10236  213 
*} 
214 

11705  215 
apply(case_tac w) 
216 
apply(simp_all) 

10236  217 
(*<*)apply(rename_tac x v)(*>*) 
218 

219 
txt{*\noindent 

11257  220 
Simplification disposes of the base case and leaves only a conjunction 
221 
of two step cases to be proved: 

10884  222 
if @{prop"w = a#v"} and @{prop[display]"size[x\<in>v. x=a] = size[x\<in>v. x=b]+2"} then 
10236  223 
@{prop"b#v \<in> A"}, and similarly for @{prop"w = b#v"}. 
224 
We only consider the first case in detail. 

225 

11257  226 
After breaking the conjunction up into two cases, we can apply 
10236  227 
@{thm[source]part1} to the assumption that @{term w} contains two more @{term 
228 
a}'s than @{term b}'s. 

229 
*} 

230 

11257  231 
apply(rule conjI) 
232 
apply(clarify) 

233 
apply(frule part1[of "\<lambda>x. x=a", simplified]) 

234 
apply(clarify) 

10236  235 
txt{*\noindent 
236 
This yields an index @{prop"i \<le> length v"} such that 

23380  237 
@{prop[display]"length [x\<leftarrow>take i v . x = a] = length [x\<leftarrow>take i v . x = b] + 1"} 
11147  238 
With the help of @{thm[source]part2} it follows that 
23380  239 
@{prop[display]"length [x\<leftarrow>drop i v . x = a] = length [x\<leftarrow>drop i v . x = b] + 1"} 
10236  240 
*} 
241 

11257  242 
apply(drule part2[of "\<lambda>x. x=a", simplified]) 
243 
apply(assumption) 

10236  244 

245 
txt{*\noindent 

246 
Now it is time to decompose @{term v} in the conclusion @{prop"b#v \<in> A"} 

247 
into @{term"take i v @ drop i v"}, 

11257  248 
*} 
249 

250 
apply(rule_tac n1=i and t=v in subst[OF append_take_drop_id]) 

251 

252 
txt{*\noindent 

253 
(the variables @{term n1} and @{term t} are the result of composing the 

254 
theorems @{thm[source]subst} and @{thm[source]append_take_drop_id}) 

10236  255 
after which the appropriate rule of the grammar reduces the goal 
256 
to the two subgoals @{prop"take i v \<in> A"} and @{prop"drop i v \<in> A"}: 

257 
*} 

258 

11257  259 
apply(rule S_A_B.intros) 
10236  260 

11257  261 
txt{* 
10236  262 
Both subgoals follow from the induction hypothesis because both @{term"take i 
263 
v"} and @{term"drop i v"} are shorter than @{term w}: 

264 
*} 

265 

11257  266 
apply(force simp add: min_less_iff_disj) 
267 
apply(force split add: nat_diff_split) 

10236  268 

11257  269 
txt{* 
10884  270 
The case @{prop"w = b#v"} is proved analogously: 
10236  271 
*} 
272 

11257  273 
apply(clarify) 
274 
apply(frule part1[of "\<lambda>x. x=b", simplified]) 

275 
apply(clarify) 

276 
apply(drule part2[of "\<lambda>x. x=b", simplified]) 

277 
apply(assumption) 

278 
apply(rule_tac n1=i and t=v in subst[OF append_take_drop_id]) 

279 
apply(rule S_A_B.intros) 

12815  280 
apply(force simp add: min_less_iff_disj) 
281 
by(force simp add: min_less_iff_disj split add: nat_diff_split) 

10217  282 

10236  283 
text{* 
10884  284 
We conclude this section with a comparison of our proof with 
11494  285 
Hopcroft\index{Hopcroft, J. E.} and Ullman's\index{Ullman, J. D.} 
286 
\cite[p.\ts81]{HopcroftUllman}. 

287 
For a start, the textbook 

11257  288 
grammar, for no good reason, excludes the empty word, thus complicating 
289 
matters just a little bit: they have 8 instead of our 7 productions. 

10236  290 

11147  291 
More importantly, the proof itself is different: rather than 
292 
separating the two directions, they perform one induction on the 

293 
length of a word. This deprives them of the beauty of rule induction, 

294 
and in the easy direction (correctness) their reasoning is more 

295 
detailed than our @{text auto}. For the hard part (completeness), they 

296 
consider just one of the cases that our @{text simp_all} disposes of 

297 
automatically. Then they conclude the proof by saying about the 

298 
remaining cases: ``We do this in a manner similar to our method of 

299 
proof for part (1); this part is left to the reader''. But this is 

300 
precisely the part that requires the intermediate value theorem and 

301 
thus is not at all similar to the other cases (which are automatic in 

302 
Isabelle). The authors are at least cavalier about this point and may 

303 
even have overlooked the slight difficulty lurking in the omitted 

11494  304 
cases. Such errors are found in many penandpaper proofs when they 
305 
are scrutinized formally.% 

306 
\index{grammars!defining inductively)} 

307 
*} 

10236  308 

10225  309 
(*<*)end(*>*) 