author | wenzelm |
Mon, 11 Aug 2014 22:29:48 +0200 | |
changeset 57901 | e1abca2527da |
parent 57105 | bf5ddf4ec64b |
child 57903 | ade8d65b212e |
permissions | -rw-r--r-- |
29315
b074c05f00ad
renamed ThyEdit (in thy_edit.ML) to ThySyntax (in thy_syntax.ML);
wenzelm
parents:
28454
diff
changeset
|
1 |
(* Title: Pure/Thy/thy_syntax.ML |
23726 | 2 |
Author: Makarius |
3 |
||
29315
b074c05f00ad
renamed ThyEdit (in thy_edit.ML) to ThySyntax (in thy_syntax.ML);
wenzelm
parents:
28454
diff
changeset
|
4 |
Superficial theory syntax: tokens and spans. |
23726 | 5 |
*) |
6 |
||
29315
b074c05f00ad
renamed ThyEdit (in thy_edit.ML) to ThySyntax (in thy_syntax.ML);
wenzelm
parents:
28454
diff
changeset
|
7 |
signature THY_SYNTAX = |
23726 | 8 |
sig |
36959
f5417836dbea
renamed structure OuterLex to Token and type token to Token.T, keeping legacy aliases for some time;
wenzelm
parents:
36950
diff
changeset
|
9 |
val parse_tokens: Scan.lexicon * Scan.lexicon -> Position.T -> string -> Token.T list |
57105 | 10 |
val reports_of_tokens: Token.T list -> bool * Position.report_text list |
40131
7cbebd636e79
explicitly qualify type Output.output, which is a slightly odd internal feature;
wenzelm
parents:
39507
diff
changeset
|
11 |
val present_token: Token.T -> Output.output |
57901
e1abca2527da
more explicit type Span in Scala, according to ML version;
wenzelm
parents:
57105
diff
changeset
|
12 |
datatype span_kind = Command_Span of string * Position.T | Ignored_Span | Malformed_Span |
48867
e9beabf045ab
some support for inlining file content into outer syntax token language;
wenzelm
parents:
48865
diff
changeset
|
13 |
datatype span = Span of span_kind * Token.T list |
27842 | 14 |
val span_kind: span -> span_kind |
36959
f5417836dbea
renamed structure OuterLex to Token and type token to Token.T, keeping legacy aliases for some time;
wenzelm
parents:
36950
diff
changeset
|
15 |
val span_content: span -> Token.T list |
48865 | 16 |
val present_span: span -> Output.output |
46811
03a2dc9e0624
clarified command span: include trailing whitespace/comments and thus reduce number of ignored spans with associated transactions and states (factor 2);
wenzelm
parents:
45666
diff
changeset
|
17 |
val parse_spans: Token.T list -> span list |
54520
cee77d2e9582
release file errors at runtime: Command.eval instead of Command.read;
wenzelm
parents:
54451
diff
changeset
|
18 |
val resolve_files: (string -> Path.T * Position.T -> Token.file Exn.result list) -> span -> span |
51225 | 19 |
datatype 'a element = Element of 'a * ('a element list * 'a) option |
51321 | 20 |
val atom: 'a -> 'a element |
51225 | 21 |
val map_element: ('a -> 'b) -> 'a element -> 'b element |
22 |
val flat_element: 'a element -> 'a list |
|
51268
fcc4b89a600d
simplified Outer_Syntax.read_span: internalized Toplevel.is_ignored;
wenzelm
parents:
51267
diff
changeset
|
23 |
val last_element: 'a element -> 'a |
51225 | 24 |
val parse_elements: span list -> span element list |
23726 | 25 |
end; |
26 |
||
37216
3165bc303f66
modernized some structure names, keeping a few legacy aliases;
wenzelm
parents:
37197
diff
changeset
|
27 |
structure Thy_Syntax: THY_SYNTAX = |
23726 | 28 |
struct |
29 |
||
23803 | 30 |
(** tokens **) |
31 |
||
32 |
(* parse *) |
|
23726 | 33 |
|
40523
1050315f6ee2
simplified/robustified treatment of malformed symbols, which are now fully internalized (total Symbol.explode etc.);
wenzelm
parents:
40290
diff
changeset
|
34 |
fun parse_tokens lexs pos = |
46811
03a2dc9e0624
clarified command span: include trailing whitespace/comments and thus reduce number of ignored spans with associated transactions and states (factor 2);
wenzelm
parents:
45666
diff
changeset
|
35 |
Source.of_string #> |
03a2dc9e0624
clarified command span: include trailing whitespace/comments and thus reduce number of ignored spans with associated transactions and states (factor 2);
wenzelm
parents:
45666
diff
changeset
|
36 |
Symbol.source #> |
03a2dc9e0624
clarified command span: include trailing whitespace/comments and thus reduce number of ignored spans with associated transactions and states (factor 2);
wenzelm
parents:
45666
diff
changeset
|
37 |
Token.source {do_recover = SOME false} (K lexs) pos #> |
03a2dc9e0624
clarified command span: include trailing whitespace/comments and thus reduce number of ignored spans with associated transactions and states (factor 2);
wenzelm
parents:
45666
diff
changeset
|
38 |
Source.exhaust; |
23726 | 39 |
|
40 |
||
23803 | 41 |
(* present *) |
23726 | 42 |
|
43 |
local |
|
44 |
||
48749
c197b3c3e7fa
some attempts to keep malformed syntax errors focussed, without too much red spilled onto the document view;
wenzelm
parents:
46811
diff
changeset
|
45 |
fun reports_of_token tok = |
c197b3c3e7fa
some attempts to keep malformed syntax errors focussed, without too much red spilled onto the document view;
wenzelm
parents:
46811
diff
changeset
|
46 |
let |
55828
42ac3cfb89f6
clarified language markup: added "delimited" property;
wenzelm
parents:
55744
diff
changeset
|
47 |
val {text, pos, ...} = Token.source_position_of tok; |
48749
c197b3c3e7fa
some attempts to keep malformed syntax errors focussed, without too much red spilled onto the document view;
wenzelm
parents:
46811
diff
changeset
|
48 |
val malformed_symbols = |
55828
42ac3cfb89f6
clarified language markup: added "delimited" property;
wenzelm
parents:
55744
diff
changeset
|
49 |
Symbol_Pos.explode (text, pos) |
48749
c197b3c3e7fa
some attempts to keep malformed syntax errors focussed, without too much red spilled onto the document view;
wenzelm
parents:
46811
diff
changeset
|
50 |
|> map_filter (fn (sym, pos) => |
48756
1c843142758e
sneak message into "bad" markup as property -- to be displayed after YXML parsing;
wenzelm
parents:
48752
diff
changeset
|
51 |
if Symbol.is_malformed sym |
50201
c26369c9eda6
Isabelle-specific implementation of quasi-abstract markup elements -- back to module arrangement before d83797ef0d2d;
wenzelm
parents:
48878
diff
changeset
|
52 |
then SOME ((pos, Markup.bad), "Malformed symbolic character") else NONE); |
48749
c197b3c3e7fa
some attempts to keep malformed syntax errors focussed, without too much red spilled onto the document view;
wenzelm
parents:
46811
diff
changeset
|
53 |
val is_malformed = Token.is_error tok orelse not (null malformed_symbols); |
55915
607948c90bf0
suppress short abbreviations more uniformly, for outer and quasi-outer syntax;
wenzelm
parents:
55828
diff
changeset
|
54 |
val reports = Token.report tok :: Token.completion_report tok @ malformed_symbols; |
48749
c197b3c3e7fa
some attempts to keep malformed syntax errors focussed, without too much red spilled onto the document view;
wenzelm
parents:
46811
diff
changeset
|
55 |
in (is_malformed, reports) end; |
40528 | 56 |
|
23803 | 57 |
in |
58 |
||
48749
c197b3c3e7fa
some attempts to keep malformed syntax errors focussed, without too much red spilled onto the document view;
wenzelm
parents:
46811
diff
changeset
|
59 |
fun reports_of_tokens toks = |
c197b3c3e7fa
some attempts to keep malformed syntax errors focussed, without too much red spilled onto the document view;
wenzelm
parents:
46811
diff
changeset
|
60 |
let val results = map reports_of_token toks |
c197b3c3e7fa
some attempts to keep malformed syntax errors focussed, without too much red spilled onto the document view;
wenzelm
parents:
46811
diff
changeset
|
61 |
in (exists fst results, maps snd results) end; |
c197b3c3e7fa
some attempts to keep malformed syntax errors focussed, without too much red spilled onto the document view;
wenzelm
parents:
46811
diff
changeset
|
62 |
|
23726 | 63 |
fun present_token tok = |
55915
607948c90bf0
suppress short abbreviations more uniformly, for outer and quasi-outer syntax;
wenzelm
parents:
55828
diff
changeset
|
64 |
Markup.enclose (Token.markup tok) (Output.output (Token.unparse tok)); |
23726 | 65 |
|
23803 | 66 |
end; |
67 |
||
68 |
||
69 |
||
27665 | 70 |
(** spans **) |
71 |
||
57901
e1abca2527da
more explicit type Span in Scala, according to ML version;
wenzelm
parents:
57105
diff
changeset
|
72 |
datatype span_kind = Command_Span of string * Position.T | Ignored_Span | Malformed_Span; |
36959
f5417836dbea
renamed structure OuterLex to Token and type token to Token.T, keeping legacy aliases for some time;
wenzelm
parents:
36950
diff
changeset
|
73 |
datatype span = Span of span_kind * Token.T list; |
23803 | 74 |
|
27842 | 75 |
fun span_kind (Span (k, _)) = k; |
76 |
fun span_content (Span (_, toks)) = toks; |
|
77 |
||
48865 | 78 |
val present_span = implode o map present_token o span_content; |
79 |
||
23803 | 80 |
|
81 |
(* parse *) |
|
23726 | 82 |
|
23803 | 83 |
local |
84 |
||
57901
e1abca2527da
more explicit type Span in Scala, according to ML version;
wenzelm
parents:
57105
diff
changeset
|
85 |
fun ship span = |
e1abca2527da
more explicit type Span in Scala, according to ML version;
wenzelm
parents:
57105
diff
changeset
|
86 |
let |
e1abca2527da
more explicit type Span in Scala, according to ML version;
wenzelm
parents:
57105
diff
changeset
|
87 |
val kind = |
e1abca2527da
more explicit type Span in Scala, according to ML version;
wenzelm
parents:
57105
diff
changeset
|
88 |
if not (null span) andalso Token.is_command (hd span) then |
e1abca2527da
more explicit type Span in Scala, according to ML version;
wenzelm
parents:
57105
diff
changeset
|
89 |
Command_Span (Token.content_of (hd span), Token.pos_of (hd span)) |
e1abca2527da
more explicit type Span in Scala, according to ML version;
wenzelm
parents:
57105
diff
changeset
|
90 |
else if forall Token.is_improper span then Ignored_Span |
e1abca2527da
more explicit type Span in Scala, according to ML version;
wenzelm
parents:
57105
diff
changeset
|
91 |
else Malformed_Span; |
e1abca2527da
more explicit type Span in Scala, according to ML version;
wenzelm
parents:
57105
diff
changeset
|
92 |
in cons (Span (kind, span)) end; |
23726 | 93 |
|
57901
e1abca2527da
more explicit type Span in Scala, according to ML version;
wenzelm
parents:
57105
diff
changeset
|
94 |
fun flush (result, content, improper) = |
53843
88c6e630c15f
clarified command spans (again, see also 03a2dc9e0624): restrict to proper range to allow Isabelle.command_edit adding material monotonically without destroying the command (e.g. relevant for sendback from sledgehammer);
wenzelm
parents:
51423
diff
changeset
|
95 |
result |
57901
e1abca2527da
more explicit type Span in Scala, according to ML version;
wenzelm
parents:
57105
diff
changeset
|
96 |
|> not (null content) ? ship (rev content) |
e1abca2527da
more explicit type Span in Scala, according to ML version;
wenzelm
parents:
57105
diff
changeset
|
97 |
|> not (null improper) ? ship (rev improper); |
53843
88c6e630c15f
clarified command spans (again, see also 03a2dc9e0624): restrict to proper range to allow Isabelle.command_edit adding material monotonically without destroying the command (e.g. relevant for sendback from sledgehammer);
wenzelm
parents:
51423
diff
changeset
|
98 |
|
57901
e1abca2527da
more explicit type Span in Scala, according to ML version;
wenzelm
parents:
57105
diff
changeset
|
99 |
fun parse tok (result, content, improper) = |
e1abca2527da
more explicit type Span in Scala, according to ML version;
wenzelm
parents:
57105
diff
changeset
|
100 |
if Token.is_command tok then (flush (result, content, improper), [tok], []) |
e1abca2527da
more explicit type Span in Scala, according to ML version;
wenzelm
parents:
57105
diff
changeset
|
101 |
else if Token.is_improper tok then (result, content, tok :: improper) |
e1abca2527da
more explicit type Span in Scala, according to ML version;
wenzelm
parents:
57105
diff
changeset
|
102 |
else (result, tok :: (improper @ content), []); |
23726 | 103 |
|
104 |
in |
|
105 |
||
46811
03a2dc9e0624
clarified command span: include trailing whitespace/comments and thus reduce number of ignored spans with associated transactions and states (factor 2);
wenzelm
parents:
45666
diff
changeset
|
106 |
fun parse_spans toks = |
57901
e1abca2527da
more explicit type Span in Scala, according to ML version;
wenzelm
parents:
57105
diff
changeset
|
107 |
fold parse toks ([], [], []) |> flush |> rev; |
23803 | 108 |
|
109 |
end; |
|
110 |
||
111 |
||
54451 | 112 |
(* inlined files *) |
113 |
||
114 |
local |
|
115 |
||
116 |
fun clean ((i1, t1) :: (i2, t2) :: toks) = |
|
117 |
if Token.keyword_with (fn s => s = "%" orelse s = "--") t1 then clean toks |
|
118 |
else (i1, t1) :: clean ((i2, t2) :: toks) |
|
119 |
| clean toks = toks; |
|
120 |
||
121 |
fun clean_tokens toks = |
|
122 |
((0 upto length toks - 1) ~~ toks) |
|
123 |
|> filter (fn (_, tok) => Token.is_proper tok) |
|
124 |
|> clean; |
|
125 |
||
55118
7df949045dc5
clarified approximative syntax of thy_load commands: first name after command keyword, after cleaning wrt. tags and cmts;
wenzelm
parents:
55033
diff
changeset
|
126 |
fun find_file ((_, tok) :: toks) = |
7df949045dc5
clarified approximative syntax of thy_load commands: first name after command keyword, after cleaning wrt. tags and cmts;
wenzelm
parents:
55033
diff
changeset
|
127 |
if Token.is_command tok then |
7df949045dc5
clarified approximative syntax of thy_load commands: first name after command keyword, after cleaning wrt. tags and cmts;
wenzelm
parents:
55033
diff
changeset
|
128 |
toks |> get_first (fn (i, tok) => |
7df949045dc5
clarified approximative syntax of thy_load commands: first name after command keyword, after cleaning wrt. tags and cmts;
wenzelm
parents:
55033
diff
changeset
|
129 |
if Token.is_name tok then |
55708 | 130 |
SOME (i, (Path.explode (Token.content_of tok), Token.pos_of tok)) |
131 |
handle ERROR msg => error (msg ^ Position.here (Token.pos_of tok)) |
|
55118
7df949045dc5
clarified approximative syntax of thy_load commands: first name after command keyword, after cleaning wrt. tags and cmts;
wenzelm
parents:
55033
diff
changeset
|
132 |
else NONE) |
7df949045dc5
clarified approximative syntax of thy_load commands: first name after command keyword, after cleaning wrt. tags and cmts;
wenzelm
parents:
55033
diff
changeset
|
133 |
else NONE |
7df949045dc5
clarified approximative syntax of thy_load commands: first name after command keyword, after cleaning wrt. tags and cmts;
wenzelm
parents:
55033
diff
changeset
|
134 |
| find_file [] = NONE; |
54451 | 135 |
|
136 |
in |
|
137 |
||
138 |
fun resolve_files read_files span = |
|
139 |
(case span of |
|
57901
e1abca2527da
more explicit type Span in Scala, according to ML version;
wenzelm
parents:
57105
diff
changeset
|
140 |
Span (Command_Span (cmd, pos), toks) => |
54451 | 141 |
if Keyword.is_theory_load cmd then |
55118
7df949045dc5
clarified approximative syntax of thy_load commands: first name after command keyword, after cleaning wrt. tags and cmts;
wenzelm
parents:
55033
diff
changeset
|
142 |
(case find_file (clean_tokens toks) of |
54451 | 143 |
NONE => error ("Bad file argument of command " ^ quote cmd ^ Position.here pos) |
144 |
| SOME (i, path) => |
|
145 |
let |
|
146 |
val toks' = toks |> map_index (fn (j, tok) => |
|
147 |
if i = j then Token.put_files (read_files cmd path) tok |
|
148 |
else tok); |
|
57901
e1abca2527da
more explicit type Span in Scala, according to ML version;
wenzelm
parents:
57105
diff
changeset
|
149 |
in Span (Command_Span (cmd, pos), toks') end) |
54451 | 150 |
else span |
151 |
| _ => span); |
|
152 |
||
153 |
end; |
|
154 |
||
155 |
||
28434 | 156 |
|
43621 | 157 |
(** specification elements: commands with optional proof **) |
158 |
||
51225 | 159 |
datatype 'a element = Element of 'a * ('a element list * 'a) option; |
160 |
||
161 |
fun element (a, b) = Element (a, SOME b); |
|
162 |
fun atom a = Element (a, NONE); |
|
43621 | 163 |
|
51225 | 164 |
fun map_element f (Element (a, NONE)) = Element (f a, NONE) |
165 |
| map_element f (Element (a, SOME (elems, b))) = |
|
166 |
Element (f a, SOME ((map o map_element) f elems, f b)); |
|
167 |
||
168 |
fun flat_element (Element (a, NONE)) = [a] |
|
169 |
| flat_element (Element (a, SOME (elems, b))) = a :: maps flat_element elems @ [b]; |
|
43621 | 170 |
|
51268
fcc4b89a600d
simplified Outer_Syntax.read_span: internalized Toplevel.is_ignored;
wenzelm
parents:
51267
diff
changeset
|
171 |
fun last_element (Element (a, NONE)) = a |
fcc4b89a600d
simplified Outer_Syntax.read_span: internalized Toplevel.is_ignored;
wenzelm
parents:
51267
diff
changeset
|
172 |
| last_element (Element (_, SOME (_, b))) = b; |
fcc4b89a600d
simplified Outer_Syntax.read_span: internalized Toplevel.is_ignored;
wenzelm
parents:
51267
diff
changeset
|
173 |
|
28434 | 174 |
|
175 |
(* scanning spans *) |
|
176 |
||
57901
e1abca2527da
more explicit type Span in Scala, according to ML version;
wenzelm
parents:
57105
diff
changeset
|
177 |
val eof = Span (Command_Span ("", Position.none), []); |
28434 | 178 |
|
57901
e1abca2527da
more explicit type Span in Scala, according to ML version;
wenzelm
parents:
57105
diff
changeset
|
179 |
fun is_eof (Span (Command_Span ("", _), _)) = true |
28434 | 180 |
| is_eof _ = false; |
181 |
||
182 |
val not_eof = not o is_eof; |
|
183 |
||
184 |
val stopper = Scan.stopper (K eof) is_eof; |
|
185 |
||
186 |
||
46811
03a2dc9e0624
clarified command span: include trailing whitespace/comments and thus reduce number of ignored spans with associated transactions and states (factor 2);
wenzelm
parents:
45666
diff
changeset
|
187 |
(* parse *) |
28434 | 188 |
|
189 |
local |
|
190 |
||
48878 | 191 |
fun command_with pred = |
57901
e1abca2527da
more explicit type Span in Scala, according to ML version;
wenzelm
parents:
57105
diff
changeset
|
192 |
Scan.one (fn (Span (Command_Span (name, _), _)) => pred name | _ => false); |
28434 | 193 |
|
51225 | 194 |
val proof_atom = |
57901
e1abca2527da
more explicit type Span in Scala, according to ML version;
wenzelm
parents:
57105
diff
changeset
|
195 |
Scan.one (fn (Span (Command_Span (name, _), _)) => Keyword.is_proof_body name | _ => true) >> atom; |
28434 | 196 |
|
51225 | 197 |
fun proof_element x = (command_with Keyword.is_proof_goal -- proof_rest >> element || proof_atom) x |
198 |
and proof_rest x = (Scan.repeat proof_element -- command_with Keyword.is_qed) x; |
|
199 |
||
200 |
val other_element = |
|
201 |
command_with Keyword.is_theory_goal -- proof_rest >> element || |
|
202 |
Scan.one not_eof >> atom; |
|
28434 | 203 |
|
204 |
in |
|
205 |
||
46811
03a2dc9e0624
clarified command span: include trailing whitespace/comments and thus reduce number of ignored spans with associated transactions and states (factor 2);
wenzelm
parents:
45666
diff
changeset
|
206 |
val parse_elements = |
03a2dc9e0624
clarified command span: include trailing whitespace/comments and thus reduce number of ignored spans with associated transactions and states (factor 2);
wenzelm
parents:
45666
diff
changeset
|
207 |
Source.of_list #> |
51225 | 208 |
Source.source stopper (Scan.bulk other_element) NONE #> |
46811
03a2dc9e0624
clarified command span: include trailing whitespace/comments and thus reduce number of ignored spans with associated transactions and states (factor 2);
wenzelm
parents:
45666
diff
changeset
|
209 |
Source.exhaust; |
28434 | 210 |
|
23726 | 211 |
end; |
28434 | 212 |
|
213 |
end; |