diff --git a/CHANGES.md b/CHANGES.md index a26d6662c..0a717773e 100644 --- a/CHANGES.md +++ b/CHANGES.md @@ -7,6 +7,15 @@ and this project adheres to [Semantic Versioning](https://semver.org/). ### Added +- CLI command `deindex` to remove constants from the index. +- Indexing of symbols from current development (as well as currently required files) and their deindexing when files are closed are now automatically supported. +- Added filtering of search results using regular expressions. +- Added support for basic Rocq syntax for writing search queries (fun, forall, exists, /\ and ~). +- Allow the `--require` flag to be used multiple times with the `search` and `websearch` commands. +- Ambiguity due to overloaded symbols is now solved by normalisation. +- Added streaming of results in command line search. +- Supporting `Plac` in rewriting rules. +- Fixed Stack_overflow exception due large number of search results. - Decimal numbers can now be qualified by a module path so that one can use the decimal notation with different types in the same file. ### Changed diff --git a/TODO.md b/TODO.md index a1364e94e..187cb0c9a 100644 --- a/TODO.md +++ b/TODO.md @@ -4,19 +4,89 @@ TODO Index and search ---------------- -* Too many results found? +* @abdelghani include in HOL2DK_indexing git sub-repos of + coq-hol-light-real-with-N and coq-hol-light + and automatize the generation of the file to output Coq sources -anywhere >= (Π x: _, (= _ V# V#)) -anywhere >= (Π x: _, (= _ x x)) +* @abdelghani + - (maybe?) allow an --add-examples=FNAME to include in the + generated webpage an HTML snippet (e.g. with examples or + additional infos for that instance) -* html tags in textual output :-( +* syntactic sugar for regular expressions / way to write a regular + expression - only query efficiently + (concl = _ | "regexpr") + +* normalize queries when given as commands in lambdapi + +* generate mappings from LP to V automatically with HOL2DK and find + what to do with manually written files; also right now there are + mappings that are lost and mappings that are confused in a many-to-one + relation + +Think about +----------- + +* what notations for Coq should websearch know about; + right now they are in: + - notation.lp file + problem: if you want to add a lot of them, you need to + require the definitions first (without requiring all the + library...) + - hard-coded lexing tokens in rocqLexer, e.g. /\ mapped to + the corresponding Unicode character * would it be more reasonable to save the normalization rules when the index is created and apply them as default when searching, in particular when searching as a lambdapi command? -* normalize queries when given as commands in lambdapi +* package management + +* update index / deindex (it should work at package level) + +* more external sources when showing query results, including Git repos + +* VS code integration: right now lambdapi works on the current development + mixed with remote, but different sets of rewriting rules would make sense; + should it instead only work with the current development and dispatch + queries via VS code to a remote websearch? + +* ranking + +Performance improvements +------------------------ + +* see if compressing the index yields a significant gain + +* currently in a query like 'concl = _' it builds an extremely large matching set + and THEN it filters out the justifications that have Concl Exact; maybe we + could save a lot of time anticipating the filtrage + +Documentation +------------- + +* document the Coq syntax: ~ /\ \/ -> = forall exists fun + +* fix the doc: --require can now be repeated + +* fix the doc: not only "anywhere" but also "type" can be paired + only with ">="; maybe make it explicit that to match exactly the + type of a constant one should use "spine =" + +* document new features, e.g. -sources (and find better + terminology), deindex + +* document require ... as Foo: using Foo.X in the query already + works (pure magic!); of course it does not work when using + regular expressions [ check before! ] + +* Misleading output for: -* alignments with same name ==> automatic preference? + anywhere >= (Π x: _, (= _ V# V#)) + anywhere >= (Π x: _, (= _ x x)) -* better pagination + Are there too many results? NO, it's ok, but the output is misleading. + The second form is equivalent + to the first that is equivalent to (_ -> (= _ V# V#)) that is what it is + found. However it shows the pattern saying that " (Π x: _, (= _ x x))" was + found, that is the misleading thing. diff --git a/doc/options.rst b/doc/options.rst index bab6c2c89..749ee2a04 100644 --- a/doc/options.rst +++ b/doc/options.rst @@ -11,6 +11,7 @@ The available commands are: * ``check``: check the correctness of input files. * ``decision-tree``: output the decision tree of a symbol as a Dot graph (see :doc:`dtrees`) +* ``deindex``: remove constants from the search index given a prefix path * ``export``: translate the input file to other formats. * ``help``: display the main help message. * ``index``: create an index of symbols and rules of input files. @@ -44,7 +45,7 @@ The ``index`` command generates the file ``~/.LPSearch.db`` if ``$HOME`` is defi **Remark on search:** -The command ``search`` takes as argument a query and runs it against the index file ``~/.LPSearch.db``. It is also possible to normalize terms in the query wrt some rules by using ``--rules`` options. It is advised to use the same set of rules previously used during indexing. It is also possible to pass via ``--require`` a file to be required and opened before performing the query, e.g. to specify implicit arguments for symbols. See :doc:`query_language` for the specification of the query language. +The command ``search`` takes as argument a query and runs it against the index file ``~/.LPSearch.db``. It is also possible to normalize terms in the query wrt some rules by using ``--rules`` options. It is advised to use the same set of rules previously used during indexing. It is also possible to pass via ``--require`` files to be required and opened before performing the query, e.g. to specify implicit arguments for symbols. See :doc:`query_language` for the specification of the query language. **Common flags:** @@ -88,6 +89,14 @@ decision-tree * ``--ghost`` print the decision tree of a ghost symbol. Ghost symbols are symbols used internally that cannot be used in the concrete syntax. +deindex +------- + +* ``--path``: indicates the prefix of symbols paths to be removed from the index. + The path must be dot (`.`) and is not checked for well formness (i.e. A.B matches A.BC). + For instance, `lambdapi deindex --path tests.OK.natural`` removes from the index all the symbols + whose path starts with `tests.OK.natural` like `tests.OK.natural.N`. + export ------ @@ -174,6 +183,9 @@ index * ``--db `` tells lambdapi to index symbols and rules in ```` instead of ``~/.LPSearch.db``. +* ``--source`` indicates the path to the file containing the mapping to additional sources (for instance, Rocq sources corresponding to indexed ones). + These sources will also be displayed by the websearch engine when showing the results. + install/uninstall ----------------- @@ -191,7 +203,7 @@ search * ``--rules `` tells lambdapi to normalize terms in the query using the rules given in the file ````. Several files can be specified by using several ``--rules`` options. In these files, symbols must be fully qualified but no ``require`` command is needed. Moreover, the rules do not need to preserve typing. On the other hand, right hand-side of rules must contain implicit arguments. It is advised to use the same set of rules previously used during indexing. -* ``--require `` requires and open ```` when starting the search engine. The file can be used for example to specify implicit arguments for symbols used in the queries. +* ``--require `` requires and opens ```` when starting the search engine. The file can be used for example to specify implicit arguments for symbols used in the queries. * ``--db `` tells lambdapi to search in ```` instead of ``~/.LPSearch.db``. diff --git a/doc/queries.rst b/doc/queries.rst index dd476fbb5..ba457ad99 100644 --- a/doc/queries.rst +++ b/doc/queries.rst @@ -110,7 +110,8 @@ beginning, the timeout is set to 2s. ------------------ Runs a query between double quotes against the index file -``~/.LPSearch.db``. See :doc:`query_language` for the query language +``~/.LPSearch.db`` updated with the assets defined in the file under development including the assets "imported" by the `require` command. +See :doc:`query_language` for the query language specification. :: diff --git a/doc/query_language.rst b/doc/query_language.rst index 1016f8e5b..6b59d2b7c 100644 --- a/doc/query_language.rst +++ b/doc/query_language.rst @@ -7,11 +7,11 @@ Queries can be expressed according to the following syntax: Q ::= B | Q,Q | Q;Q | Q|PATH B ::= WHERE HOW GENERALIZE? PATTERN - PATH ::= << string >> + PATH ::= << string >> | "<< regexp>>" WHERE ::= name | anywhere | rule | lhs | rhs | type | concl | hyp | spine HOW ::= > | = | >= | ≥ GENERALIZE ::= generalize - PATTERN ::= << term possibly containing placeholders _ (for terms) and V# (for variable occurrences >> + PATTERN ::= << term possibly containing placeholders _ (for terms) and V# (for variable occurrences) >> where @@ -19,10 +19,12 @@ where * parentheses can be used as usual to force a different precedence order * ``anywhere`` can be paired only with ``>=`` and ``name`` can be paired only with ``>=`` and no ``generalize`` * a pattern should be wrapped in parentheses, unless it is atomic (e.g. an identifier or a placeholder) +* a pattern is expressed using the Lambdapi terms syntax. Additionaly, the search engine allows the use of a basic Rocq syntax to express terms. + Specifically, ``fun``, ``forall``, ``exists``, ``/\`` and ``~`` are supported to express terms inside the pattern. The semantics of the query language is the following: -* a query ``Q`` is either a base query ``B``, the conjunction ``Q1,Q2`` of two queries ``Q1`` and ``Q2``, their disjunction ``Q1;Q2`` or the query ``Q|PATH`` that behaves as ``Q``, but only keeps the results whose path is a suffix of ``PATH`` (that must be a valid path prefix) +* a query ``Q`` is either a base query ``B``, the conjunction ``Q1,Q2`` of two queries ``Q1`` and ``Q2``, their disjunction ``Q1;Q2`` or the query ``Q|PATH`` that behaves as ``Q``, but only keeps the results whose path is a suffix of ``PATH`` (that must be a valid path prefix) or matches the regular expression between double quotes (``""``) * a base query ``name = ID`` looks for symbols with name ``ID`` in the library. The identifier ``ID`` must not be qualified. * a base query ``WHERE HOW GENERALIZE? PATTERN`` looks in the library for occurrences of the pattern ``PATTERN`` **up to normalization rules** and, if ``generalize`` is specified, also **up to generalization** of the pattern. The normalization rules are library specific and are employed during indexing. They can be used, for example, to remove the clutter associated to encodings, to align concepts by mapping symbols to cross-library standard ones, or to standardize the shape of statements to improve recall (e.g. replacing occurrence of ``x > y`` with ``y < x``). @@ -42,8 +44,6 @@ The semantics of the query language is the following: * ``=`` the pattern must match the whole position * ``>`` the pattern must match a strict subterm of the position -Note that for commodity, ``forall`` and ``->`` can be used inside the query instead of the Unicode characters ``Π`` and ``→`` respectively. - Examples: * ``hyp = (nat → bool) , hyp >= (list nat)`` @@ -53,5 +53,9 @@ Examples: in a module whose path is a suffix of ``math.arithmetics``. The query can return ``plus_O : ∀x: nat. plus x O = x`` where ``plus_O`` has fully qualified name ``math.arithmetics.addition.plus`` + * ``concl > plus | "*.arithmetics"`` + searches for theorems having an hypothesis containing ``plus`` and located + in a module whose path matches ``*.arithmetics``. The query + can return ``math.arithmetics.addition.plus`` and ``mathematics.arithmetics.addition.plus`` * ``name = nat ; name = NAT`` searches for symbols named either ``nat`` or ``NAT`` diff --git a/dune-project b/dune-project index 1c777b972..ea0e2d351 100644 --- a/dune-project +++ b/dune-project @@ -46,6 +46,7 @@ systems: Dedukti, Coq, HRS, CPF.") (odoc :with-doc) (lwt_ppx (>= 1)) (uri (>= 1.1)) + (dune-site (>= 3.15)) (ppx_inline_test (>= v0.17)) ) (sites (share server_resources)) diff --git a/lambdapi.opam b/lambdapi.opam index ba734125a..56957cae5 100644 --- a/lambdapi.opam +++ b/lambdapi.opam @@ -40,6 +40,7 @@ depends: [ "odoc" {with-doc} "lwt_ppx" {>= "1"} "uri" {>= "1.1"} + "dune-site" {>= "3.15"} "ppx_inline_test" ] build: [ diff --git a/ressources/description.html b/ressources/description.html new file mode 100644 index 000000000..f148cef5a --- /dev/null +++ b/ressources/description.html @@ -0,0 +1,117 @@ + + + + + + +
+ + +

+ The search button answers the query. Read the + query language specification to learn about the query language. +
The query language also uses the + Lambdapi terms syntax.
+ while supporting a subset of the Rocq syntax (specificaly, + terms using fun, forall, exists, /\ and ~ are allowed) + In particular, the following constructors can come handy for + writing queries:
+

+
    +
  • an anonymous functionλ (x:A) y z,t + mapping x, y + and z (of type A for x) to t.
  • +
  • a dependent product + forall (x:A) y z,T +
  • +
  • a non-dependent product A -> T + (syntactic sugar for forall x:A,T when + x does not occur in T)
  • +
+ +
+ + \ No newline at end of file diff --git a/ressources/dune b/ressources/dune new file mode 100644 index 000000000..3e6f6f399 --- /dev/null +++ b/ressources/dune @@ -0,0 +1,24 @@ +(library + (name ressources) + (public_name lambdapi.ressources) + (modules :standard) + (libraries lambdapi.parsing lambdapi.core dream unix dune-site) + (preprocess (pps lwt_ppx)) +) + +; (rule +; (targets websearch.ml) +; (deps websearch.eml.ml) +; (action (run dream_eml %{deps} --workspace %{workspace_root}))) + +(install + (section (site (lambdapi server_resources))) + (files + (description.html as default/description.html) + (lambdapi.ico as default/lambdapi.ico) + ) +) + +(generate_sites_module + (module mysites) + (sites lambdapi)) diff --git a/src/cli/lambdapi.ico b/ressources/lambdapi.ico similarity index 100% rename from src/cli/lambdapi.ico rename to ressources/lambdapi.ico diff --git a/src/cli/config.ml b/src/cli/config.ml index 14240aaab..5a6e8b9e3 100644 --- a/src/cli/config.ml +++ b/src/cli/config.ml @@ -109,7 +109,7 @@ let map_dir : (Path.t * string) list CLT.t = let path : Path.t Arg.conv = let parse (s: string) : (Path.t, [>`Msg of string]) result = try Ok(Parser.path_of_string s) - with Error.Fatal(_,s) -> Error(`Msg(s)) + with Error.Fatal(_,s, err_desc) -> Error(`Msg(s ^ "\n" ^ err_desc)) in let print fmt p = Path.pp fmt p in Arg.conv (parse, print) diff --git a/src/cli/dune b/src/cli/dune index 39add14f2..b6692f1c2 100644 --- a/src/cli/dune +++ b/src/cli/dune @@ -4,5 +4,4 @@ (modes byte native) (modules :standard) (libraries cmdliner lambdapi.lsp lambdapi.tool lambdapi.handle - lambdapi.export unix) -) + lambdapi.export unix ressources)) diff --git a/src/cli/lambdapi.ml b/src/cli/lambdapi.ml index a505e7429..eac1554a4 100644 --- a/src/cli/lambdapi.ml +++ b/src/cli/lambdapi.ml @@ -20,73 +20,42 @@ module CLT = Cmdliner.Term module LPSearchMain = struct -let sig_state_of_require = - function - None -> Core.Sig_state.dummy - | Some req -> - (* Search for a package from the current working directory. *) - Package.apply_config (Filename.concat (Sys.getcwd()) ".") ; - Core.Sig_state.of_sign - (Compile.compile (Parsing.Parser.path_of_string req)) +let sig_state_of_require l = + (* Search for a package from the current working directory. *) + Package.apply_config (Filename.concat (Sys.getcwd()) ".") ; + List.fold_left + (fun ss req -> + Handle.Command.handle Compile.compile ss + (Pos.none + (Parsing.Syntax.P_require + (Some false, [Pos.none (Parsing.Parser.path_of_string req)])))) + Core.Sig_state.dummy l let search_cmd cfg rules require s dbpath_opt = Config.init cfg; let run () = Tool.Indexing.load_rewriting_rules rules ; + Tool.Indexing.force_meta_rules_loading () ; let ss = sig_state_of_require require in let dbpath = Option.get Path.default_dbpath dbpath_opt in - out Format.std_formatter "%s@." - (Tool.Indexing.search_cmd_txt ss s ~dbpath) in + out Format.std_formatter "%a@." + (Tool.Indexing.search_cmd_txt ss ~dbpath) s in Error.handle_exceptions run let websearch_cmd cfg rules port require header_file dbpath_opt path_in_url = Config.init cfg; let run () = Tool.Indexing.load_rewriting_rules rules ; + Tool.Indexing.force_meta_rules_loading () ; let ss = sig_state_of_require require in let header = match header_file with | None -> - " - -

LambdaPi - Search Engine

-
-

- The search button answers the query. Read the - query language specification to learn about the query language. -
The query language also uses the - Lambdapi terms syntax.
with the possibility to use, - for commodity, - \"forall\" and \"->\" instead of \"Π\" and \"→\" respectively - (results are displayed with the Unicode symbols - \"Π\" and \"→\" though). - In particular, the following constructors can come handy for - writing queries:
-

-
    -
  • an anonymous functionλ (x:A) y z,t - mapping x, y - and z (of type A for x) to t.
  • -
  • a dependent product - forall (x:A) y z,T -
  • -
  • a non-dependent product A -> T - (syntactic sugar for forall x:A,T when - x does not occur in T)
  • -
-
- " + let themes_locations : string list = + Ressources.Mysites.Sites.server_resources in + let file = match themes_locations with + | [] -> assert false + | x :: _ -> x in + Lplib.String.string_of_file (file ^ "/default/description.html") | Some file -> Lplib.String.string_of_file file in let dbpath = Option.get Path.default_dbpath dbpath_opt in let path_in_url = match path_in_url with @@ -95,7 +64,7 @@ let websearch_cmd cfg rules port require header_file dbpath_opt path_in_url = Tool.Websearch.start ~header ss ~port ~dbpath ~path_in_url () in Error.handle_exceptions run -let index_cmd cfg add_only rules files dbpath_opt = +let index_cmd cfg add_only rules files source dbpath_opt = Config.init cfg; let run () = if not add_only then Tool.Indexing.empty (); @@ -104,10 +73,19 @@ let index_cmd cfg add_only rules files dbpath_opt = let time = Time.save () in let handle file = Console.reset_default (); - Time.restore time; + Tool.Indexing.preserving_index Time.restore time; Tool.Indexing.load_rewriting_rules rules; Tool.Indexing.index_sign (no_wrn Compile.compile_file file) in List.iter handle files; + Option.iter Tool.Indexing.parse_source_map source; + let dbpath = Option.get Path.default_dbpath dbpath_opt in + Tool.Indexing.dump ~dbpath () in + Error.handle_exceptions run + +let deindex_cmd cfg path dbpath_opt = + Config.init cfg; + let run () = + Tool.Indexing.deindex_path path ; let dbpath = Option.get Path.default_dbpath dbpath_opt in Tool.Indexing.dump ~dbpath () in Error.handle_exceptions run @@ -201,7 +179,10 @@ let export_cmd (cfg:Config.t) (output:output option) (encoding:string option) (** Running the LSP server. *) let lsp_server_cmd : Config.t -> bool -> string -> unit = fun cfg standard_lsp lsp_log_file -> - let run _ = Config.init cfg; Lsp.Lp_lsp.main standard_lsp lsp_log_file in + let run _ = + Config.init cfg; + Common.Console.lsp_mod := true ; + Lsp.Lp_lsp.main standard_lsp lsp_log_file in Error.handle_exceptions run (** Printing a decision tree. *) @@ -274,7 +255,7 @@ let qident : qident CLT.t = let qident : qident Arg.conv = let parse (s: string): (qident, [>`Msg of string]) result = try Ok(Parser.qident_of_string s) - with Fatal(_,s) -> Error(`Msg(s)) + with Fatal(_,s,_) -> Error(`Msg(s)) in let print fmt qid = Pretty.qident fmt (Pos.none qid) in Arg.conv (parse, print) @@ -473,21 +454,31 @@ let rules_arg : string list CLT.t = multiple times to fetch rules from multiple files." in Arg.(value & opt_all string [] & info ["rules"] ~docv:"FILENAME" ~doc) -let require_arg : string option CLT.t = +let require_arg : string list CLT.t = let doc = - "LP file to be required before starting the search engine." in - Arg.(value & opt (some string) None & info ["require"] ~docv:"PATH" ~doc) + "LP files to be required before starting the search engine." in + Arg.(value & opt_all string [] & info ["require"] ~docv:"PATH" ~doc) let custom_dbpath : string option CLT.t = let doc = "Path to the search DB file." in Arg.(value & opt (some string) None & info ["db"] ~docv:"PATH" ~doc) +let source_file : string option CLT.t = + let doc = + "Path to the mapping to additional sources." in + Arg.(value & opt (some string) None & info ["sources"] ~docv:"PATH" ~doc) + let path_in_url : string option CLT.t = let doc = "The path in the URL accepted by the server." in Arg.(value & opt (some string) None & info ["url"] ~docv:"String" ~doc) +let path_prefix : string CLT.t = + let doc = + "The prefix path of the constants to de-index." in + Arg.(required & pos 0 (some string) None & info [] ~docv:"PATH" ~doc) + let header_file_arg : string option CLT.t = let doc = "html file holding the header of the web page of the server." in @@ -498,7 +489,14 @@ let index_cmd = let doc = "Index the given files." in Cmd.v (Cmd.info "index" ~doc ~man:man_pkg_file) Cmdliner.Term.(const LPSearchMain.index_cmd $ Config.full - $ add_only_arg $ rules_arg $ files $ custom_dbpath) + $ add_only_arg $ rules_arg $ files $ source_file $ custom_dbpath) + +let deindex_cmd = + let doc = + "De-index all constants whose path is a suffix of the given path." in + Cmd.v (Cmd.info "deindex" ~doc ~man:man_pkg_file) + Cmdliner.Term.(const LPSearchMain.deindex_cmd $ Config.full + $ path_prefix $ custom_dbpath) let search_cmd = let doc = "Run a search query against the index." in @@ -522,7 +520,7 @@ let _ = [ check_cmd ; parse_cmd ; export_cmd ; lsp_server_cmd ; decision_tree_cmd ; help_cmd ; version_cmd ; Init.cmd ; Install.install_cmd ; Install.uninstall_cmd - ; index_cmd ; search_cmd ; websearch_cmd ] + ; index_cmd ; deindex_cmd ; search_cmd ; websearch_cmd ] in let doc = "A type-checker for the lambdapi-calculus modulo rewriting." in let sdocs = Manpage.s_common_options in diff --git a/src/common/console.ml b/src/common/console.ml index 32a2bdb6a..efd8b53f9 100644 --- a/src/common/console.ml +++ b/src/common/console.ml @@ -103,3 +103,8 @@ module State = struct in apply e end + +(** [lsp_mod] indicates whether we are executing the LSP server. + Constants and rules are indexed automatically only in LSP mode + and not in check mode *) +let lsp_mod = Stdlib.ref false diff --git a/src/common/error.ml b/src/common/error.ml index 40cad8fe5..3c024a1f4 100644 --- a/src/common/error.ml +++ b/src/common/error.ml @@ -36,8 +36,9 @@ let no_wrn : ('a -> 'b) -> 'a -> 'b = fun f x -> code position (e.g., errors related to command-line arguments parsing). In cases where positions are expected [Some None] may be used to indicate the abscence of a position. This may happen when terms are generated (e.g., by - a form of desugaring). *) -exception Fatal of Pos.popt option * string + a form of desugaring). The last argument is used to provide an optional + description of the error, displayed differently from the error itself. *) +exception Fatal of Pos.popt option * string * string (** [fatal_str fmt] may be called an arbitrary number of times to build up the error message of the [fatal] or [fatal_no_pos] functions prior to calling @@ -46,17 +47,22 @@ exception Fatal of Pos.popt option * string let fatal_msg : 'a outfmt -> 'a = fun fmt -> out Format.str_formatter fmt -(** [fatal popt fmt] raises the [Fatal(popt,msg)] exception, in which [msg] is - built from the format [fmt] (provided the necessary arguments). *) +(** [fatal popt fmt] raises the [Fatal(popt,msg,err_desc)] exception, in which + [msg] is built from the format [fmt] (provided the necessary arguments). + [err_desc] continues the error message and is printed in normal format + instead of red color*) + let fatal : Pos.popt -> ('a,'b) koutfmt -> 'a = fun pos fmt -> - let cont _ = raise (Fatal(Some(pos), Format.flush_str_formatter ())) in + let cont _ = raise (Fatal(Some(pos), Format.flush_str_formatter (), "")) in Format.kfprintf cont Format.str_formatter fmt (** [fatal_no_pos fmt] is similar to [fatal _ fmt], but it is used to raise an error that has no precise attached source code position. *) -let fatal_no_pos : ('a,'b) koutfmt -> 'a = fun fmt -> - let cont _ = raise (Fatal(None, Format.flush_str_formatter ())) in - Format.kfprintf cont Format.str_formatter fmt +let fatal_no_pos : ?err_desc:string -> ('a,'b) koutfmt -> 'a = + fun ?(err_desc="") fmt -> + let cont _ = + raise (Fatal(None, Format.flush_str_formatter (), err_desc)) in + Format.kfprintf cont Format.str_formatter fmt (** [handle_exceptions f] runs [f ()] in an exception handler and handles both expected and unexpected exceptions by displaying a graceful error message. @@ -64,15 +70,17 @@ let fatal_no_pos : ('a,'b) koutfmt -> 'a = fun fmt -> [1] (indicating failure). Hence, [handle_exceptions] should only be called by the main program logic, not by the internals. *) let handle_exceptions : (unit -> unit) -> unit = fun f -> - let exit_with : type a b. (a,b) koutfmt -> a = fun fmt -> + let exit_with : type a b. string -> (a,b) koutfmt -> a = fun err_desc fmt -> Color.update_with_color Format.err_formatter; - Format.kfprintf (fun _ -> exit 1) Format.err_formatter + Format.kfprintf (fun _ -> Color.update_with_color Format.err_formatter; + (Format.kfprintf (fun _ -> exit 1) + Format.err_formatter "%s" err_desc)) Format.err_formatter (Color.red (fmt ^^ "@.")) in try f () with - | Fatal(None, msg) -> exit_with "%s" msg - | Fatal(Some(p), msg) -> exit_with "[%a] %s" Pos.pp p msg + | Fatal(None, msg, cnt) -> exit_with cnt "%s" msg + | Fatal(Some(p), msg, cnt) -> exit_with cnt "[%a] %s" Pos.pp p msg | e -> - exit_with "Uncaught [%s].\n%s" + exit_with "" "Uncaught [%s].\n%s" (Printexc.to_string e) - (Printexc.get_backtrace()) + (Printexc.get_backtrace()) \ No newline at end of file diff --git a/src/common/pos.ml b/src/common/pos.ml index 7996894fe..b0cb32d47 100644 --- a/src/common/pos.ml +++ b/src/common/pos.ml @@ -168,28 +168,45 @@ let pp_lexing : (Lexing.position * Lexing.position) pp = -(** [print_file_contents escape sep delimiters pos] prints the contents of the - file at position [pos]. [sep] is the separator replacing each newline +(** [print_file_contents parse_file escape sep delimiters pos] prints the + contents of the file at position [pos]. The [parse_file] function + takes in input [pos.fname] (that in reality may be a filename or + a URI, e.g. when the text comes from LSP) and it returns both a stream + of lines provided by a function that raises End_of_file if the file + content is terminated, and a function to close the resources when + we are done with the stream. + [sep] is the separator replacing each newline (e.g. "
\n"). [delimiters] is a pair of delimiters used to wrap the "unknown location" message returned when the position does not refer to a - file. [escape] is used to escape the file contents.*) + file. [escape] is used to escape the file contents. + + The value -1 for end_col is to be interpreted as "at the end of line". *) let print_file_contents : - escape:(string -> string) -> delimiters:(string*string) -> popt pp = - fun ~escape ~delimiters:(db,de) ppf pos -> + parse_file:(string -> (unit -> string) * (unit -> unit)) -> + escape:(string -> string) -> + delimiters:(string*string) -> + complain_if_location_unknown:bool -> + popt pp = + fun ~parse_file ~escape ~delimiters:(db,de) ~complain_if_location_unknown + ppf pos -> match pos with | Some { fname=Some fname; start_line; start_col; end_line; end_col } -> (* WARNING: do not try to understand the following code! It's dangerous for your health! *) + let input_line,finish = parse_file fname in + let out = + Buffer.create + ((end_line - start_line) * 80 + + (if end_col = -1 then 80 else end_col) + 1) in + (* ignore the lines before the start_line *) - let ch = open_in fname in - let out = Buffer.create ((end_line - start_line) * 80 + end_col + 1) in for i = 0 to start_line - 2 do - ignore (input_line ch) + ignore (input_line ()) done ; (* skip the first start_col UTF8 codepoints of the start_line *) - let startl = input_line ch in + let startl = input_line () in assert (String.is_valid_utf_8 startl); let bytepos = ref 0 in for i = 0 to start_col - 1 do @@ -209,27 +226,39 @@ let print_file_contents : (* add the lines in between the start_line and the end_line *) for i = 0 to end_line - start_line - 2 do - Buffer.add_string out (escape (input_line ch)) ; + Buffer.add_string out (escape (input_line ())) ; Buffer.add_string out "\n" done ; (* identify what the end_line is and how many UTF8 codepoints to keep *) let endl,end_col = if start_line = end_line then - startstr, end_col - start_col - else input_line ch, end_col in + if end_col = -1 then + startstr, -1 + else + startstr, end_col - start_col + else input_line (), end_col in (* keep the first end_col UTF8 codepoints of the end_line *) assert (String.is_valid_utf_8 endl); let bytepos = ref 0 in - for i = 0 to end_col - 1 do - let uchar = String.get_utf_8_uchar endl !bytepos in - assert (Uchar.utf_decode_is_valid uchar) ; - bytepos := !bytepos + Uchar.utf_decode_length uchar - done ; + let i = ref 0 in + (try + while !i <= end_col -1 || end_col = -1 do + let uchar = String.get_utf_8_uchar endl !bytepos in + assert (Uchar.utf_decode_is_valid uchar) ; + bytepos := !bytepos + Uchar.utf_decode_length uchar ; + incr i + done + with + Invalid_argument _ -> () (* End of line reached *)) ; let str = String.sub endl 0 !bytepos in Buffer.add_string out (escape str) ; - close_in ch ; + finish () ; string ppf (Buffer.contents out) - | None | Some {fname=None} -> string ppf (db ^ "unknown location" ^ de) + | None | Some {fname=None} -> + if complain_if_location_unknown then + string ppf (db ^ "unknown location" ^ de) + else + string ppf "" diff --git a/src/core/sign.ml b/src/core/sign.ml index d60eb99a3..cdd64ccf3 100644 --- a/src/core/sign.ml +++ b/src/core/sign.ml @@ -227,6 +227,9 @@ let unlink : t -> unit = fun sign -> let f s cps = unlink_sym s; List.iter unlink_cp_pos cps in SymMap.iter f !(sign.sign_cp_pos) +let add_symbol_callback = Stdlib.ref (fun _ -> ()) +let add_rules_callback = Stdlib.ref (fun _ _ -> ()) + (** [add_symbol sign expo prop mstrat opaq name pos typ impl notation] adds in the signature [sign] a symbol with name [name], exposition [expo], property [prop], matching strategy [strat], opacity [opaq], type [typ], @@ -241,6 +244,7 @@ let add_symbol : t -> expo -> prop -> match_strat -> bool -> strloc -> (cleanup typ) (minimize_impl impl) in sign.sign_symbols := StrMap.add name.elt sym !(sign.sign_symbols); + if Stdlib.(!Common.Console.lsp_mod) then Stdlib.(!add_symbol_callback sym) ; sym (** [strip_private sign] removes private symbols from signature [sign]. *) @@ -336,23 +340,6 @@ let read = let open Stdlib in let r = ref Ghost.sign in fun n -> Debug.(record_time Reading (fun () -> r := read n)); !r -(** [add_rule sign s r] adds the new rule [r] to the symbol [s]. When the rule - does not correspond to a symbol of signature [sign], it is stored in its - dependencies. /!\ does not update the decision tree or the critical - pairs. *) -let add_rule : t -> sym_rule -> unit = fun sign (s,r) -> - s.sym_rules := !(s.sym_rules) @ [r]; - if s.sym_path <> sign.sign_path then - let d = try Path.Map.find s.sym_path !(sign.sign_deps) - with Not_found -> assert false in - let f = function - | None -> Some{rules=[r]; nota=None} - | Some sd -> Some{sd with rules=sd.rules@[r]} - in - let sm = StrMap.update s.sym_name f d.dep_symbols in - let d = {d with dep_symbols=sm} in - sign.sign_deps := Path.Map.add s.sym_path d !(sign.sign_deps) - (** [add_rules sign s rs] adds the new rules [rs] to the symbol [s]. When the rules do not correspond to a symbol of signature [sign], they are stored in its dependencies. /!\ does not update the decision tree or the critical @@ -360,6 +347,7 @@ let add_rule : t -> sym_rule -> unit = fun sign (s,r) -> let add_rules : t -> sym -> rule list -> unit = fun sign s rs -> s.sym_rules := !(s.sym_rules) @ rs; if s.sym_path <> sign.sign_path then + begin let d = try Path.Map.find s.sym_path !(sign.sign_deps) with Not_found -> assert false in let f = function @@ -369,6 +357,16 @@ let add_rules : t -> sym -> rule list -> unit = fun sign s rs -> let sm = StrMap.update s.sym_name f d.dep_symbols in let d = {d with dep_symbols=sm} in sign.sign_deps := Path.Map.add s.sym_path d !(sign.sign_deps) + end ; + if Stdlib.(!Common.Console.lsp_mod) then Stdlib.(!add_rules_callback s rs) + +(** [add_rule sign s r] adds the new rule [r] to the symbol [s]. When the rule + does not correspond to a symbol of signature [sign], it is stored in its + dependencies. /!\ does not update the decision tree or the critical + pairs. *) +let add_rule : t -> sym_rule -> unit = fun sign (s,r) -> + add_rules sign s [r] + (** [add_notation sign sym nota] changes the notation of [s] to [n] in the signature [sign]. *) diff --git a/src/handle/command.ml b/src/handle/command.ml index 617601e5c..4273170d4 100644 --- a/src/handle/command.ml +++ b/src/handle/command.ml @@ -625,9 +625,9 @@ let get_proof_data : compiler -> sig_state -> p_command -> cmd_output = ss with | Timeout as e -> raise e - | Fatal(Some(Some(_)),_) as e -> raise e - | Fatal(None ,m) -> fatal pos "%s" m - | Fatal(Some(None) ,m) -> fatal pos "%s" m + | Fatal(Some(Some(_)),_, _) as e -> raise e + | Fatal(None ,m, _) -> fatal pos "%s" m + | Fatal(Some(None) ,m, _) -> fatal pos "%s" m | e -> fatal pos "Uncaught exception: %s." (Printexc.to_string e) diff --git a/src/handle/compile.ml b/src/handle/compile.ml index b2f3ff365..a35bd3b1c 100644 --- a/src/handle/compile.ml +++ b/src/handle/compile.ml @@ -83,6 +83,7 @@ let rec compile_with : we need to explicitly update the decision tree of their symbols because it is not done in linking which normally follows loading. *) Ghost.iter (fun s -> Tree.update_dtree s []); + if Stdlib.(!Common.Console.lsp_mod) then Tool.Indexing.index_sign sign ; sign end diff --git a/src/handle/tactic.ml b/src/handle/tactic.ml index 0e44f3c6d..e08d7f43c 100644 --- a/src/handle/tactic.ml +++ b/src/handle/tactic.ml @@ -668,12 +668,12 @@ let rec handle : | P_tac_try tactic -> begin try handle ss sym_pos prv ps tactic - with Fatal(_, _s) -> ps + with Fatal(_, _s, _) -> ps end | P_tac_orelse(t1,t2) -> begin try handle ss sym_pos prv ps t1 - with Fatal(_, _s) -> handle ss sym_pos prv ps t2 + with Fatal(_, _s, _) -> handle ss sym_pos prv ps t2 end | P_tac_repeat t -> begin @@ -682,7 +682,7 @@ let rec handle : let ps = handle ss sym_pos prv ps t in if List.length ps.proof_goals < nb_goals then ps else handle ss sym_pos prv ps tac - with Fatal(_, _s) -> ps + with Fatal(_, _s, _) -> ps end | P_tac_and(t1,t2) -> let ps = handle ss sym_pos prv ps t1 in diff --git a/src/lplib/color.ml b/src/lplib/color.ml index 4d316926a..d4068d221 100644 --- a/src/lplib/color.ml +++ b/src/lplib/color.ml @@ -75,6 +75,8 @@ let blu fmt = colorize Blu fmt let mag fmt = colorize Mag fmt let cya fmt = colorize Cya fmt +let default (fmt:('a, 'b, 'c, 'd, 'e, 'f) format6) = fmt + (** [g_or_r cond fmt] colors the format [fmt] in green if [cond] is [true] and in red otherwise. *) let g_or_r cond = if cond then gre else red diff --git a/src/lsp/lp_doc.ml b/src/lsp/lp_doc.ml index 4db3c59ac..bbd7e2f88 100644 --- a/src/lsp/lp_doc.ml +++ b/src/lsp/lp_doc.ml @@ -144,7 +144,7 @@ let new_doc ~uri ~version ~text = assert(String.is_prefix "file://" uri); let path = String.sub uri 7 (String.length uri - 7) in Some(Pure.initial_state path), [] - with Error.Fatal(_pos, msg) -> + with Error.Fatal(_pos, msg, _) -> let loc : Pos.pos = { fname = Some(uri); diff --git a/src/lsp/lp_lsp.ml b/src/lsp/lp_lsp.ml index e6db1697d..2355d2dde 100644 --- a/src/lsp/lp_lsp.ml +++ b/src/lsp/lp_lsp.ml @@ -58,7 +58,7 @@ let do_check_text ofmt ~doc = let doc, diags = try Lp_doc.check_text ~doc - with Common.Error.Fatal(_pos, msg) -> + with Common.Error.Fatal(_pos, msg, _) -> let loc : Pos.pos = { fname = Some(doc.uri); diff --git a/src/parsing/dune b/src/parsing/dune index 1ac0824ba..4ed10159a 100644 --- a/src/parsing/dune +++ b/src/parsing/dune @@ -6,6 +6,10 @@ (libraries camlp-streams lambdapi.core menhirLib pratter sedlex sedlex.ppx lambdapi.common) (flags -w +3)) +; (menhir (flags --explain --external-tokens LpLexer) (modules lpParser)) + +; (menhir (flags --explain --external-tokens RocqLexer) (modules rocqParser)) + (ocamllex dkLexer) (menhir (flags --explain --external-tokens DkTokens) (modules dkParser)) diff --git a/src/parsing/lpParser.ml b/src/parsing/lpParser.ml index 46b89b6b3..5c3088d2f 100644 --- a/src/parsing/lpParser.ml +++ b/src/parsing/lpParser.ml @@ -119,14 +119,14 @@ let current_token() : token = let (t,_,_) = !the_current_token in t let current_pos() : position * position = let (_,p1,p2) = !the_current_token in (p1,p2) - +(* let new_parsing (entry:lexbuf -> 'a) (lb:lexbuf): 'a = let t = !the_current_token in let reset() = the_current_token := t in the_current_token := LpLexer.token lb; try let r = entry lb in begin reset(); r end with e -> begin reset(); raise e end - + *) let expected (msg:string) (tokens:token list): 'a = if msg <> "" then syntax_error (current_pos()) ("Expected: "^msg^".") else @@ -1570,30 +1570,45 @@ and asearch (lb:lexbuf): search = let t = term lb in QBase(QSearch(t,g,Some(QXhs(r,None)))) | UID "spine" -> + begin + consume_token lb; let r = relation lb in let g = generalize lb in let t = term lb in QBase(QSearch(t,g,Some(QType(Some(Spine r))))) + end | UID "concl" -> + begin + consume_token lb; let r = relation lb in let g = generalize lb in let t = term lb in QBase(QSearch(t,g,Some(QType(Some(Conclusion r))))) + end | UID "hyp" -> + begin + consume_token lb; let r = relation lb in let g = generalize lb in let t = term lb in QBase(QSearch(t,g,Some(QType(Some(Hypothesis r))))) + end | UID "lhs" -> + begin + consume_token lb; let r = relation lb in let g = generalize lb in let t = term lb in QBase(QSearch(t,g,Some(QXhs(r,Some Lhs)))) + end | UID "rhs" -> + begin + consume_token lb; let r = relation lb in let g = generalize lb in let t = term lb in QBase(QSearch(t,g,Some(QXhs(r,Some Rhs)))) + end | L_PAREN -> consume_token lb; let q = search lb in @@ -1617,7 +1632,7 @@ and ssearch (lb:lexbuf): search = let cq = csearch lb in match current_token() with | SEMICOLON -> - let cqs = list (prefix SEMICOLON csearch) lb in + let cqs = list (csearch) lb in List.fold_left (fun x cq -> QOp(x,Union,cq)) cq cqs | _ -> cq diff --git a/src/parsing/parser.ml b/src/parsing/parser.ml index 949117101..d8f1f232f 100644 --- a/src/parsing/parser.ml +++ b/src/parsing/parser.ml @@ -90,32 +90,38 @@ end open LpLexer open Sedlexing -(** Parsing lp syntax. *) -module Lp : -sig - include PARSER with type lexbuf := Sedlexing.lexbuf - - val parse_term_string: Lexing.position -> string -> Syntax.p_term - (** [parse_rwpatt_string p s] parses a term from string [s] assuming that - [s] starts at position [p]. *) - - val parse_rwpatt_string: Lexing.position -> string -> Syntax.p_rwpatt - (** [parse_rwpatt_string f s] parses a rewrite pattern specification from - string [s] assuming that [s] starts at position [p]. *) - - val parse_search_string: Lexing.position -> string -> Syntax.search - (** [parse_search_string f s] parses a query from string [s] assuming - that [s] starts at position [p]. *) - - end -= struct +module Aux(Lexer: + sig + type token + val the_current_token : (token * position * position) ref + val get_token : Sedlexing.lexbuf -> unit + -> token * Lexing.position * Lexing.position + (* val parsing : + (Sedlexing.lexbuf -> 'a) -> Sedlexing.lexbuf -> 'a *) + end)= +struct + +(* let current_token() : Lexer.token = let (t,_,_) = + !Lexer.the_current_token in t *) + +let current_pos() : position * position = + let (_,p1,p2) = !Lexer.the_current_token in (p1,p2) + +let new_parsing (entry:lexbuf -> 'a) (lb:lexbuf): 'a = + let t = !Lexer.the_current_token in + let reset() = Lexer.the_current_token := t in + Lexer.the_current_token := Lexer.get_token lb (); + try let r = entry lb in begin reset(); r end + with e -> begin reset(); raise e end let handle_error (icopt: in_channel option) (entry: lexbuf -> 'a) (lb: lexbuf): 'a option = try Some(entry lb) with | End_of_file -> Option.iter close_in icopt; None + | RocqLexer.SyntaxError{pos=None; _} | SyntaxError{pos=None; _} -> assert false + | RocqLexer.SyntaxError{pos=Some pos; elt} | SyntaxError{pos=Some pos; elt} -> parser_fatal pos "Syntax error. %s" elt @@ -143,8 +149,35 @@ sig let lb = Utf8.from_string s in set_position lb lexpos; set_filename lb lexpos.pos_fname; - Stream.next (parse_lexbuf None (LpParser.new_parsing entry) lb) + Stream.next (parse_lexbuf None (new_parsing entry) lb) +end + +(** Parsing lp syntax. *) +module Lp : +sig + include PARSER with type lexbuf := Sedlexing.lexbuf + val parse_term_string: Lexing.position -> string -> Syntax.p_term + (** [parse_rwpatt_string p s] parses a term from string [s] assuming that + [s] starts at position [p]. *) + + val parse_rwpatt_string: Lexing.position -> string -> Syntax.p_rwpatt + (** [parse_rwpatt_string f s] parses a rewrite pattern specification from + string [s] assuming that [s] starts at position [p]. *) + + val parse_search_string: Lexing.position -> string -> Syntax.search + (** [parse_search_string f s] parses a query from string [s] assuming + that [s] starts at position [p]. *) + + end += struct + + include Aux(struct + type token = LpLexer.token + let the_current_token = LpParser.the_current_token + let get_token x _ = LpLexer.token x + (* parsing = LpParser.new_parsing *) + end) (* exported functions *) let parse_term_string = parse_entry_string LpParser.term let parse_rwpatt_string = parse_entry_string LpParser.rwpatt @@ -157,6 +190,39 @@ sig end +module Rocq : +sig + val parse_search_string : + Lexing.position -> string -> Syntax.search + (* TODO update the next comment *) + (** [parse_search_query_string f s] returns a stream of parsed terms from + string [s] which comes from file [f] ([f] can be anything). *) +end += struct + + include Aux(struct + type token = RocqLexer.token + let the_current_token = RocqParser.the_current_token + let get_token x _ = RocqLexer.token x + end) + (* exported functions *) +(* + let parse = + MenhirLib.Convert.Simplified.traditional2revised + RocqParser.search_query_alone + let token lb = RocqLexer.token lb + let parse_lexbuf lb = parse (token lb) + let parse_search_string pos s = parse_entry_string parse_lexbuf pos s *) + + let parse_term_string = parse_entry_string RocqParser.term + let parse_rwpatt_string = parse_entry_string RocqParser.rwpatt + let parse_search_string = parse_entry_string RocqParser.search + (* let parse_search_string = + parse_entry_string LpParser.search *) + (* parse_string ~grammar_entry:RocqParser.search_query_alone *) +end + + include Lp open Error @@ -170,8 +236,11 @@ let path_of_string : string -> Path.t = fun s -> | QID p, _, _ -> List.rev p | _ -> fatal_no_pos "Syntax error: \"%s\" is not a path." s end - with SyntaxError _ -> - fatal_no_pos "Syntax error: \"%s\" is not a path." s + with + SyntaxError _ + | RocqLexer.SyntaxError _ -> + fatal_no_pos "Syntax error: \"%s\" is not a path." s + (** [qident_of_string s] converts the string [s] into a qident. *) let qident_of_string : string -> Core.Term.qident = fun s -> @@ -183,8 +252,10 @@ let qident_of_string : string -> Core.Term.qident = fun s -> | _ -> fatal_no_pos "Syntax error: \"%s\" is not a qualified identifier." s end - with SyntaxError _ -> - fatal_no_pos "Syntax error: \"%s\" is not a qualified identifier." s + with + | RocqLexer.SyntaxError _ + | SyntaxError _ -> + fatal_no_pos "Syntax error: \"%s\" is not a qualified identifier." s (** [parse_file fname] selects and runs the correct parser on file [fname], by looking at its extension. *) diff --git a/src/parsing/pretty.ml b/src/parsing/pretty.ml index 6f1173d4c..3883c7a83 100644 --- a/src/parsing/pretty.ml +++ b/src/parsing/pretty.ml @@ -266,7 +266,10 @@ let search_base ppf = function let op ppf o = string ppf (match o with Union -> "; " | Intersect -> ", ") -let filter ppf (Path s) = out ppf " | %s" s +let filter ppf f = + match f with + | Path s -> out ppf " | %s" s + | RegExp _ -> out ppf "(;%s;)@." "regular expression" (*FIXME?*) let rec search ppf = function | QBase b -> search_base ppf b diff --git a/src/parsing/rocqLexer.ml b/src/parsing/rocqLexer.ml new file mode 100644 index 000000000..533a90b24 --- /dev/null +++ b/src/parsing/rocqLexer.ml @@ -0,0 +1,233 @@ +(** Lexer for Rocq syntax, using Sedlex, a Utf8 lexer generator. *) + +open Lplib +open Sedlexing +open Common open Pos + +let remove_first : lexbuf -> string = fun lb -> + Utf8.sub_lexeme lb 1 (lexeme_length lb - 1) + +let remove_last : lexbuf -> string = fun lb -> + Utf8.sub_lexeme lb 0 (lexeme_length lb - 1) + +let remove_ends : lexbuf -> string = fun lb -> + Utf8.sub_lexeme lb 1 (lexeme_length lb - 2) + +exception SyntaxError of strloc + +let syntax_error : Lexing.position * Lexing.position -> string -> 'a = + fun pos msg -> raise (SyntaxError (Pos.make_pos pos msg)) + +let fail : lexbuf -> string -> 'a = fun lb msg -> + syntax_error (lexing_positions lb) msg + +let invalid_character : lexbuf -> 'a = fun lb -> + fail lb "Invalid character" + +(** Tokens. *) +type token = + (* end of file *) + | EOF + + (* keywords in alphabetical order *) + | GENERALIZE + | IN + | LET + | RULE + | TYPE_QUERY + | TYPE_TERM + + (* other tokens *) + | INT of string + (* | QINT of Path.t * string *) + | STRINGLIT of string + + (* symbols *) + | ARROW + | ASSIGN + | BACKQUOTE + | COMMA + | COLON + | DOT + | EXISTS + | FORALL + | FUN + | LAMBDA + | L_PAREN + | L_SQ_BRACKET + | PI + | R_PAREN + | R_SQ_BRACKET + | SEMICOLON + | THICKARROW + | UNDERSCORE + | VBAR + + (* identifiers *) + | UID of string + | UID_EXPL of string + | UID_META of int + | UID_PATT of string + | QID of Path.t (* in reverse order *) + | QID_EXPL of Path.t (* in reverse order *) + +(** Some regexp definitions. *) +let space = [%sedlex.regexp? Chars " \t\n\r"] +let digit = [%sedlex.regexp? '0' .. '9'] +let nat = [%sedlex.regexp? Plus digit] +let int = [%sedlex.regexp? nat | '-', nat] +let float = [%sedlex.regexp? int, '.', Plus digit] +let oneline_comment = [%sedlex.regexp? "//", Star (Compl ('\n' | '\r'))] +let string = [%sedlex.regexp? '"', Star (Compl '"'), '"'] + +(** Identifiers. + + There are two kinds of identifiers: regular identifiers and escaped + identifiers of the form ["{|...|}"]. + + Modulo those surrounding brackets, escaped identifiers allow to use as + identifiers keywords or filenames that are not regular identifiers. + + An escaped identifier denoting a filename or directory is unescaped before + accessing to it. Hence, the module ["{|a b|}"] refers to the file ["a b"]. + + Identifiers need to be normalized so that an escaped identifier, once + unescaped, is not regular. To this end, every identifier of the form + ["{|s|}"] with s regular, is understood as ["s"] (function + [remove_useless_escape] below). + + Finally, identifiers must not be empty, so that we can use the empty string + for the path of ghost signatures. *) + +(** Unqualified regular identifiers are any non-empty sequence of characters + not among: *) +let forbidden_letter = [%sedlex.regexp? Chars " ,;\r\t\n(){}[]:.`\"@$|/"] +let regid = [%sedlex.regexp? '/' | Plus (Compl forbidden_letter)] + +let is_regid : string -> bool = fun s -> + let lb = Utf8.from_string s in + match%sedlex lb with + | regid, eof -> true + | _ -> false + +(** Unqualified escaped identifiers are any non-empty sequence of characters + (except "|}") between "{|" and "|}". *) +let nobars = [%sedlex.regexp? Star (Compl '|')] +let escid = [%sedlex.regexp? + "{|", nobars, '|', Star ('|' | Compl (Chars "|}"), nobars, '|'), '}'] + +(** [escape s] converts a string [s] into an escaped identifier if it is not + regular. We do not check whether [s] contains ["|}"]. FIXME? *) +let escape s = if is_regid s then s else Escape.escape s + +(** [remove_useless_escape s] replaces escaped regular identifiers by their + unescape form. *) +let remove_useless_escape : string -> string = fun s -> + let s' = Escape.unescape s in if is_regid s' then s' else s + +(** Lexer. *) +let rec token lb = + match%sedlex lb with + + (* end of file *) + | eof -> EOF + + (* spaces *) + | space -> token lb + + (* comments *) + | oneline_comment -> token lb + | "/*" -> comment token 0 lb + + (* keywords *) + | "exists" -> EXISTS (* in Coq *) + | "forall" -> FORALL (* in Coq *) + | "fun" -> FUN (* in Coq *) + | "generalize" -> GENERALIZE + | "in" -> IN + | "let" -> LET + | "rule" -> RULE + | "type" -> TYPE_QUERY + | "TYPE" -> TYPE_TERM + + (* other tokens *) + | int -> INT(Utf8.lexeme lb) + | string -> STRINGLIT(Utf8.sub_lexeme lb 1 (lexeme_length lb - 2)) + + (* symbols *) + | 0x2254 (* ≔ *) -> ASSIGN + | 0x2192 (* → *) -> ARROW (* not in Coq! *) + | "->" -> ARROW (* in Coq *) + | "=>" -> THICKARROW (* in Coq *) + | '`' -> BACKQUOTE + | ',' -> COMMA + | ':' -> COLON + | '.' -> DOT + | 0x03bb (* λ *) -> LAMBDA (* not in Coq! *) + | '(' -> L_PAREN + | '[' -> L_SQ_BRACKET + | 0x03a0 (* Π *) -> PI + | ')' -> R_PAREN + | ']' -> R_SQ_BRACKET + | ';' -> SEMICOLON + | '|' -> VBAR + | '_' -> UNDERSCORE + + (* rocq identifiers *) + | "\\/" -> UID("∨") + | "/\\" -> UID("∧") + | "~" -> UID("¬") + + (* identifiers *) + | regid -> UID(Utf8.lexeme lb) + | escid -> UID(remove_useless_escape(Utf8.lexeme lb)) + | '@', regid -> UID_EXPL(remove_first lb) + | '@', escid -> UID_EXPL(remove_useless_escape(remove_first lb)) + | '?', nat -> UID_META(int_of_string(remove_first lb)) + | '$', regid -> UID_PATT(remove_first lb) + | '$', escid -> UID_PATT(remove_useless_escape(remove_first lb)) + | '$', nat -> UID_PATT(remove_first lb) + + | regid, '.' -> qid false [remove_last lb] lb + | escid, '.' -> qid false [remove_useless_escape(remove_last lb)] lb + | '@', regid, '.' -> qid true [remove_ends lb] lb + | '@', escid, '.' -> qid true [remove_useless_escape(remove_ends lb)] lb + + (* invalid character *) + | _ -> invalid_character lb + +and qid expl ids lb = + match%sedlex lb with + | oneline_comment -> qid expl ids lb + | "/*" -> comment (qid expl ids) 0 lb + (* | int -> QINT(List.rev ids, Utf8.lexeme lb) *) + | regid, '.' -> qid expl (remove_last lb :: ids) lb + | escid, '.' -> qid expl (remove_useless_escape(remove_last lb) :: ids) lb + | regid -> + if expl then QID_EXPL(Utf8.lexeme lb :: ids) + else QID(Utf8.lexeme lb :: ids) + | escid -> + if expl then QID_EXPL(remove_useless_escape (Utf8.lexeme lb) :: ids) + else QID(remove_useless_escape (Utf8.lexeme lb) :: ids) + | _ -> fail lb ("Invalid identifier: \"" ^ Utf8.lexeme lb ^ "\".") + +and comment next i lb = + match%sedlex lb with + | eof -> fail lb "Unterminated comment." + | "*/" -> if i=0 then next lb else comment next (i-1) lb + | "/*" -> comment next (i+1) lb + | any -> comment next i lb + | _ -> invalid_character lb + +(** [token lb] is a lexing function on [lb] that can be passed to a parser. *) +let token : lexbuf -> token * Lexing.position * Lexing.position = + fun lb -> try Sedlexing.with_tokenizer token lb () with + | MalFormed -> fail lb "Not Utf8 encoded file" + | InvalidCodepoint k -> + fail lb ("Invalid Utf8 code point " ^ string_of_int k) + +let dummy_token = (EOF, Lexing.dummy_pos, Lexing.dummy_pos) + +let token = + let r = ref dummy_token in fun lb -> + Debug.(record_time Lexing (fun () -> r := token lb)); !r diff --git a/src/parsing/rocqParser.ml b/src/parsing/rocqParser.ml new file mode 100644 index 000000000..7d544e41a --- /dev/null +++ b/src/parsing/rocqParser.ml @@ -0,0 +1,854 @@ +open Lplib +open Common open Pos open Logger +open Syntax +open Core +open RocqLexer +open Lexing +open Sedlexing + +let log = LpParser.log + +(* token management *) + +let string_of_token = function + | EOF -> "end of file" + | ARROW -> "→" + | ASSIGN -> "≔" + | BACKQUOTE -> "`" + | COLON -> ":" + | COMMA -> "," + | DOT -> "." + | GENERALIZE -> "generalize" + | EXISTS -> "exists" + | FORALL -> "forall" + | FUN -> "fun" + | THICKARROW -> "=>" + | IN -> "in" + | INT _ -> "integer" + | LAMBDA -> "λ" + | LET -> "let" + | L_PAREN -> "(" + | L_SQ_BRACKET -> "[" + | PI -> "Π" + | QID _ -> "qualified identifier" + | QID_EXPL _ -> "@-prefixed qualified identifier" + | RULE -> "rule" + | R_PAREN -> ")" + | R_SQ_BRACKET -> "]" + | SEMICOLON -> ";" + | STRINGLIT _ -> "string literal" + | TYPE_QUERY -> "type" + | TYPE_TERM -> "TYPE" + | UID _ -> "non-qualified identifier" + | UID_EXPL _ -> "@-prefixed non-qualified identifier" + | UID_META _ -> "?-prefixed metavariable number" + | UID_PATT _ -> "$-prefixed non-qualified identifier" + | UNDERSCORE -> "_" + | VBAR -> "|" + +let pp_token ppf t = Base.string ppf (string_of_token t) + +let the_current_token : (token * position * position) Stdlib.ref = + Stdlib.ref dummy_token + +let current_token() : token = let (t,_,_) = !the_current_token in t + +let current_pos() : position * position = + let (_,p1,p2) = !the_current_token in (p1,p2) +(* +let new_parsing (entry:lexbuf -> 'a) (lb:lexbuf): 'a = + let t = !the_current_token in + let reset() = the_current_token := t in + the_current_token := LpLexer.token lb; + try let r = entry lb in begin reset(); r end + with e -> begin reset(); raise e end + *) +let expected (msg:string) (tokens:token list): 'a = + if msg <> "" then syntax_error (current_pos()) ("Expected: "^msg^".") + else + match tokens with + | [] -> assert false + | t::ts -> + let soft = string_of_token in + syntax_error (current_pos()) + (List.fold_left (fun s t -> s^", "^soft t) ("Expected: "^soft t) ts + ^".") + +let consume_token (lb:lexbuf) : unit = + the_current_token := RocqLexer.token lb; + if log_enabled() then + let (t,p1,p2) = !the_current_token in + let p = locate (p1,p2) in + log "read new token %a %a" Pos.short (Some p) pp_token t + +(* building positions and terms *) + +let extend_pos (*s:string*) (lps:position * position): 'a -> 'a loc = + let p1 = fst lps and p2 = fst (current_pos()) in + let p2 = + if p2.pos_cnum > p2.pos_bol then + {p2 with pos_cnum = p2.pos_cnum - 1} + else p2 + in + (*if log_enabled() then + log "extend_pos %s %a -> %a" s Pos.pp_lexing lps Pos.pp_lexing lps2;*) + make_pos (p1,p2) + +let qid_of_path (lps: position * position): + string list -> (string list * string) loc = function + | [] -> assert false + | id::mp -> make_pos lps (List.rev mp, id) + +let make_abst (pos1:position) (ps:p_params list) (t:p_term) (pos2:position) + :p_term = + if ps = [] then t + else extend_pos (*__FUNCTION__*) (pos1,pos2) (P_Abst(ps,t)) + +let make_prod (pos1:position) (ps:p_params list) (t:p_term) (pos2:position) + :p_term = + if ps = [] then t + else extend_pos (*__FUNCTION__*) (pos1,pos2) (P_Prod(ps,t)) + +let ident_of_term pos1 {elt; _} = + match elt with + | P_Iden({elt=([], x); pos}, _) -> Pos.make pos x + | _ -> LpLexer.syntax_error pos1 "not an unqualified identifier." + +(* generic parsing functions *) + +let list (elt:lexbuf -> 'a) (lb:lexbuf): 'a list = + if log_enabled() then log "%s" __FUNCTION__; + let acc = ref [] in + (try while true do acc := elt lb :: !acc done with SyntaxError _ -> ()); + List.rev !acc + +let nelist (elt:lexbuf -> 'a) (lb:lexbuf): 'a list = + if log_enabled() then log "%s" __FUNCTION__; + let x = elt lb in + x :: list elt lb + +let consume (token:token) (lb:lexbuf): unit = + if current_token() = token then consume_token lb + else expected "" [token] + +let prefix (token:token) (elt:lexbuf -> 'a) (lb:lexbuf): 'a = + consume token lb; elt lb + +let alone (entry:lexbuf -> 'a) (lb:lexbuf): 'a = + let x = entry lb in if current_token() != EOF then expected "" [EOF] else x + +(* parsing functions *) + +let consume_STRINGLIT (lb:lexbuf): string = + match current_token() with + | STRINGLIT s -> + consume_token lb; + s + | _ -> + expected "" [STRINGLIT""] + + +let consume_INT (lb:lexbuf): string = + match current_token() with + | INT s -> + consume_token lb; + s + | _ -> + expected "" [INT""] + + +let qid (lb:lexbuf): (string list * string) loc = + if log_enabled() then log "%s" __FUNCTION__; + match current_token() with + | UID s -> + let pos1 = current_pos() in + consume_token lb; + make_pos pos1 ([], s) + | QID p -> + let pos1 = current_pos() in + consume_token lb; + qid_of_path pos1 p + | _ -> + expected "" [UID"";QID[]] + +let qid_expl (lb:lexbuf): (string list * string) loc = + if log_enabled() then log "%s" __FUNCTION__; + match current_token() with + | UID_EXPL s -> + let pos1 = current_pos() in + consume_token lb; + make_pos pos1 ([], s) + | QID_EXPL p -> + let pos1 = current_pos() in + consume_token lb; + qid_of_path pos1 p + | _ -> + expected "" [UID_EXPL"";QID_EXPL[]] + +let uid (lb:lexbuf): string loc = + if log_enabled() then log "%s" __FUNCTION__; + match current_token() with + | UID s -> + let pos1 = current_pos() in + consume_token lb; + make_pos pos1 s + | _ -> + expected "" [UID""] + +let param (lb:lexbuf): string loc option = + if log_enabled() then log "%s" __FUNCTION__; + match current_token() with + | UID s -> + let pos1 = current_pos() in + consume_token lb; + Some (make_pos pos1 s) + | UNDERSCORE -> + consume_token lb; + None + | _ -> + expected "non-qualified identifier or \"_\"" [UID"";UNDERSCORE] + +let int (lb:lexbuf): string = + if log_enabled() then log "%s" __FUNCTION__; + match current_token() with + | INT s -> + consume_token lb; + s + | _ -> + expected "integer" [INT""] + +let path (lb:lexbuf): string list loc = + if log_enabled() then log "%s" __FUNCTION__; + match current_token() with + (*| UID s -> + let pos1 = current_pos() in + LpLexer.syntax_error pos1 "Unqualified identifier"*) + | QID p -> + let pos1 = current_pos() in + consume_token lb; + make_pos pos1 (List.rev p) + | _ -> + expected "" [QID[]] + +let rec term_id (lb:lexbuf): p_term = + if log_enabled() then log "%s" __FUNCTION__; + match current_token() with + | UID _ + | QID _ -> + let i = qid lb in + {i with elt=P_Iden(i, false)} + | UID_EXPL _ + | QID_EXPL _ -> + let i = qid_expl lb in + {i with elt=P_Iden(i, true)} + | _ -> + expected "" [UID"";QID[];UID_EXPL"";QID_EXPL[]] + +(* commands *) + +and rwpatt_content (lb:lexbuf): p_rwpatt = + if log_enabled() then log "%s" __FUNCTION__; + match current_token() with + (* bterm *) + | BACKQUOTE + | PI + | LAMBDA + | LET + (* aterm *) + | UID _ + | QID _ + | UID_EXPL _ + | QID_EXPL _ + | UNDERSCORE + | TYPE_TERM + | UID_META _ + | UID_PATT _ + | L_PAREN + | L_SQ_BRACKET + | INT _ + (* | QINT _ *) + | STRINGLIT _ -> + let pos1 = current_pos() in + let t1 = term lb in + begin + match current_token() with + | IN -> + consume_token lb; + let t2 = term lb in + begin + match current_token() with + | IN -> + consume_token lb; + let t3 = term lb in + let x = ident_of_term pos1 t2 in + extend_pos (*__FUNCTION__*) pos1 + (Rw_TermInIdInTerm(t1,(x,t3))) + | _ -> + let x = ident_of_term pos1 t1 in + extend_pos (*__FUNCTION__*) pos1 (Rw_IdInTerm(x,t2)) + end + | _ -> + extend_pos (*__FUNCTION__*) pos1 (Rw_Term(t1)) + end + | IN -> + let pos1 = current_pos() in + consume_token lb; + let t1 = term lb in + begin + match current_token() with + | IN -> + consume_token lb; + let t2 = term lb in + let x = ident_of_term pos1 t1 in + extend_pos (*__FUNCTION__*) pos1 (Rw_InIdInTerm(x,t2)) + | _ -> + extend_pos (*__FUNCTION__*) pos1 (Rw_InTerm(t1)) + end + | _ -> + expected "term or keyword \"in\"" [] + +and rwpatt_bracket (lb:lexbuf): p_rwpatt = + if log_enabled() then log "%s" __FUNCTION__; + match current_token() with + | L_SQ_BRACKET -> + consume_token lb; + let p = rwpatt_content lb in + consume R_SQ_BRACKET lb; (*add info on opening bracket*) + p + | _ -> + expected "" [L_SQ_BRACKET] + +and rwpatt (lb:lexbuf): p_rwpatt = + if log_enabled() then log "%s" __FUNCTION__; + match current_token() with + | DOT -> + consume_token lb; + rwpatt_bracket lb + | _ -> + expected "" [DOT] + +(* terms *) + +and params (lb:lexbuf): p_params = + if log_enabled() then log "%s" __FUNCTION__; + match current_token() with + | L_PAREN -> + consume_token lb; + let ps = nelist param lb in + begin + match current_token() with + | COLON -> + consume_token lb; + let typ = term lb in + consume R_PAREN lb; + ps, Some typ, false + | R_PAREN -> + consume_token lb; + ps, None, false + | _ -> + expected "" [COLON;R_PAREN] + end + | L_SQ_BRACKET -> + consume_token lb; + let ps = nelist param lb in + begin + match current_token() with + | COLON -> + consume_token lb; + let typ = term lb in + consume R_SQ_BRACKET lb; + ps, Some typ, true + | R_SQ_BRACKET -> + consume_token lb; + ps, None, true + | _ -> + expected "" [COLON;R_SQ_BRACKET] + end + | _ -> + let x = param lb in + [x], None, false + +and fun_param_list (lb:lexbuf) = + if log_enabled() then log "%s" __FUNCTION__; + match current_token() with + | L_PAREN -> + consume_token lb; + let ps = nelist param lb in + begin + match current_token() with + | COLON -> + consume_token lb; + let typ = term lb in + consume R_PAREN lb; + ps, Some typ, false + | R_PAREN -> + consume_token lb; + ps, None, false + | _ -> + expected "" [COLON;R_PAREN] + end + | _ -> + let x = param lb in + [x], None, false + +and term (lb:lexbuf): p_term = + if log_enabled() then log "%s" __FUNCTION__; + match current_token() with + (* bterm *) + | BACKQUOTE + | EXISTS + | FORALL + | FUN + | PI + | LAMBDA + | LET -> + bterm lb + (* aterm *) + | UID _ + | QID _ + | UID_EXPL _ + | QID_EXPL _ + | UNDERSCORE + | TYPE_TERM + | UID_META _ + | UID_PATT _ + | L_PAREN + | L_SQ_BRACKET + | INT _ + | STRINGLIT _ -> + let pos1 = current_pos() in + let h = aterm lb in + app pos1 h lb + | _ -> + expected "term" [] + +and app (pos1:position * position) (t: p_term) (lb:lexbuf): p_term = + if log_enabled() then log "%s" __FUNCTION__; + match current_token() with + (* aterm *) + | UID _ + | QID _ + | UID_EXPL _ + | QID_EXPL _ + | UNDERSCORE + | TYPE_TERM + | UID_META _ + | UID_PATT _ + | L_PAREN + | L_SQ_BRACKET + | INT _ + | STRINGLIT _ -> + let u = aterm lb in + app pos1 (extend_pos (*__FUNCTION__*) pos1 (P_Appl(t,u))) lb + (* bterm *) + | BACKQUOTE + | PI + | LAMBDA + | LET -> + let u = bterm lb in + extend_pos (*__FUNCTION__*) pos1 (P_Appl(t,u)) + (* other cases *) + | ARROW -> + consume_token lb; + let u = term lb in + extend_pos (*__FUNCTION__*) pos1 (P_Arro(t,u)) + | _ -> + t + +and bterm (lb:lexbuf): p_term = + if log_enabled() then log "%s" __FUNCTION__; + match current_token() with + | BACKQUOTE -> + let pos1 = current_pos() in + consume_token lb; + let q = term_id lb in + let b = binder lb in + let b = extend_pos (*__FUNCTION__*) pos1 (P_Abst(fst b, snd b)) in + extend_pos (*__FUNCTION__*) pos1 (P_Appl(q, b)) + | EXISTS -> + let pos1 = current_pos() in + consume_token lb; + let b = rocqbinder lb COMMA in + let f = fun bin res -> + extend_pos pos1 (P_Appl( + extend_pos pos1 (P_Iden(extend_pos pos1 ([],"∃"), false)), + extend_pos pos1 (P_Abst([bin], res)))) in + (List.fold_right f (fst b) (snd b)) + | FORALL -> + (* { make_pos $sloc (P_Prod(fst b, snd b)) } *) + let pos1 = current_pos() in + consume_token lb; + let b = rocqbinder lb COMMA in + extend_pos (*__FUNCTION__*) pos1 (P_Prod(fst b, snd b)) + | PI -> + (* { make_pos $sloc (P_Prod(fst b, snd b)) } *) + let pos1 = current_pos() in + consume_token lb; + let b = binder lb in + extend_pos (*__FUNCTION__*) pos1 (P_Prod(fst b, snd b)) + | LAMBDA -> + let pos1 = current_pos() in + consume_token lb; + let b = binder lb in + extend_pos (*__FUNCTION__*) pos1 (P_Abst(fst b, snd b)) + | FUN -> + let pos1 = current_pos() in + consume_token lb; + let b = rocqbinder lb THICKARROW in + extend_pos (*__FUNCTION__*) pos1 (P_Abst(fst b, snd b)) + | LET -> + let pos1 = current_pos() in + consume_token lb; + let x = uid lb in + let a = list params lb in + begin + match current_token() with + | COLON -> + consume_token lb; + let b = Some (term lb) in + consume ASSIGN lb; + let t = term lb in + consume IN lb; + let u = term lb in + extend_pos (*__FUNCTION__*) pos1 (P_LLet(x, a, b, t, u)) + | _ -> + let b = None in + consume ASSIGN lb; + let t = term lb in + consume IN lb; + let u = term lb in + extend_pos (*__FUNCTION__*) pos1 (P_LLet(x, a, b, t, u)) + end + | _ -> + expected "" [BACKQUOTE;PI;LAMBDA;LET] + +and aterm (lb:lexbuf): p_term = + if log_enabled() then log "%s" __FUNCTION__; + match current_token() with + | UID _ + | QID _ + | UID_EXPL _ + | QID_EXPL _ -> + term_id lb + | UNDERSCORE -> + let pos1 = current_pos() in + consume_token lb; + make_pos pos1 P_Wild + | TYPE_TERM -> + let pos1 = current_pos() in + consume_token lb; + make_pos pos1 P_Type + | UID_META s -> + let pos1 = current_pos() in + consume_token lb; + let i = make_pos pos1 s in + begin + match current_token() with + | DOT -> + consume_token lb; + extend_pos (*__FUNCTION__*) pos1 + (P_Meta(i,Array.of_list (env lb))) + | _ -> + {i with elt=P_Meta(i,[||])} + end + | UID_PATT s -> + let pos1 = current_pos() in + consume_token lb; + let i = + if s = "_" then None else Some(make_pos pos1 s) in + begin + match current_token() with + | DOT -> + consume_token lb; + extend_pos (*__FUNCTION__*) pos1 + (P_Patt(i, Some(Array.of_list (env lb)))) + | _ -> + make_pos pos1 (P_Patt(i, None)) + end + | L_PAREN -> + let pos1 = current_pos() in + consume_token lb; + let t = term lb in + consume R_PAREN lb; + extend_pos (*__FUNCTION__*) pos1 (P_Wrap(t)) + | L_SQ_BRACKET -> + let pos1 = current_pos() in + consume_token lb; + let t = term lb in + consume R_SQ_BRACKET lb; + extend_pos (*__FUNCTION__*) pos1 (P_Expl(t)) + | INT n -> + let pos1 = current_pos() in + consume_token lb; + make_pos pos1 (P_NLit([],n)) + | STRINGLIT s -> + let pos1 = current_pos() in + consume_token lb; + make_pos pos1 (P_SLit s) + | _ -> + expected "identifier, \"_\", or term between parentheses or square \ + brackets" [] + +and env (lb:lexbuf): p_term list = + if log_enabled() then log "%s" __FUNCTION__; + match current_token() with + | L_SQ_BRACKET -> + consume_token lb; + begin + match current_token() with + | R_SQ_BRACKET -> + consume_token lb; + [] + | _ -> + let t = term lb in + let ts = list (prefix SEMICOLON term) lb in + consume R_SQ_BRACKET lb; + t::ts + end + | _ -> + expected "" [L_SQ_BRACKET] + +and binder (lb:lexbuf): p_params list * p_term = + if log_enabled() then log "%s" __FUNCTION__; + match current_token() with + | UID _ + | UNDERSCORE -> + let s = param lb in + begin + match current_token() with + | UID _ + | UNDERSCORE + | L_PAREN + | L_SQ_BRACKET -> + let ps = list params lb in + consume COMMA lb; + let p = [s], None, false in + p::ps, term lb + | COMMA -> + consume_token lb; + let p = [s], None, false in + [p], term lb + | COLON -> + consume_token lb; + let a = term lb in + consume COMMA lb; + let p = [s], Some a, false in + [p], term lb + | _ -> + expected "parameter list" + [UID"";UNDERSCORE;L_PAREN;L_SQ_BRACKET;COMMA] + end + | L_PAREN -> + let ps = nelist params lb in + begin + match current_token() with + | COMMA -> + consume_token lb; + ps, term lb + | _ -> + expected "" [COMMA] + end + | L_SQ_BRACKET -> + let ps = nelist params lb in + begin + match current_token() with + | COMMA -> + consume_token lb; + ps, term lb + | _ -> + expected "" [COMMA] + end + | _ -> + expected "" [UID"";UNDERSCORE;L_PAREN;L_SQ_BRACKET] + +and rocqbinder (lb:lexbuf) terminator : p_params list * p_term = + if log_enabled() then log "%s" __FUNCTION__; + match current_token() with + | UID _ + | UNDERSCORE -> + let s = param lb in + begin + match current_token() with + | UID _ + | UNDERSCORE + | L_PAREN -> + let ps = list params lb in + consume terminator lb; + let p = [s], None, false in + p::ps, term lb + | COLON -> + consume_token lb; + let a = term lb in + consume terminator lb; + let p = [s], Some a, false in + [p], term lb + (* | terminator *) + | _ -> + if current_token() = terminator then + begin + consume_token lb; + let p = [s], None, false in + [p], term lb + end + else + expected "parameter list" + [UID"";UNDERSCORE;L_PAREN;terminator] + end + | L_PAREN -> + let ps = nelist params lb in + begin + match current_token() with + | _ -> + if current_token() = terminator then + begin + consume_token lb; + ps, term lb + end + else + expected "" [terminator] + end + | _ -> + expected "" [UID"";UNDERSCORE;L_PAREN;] + + +(* search *) + +and generalize (lb:lexbuf): bool = + if log_enabled() then log "%s" __FUNCTION__; + match current_token() with + | GENERALIZE -> consume_token lb; true + | _ -> false + +and relation (lb:lexbuf): relation option = + if log_enabled() then log "%s" __FUNCTION__; + match current_token() with + | UID "=" -> consume_token lb; Some Exact + | UID ">" -> consume_token lb; Some Inside + | UID ("≥"|">=") -> consume_token lb; None + | _ -> expected "\">\", \"=\", \"≥\",\">=\"" [] + +and where (lb:lexbuf): bool * relation option = + if log_enabled() then log "%s" __FUNCTION__; + let r = relation lb in + let g = generalize lb in + g,r + +and asearch (lb:lexbuf): search = + if log_enabled() then log "%s" __FUNCTION__; + match current_token() with + | UID "name" -> + begin + consume_token lb; + match current_token() with + | UID "=" -> + consume_token lb; + QBase(QName (uid lb).elt) + | _ -> expected "\"=\"" [] + end + | TYPE_QUERY -> + begin + consume_token lb; + match current_token() with + | UID ("≥"|">=") -> + consume_token lb; + let g = generalize lb in + let t = term lb in + QBase(QSearch(t,g,Some(QType None))) + | _ -> expected "\"≥\",\">=\"" [] + end + | UID "anywhere" -> + begin + consume_token lb; + match current_token() with + | UID ("≥"|">=") -> + consume_token lb; + let g = generalize lb in + let t = term lb in + QBase(QSearch(t,g,None)) + | _ -> expected "\"≥\",\">=\"" [] + end + | RULE -> + consume_token lb; + let r = relation lb in + let g = generalize lb in + let t = term lb in + QBase(QSearch(t,g,Some(QXhs(r,None)))) + | UID "spine" -> + begin + consume_token lb; + let r = relation lb in + let g = generalize lb in + let t = term lb in + QBase(QSearch(t,g,Some(QType(Some(Spine r))))) + end + | UID "concl" -> + begin + consume_token lb; + let r = relation lb in + let g = generalize lb in + let t = term lb in + QBase(QSearch(t,g,Some(QType(Some(Conclusion r))))) + end + | UID "hyp" -> + begin + consume_token lb; + let r = relation lb in + let g = generalize lb in + let t = term lb in + QBase(QSearch(t,g,Some(QType(Some(Hypothesis r))))) + end + | UID "lhs" -> + begin + consume_token lb; + let r = relation lb in + let g = generalize lb in + let t = term lb in + QBase(QSearch(t,g,Some(QXhs(r,Some Lhs)))) + end + | UID "rhs" -> + begin + consume_token lb; + let r = relation lb in + let g = generalize lb in + let t = term lb in + QBase(QSearch(t,g,Some(QXhs(r,Some Rhs)))) + end + | L_PAREN -> + consume_token lb; + let q = search lb in + consume R_PAREN lb; + q + | _ -> + expected "name, anywhere, rule, lhs, rhs, type, concl, hyp, spine" [] + +and csearch (lb:lexbuf): search = + if log_enabled() then log "%s" __FUNCTION__; + let aq = asearch lb in + match current_token() with + | COMMA -> + let aqs = list (prefix COMMA asearch) lb in + List.fold_left (fun x aq -> QOp(x,Intersect,aq)) aq aqs + | _ -> + aq + +and ssearch (lb:lexbuf): search = + if log_enabled() then log "%s" __FUNCTION__; + let cq = csearch lb in + match current_token() with + | SEMICOLON -> + let cqs = list (prefix SEMICOLON csearch) lb in + List.fold_left (fun x cq -> QOp(x,Union,cq)) cq cqs + | _ -> + cq + +and search (lb:lexbuf): search = + (* expected "prbolem " []*) + if log_enabled() then log "%s" __FUNCTION__; + let q = ssearch lb in + let qids = list (prefix VBAR qid) lb in + let path_of_qid qid = + let p,n = qid.elt in + if p = [] then n + else Format.asprintf "%a.%a" Print.path p Print.uid n + in + List.fold_left (fun x qid -> QFilter(x,Path(path_of_qid qid))) q qids diff --git a/src/parsing/syntax.ml b/src/parsing/syntax.ml index a42972d9d..7815f7a69 100644 --- a/src/parsing/syntax.ml +++ b/src/parsing/syntax.ml @@ -224,6 +224,7 @@ type op = | Union type filter = | Path of string + | RegExp of Str.regexp type search = | QBase of search_base | QOp of search * op * search diff --git a/src/pure/dune b/src/pure/dune index 503baa2a3..58c045fb1 100644 --- a/src/pure/dune +++ b/src/pure/dune @@ -4,5 +4,5 @@ (modules :standard) (inline_tests (deps tests/foo.lp tests/lambdapi.pkg)) (preprocess (pps ppx_inline_test)) - (libraries camlp-streams lambdapi.handle lambdapi.core) + (libraries camlp-streams lambdapi.handle lambdapi.core lambdapi.tool) (flags -w +3)) diff --git a/src/pure/pure.ml b/src/pure/pure.ml index bb18ed3e0..161506651 100644 --- a/src/pure/pure.ml +++ b/src/pure/pure.ml @@ -65,11 +65,11 @@ let parse_command p : (Command.t, Pos.popt * string) Result.t = Result.Error (None, "EOF") | Some c -> Ok c - | exception Fatal(Some(Some(pos)), msg) -> + | exception Fatal(Some(Some(pos)), msg, "") -> Error(Some pos, msg) - | exception Fatal(Some(None) , _ ) -> + | exception Fatal(Some(None) , _ , "") -> assert false - | exception Fatal(None , _ ) -> + | exception Fatal(None , _ , "") -> assert false (** Exception raised by [parse_text] on error. *) @@ -77,6 +77,7 @@ let parse_command p : (Command.t, Pos.popt * string) Result.t = let parse_text : fname:string -> string -> Command.t list * (Pos.pos * string) option = fun ~fname s -> + Stdlib.(Tool.Indexing.lsp_input := (fname,s)) ; let parse_string = if Filename.check_suffix fname dk_src_extension then Parser.Dk.parse_string @@ -87,9 +88,10 @@ let parse_text : Stream.iter (fun c -> Stdlib.(cmds := c :: !cmds)) (parse_string fname s); List.rev Stdlib.(!cmds), None with - | Fatal(Some(Some(pos)), msg) -> List.rev Stdlib.(!cmds), Some(pos, msg) - | Fatal(Some(None) , _ ) -> assert false - | Fatal(None , _ ) -> assert false + | Fatal(Some(Some(pos)), msg, err_desc) -> + List.rev Stdlib.(!cmds), Some(pos, msg ^ "\n" ^ err_desc) + | Fatal(Some(None) , _ , _) -> assert false + | Fatal(None , _ , _) -> assert false let parse_file : fname:string -> Command.t list * (Pos.pos * string) option = @@ -104,9 +106,9 @@ let parse_file : Stream.iter (fun c -> Stdlib.(cmds := c :: !cmds)) (parse_file fname); List.rev Stdlib.(!cmds), None with - | Fatal(Some(Some(pos)), msg) -> List.rev Stdlib.(!cmds), Some(pos, msg) - | Fatal(Some(None) , _ ) -> assert false - | Fatal(None , _ ) -> assert false + | Fatal(Some(Some(pos)), msg, "") -> List.rev Stdlib.(!cmds), Some(pos, msg) + | Fatal(Some(None) , _ , "" ) -> assert false + | Fatal(None , _ , "" ) -> assert false type proof_finalizer = Sig_state.t -> Proof.proof_state -> Sig_state.t @@ -158,8 +160,8 @@ let handle_command : state -> Command.t -> command_result = (t, ss, d.pdata_state, d.pdata_finalize, d.pdata_prv, d.pdata_sym_pos) in Cmd_Proof(ps, d.pdata_proof, d.pdata_sym_pos, d.pdata_end_pos) - with Fatal(Some p,m) -> - Cmd_Error(Some p, m) + with Fatal(Some p,m, err_desc) -> + Cmd_Error(Some p, m ^ "\n" ^ err_desc) let handle_tactic : proof_state -> Tactic.t -> int -> tactic_result = fun (_, ss, ps, finalize, prv, sym_pos) tac n -> @@ -167,14 +169,14 @@ let handle_tactic : proof_state -> Tactic.t -> int -> tactic_result = let ps, qres = Handle.Tactic.handle ss sym_pos prv (ps, None) tac n in let qres = Option.map (fun f -> f ()) qres in Tac_OK((Time.save (), ss, ps, finalize, prv, sym_pos), qres) - with Fatal(Some p,m) -> - Tac_Error(Some p, Pos.popt_to_string p ^ " " ^ m) + with Fatal(Some p,m, err_desc) -> + Tac_Error(Some p, m ^ "\n" ^ err_desc) let end_proof : proof_state -> command_result = fun (_, ss, ps, finalize, _, _) -> try Cmd_OK((Time.save (), finalize ss ps), None) - with Fatal(Some p,m) -> - Cmd_Error(Some p, m) + with Fatal(Some p,m, err_descr) -> + Cmd_Error(Some p, m ^ "\n" ^ err_descr) let get_symbols : state -> Term.sym Extra.StrMap.t = fun (_, ss) -> ss.in_scope diff --git a/src/tool/dune b/src/tool/dune index 2be51b1cd..62eeda1cf 100644 --- a/src/tool/dune +++ b/src/tool/dune @@ -2,7 +2,7 @@ (name tool) (public_name lambdapi.tool) (modules :standard) - (libraries lambdapi.parsing lambdapi.core dream unix dune-site) + (libraries lambdapi.parsing lambdapi.core dream unix dune-site lambdapi.ressources) (preprocess (pps lwt_ppx))) @@ -10,14 +10,3 @@ (targets websearch.ml) (deps websearch.eml.ml) (action (run dream_eml %{deps} --workspace %{workspace_root}))) - -(install - (section (site (lambdapi server_resources))) - (files - (lambdapi.ico as default/lambdapi.ico) - ) -) - -(generate_sites_module - (module mysites) - (sites lambdapi)) diff --git a/src/tool/indexing.ml b/src/tool/indexing.ml index 0ea53c853..4574b22e9 100644 --- a/src/tool/indexing.ml +++ b/src/tool/indexing.ml @@ -1,10 +1,40 @@ open Core open Term open Common open Pos +open Timed + +let lsp_input = Stdlib.ref ("","") type sym_name = Common.Path.t * string +let string_of_path x = Format.asprintf "%a" Common.Path.pp x +let string_of_sym_name (p,n) = string_of_path p ^ "." ^ n + +let is_path_prefix patt p = + Lplib.String.is_prefix patt (string_of_path p) + +let re_matches_sym_name re sym_name = + try + ignore (Str.search_forward re (string_of_sym_name sym_name) 0) ; + true + with Not_found -> false + let name_of_sym s = (s.sym_path, s.sym_name) +(* Tail recursive implementation of List.append for + OCaml < 5.1 *) +let (@) l1 l2 = List.rev_append (List.rev l1) l2 + +let dump_to ~filename i = + let ch = open_out_bin filename in + Marshal.to_channel ch i [] ; + close_out ch + +let restore_from ~filename = + let ch = open_in_bin filename in + let i = Marshal.from_channel ch in + close_in ch ; + i + (* discrimination tree *) (* substitution trees would be best *) @@ -63,7 +93,13 @@ let rec node_of_stack t s v = (* Let-ins are expanded during indexing *) node_of_stack (subst bind bod) s v | Meta _ -> assert false - | Plac _ -> assert false (* not for meta-closed terms *) + | Plac _ -> + (* this may happen in a rewriting rule that uses a _ in the r.h.s. + that is NOT instantiated; the rule is meant to open a proof + obligation. + + Tentative implementation: a placeholder is seen as a variable *) + IRigid(IVar, index_of_stack s v) | Wild -> assert false (* used only by tactics and reduction *) | TRef _ -> assert false (* destroyed by unfold *) | Bvar _ -> assert false @@ -87,7 +123,14 @@ let rec match_rigid r term = | IAbst, Abst(t1,bind) -> let _, t2 = unbind bind in [t1;t2] | IProd, Prod(t1,bind) -> let _, t2 = unbind bind in [t1;t2] | _, LLet (_typ, bod, bind) -> match_rigid r (subst bind bod) - | _, (Meta _ | Plac _ | Wild | TRef _) -> assert false + | IVar, Plac _ -> + (* this may happen in a rewriting rule that uses a _ in the r.h.s. + that is NOT instantiated; the rule is meant to open a proof + obligation. + + Tentative implementation: a placeholder is seen as a variable *) + [] + | _, (Meta _ | Wild | TRef _) -> assert false | _, _ -> raise NoMatch (* match anything with a flexible term *) @@ -145,6 +188,38 @@ let insert_name (namemap,index) name v = | Some l -> l in Lplib.Extra.StrMap.add name (v::vs) namemap, index +let rec remove_index ~what = + function + | Leaf l -> + let l' = List.filter what l in + (match l' with + | [] -> Choice [] + | _::_ -> Leaf l') + | Choice l -> + Choice (List.filter_map (remove_node ~what) l) + +and remove_node ~what = + function + | IHOLE i -> + (match remove_index ~what i with + | Choice [] -> None + | i' -> Some (IHOLE i')) + | IRigid (r,i) -> + (match remove_index ~what i with + | Choice [] -> None + | i' -> Some (IRigid (r,i'))) + +let remove_from_name_index ~what i = + Lplib.Extra.StrMap.filter_map + (fun _ l -> + let f x = if what x then Some x else None in + match List.filter_map f l with + | [] -> None + | l' -> Some l') i + +let remove ~what (dbname, index) = + remove_from_name_index ~what dbname, remove_index ~what index + let rec search_index ~generalize index stack = match index,stack with | Leaf vs, [] -> vs @@ -174,17 +249,6 @@ let search ~generalize (_,index) term = let locate_name (namemap,_) name = match Lplib.Extra.StrMap.find_opt name namemap with None -> [] | Some l -> l -let dump_to ~filename i = - let ch = open_out_bin filename in - Marshal.to_channel ch i [] ; - close_out ch - -let restore_from ~filename = - let ch = open_in_bin filename in - let i = Marshal.from_channel ch in - close_in ch ; - i - end module DB = struct @@ -238,8 +302,111 @@ module DB = struct m) empty l end + module Sym_nameMap = + Map.Make(struct type t = sym_name let compare = compare end) + type answer = ((*generalized:*)bool * term * position) list + (* disk persistence *) + + let the_dbpath : string Stdlib.ref = Stdlib.ref Path.default_dbpath + + let rwpaths = Stdlib.ref [] + + let restore_from_disk () = + try restore_from ~filename:Stdlib.(!the_dbpath) + with Sys_error msg -> + Common.Error.wrn None "%s.\n\ + Type \"lambdapi index --help\" to learn how to create the index." msg ; + Sym_nameMap.empty, Index.empty + + (* The persistent database *) + let db : + ((string * string * int * int) Sym_nameMap.t * + (item * position list) Index.db) Lazy.t ref = + ref (lazy (restore_from_disk ())) + + let preserving_index f x = + let saved_db = !db in + let res = f x in + db := saved_db ; + res + + let empty () = db := lazy (Sym_nameMap.empty,Index.empty) + + let insert k v = + let sidx,idx = Lazy.force !db in + let db' = sidx, Index.insert idx k v in + db := lazy db' + + let insert_name k v = + let sidx,idx = Lazy.force !db in + let db' = sidx, Index.insert_name idx k v in + db := lazy db' + + let remove path = + let sidx,idx = Lazy.force !db in + let keep sym_path = not (is_path_prefix path sym_path) in + let idx = + Index.remove + ~what:(fun (((sym_path,_),_),_) -> keep sym_path ) idx in + let sidx = + Sym_nameMap.filter (fun (sym_path,_) _ -> keep sym_path) sidx in + db := lazy (sidx,idx) + + let set_of_list ~generalize k l = + (* rev_map is used because it is tail recursive *) + ItemSet.of_list + (List.rev_map + (fun (i,pos) -> + i, List.map (fun x -> generalize,k,x) pos) l) + + let search ~generalize k = + set_of_list ~generalize k + (Index.search ~generalize (snd (Lazy.force !db)) k) + + let dump ~dbpath () = + dump_to ~filename:dbpath (Lazy.force !db) + + let locate_name name = + let k = Term.mk_Wild (* dummy, unused *) in + set_of_list ~generalize:false k + (Index.locate_name (snd (Lazy.force !db)) name) + + let parse_source_map filename = + let sidx,idx = Lazy.force !db in + let sidx = ref sidx in + let ch = open_in filename in + (try + while true do + let line = input_line ch in + match String.split_on_char ' ' line with + | [fname; start_line; end_line; sourceid; lpid] -> + let rec sym_name_of = + function + | [] -> assert false + | [name] -> [],name + | hd::tl -> let path,name = sym_name_of tl in hd::path,name in + let lpid = sym_name_of (String.split_on_char '.' lpid) in + let start_line = int_of_string start_line in + let end_line = int_of_string end_line in + sidx := + Sym_nameMap.add lpid (sourceid,fname,start_line,end_line) !sidx + | _ -> + raise + (Common.Error.Fatal + (None,"wrong file format for source map file", "")) + done ; + with + | Failure _ as exn -> + close_in ch; + raise + (Common.Error.Fatal(None,"wrong file format for source map file: " ^ + Printexc.to_string exn, "")) + | End_of_file -> close_in ch) ; + db := lazy (!sidx,idx) + + type ho_pp = { run : 'a. 'a Lplib.Base.pp -> 'a Lplib.Base.pp } let identity_escaper : ho_pp = @@ -251,6 +418,18 @@ module DB = struct Format.pp_print_string fmt res } + let source_infos_of_sym_name sym_name = + match Sym_nameMap.find_opt sym_name (fst (Lazy.force !db)) with + | None -> None, None + | Some (sourceid, fname, start_line, end_line) -> + let start_col = 0 in + let end_col = -1 in (* to the end of line *) + (* FIX ME *) + let start_offset, end_offset = 0, 0 in + Some sourceid, + Some { fname=Some fname; start_line; start_col; end_line; + end_col; start_offset; end_offset } + let generic_pp_of_position_list ~escaper ~sep = Lplib.List.pp (fun ppf position -> @@ -270,108 +449,128 @@ module DB = struct pp_relation relation pp_side side)) sep + (* given a filename/URI it returns the stream of lines + and a function to close the resources *) + let parse_file fname = + if String.starts_with ~prefix:"file:///" fname then + (assert (fst Stdlib.(!lsp_input) = fname) ; + let text = snd Stdlib.(!lsp_input) in + let lines = ref (String.split_on_char '\n' text) in + (fun () -> + match !lines with + | [] -> raise End_of_file + | he::tl -> lines := tl ; he), + (fun () -> ())) + else + let ch = open_in fname in + (fun () -> input_line ch), (fun () -> close_in ch) + let generic_pp_of_item_list ~escape ~escaper ~separator ~sep ~delimiters ~lis:(lisb,lise) ~pres:(preb,pree) - ~bold:(boldb,bolde) ~code:(codeb,codee) fmt l - = + ~bold:(boldb,bolde) ~code:(codeb,codee) ?(colorizer=Lplib.Color.default) + fmt l = if l = [] then Lplib.Base.out fmt "Nothing found" else - Lplib.List.pp - (fun ppf (((p,n),pos),(positions : answer)) -> - Lplib.Base.out ppf "%s%a.%s%s%s@%s%s%a%s%s%s%a%s%s%s@." - lisb (escaper.run Core.Print.path) p boldb n bolde - (popt_to_string ~print_dirname:false pos) - separator (generic_pp_of_position_list ~escaper ~sep) positions - separator preb codeb - (Common.Pos.print_file_contents ~escape ~delimiters) - pos codee pree lise) + Lplib.List.pp (fun ppf (((p,n) as sym_name,pos),(positions : answer)) -> + let sourceid,sourcepos = source_infos_of_sym_name sym_name in + (* let n_pp = (Lplib.Color.red "%s") n in *) + Lplib.Base.out ppf + ("%s%a.%s" + ^^ + (colorizer "%s") + ^^ "%s@%s%s%a%s%s%s%a%s%a%s%a%s%s%s%s@.") + lisb (escaper.run Core.Print.path) p boldb n bolde + (popt_to_string ~print_dirname:false pos) + separator (generic_pp_of_position_list ~escaper ~sep) positions + separator preb codeb + (Common.Pos.print_file_contents ~parse_file ~escape ~delimiters + ~complain_if_location_unknown:true) pos + separator + (fun ppf opt -> + match opt with + | None -> Lplib.Base.string ppf "" + | Some sourceid -> + Lplib.Base.string ppf ("Translated to " ^ sourceid)) sourceid + separator + (Common.Pos.print_file_contents ~parse_file ~escape ~delimiters + ~complain_if_location_unknown:false) sourcepos + codee pree lise separator) "" fmt l let html_of_item_list = generic_pp_of_item_list ~escape:Dream.html_escape ~escaper:html_escaper ~separator:"
\n" ~sep:" and
\n" ~delimiters:("

","

") ~lis:("
  • ","
  • ") ~pres:("
    ","
    ") ~bold:("","") - ~code:("","") + ~code:("","") ~colorizer: Lplib.Color.default - let pp_item_list = + let pp_item_list fmt l = generic_pp_of_item_list ~escape:(fun x -> x) ~escaper:identity_escaper ~separator:"\n" ~sep:" and\n" ~delimiters:("","") - ~lis:("* ","") ~pres:("","") ~bold:("","") ~code:("","") + ~lis:("* ","") ~pres:("","") + ~bold:("","") + ~code:("","") + ~colorizer: + (if Stdlib.(!Common.Console.lsp_mod) || Unix.isatty Unix.stdout then + Lplib.Color.red + else + Lplib.Color.default) + fmt l let pp_results_list fmt l = pp_item_list fmt l let html_of_results_list from fmt l = Lplib.Base.out fmt "
      %a
    " from html_of_item_list l - (* disk persistence *) - - let the_dbpath : string ref = ref Path.default_dbpath - - let rwpaths = ref [] - - let restore_from_disk () = - try Index.restore_from ~filename:!the_dbpath - with Sys_error msg -> - Common.Error.wrn None "%s.\n\ - Type \"lambdapi index --help\" to learn how to create the index." msg ; - Index.empty - - let db : (item * position list) Index.db Lazy.t ref = - ref (lazy (restore_from_disk ())) - - let empty () = db := lazy Index.empty - - let insert k v = - let db' = Index.insert (Lazy.force !db) k v in - db := lazy db' - - let insert_name k v = - let db' = Index.insert_name (Lazy.force !db) k v in - db := lazy db' - - let set_of_list ~generalize k l = - ItemSet.of_list - (List.map - (fun (i,pos) -> - i, List.map (fun x -> generalize,k,x) pos) l) - - let search ~generalize k = - set_of_list ~generalize k - (Index.search ~generalize (Lazy.force !db) k) - - let dump ~dbpath () = - the_dbpath := dbpath; - Index.dump_to ~filename:dbpath (Lazy.force !db) - - let locate_name name = - let k = Term.mk_Wild (* dummy, unused *) in - set_of_list ~generalize:false k - (Index.locate_name (Lazy.force !db) name) - end exception Overloaded of string * DB.answer DB.ItemSet.t +let normalize_fun = ref (fun _ -> assert false) + +let mk_bogus_sym mp name pos = + Core.Term.create_sym mp Core.Term.Public Core.Term.Defin Core.Term.Sequen + false (Common.Pos.make pos name) None Core.Term.mk_Type [] + +let elim_duplicates_up_to_normalization res = + let resl = DB.ItemSet.bindings res in + let norm = + List.map + (fun ((((mp,name),sympos),l) as inp) -> + let s = mk_bogus_sym mp name sympos in + match !normalize_fun (Core.Term.mk_Symb s) with + | Symb {sym_path ; sym_name ; _} -> + (((sym_path,sym_name),sympos), l) + | _ -> inp) resl in + let res = List.sort (fun ((x,_),_) ((y,_),_) -> compare x y) norm in + let res = + let rec uniq = + function + | [] | [_] as l -> l + | ((x,_),_)::(((y,_),_)::_ as l) when x=y -> uniq l + | i::l -> i::uniq l in + uniq res in + DB.ItemSet.of_list res + let find_sym ~prt ~prv sig_state ({elt=(mp,name); pos} as s) = - let pos,mp = - match mp with - [] -> - let res = DB.locate_name name in - if DB.ItemSet.cardinal res > 1 then - raise (Overloaded (name,res)) ; - (match DB.ItemSet.choose_opt res with - | None -> Common.Error.fatal pos "Unknown symbol %s." name - | Some (((mp,_),sympos),[_,_,DB.Name]) -> sympos,mp - | Some _ -> assert false) (* locate only returns DB.Name*) - | _::_ -> None,mp - in try Core.Sig_state.find_sym ~prt ~prv sig_state s with Common.Error.Fatal _ -> - Core.Term.create_sym mp Core.Term.Public Core.Term.Defin Core.Term.Sequen - false (Common.Pos.make pos name) None Core.Term.mk_Type [] + let pos,mp,name = + match mp with + [] -> + let res_orig = DB.locate_name name in + let res = elim_duplicates_up_to_normalization res_orig in + if DB.ItemSet.cardinal res > 1 then + raise (Overloaded (name,res_orig)) ; + (match DB.ItemSet.choose_opt res with + | None -> Common.Error.fatal pos "Unknown symbol %s." name + | Some (((mp,name),sympos),[_,_,DB.Name]) -> sympos,mp,name + | Some _ -> assert false) (* locate only returns DB.Name*) + | _::_ -> None,mp,name + in + mk_bogus_sym mp name pos module QNameMap = Map.Make(struct type t = sym_name let compare = Stdlib.compare end) @@ -403,7 +602,7 @@ let load_meta_rules () = match elt with Parsing.Syntax.P_rules r -> rules := List.rev_append r !rules | _ -> ()) - cmdstream) !DB.rwpaths ; + cmdstream) Stdlib.(!DB.rwpaths) ; let rules = List.rev !rules in let handle_rule map r = let (s,r) = check_rule r in @@ -418,20 +617,25 @@ let load_meta_rules () = let meta_rules = lazy (load_meta_rules ()) +let force_meta_rules_loading () = ignore (Lazy.force meta_rules) + let normalize typ = let dtree sym = try QNameMap.find (name_of_sym sym) (Lazy.force meta_rules) with Not_found -> Core.Tree_type.empty_dtree in Core.Eval.snf ~dtree ~tags:[`NoExpand] [] typ +let _ = normalize_fun := normalize + let search_pterm ~generalize ~mok ss env pterm = let env = ("V#",(new_var "V#",Term.mk_Type,None))::env in + Dream.log "QUERY before scoping: %a@." Parsing.Pretty.term pterm ; let query = Parsing.Scope.scope_search_pattern ~find_sym ~mok ss env pterm in - Dream.log "QUERY before: %a" Core.Print.term query ; + Dream.log "QUERY before normalize: %a" Core.Print.term query ; let query = normalize query in - Dream.log "QUERY after: %a" Core.Print.term query ; + Dream.log "QUERY to be executed: %a" Core.Print.term query ; DB.search ~generalize query let rec is_flexible t = @@ -440,7 +644,14 @@ let rec is_flexible t = | Appl(t,_) -> is_flexible t | LLet(_,_,b) -> let _, t = unbind b in is_flexible t | Vari _ | Type | Kind | Symb _ | Prod _ | Abst _ -> false - | Meta _ | Plac _ | Wild | TRef _ | Bvar _ -> assert false + | Plac _ -> + (* this may happen in a rewriting rule that uses a _ in the r.h.s. + that is NOT instantiated; the rule is meant to open a proof + obligation + + Tentative implementation: a placeholder is seen as a variable *) + false + | Meta _ | Wild | TRef _ | Bvar _ -> assert false let enter = DB.(function @@ -470,6 +681,7 @@ let subterms_to_index ~is_spine t = | Vari _ | Type | Kind + | Plac _ | Symb _ -> [] | Abst(t,b) -> let _, t2 = unbind b in @@ -494,7 +706,7 @@ let subterms_to_index ~is_spine t = let _, t3 = unbind b in aux ~where:(enter where) t1 @ aux ~where:(enter where) t2 @ aux ~where:(enter where) t3 - | Meta _ | Plac _ | Wild | TRef _ -> assert false + | Meta _ | Wild | TRef _ -> assert false in aux ~where:(if is_spine then Spine Exact else Conclusion Exact) t let insert_rigid t v = @@ -530,6 +742,12 @@ let index_rule sym ({Core.Term.lhs=lhsargs ; rule_pos ; _} as rule) = let rhs = rule.rhs in let get_relation = function | DB.Conclusion r -> r | _ -> assert false in let filename = Option.get rule_pos.fname in + let filename = + if String.starts_with ~prefix:"file:///" filename then + let n = String.length "file://" in + String.sub filename n (String.length filename - n) + else + filename in let path = Library.path_of_file Parsing.LpLexer.escape filename in let rule_name = (path,Common.Pos.to_string ~print_fname:false rule_pos) in index_term_and_subterms ~is_spine:false lhs @@ -537,9 +755,19 @@ let index_rule sym ({Core.Term.lhs=lhsargs ; rule_pos ; _} as rule) = index_term_and_subterms ~is_spine:false rhs (fun where -> ((rule_name,Some rule_pos),[Xhs(get_relation where,Rhs)])) +let _ = + Stdlib.(Core.Sign.add_rules_callback := + fun sym rules -> List.iter (index_rule sym) rules) + let index_sym sym = let qname = name_of_sym sym in (* Name *) + if List.exists (fun ((sn,_),_) -> sn=qname) + (DB.ItemSet.bindings (DB.locate_name (snd qname))) + then + raise + (Common.Error.Fatal + (None,string_of_sym_name qname ^ " already indexed", "")); DB.insert_name (snd qname) ((qname,sym.sym_decl_pos),[Name]) ; (* Type + InType *) let typ = Timed.(!(sym.Core.Term.sym_type)) in @@ -550,8 +778,10 @@ let index_sym sym = (* Rules *) List.iter (index_rule sym) Timed.(!(sym.Core.Term.sym_rules)) +let _ = Stdlib.(Core.Sign.add_symbol_callback := index_sym) + let load_rewriting_rules rwrules = - DB.rwpaths := rwrules + Stdlib.(DB.rwpaths := rwrules) let index_sign sign = (*Console.set_flag "print_domains" true ; @@ -569,6 +799,8 @@ let index_sign sign = d.Sign.dep_symbols) deps +let deindex_path path = DB.remove path + (* let's flatten the interface *) include DB @@ -592,17 +824,18 @@ module QueryLanguage = struct | _, _ -> false let filter_constr constr _ positions = - match constr with - | QType wherep -> - List.exists - (function - | _,_,Type where -> match_where wherep where - | _ -> false) positions - | QXhs (insp,sidep) -> - List.exists - (function - | _,_,Xhs (ins,side) -> match_opt insp ins && match_opt sidep side - | _ -> false) positions + Option.map (fun x -> [x]) + (match constr with + | QType wherep -> + List.find_opt + (function + | _,_,Type where -> match_where wherep where + | _ -> false) positions + | QXhs (insp,sidep) -> + List.find_opt + (function + | _,_,Xhs (ins,side) -> match_opt insp ins && match_opt sidep side + | _ -> false) positions) let answer_base_query ~mok ss env = function @@ -611,7 +844,7 @@ module QueryLanguage = struct let res = search_pterm ~generalize ~mok ss env patt in (match constr with | None -> res - | Some constr -> ItemSet.filter (filter_constr constr) res) + | Some constr -> ItemSet.filter_map (filter_constr constr) res) let perform_op = function @@ -626,12 +859,11 @@ module QueryLanguage = struct (fun _ positions1 positions2 -> Some (positions1 @ positions2)) let filter set f = - let f ((p',_),_) _ = + let g ((p',_ as name),_) _ = match f with - | Path p -> - let string_of_path x = Format.asprintf "%a" Common.Path.pp x in - Lplib.String.is_prefix p (string_of_path p') in - ItemSet.filter f set + | Path p -> is_path_prefix p p' + | RegExp re -> re_matches_sym_name re name in + ItemSet.filter g set let answer_query ~mok ss env = let rec aux = @@ -649,37 +881,53 @@ include QueryLanguage module UserLevelQueries = struct - let search_cmd_gen ss ~from ~how_many ~fail ~pp_results ~tag:(hb,he) q = + let search_cmd_gen ss ~from ~how_many + ~(fail:?err_desc:string -> string -> string) + ~pp_results ~tag:(hb,he) q fmt s = try let mok _ = None in + let q = match q with + | None -> Parsing.Parser.Rocq.parse_search_string (lexing_opt None) s + | Some q -> q in let items = ItemSet.bindings (answer_query ~mok ss [] q) in let resultsno = List.length items in let _,items = Lplib.List.cut items from in let items,_ = Lplib.List.cut items how_many in - Format.asprintf "%sNumber of results: %d%s%a@." + Lplib.Base.out fmt "%sNumber of results: %d%s@.%a@." hb resultsno he pp_results items with | Stream.Failure -> - fail (Format.asprintf "Syntax error: a query was expected") - | Common.Error.Fatal(_,msg) -> - fail (Format.asprintf "Error: %s@." msg) + Lplib.Base.out fmt "%s" + (fail (Format.asprintf "Syntax error: a query was expected@.")) + | Common.Error.Fatal(_,msg, _) -> + Lplib.Base.out fmt "%s" (fail (Format.asprintf "Error: %s@." msg)) | Overloaded(name,res) -> - fail (Format.asprintf + Lplib.Base.out fmt "%s" (fail (Format.asprintf "Overloaded symbol %s. Please rewrite the query replacing %s \ - with a fully qualified identifier among the following: %a" - name name pp_results (ItemSet.bindings res)) + with a fully qualified identifier among the following:@." + name name) + ~err_desc:(Format.asprintf "%a@." pp_results (ItemSet.bindings res)) + ) | Stack_overflow -> - fail + Lplib.Base.out fmt "%s" (fail (Format.asprintf - "Error: too many results. Please refine your query.@." ) + "Error: too many results. Please refine your query.@." )) | exn -> - fail (Format.asprintf "Error: %s@." (Printexc.to_string exn)) + Lplib.Base.out fmt "%s" + (fail (Format.asprintf "Error: %s@." (Printexc.to_string exn))) + + let search_cmd_txt_string ss ~dbpath s = + Stdlib.(the_dbpath := dbpath); + Format.asprintf "%a" (search_cmd_gen ss ~from:0 ~how_many:999999 + ~fail:(fun ?err_desc x -> Common.Error.fatal_no_pos ?err_desc "%s" x) + ~pp_results:pp_results_list ~tag:("","") None) s let search_cmd_txt_query ss ~dbpath q = - the_dbpath := dbpath; - search_cmd_gen ss ~from:0 ~how_many:999999 - ~fail:(fun x -> Common.Error.fatal_no_pos "%s" x) - ~pp_results:pp_results_list ~tag:("","") q + Stdlib.(the_dbpath := dbpath); + Format.asprintf "%a" (search_cmd_gen ss ~from:0 ~how_many:999999 + ~fail:(fun ?err_desc x -> Common.Error.fatal_no_pos ?err_desc "%s" x) + ~pp_results:pp_results_list ~tag:("","") (Some q)) "" + (** [transform_ascii_to_unicode s] replaces all the occurences of ["->"] and ["forall"] with ["→"] and ["Π"] in the search query [s] *) @@ -696,21 +944,23 @@ module UserLevelQueries = struct assert (transform_ascii_to_unicode "forall.x, y" = "Π.x, y"); assert (transform_ascii_to_unicode "((forall x, y" = "((Π x, y") - let search_cmd_html ss ~from ~how_many ~dbpath s = - the_dbpath := dbpath; - search_cmd_gen ss ~from ~how_many - ~fail:(fun x -> "" ^ x ^ "") - ~pp_results:(html_of_results_list from) - ~tag:("

    "," "" ^ x ^ "" + ^ (Option.value err_desc ~default:"") + ) + ~pp_results:(html_of_results_list from) + ~tag:("

    "," unit val load_rewriting_rules: string list -> unit +val force_meta_rules_loading: unit -> unit val index_sign : Sign.t -> unit +val parse_source_map : string -> unit (* the name of the file *) +val deindex_path : string -> unit val dump : dbpath:string -> unit -> unit +val preserving_index : ('a -> 'b) -> 'a -> 'b + +(* set by lsp; used to print the query results *) +val lsp_input : ((*uri*)string * (*text*)string) ref + (* search command used by cli *) -val search_cmd_txt: sig_state -> dbpath:string -> string -> string val search_cmd_txt_query: sig_state -> dbpath:string -> search -> string (* search command used by websearch *) val search_cmd_html: - sig_state -> from:int -> how_many:int -> dbpath:string -> string -> string +sig_state -> from:int -> how_many:int -> string -> dbpath:string -> string + +val search_cmd_txt: sig_state -> dbpath:string -> string Lplib.Base.pp diff --git a/src/tool/lambdapi.ico b/src/tool/lambdapi.ico deleted file mode 100644 index b3c207a5d..000000000 Binary files a/src/tool/lambdapi.ico and /dev/null differ diff --git a/src/tool/websearch.eml.ml b/src/tool/websearch.eml.ml index 252aac823..ea2132c00 100644 --- a/src/tool/websearch.eml.ml +++ b/src/tool/websearch.eml.ml @@ -74,7 +74,7 @@ let show_form ~from ?(message="") ?output csrf_tag ~hide_description= -let themes_locations : string list = Mysites.Sites.server_resources +let themes_locations : string list = Ressources.Mysites.Sites.server_resources let start ~header ss ~port ~dbpath ~path_in_url () = (*Common.Logger.set_debug true "e" ;*) let interface = "0.0.0.0" in