Skip to Content

The NL2KR System

For a quick guide on the NL2KR-L and NL2KR-T systems, jump to sections 3.9 and 3.10.

Table of Contents

1. Introduction and Motivation

Our approach to understanding natural language involves translating natural language text to formal statements in an appropriate knowledge representation language so that a reasoning engine can reason with the translated knowledge and give a response, be it an answer, a clarifying question or an action. To translate natural language text to a formal statement we propose to use the compositional method of Montague 1 where the translation (or meaning) of words are given as lambda calculus formulas and the meaning of phrases and sentences are obtained by composing the meaning of the constituent words. The challenge in doing this is in coming up with appropriate lambda calculus expressions for each word. The challenging aspects in this are: (a) the number of words may be huge, (b) the lambda calculus expression (or meaning) of some words are too complicated for humans to come up with it, and (c) the lambda calculus expressions for the words are target language specific; so it is not a one time affair like compiling traditional dictionaries. To address these challenges we use an inverse lambda algorithm [2] that computes the meaning of a word/phrase G when the meaning of the word/phrase H and the phrase GH (or HG) is known. The NL2KR system uses an initial lexicon containing some words and their meanings and a set of training corpus containing sentences in natural language and their translations to learn new meanings of words. The system then uses the new learned lexicon to translate new sentences. In this paper, we would like to give an overview of the NL2KR system and examples of using it.

2. Overview

Shown below in Fig. 1 is the architecture of the NL2KR system. It has two sub-parts which depend on each other:
  1. NL2KR-L for learning
  2. NL2KR-T for translating
Fig. 1: Architecture of the NL2KR system: NL2KR-T on the left and NL2KR-L on the right
The NL2KR-L sub-part takes an initial lexicon consisting of some words and their meanings in terms of λ-calculus expressions & a set of training sentences and their target formal representations as input. It then uses a Combinatorial Categorical Grammar (CCG) parser to construct the parse trees. Next, the learning sub-part of the system uses Inverse-λ and Generalization algorithms to learn meanings of newly encountered words, which are not present in the initial lexicon, and adds them to the lexicon. A parameter learning method is then used to estimate a weight for each lexicon entry (word, its syntactic category and meaning) such that the joint probability of the sentences in the training set getting translated to their given formal representation is maximized. The result of NL2KR-L is the final lexicon, which contains a larger set of words, their meanings and their weights. Once the training component finishes its job, the translation sub-part, NL2KR-T, uses this updated lexicon and translates sentences using the CCG parser. Given input sentences and the updated lexiocon, NL2KR-T will attempt to translate the those sentences and determine their meanings. Since words can have multiple meanings and their associated λ-calculus expressions, weights assigned to each lexical entry in the lexicon helps in deciding the more likely meaning of a word in the context of a sentence. In all examples, the symbol # is used to represent λ where necessary.

3. Using NL2KR

The latest version of NL2KR system can be downloaded from the project website. It can be run on Windows, Mac OS X, and Linux (64 bit).

3.1. Third party software used by NL2KR

The current version of NL2KR uses the following libraries and tools developed by others:
  • Stanford Log-linear Part-Of-Speech Tagger 2 (version 3.1.5)
  • Oracle Java (version ≥ 1.8)

3.2. Installation guide

Just unpack and run .bat or .sh files. 3.3. Files/folders in the package Following is a brief list of important files/folders in the package:
  • NL2KR.jar: NL2KR’s classes packed in a jar file
  • Default configuration of the NL2KR system
  • Script to run the NL2KR main GUI
  • Script to run the NL2KR-L sub-part
  • Script to run the NL2KR-T sub-part
  • RunConfiguration: Example inputs of the preceding scripts
  • resources: Folder containing gringo, clasp and clingo
  • examples: Folder containing examples in various domains for NL2KR-L and NL2KR-T

3.4. Executing the scripts

The main script that will be used is, as it is the point of entry to all of the sub-parts of NL2KR. To execute any of the scripts one needs to go to the NL2KR’s root directory and run
./<script name> <RunConfiguration file> [Optional params]
where script name is the name of one of the six scripts (e.g. ./, RunConfiguration file contains corresponding parameters for the script, and optional parameters are for the Java Virtual Machine (JVM) to execute the module that corresponds to the script. For learning and testing large dataset with NL2KR-L or NL2KR-T, it is recommended to provide more memory for the JVM and enable garbage collection if needed (i.e. Use -Xmx and -XX:-UseGCOverheadLimit).

3.5. Lambda application

To use the lambda application interface of the NL2KR GUI, the function and the argument of the lambda application need to be provided. For example, the following arguments specify that we need to apply the argument #x.x@mia to the function #y.#x.loves(x,y).
Function = #y.#x.loves(x,y) 
Argument = #x.x@mia
The output of the lambda application is function @ argument. According to lambda application, the y variable is replaced by #x.x@mia and the result is #x.loves(x,#x0.x0@mia), where x in the argument is renamed to x0. Running the lambda application script with the preceding configuration yields the output, and a visual representation of the lambda application:
Result= #x.loves(x,#x0.x0 @ mia)
Fig. 2: The lambda application GUI
Fig. 3: Graph representation of lambda application
However, in the following configuration, the argument cannot be applied to the function since there is no free variable in the function.
Function = loves(x,y) 
Argument = #x.x @ mia
The output thus would be:
Function = loves(x,y) 
Argument = #x.x @ mia 
Cannot apply the argument #x.x @ mia to the function loves(x, y)

3.6. Inverse application

Given two lambda expressions g and h, the lambda application gives us f = g @ h or f = h @ g. But sometimes, we have only f and g and we need to find h. The inverse application allows us to calculate the lambda expression h so that f = g @ h or f = h @ g. More details about the inverse application can be found in [2]. For inverse application, we need to provide the parent (lambda expression f, or the function output), the right child (r, or the argument) or the left child (l, or the function). Given one child, the module will calculate the other child so that f = l @ r. In the first example below, since there does not exist a lambda expression r (right child) so that mia @ r = #x.loves(x,mia), the inverse algorithm returns the right child expression as null. However, when mia is the right child, inverse lambda returns l = #x1.#x.loves(x,x1) because #x1.#x.loves(x,x1) @ mia = #x.loves(x,mia). In the case that the CCG parse tree specifies that the meaning of mia must be in the left child, using left child as #x.x@mia (via flipping) instead of mia will do the trick and let us have the same right child: #x1.#x.loves(x,x1). Example 1 Input:
Function Output = #x.loves(x,mia) 
Function = mia
Function_output = #x.loves(mia,x)
Nothing to compute.
Fig. 4: Failed inverse application
Example 2 Input:
Function Output = #x.loves(x,mia) 
Argument = mia
Function_output = #x.loves(mia,x)

Argument = mia
Function = #x2.#x.loves(x2,x)
Fig. 5: Successful inverse application
Fig. 6: Visual representation of inverse application

3.7. Generalization

The inverse lambda module is not always adequate to learn new meanings of words, particularly when we lack meaning of words that will allow us to use the inverse lambda module at all. To address that we have developed a generalization module in the NL2KR system, where the generalization technique described in [2] is implemented. For example, if we want to find the meaning of the word plays with the CCG category (S\NP)∕NP using generalization, we can use the lexical entry (eats, (S\NP)∕NP, λy.λx.eats(x,y)) in the lexicon. In this case, the category of eats is the same as plays. The generalization module will add a new lexical entry (plays, (S\NP)∕NP, λy.λx.plays(x,y)) to the lexicon. The input of the generalization module is the file path for lexicon (existing dictionary), and the new word that we want to generalize. Following is an illustration of the use of the generalization module. As inputs to the generalization GUI, we provide:
Lexicon  = ./examples/sample/dictionary.txt 
Word = Mia 
In this example, dictionary.txt contains:
Vincent     N               vincent 
Vincent     N               #x.vincent(x) 
takes       (S\NP)/NP       #w.#z.(w@#x.takes(z,x)) 
plane       N               #x.plane(x) 
boxer       N               #x.boxer(x) 
fights      S\NP            #x.fight(x)
In this case, the word Mia can be obtained by generalization from the words Vincent, plane and boxer, each of which have the category N (noun). The output of the generalization module is:
New lexical items learned through Generalization: 
Mia     N       #x.mia(x)
Mia     N       mia
Fig. 7: The generalization module GUI
We can restrict generalization by modifying the file. For example, adding the following snippet to will skip the generalization process for the NP and N categories, and for the words on, of, by and in.

3.8. CCG parser graph

CCG parser graph displays the tree structure of output of the CCG parser that parses the input sentence visually. The input of the CCG parser module is the sentence (or a file containing multiple sentences) we want to parse, and an optional path of the file containing the words and their additional categories we want to use. The parser will parse the input sentence and output its CCG parse tree. Our CCG parser is based on ASPccgTk [3] with some modifications such as our use of the Stanford Part-Of-Speech tagger [4] instead of the C&C Part-Of-Speech tagger [5] to improve accuracy. Following is an illustration of the use of the CCG parsing module. As inputs to the CCG parser GUI, we provide:
Syntax = ./examples/sample/syntax.txt
Sentence = Every boxer walks
In this example, syntax.txt contains:
takes       (S\NP)/NP 
Every       (S/(S\NP))/NP 
Some        (S/(S\NP))/NP 
walks       S\NP 
fights      S\NP 
loves       (S\NP)/NP
Fig. 8: The CCG parser module GUI
The output of the CCG parser is a parse tree structure of the sentence “Every boxer walks”. ccg-parser-graph
Fig. 9: Graph representation of the CCG parse tree
In ASP format, the output of the CCG parse tree of the sentence “Every boxer walks” is as follows:
nl2kr_token(t1, "Every", "(S/(S\NP))/NP", 1). 
nl2kr_token(t2, "boxer", "NP", 2). 
nl2kr_token(t3, "walks", "S\NP", 3). 
nl2kr_token(t4, "Every boxer", "S/(S\NP)",1). 
nl2kr_token(t5, "Every boxer walks", "S", 1). 
nl2kr_child_left(t4, t1). 
nl2kr_child_right(t4, t2). 
nl2kr_child_left(t5, t4). 
nl2kr_child_right(t5, t3). 
The predicate nl2kr_valid_rootNode is used to specify the root node of the parse tree (corresponds to the whole sentence). nl2kr_token is used to specify nodes in the tree, while nl2kr_child_left and nl2kr_child_right are for the left child and the right child of a node. The four atoms of nl2kr_token are respectively: the node ID, the corresponding phrase, its CCG category and its starting position in the sentence.

3.9. NL2KR-L: the learning module of NL2KR

To run the learning module NL2KR-L we need to provide the initial lexicon file path, the override file for syntax categories (optional), the training data file, and the output dictionary file path (optional) in the RunConfiguration file of the NL2KR-L. Following is an example of inputs to the NL2KR-L GUI:
Training = ./examples/sample/train.txt 
Lexicon = ./examples/sample/dictionary.txt 
Syntax = ./examples/sample/syntax.txt 
Output = ./examples/sample/train_dictionary.txt
For example, the preceding snippet specifies that: the training data is in ./examples/train.txt, the initial lexicon is in ./examples/sample/dictionary.txt, and the override syntax categories are in ./examples/sample/syntax.txt. The override syntax categories will be used in CCG parsing step as shown in the previous subsection. If it is not specified, the output dictionary is saved as dictionary_train.out in ./output folder. The training data file contains the training sentences and their formal representation such as:
Some boxer walks      EX. (boxer(X) ^ walk(X)) 
John takes a plane    EX. (plane(X) ^ takes(john, X)) 
John walks            walk(john)
In the above, EX denotes ∃X. The initial lexicon contains words and the their meanings that we already know:
John      N            john 
takes     (S\NP)/NP    #w. #z. (w@ #x. takes(z,x) ) 
plane     N            #x. plane(x) 
boxer     N            #x. boxer(x) 
fights    S\NP         #x. fight(x)
The NL2KR-L sub-part learns the new meanings of words in multiple iterations. It stops when it cannot learn any new word, then providing an interactive GUI to assist the system by manually providing meanings for words. Below we present the usage of the system with example inputs. When NL2KR-L attempts to learn using the training data and the initial lexicon, it first parses each of the training sentences and uses the meanings of the words it already knows from its lexicon to fill in the meanings in the parse trees. For the words it does not know the meaning of, it uses generalization and inverse lambda application to determine meanings, and adds those new meanings to the lexicon. A sentence is marked as complete if its computed meaning is equivalent to its expected meaning from the training data. If any new words were learned when processing the training data, it is processed repeatedly until no new words can be learned from the training data. If no more new words could be learned, but not all sentences were marked as complete, then the interactive GUI is presented. The GUI shows parse trees for each of the incomplete sentences and allows the user to let the system directly learn the meaning of words. With the example input, NL2KR-L was able to correctly learn the meanings of 2 of the 6 sentences. The remaining 4 sentences were not completely learned and are thus shown in the interactive GUI. nl2kr-l-gui
Fig. 10: The NL2KR-L interactive GUI
Fig. 10 shows the interface of the interactive GUI. On the top row are the learning process controls:<
  • Previous and Next control which sentence to focus on and manually provide meanings for its words.
  • Retry Learning tells NL2KR-L to re-process the training data. Use this after you’ve manually provided meanings for some of the words in these sentences and want NL2KR-L to attempt automated learning again.
  • Exit Learning tells NL2KR-L to stop the learning process and write the new learned meanings to the output file.
In the left side are the controls specific to the current sentence:
  • Save Changes to Lexicon takes all the new meanings you’ve provided for any of the words in the current sentence and puts them in the lexicon. Note that this does not yet write the new meanings to the output file!
  • Export Changes to Clipboard takes all the new meanings you’ve provided for any of the words in the current sentence and puts them on the clipboard. This clipboard data can then be pasted into any lexicon file as needed.
  • Reset undoes any meanings you’ve provided for the current sentence.
  • Lexicon Changes shows the list of all the new meanings you’ve provided for any of the words in the current sentence. This is essentially a list of pending changes to the lexicon; clicking Save Changes to Lexicon will put the meanings in this list into the lexicon.
  • History shows all the changes to the parse tree on the right. This allows you to revert any changes you make by double-clicking on the time you want to revert to. This is helpful if you overwrite a meaning of a word with something that you didn’t want.
Finally, on the right side is the parse tree graph that is the main focus of the interactive GUI. Each node in the parse tree displays four pieces of information:
  • the word (e.g.: She watches TV)
  • the syntax category (e.g.: [S])
  • the expected meaning, also called the expected lambda expression or ELE (e.g.: watch(she,TV))
  • the current meaning, also called the current lambda expression or CLE (e.g.: null)
This parse tree graph is interactive and “live”. Clicking the red circle of a node pops up a dialog that allows you to change the meaning of that node. The dialog displays that node’s word, syntax, and CLE. You can change the CLE by typing in the box and pressing Update. If NL2KR-L was able to determine any possible meanings of that word in any of its parse trees, you can also select one of those meanings by expanding the CLE dropdown. Another way to change the CLE of the node is to press Use ELE, which will set the node’s CLE to its ELE. Finally, you can expand the Generalize dropdown to see what meanings could be determined by using the generalization algorithm. nl2kr-l-gui-node-dialog
Fig. 11: The node meaning dialog
This interface is “live” in that any changes you make to a node immediately propagates to other nodes — manually changing a node’s CLE by clicking on it to get the node meaning dialog is not the only way to change a node’s CLE. In fact, there are 5 ways for a node N to be updated, corresponding to the 5 labels in the legend (besides Unchanged):
  1. Manual: click on N and manually select or enter its new current lambda, or generalize it
  2. Expected: if N is a leaf node and has an expected lambda, it will automatically use that as its current lambda
  3. Inverse: N’s current lambda will be updated to the inverse lambda computed using N’s parent’s lambda and sibling’s lambda (or not updated at all if the inverse could not be computed), if:
    • N has a current lambda, N’s parent and sibling both have lambdas, and N’s sibling is updated, OR
    • N has no current lambda, N’s parent and sibling both have lambdas, and either is updated
  4. Combined: if N does not have a current lambda, both of N’s children have lambdas, and either of those children are updated, then N’s lambda will be computed by combining the child lambdas. If the combination could not be computed, then N will not be updated
  5. Single Parent/Child: if N is a single child and its parent’s lambda is updated, or if N has a single child and that child’s lambda is updated, then N will use that lambda as its current lambda
Updating nodes also partially applies the learning algorithm in order to propagate expected lambdas downward. A sentence can be considered as learned when every node’s current lambda matches its expected lambda. Expected lambdas can be propagated in the following ways:
  • Manually updating a node N will set its expected lambda to the new current lambda
  • An updated node N that has a sibling and a parent with an expected lambda will set the expected lambda of N’s sibling to the inverse computed using N’s current lambda and the parent’s expected lambda
  • A node N that has an updated expected lambda and a single child will set the expected lambda of that child to N’s expected lambda
For example, if we wanted to fully learn the sentence “She watches TV” as presented in Fig. 10, the process might be:
  • Manually update the node “She [NP]” to have the current lambda she. This will set its expected lambda to she. Because its parent has an expected lambda, watch(she,TV), it will compute the inverse using the current lambda she and the expected lambda watch(she,TV) to obtain the inverse, TV). This becomes the new expected lambda of the node “watches TV [S\NP]”. See Fig. 12.
  • Manually update the node “watch [(S\NP)/NP]” to have the current lambda, x1). This will set its expected lambda to equal that current lambda. Because the parent has an expected lambda,, TV), it will compute the inverse using its current lambda and the parent’s expected lambda to obtain the inverse TV. This becomes the new expected lambda of the node “TV [NP]”.
  • Because the node “TV [NP]” has a single child, its expected lambda TV also becomes the expected lambda of its child node, “TV [N]”. Because the node “TV [N]” is a leaf node, it will automatically use TV as its current lambda.
  • Because the node “TV [N]” was updated and has a single parent, the node “TV [NP]” also gets the current lambda of TV.
  • Because the nodes “watches [(S\NP)/NP]” and “TV [NP]” have current lambdas and their parent (“watches TV [S\NP]”) does not, the nodes combine their lambdas to obtain, TV) and set their parent’s current lambda to this combination.
  • Because the nodes “She [NP]” and “watches TV [S\NP]” have current lambdas and their parent (“She watches TV [S]”) does not, the nodes combine their lambdas to obtain watch(she, TV) and set their parent’s current lambda to this combination.
  • All nodes in the tree now have current lambdas matching their expected lambdas, and the sentence is complete. See Fig. 13.
Fig. 12: Manually applying the meaning she to the node “She [NP]”
Fig. 13: The complete tree of the fully learned sentence
Notice in Fig. 13 that the Lexicon Changes pane shows the new meanings of the words “She”, “watches” and “TV” that previously could not be figured out by NL2KR-L on its own. If you Export Changes to Clipboard, the following new lexicon entries will be in your clipboard:
TV         N            TV
watches    (S\NP)/NP,x)
She        NP           she
You can manually paste these into your initial lexicon file in order to update it that way. Alternatively, you can Save Changes to Lexicon and then Retry Learning. NL2KR-L will re-process the training data and will correctly parse the sentence that we just manually taught it, will mark it as complete, and will subsequently only show the three remaining sentences in the interactive GUI. Once you’ve decided to stop the learning process, you can Exit Learning and NL2KR-L will output the new, final dictionary including all the new meanings of words.

3.10. NL2KR-T: the translation sub-part of NL2KR

Similar to NL2KR-L, in the GUI or the RunConfiguration file of NL2KR-T, we need to set the lexicon file path, the override file for syntax categories (optional), and the testing data file as given below.
Data =./examples/sample/test.txt 
Lexicon = ./output/train_dictionary.txt
Syntax = ./examples/sample/syntax.txt
In this example, the testing data is in ./examples/sample/test.txt, the lexicon is in ./output/dictionary_train.out, and the override syntax categories are in ./examples/sample/syntax.txt. The lexicon here is the learned lexicon output of NL2KR-L. In this example, the content of test.txt is
Mia likes sleep     likes(mia, sleep) 
John eats rice      eats(john, rice)
Note that the expected translation in “test.txt” is optional. Without it, the evaluation cannot determine automatically whether the translation was correct, but NL2KR-T will attempt a translation nonetheless. Running NL2KR-T with the inputs specified above, we have the following output: nl2kr-t-gui
Fig. 14: The translation output interface
We see that John eats rice is correctly translated to eats(john, rice) as expected.

4. Future Work

In the future, we plan to make NL2KR more scalable and add more features to the NL2KR system such as:
  • Automatically constructing the initial lexicon
  • Use more knowledge such as word sense to select the correct meaning of words

5. References

  1. Montague, R.: English as a Formal Language. In Thomason, R.H., ed.: Formal Philosophy: Selected Papers of Richard Montague. Yale University Press, New Haven, London (1974) 188–222
  2. Baral, C., Dzifcak, J., Gonzalez, M.A., Zhou, J.: Using Inverse lambda and Generalization to Translate English to Formal Languages. CoRR abs/1108.3843 (2011)
  3. Lierler, Y., Schüller, P.: Parsing Combinatory Categorial Grammar via Planning in Answer Set Programming. In Erdem, E., Lee, J., Lierler, Y., Pearce, D., eds.: Correct Reasoning. Volume 7265 of Lecture Notes in Computer Science., Springer (2012) 436–453
  4. Toutanova, K., Klein, D., Manning, C.D., Singer, Y.: Feature-rich part-of-speech tagging with a cyclic dependency network. In: NAACL ’03: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, Morristown, NJ, USA, Association for Computational Linguistics (2003) 173–180
  5. Clark, S., Curran, J.R.: Parsing the WSJ using CCG and log-linear models. In: ACL ’04: Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, Morristown, NJ, USA, Association for Computational Linguistics (2004) 103

Leave a Reply

Your email address will not be published. Required fields are marked *