Cheonkam's Deep Learning Space: [Python Code - NLU] A Natural Language Understanding (NLU) Task: the Java Speech Grammar Format (JSGF) Development using Python

A Natural Language Understanding (NLU) Task: the Java Speech Grammar Format (JSGF) Development using Python

Code: https://github.com/ninackjeong/nlu-challenge

The following is a Context Free Grammar, written using JSGF

 #JSGF V1.0 utf-8 en;
 grammar music_play;

 public <music_play> =
	[can you] (play | put on) (<artist> | <song>);

 <artist> =
	 the beatles |
	 radio head |
	 lady gaga |
	 pink floyd;

 <song> =
	 comfortably numb |
	 paranoid android |
	 let it be |
	 hey jude;

This grammar creates utterances that express the desire or intent to play music. Then, they are used as training data for statistical models for intent recognition. This grammar can generate utterances using a custom parser as follows:

[can you play]<unk> [the beatles]<artist>
[can you put on]<unk> [paranoid android]<song>

Task 1: Extend the English Grammar

Extend the English JSGF development grammar ("jsgf_eng_basic_ruls.txt" under "eng") so that it can cover at least the following utterances.

[i want to listen to]<unk> [jazz]<genre> [music]<unk>
[play me]<unk> [ummagumma]<album> [by]<unk> [pink floyd]<artist>
[put]<unk> [lady gaga]<artist> [on]<unk>

Task 2: Localize the JSGF grammar in your language (Korean, here)

Localize the extended English grammar from the above task in Korean. Considerations

Korean is a SOV language
Korean utilizes case markers to mark case while English does syntactically

Procedure

Install pyjsgf to construct JSGF grammars, convert them into strings or files, and find grammar rules that match speech strings. Type the following in Terminal.

$ pip install pyjsgf

If you are installing for developing pyjsgf, clone or download the repository, move to the root directory, and run the following.

$ pip install -e .

Prepare the example utterances in Korean "kor_test_utters.txt" under "kor": a Korean version of the English test utterances, translated by me
Write Korean grammar, and test it "jsgf_kor_extended_rules.txt" under "kor": Korean grammar (Note: I did not split case markers and particles from words) "test_kor_grammar.ipynb" under "kor": A script to check whether the test utterances are covered by the grammar
Generate Korean utterances, which include at least test utterances "jsgf_kor_extended_rules_generated_utterance.txt": Utterances generated by the Korean rules using "kor_utters_generator.ipynb"

Think about the following:

Caveat: The followings are just my own opinions

Any possible issue, if you were asked to extend the grammar considerably? Can you think about how to overcome it?

Issue 1: This bottom-up approach is labor-intensive and error prone with the possible danger of lowering the model accuracy (I am not talking about JSGF per se; I am talking about how I approached this problem - and possibly about the way the basic given rules are written - based on the given utterances).
Solution to Issue 1: This may be mitigated by writing rules at a higher level. For instance, categorizing the utterances depending on their sentence type (e.g., imperatives) can be a way to mitigate such problems.
Issue 2: This approach features weak semantic relation.
Solution to Issue 2: A possible solution to this problem is to construct semantic rules as well. However, this solution may need duplicated work. Thus, another way of incorporating them, like a kind of dependency parsing at multiple levels or by adopting a constraint-based approach, is needed.
Comments: The aforementioned problems can be more serious n morphologically rich languages, such as Korean. Specifically, Korean has a large number of particles, including case markers, but they are optional depending on context or situation. A topic pro-drop is in the same vein. In addition, honorific features of such languages as Korean and Japanese require modeling various conjugated forms and sentence types.

Which features of Korean complicate the localization? Can you think about how to work around these complications?

Issue 1: The major challenges were related to different grammatical rules between Korean and English. The major difference between them definitely lies in their different word order or headness.
Solution to Issue 1: The order-related issues was not actually a major challenge as it was easily solved by switching positions in the higher rules.
Issue 2: The major challenge was due to the fact that Korean is a morphologically rich language while syntax plays a pivotal role in English. Thus, Korean requires more various word forms to consider.
Comments: All the aforementioned features of Korean are indeed challenging to model. However, a closer examination guided me to the conclusion that the major challenge or problem in the extended grammar might not be due to language specific features, but to the way the grammatical rules were written and/or work. To be specific, the extended grammar produced not just correct utterances but also semantically incorrect utterances. The erroneous utterances were due to the mismatch between the song or album and the artist. For instance, Ummagumma is a Pink Floyd's album, neither the Beatles' nor Lady Gaga's. However, these errors were also found in the corresponding English sentences, such as "i 'd like to hear ummagumma by radiohead." Thus, this mismatch (the major error) was not due to language specific features of Korean or English; rather this is possibly related to lack of firm semantic relations with respect to artists and songs, which may not be gracefully handled without humans' touch.

One last (linguistic) thought (also my opinion): Human languages are not context-free; rather, they are between context sensitive grammar and context free grammar in terms of Chomsky Hierarchy. Although language specific features of Korean make it difficult to build a model, the major challenge may be more related to how the computer parses human languages. Fortunately, human languages are similar in terms of deep structures. It may be impossible to build a perfect model, but if we abstractize models, parameterize language specific features, put some constraints with respect to semantic relations or incorporate semantic information, and build some processes to check both low and high-level grammars, a fancier (multilingual) model can be built.

Sunday, December 4, 2022

[Python Code - NLU] A Natural Language Understanding (NLU) Task: the Java Speech Grammar Format (JSGF) Development using Python

Task 1: Extend the English Grammar

Task 2: Localize the JSGF grammar in your language (Korean, here)

Procedure

Think about the following:

Any possible issue, if you were asked to extend the grammar considerably? Can you think about how to overcome it?

No comments:

Post a Comment

[Book Summary - CtDSI] Cracking the Data Science Interview Ch. 1

Postings