Submission Guidelines: Submit your files on Canvas using the Programming Assignment 2 submission Link. No need to remove it or consider it a separate punctuation. In all previous appearances, Klingons spoke in English. It is the constructed language that film marked the first time the language had been heard on screen. The language’s basic sound, along with a few words, was first devised by actor James Doohan (Scotty) for Star Trek: The Motion Picture. NOTE: is considered to be part of the Klingon word. The Klingon language is the language of fictional Klingons in the Star Trek universe. Python users: You can use NLTK to break the word/POS tokens into tuples. Klingon_Train.txt Read in the text file to create the transition and emission probability table.Each token is in the following format: word/POS. Each line is a new sentence and each sentence consist of tokens separated by space. Probability (tagitag:-1) = Count(tagi-1, taga) / Count (tag:-1) NOUNVERB CONJ PRO START N V CONJ PRO 1.1 0.1 0.1 0.1 2.1 0.1 0.1 0.1 Step 3: Viterbi Decoding (decoding.java or decoding.py) Now consider the following test sentence: "tera 'ngan legh yas" Use the Viterbi algorithm (in slides), to generate the most likely sequence of POS tags for the given test sentence.ĭata File: The training examples from above are already in the given text file. Sample table values of transition for two parts of speech have been shown. After you've done this, compute the start and transition probabilities. Now, for each part of speech, total the number of times it transitioned to each other part of speech. NOUN 'e 'eg ghal ja'chuqmeH legh neH pa'Daq pug qip rojHom tal tera'ngan yas VERB CONJ PRO 0.1 1.1 1.1 0.1 0.1 1.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 Step 2: Creating the Transition probability table (transition.java or transition.py) Generate a transition probability table by calculating the transition frequencies from one POS tag to another. Probability (word[tag) = Count (word, tag) / Count(tag) ![]() Sample table values of two parts of speech have been shown. We'll use a smoothing factor of 0.1 (as discussed in class) to make sure that no event is impossible add this number to all of your observations. N PRO pa' Dag ghah room (inside) he The human is in the room V tah is N tera'ngan human PRO 'e of V N N tera'ngan human V ja' chuqmeh roj Hom neh in order to parley truce want The enemy commander wants a truce in order to parley N v N CONJ tera' ngan qIp pug human bit child and The child bit the human, and the human bit the child 'eg N puq child VN qIp tera'ngan bit child Step 1: Creating the Emission probability table (emission.java or emission.py) Create a Emission probability table by computing the frequencies of each part of speech in the table below for all POS tags. Using these training sentences, we're going to build a Hidden Markov Model (HMM) to predict the part of speech of an unknown sentence using the Viterbi algorithm. For each sentence, the part of speech of each “word” has been given (for ease of translation, some prefixes/suffixes have been treated as words), along with a translation. Consider the following sentences written in Klingon.
0 Comments
Leave a Reply. |
Details
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |