Speech:Spring 2014 Colby Johnson Log


 * Home
 * Semesters
 * Spring 2014
 * Proposal
 * Report
 * Information - General Project Information
 * Experiments - List of speech experiments

Week Ending February 4th, 2014
30Jan2014: My goal is to get familiarized with the software and interface involved in the process of running a Train. During this time I plan to download, install, and run any software necessary to do so. This involves Putty, and SSH-ing into Caesar.unh.edu
 * Task:

02Feb2014: Today my task is to log into my username and change the password, as well as learn how to set up an experiment and run a train.

03Feb2014: Complete a full dictionary for a First_5hr Train using genTans5.pl, Eric's updateDict.pl scripts. Possibly attempt a Train

04Feb2014: Attempt to build a language model for Experiment 0144

30Jan2014: I downloaded Putty and logged into Caesar as root. After reading the documentation I found that Trains cannot be run as root. Doing so could result in anything from an error to crashing the system. Either way the only progress possible was to SSH into Caesar and view the files.
 * Results:

02Feb2014: After getting through all the steps I have Created experiment 0144 but encountered an error that I cannot solve. I get the error "WARNING: This word: UH" was in the transcript file, but is not in the dictionary ( SO I DON'T HAVE ANY "ANY UH" RELATIVES THAT I AM OR OR UH ). Do cases match?" many times. I have been at it for about 3 hours now. I will look at it later to trouble shoot.

03Feb2014: Colby and I ran Eric's updateDict.pl without error, to our surprise, we took it a step further and successfully completed Phase 3. This was a huge milestone in understanding the process of model building. Future tasks will involve updating documenttation and providing detailed steps as to the experimental process we went trough. As well as updating the CMU dictionary stored on the server.

04Feb2014: After logging on I saw that my train had run successfully and was pleased to see things going well. I went on to create a new experiment (Experiment 0147) to generate my language modal. I ran the language model and did not realize the effect it would have on Caesar. Forrest was next to me working and complained that everything slowed to a crawl. Since then I have learned how to SSH into the machine dedicated to our team. I will be working from there from now on. I also learned from David how to upload files to the server through Filezilla. With this I was able to upload a new dictionary file. cmudict.0.7a has about 4000 more words and correcting to some of the phonetics. I will be updating the documentation to describe the use of this new dictionary instead of the now obsolete one. Next Step is to run a Decode!

30Jan2014: SSH into Caesar 1. Run Putty.exe 2. Enter Caesar.unh.edu (port 22 and SSH should be selected by default) 3. Log in as root(usernames did not exist at the time) 4. Follow steps on Foss - Running a Train
 * Plan:

02Feb2014: Log in and change username password. 1. Log in using username (Wildcat username) using temp password (same as username) 2. Change password by entering passwd then entering the old password followed by the new password look around the file directories and get familiar where things are located. 1. Examine the Train directory structure 2. Examine the Experiment directory structure

03Feb2014: Collaborated on all work with Colby Chenard. 1. Create a Dictionary against our Transcription file that was generated using genTrans5.pl  2. Using Experiment 0166 add2.txt file and Eric's updateDict.pl obtain a list of missing words from Dictionary and add to created dictionary. 3. Once we have a full dictionary list for first_5hr Train, attempt to run Train.

04Feb2014: Create a language model with my successfully run train in experiment 0144 1. Create a new Experiment 0147 to create the LM   2. Follow the documentation on Foss to generate a language model to get ready for the Decode

30Jan2014: Without usernames nothing was accomplished as Trains cannon be run on root besides getting familiar with how to log into Caesar.unh.edu
 * Concerns:

02Feb2014: Train directory structure is incorrectly described on Experiments page "/root/speechtools/SphinxTrain-1.0/" should read "/root/tools/SphinxTrain-1.0/" Error messages are making me very discouraged.

03Feb2014: updateDict.pl is process is undocumented therefore requires a bit of digging to learn.

04Feb2014: None. After the Train ran successfully, I was ready to move on. The simplicity of the LM documentation seems very straight forward and easy to follow. Eric did a teriffic job!

Week Ending February 11, 2014
08Feb2014: Logged in (read logs)

05Feb2014: Create a larger master Dictionary.
 * Task:

10Feb2014: Read up on linking and how to run Decode

11Feb2014: Run a decode successfully.

5Feb2014: I finished the dictionary the way I wanted it. Experiment 0144 now has a larger master dictionary which I will soon transfer into a more organized dictionary folder.
 * Results:

10Feb2014: I could not find out how to link files correctly from reading logs, David ended up giving me a nice description on what to do though. Should be running my Decode tomorrow.

11Feb2014: Decode was run successfully but after a failure from leaving the network with the machine I was logged in under.

05Feb2014: Use the cmudict.0.7a dictionary and merge extra words into it  1. Using the dictionary file generated for my Experiment 0144 merge extra words into it. 2. I merged the cmudict.0.7a into it as well as the file that adds all the missing words for the first_5hr train. 3. I had to automate the merge because there were a couple thousand words that had to go through it. 4. It still took a few hours to go through. once completed the finished dictionary had about 136,000 words (133,000 in cmudict.0.7a)
 * Plan:

10Feb2014: Look through logs and learn how experiments are correctly run from the train to the decode. This will be valulable when providing insight to other students and giving advice on writing scripts for to simplify process.

11Feb2014: Link files and run decode 1. First a started by linking all the files from the Experiment where I ran the Train (0144) to the Experiment that I built my LM  2. This was done by entering % ln -s /mnt/main/Exp//*. (while in the destination directory) 3. Do the same for the LM in the original Experiment (0144) % ln -s /mnt/main/Exp//LM. 4. Run the decode as described online. 5. Worked great 05Feb2014: The updateDict.pl only seems to work when you are in the experiment folder. this made it a few more steps but worked out the same.
 * Concerns:

10Feb2014: I have no idea how to link files when two experiments have been used for the same decode. Hope to find some insight.

11Feb2014: I had attempted to make a master dictionary in this experiment folder after I found updateDict.pl only works in the experiment folders. All i did was revert back to an older version of my dictionary and ran the decode.

Week Ending February 18, 2014
12Feb2014:
 * Task:
 * Run 10 different trains in 10 different experiments
 * Each train will have a senone value of 3000
 * Each train will have a varying density
 * 8,16,32,64,128
 * The trains will use one of two Corpus Switchboard training files
 * /mnt/main/corpus/switchboard/first_5hr/train
 * /mnt/main/corpus/switchboard/first_5hr/test

13Feb2014:
 * Run Trains
 * Solve errors that were stopping me
 * Build LM
 * One for each Experiment located in the same Experiment number
 * Begin Decode Process

14Feb2014
 * Verify that Decodes worked correctly
 * Score all decodes
 * Compare results
 * Run Identical experiments w/ genTrans5
 * supposedly will yield better results

16Feb2014
 * Finish Decodes
 * Collect and post data
 * Create graph
 * visually represent data in a clear fashion
 * Point out obvious inefficiencies.

12Feb2014: I created 10 different experiments, as to reserve a block in the Experiment list. I followed up with providing a page and a title to each of the Experiments, as to be a place holder on the documentation side of things as well. 13Feb2014 Not all the trains were run successfully. The ones that used first_5hr/test would not run. I decided to work on a solution to these later. I wanted to create some date to work with. 14Feb2014 The results show a nice curve. This will be graphically represented in the future once more data has been compiled. Notice How Experiment 0171 had 5% less WER than 0161. The only variable was the genTrans file. 16Feb2014 Data looks great! plotted some nice curves, should really give an idea of how this system works and where some obvious inefficiencies lie. I posted the data and notes at the bottom. GenTrans5 turned out to be better. Now the question is why?
 * Results:

Each Experiment was layed out as follows: EXP 0161: Transcription: first_5hr/train Senone Value: 3000 Density: 8 Dictionary: /mnt/main/corpus/dist/custom/first_5hr_train_full.dic Train Time: 1 Hr 3 Min Decode Time: 5.53 Hours SYSTEM SUMMARY PERCENTAGES by SPEAKER ,-.     |                            hyp.trans                            | |-|     | SPKR    | # Snt # Wrd | Corr    Sub    Del    Ins    Err  S.Err | |=================================================================|     | Sum/Avg | 4659  68616 | 61.0   20.2   18.8    3.3   42.3   80.8 | |=================================================================|     |  Mean   | 58.2  857.7 | 61.8   20.0   18.1    3.6   41.8   81.8 | | S.D.   | 22.1  330.0 | 10.6    5.9    5.5    2.5   10.5   10.6 | | Median | 55.5  813.0 | 63.4   19.4   17.5    2.9   40.8   83.5 | `-'

EXP 0163: Transcription: first_5hr/train Senone Value: 3000 Density: 16 Dictionary: /mnt/main/corpus/dist/custom/first_5hr_train_full.dic Train Time: 1 Hr 17 Min Decode Time: 4.77 Hours SYSTEM SUMMARY PERCENTAGES by SPEAKER ,-.     |                            hyp.trans                            | |-|     | SPKR    | # Snt # Wrd | Corr    Sub    Del    Ins    Err  S.Err | |=================================================================|     | Sum/Avg | 4659  68616 | 68.5   14.9   16.5    2.2   33.6   71.8 | |=================================================================|     |  Mean   | 58.2  857.7 | 69.5   14.5   16.0    2.4   32.9   72.4 | | S.D.   | 22.1  330.0 | 11.5    6.1    5.9    1.9   11.2   10.9 | | Median | 55.5  813.0 | 70.6   12.9   14.9    1.7   31.2   73.1 | `-'

EXP 0165: Transcription: first_5hr/train Senone Value: 3000 Density: 32 Dictionary: /mnt/main/corpus/dist/custom/first_5hr_train_full.dic Train Time: 2 Hr 5 Min Decode Time: 5.25 Hours SYSTEM SUMMARY PERCENTAGES by SPEAKER ,-.     |                            hyp.trans                            | |-|     | SPKR    | # Snt # Wrd | Corr    Sub    Del    Ins    Err  S.Err | |=================================================================|     | Sum/Avg | 4659  68616 | 73.0   12.5   14.5    1.5   28.5   61.6 | |=================================================================|     |  Mean   | 58.2  857.7 | 74.0   12.0   14.0    1.7   27.6   62.1 | | S.D.   | 22.1  330.0 | 11.9    6.3    6.3    1.3   11.6   11.3 | | Median | 55.5  813.0 | 76.2   10.9   13.0    1.3   24.9   61.3 | `-'

EXP 0167: Transcription: first_5hr/train Senone Value: 3000 Density: 64 Dictionary: /mnt/main/corpus/dist/custom/first_5hr_train_full.dic Train Time: 3 Hr 2 Min Decode Time: 5.68 Hours SYSTEM SUMMARY PERCENTAGES by SPEAKER ,-.     |                            hyp.trans                            | |-|     | SPKR    | # Snt # Wrd | Corr    Sub    Del    Ins    Err  S.Err | |=================================================================|     | Sum/Avg | 4659  68616 | 70.2   10.0   19.7    1.3   31.1   58.0 | |=================================================================|     |  Mean   | 58.2  857.7 | 71.3    9.7   18.9    1.4   30.0   58.5 | | S.D.   | 22.1  330.0 | 12.5    4.5   10.1    0.8   12.2   12.7 | | Median | 55.5  813.0 | 73.7    8.5   16.2    1.2   27.5   57.7 | `-'

EXP 0169: Transcription: first_5hr/train Senone Value: 3000 Density: 128 Dictionary: /mnt/main/corpus/dist/custom/first_5hr_train_full.dic Train Time: 3 Hr 31 Min Decode Time: 4.41 Hours SYSTEM SUMMARY PERCENTAGES by SPEAKER ,-.     |                            hyp.trans                            | |-|     | SPKR    | # Snt # Wrd | Corr    Sub    Del    Ins    Err  S.Err | |=================================================================|     | Sum/Avg | 4659  68616 | 54.2    7.5   38.3    1.0   46.9   69.2 | |=================================================================|     |  Mean   | 58.2  857.7 | 55.0    7.5   37.5    1.0   46.1   69.3 | | S.D.   | 22.1  330.0 | 12.6    2.5   14.2    0.7   12.3   12.8 | | Median | 55.5  813.0 | 57.1    7.1   36.2    0.9   43.4   69.6 | `-'

EXP 0171: Transcription: first_5hr/train Senone Value: 3000 Density: 8 Dictionary: /mnt/main/corpus/dist/custom/first_5hr_train_full.dic Train Time: 1 Hr 2 Min Decode Time: 5.94 Hours SYSTEM SUMMARY PERCENTAGES by SPEAKER ,-.     |                            hyp.trans                            | |-|     | SPKR    | # Snt # Wrd | Corr    Sub    Del    Ins    Err  S.Err | |=================================================================|     | Sum/Avg | 4659  68616 | 79.5   15.2    5.3   16.9   37.4   94.6 | |=================================================================|     |  Mean   | 58.2  857.7 | 79.5   15.5    5.1   18.4   38.9   95.2 | | S.D.   | 22.1  330.0 |  6.4    5.1    2.2    8.5   11.6    5.4 | | Median | 55.5  813.0 | 80.5   14.5    4.9   17.9   37.3   96.9 | `-'

EXP 0173: Transcription: first_5hr/train Senone Value: 3000 Density: 16 Dictionary: /mnt/main/corpus/dist/custom/first_5hr_train_full.dic Train Time: 1 Hr 26 Min Decode Time: 15.02 Hours SYSTEM SUMMARY PERCENTAGES by SPEAKER ,-.     |                            hyp.trans                            | |-|     | SPKR    | # Snt # Wrd | Corr    Sub    Del    Ins    Err  S.Err | |=================================================================|     | Sum/Avg | 4659  68616 | 86.1    9.5    4.4   14.6   28.5   91.1 | |=================================================================|     |  Mean   | 58.2  857.7 | 85.9    9.8    4.3   16.1   30.2   92.0 | | S.D.   | 22.1  330.0 |  4.6    3.5    1.9    7.9   10.4    7.3 | | Median | 55.5  813.0 | 86.7    9.3    3.8   15.1   29.2   93.8 | `-'

EXP 0174: Transcription: first_5hr/train Senone Value: 3000 Density: 32 Dictionary: /mnt/main/corpus/dist/custom/first_5hr_train_full.dic Train Time: 1 Hr 56 Min Decode Time: 20.68 Hours SYSTEM SUMMARY PERCENTAGES by SPEAKER ,-.     |                            hyp.trans                            | |-|     | SPKR    | # Snt # Wrd | Corr    Sub    Del    Ins    Err  S.Err | |=================================================================|     | Sum/Avg | 4659  68616 | 91.1    5.4    3.5   12.0   20.9   83.6 | |=================================================================|     |  Mean   | 58.2  857.7 | 90.9    5.7    3.4   13.2   22.3   85.1 | | S.D.   | 22.1  330.0 |  3.3    2.3    1.8    6.9    8.5    9.5 | | Median | 55.5  813.0 | 91.4    5.2    3.0   12.0   21.1   86.7 | `-'

EXP 0175: Transcription: first_5hr/train Senone Value: 3000 Density: 64 Dictionary: /mnt/main/corpus/dist/custom/first_5hr_train_full.dic Train Time: 3 Hr 30 Min Decode Time: 23.5 Hours SYSTEM SUMMARY PERCENTAGES by SPEAKER ,-.     |                            hyp.trans                            | |-|     | SPKR    | # Snt # Wrd | Corr    Sub    Del    Ins    Err  S.Err | |=================================================================|     | Sum/Avg | 4659  68616 | 93.8    3.3    2.9    8.8   15.0   71.5 | |=================================================================|     |  Mean   | 58.2  857.7 | 93.7    3.4    2.9    9.7   16.0   73.2 | | S.D.   | 22.1  330.0 |  2.7    1.4    2.0    5.4    6.5   11.0 | | Median | 55.5  813.0 | 94.5    3.1    2.4    8.7   14.1   73.8 | `-'

EXP 0176: Transcription: first_5hr/train Senone Value: 3000 Density: 128 Dictionary: /mnt/main/corpus/dist/custom/first_5hr_train_full.dic Train Time: 5 Hr 8 Min Decode Time: 23.35 Hours SYSTEM SUMMARY PERCENTAGES by SPEAKER ,-.     |                            hyp.trans                            | |-|     | SPKR    | # Snt # Wrd | Corr    Sub    Del    Ins    Err  S.Err | |=================================================================|     | Sum/Avg | 4659  68616 | 87.6    4.7    7.8    4.9   17.3   65.8 | |=================================================================|     |  Mean   | 58.2  857.7 | 87.6    4.7    7.7    5.2   17.7   67.5 | | S.D.   | 22.1  330.0 |  4.4    1.2    4.1    2.6    4.8   11.3 | | Median | 55.5  813.0 | 88.0    4.6    7.3    4.6   16.5   69.0 | `-'

FINAL DATA ANALYSIS

16Feb2014: Graph representing WER with varying densities. Green and Orange depict genTrans6 and genTrans5, respectively. Senone value remained Constant through all experiments (3000). The Density values were as follows: 8,16,32,64,128 Future Experiments could yield better data, but this begs the question, Is the genTrans file the key to increased efficiency. With Senone value optimized for the data. We now have possible optimized density values. The biggest thing to Note from this data is the increase in time as the density increased. For example the Density: 8 Experiment took about 4-6 hours to decode. The Density: 128 Experiment took about 20 hours to Decode. My conclusion is that using a Density value of 32 is optimal although there was a significant decrease in error rate between 32 - 64 density. I will be running more small data collection tests as well as longer trains to ensure error does not increase over time

The goal of using the Transcription files is because of their varying lengths. The first_5hr/test is roughly a 30 minute subset of the complete first_5hr/train. Because a 3000 senone value is right around the recommended value for a 5 hour train and Eric proved that there is still a minor increase in using larger senone values on smaller trains, we will be using 3000 for all. The Density was supposedly the key to achieving the optimal results. With this array of experiments to decode, we can expect to get a good plot of data to reflect upon.

I ran Into several issues trying to run the Trains. Some of which were easy fixes such as missing dictionary words. Others however I have not yet seen before and cannot quite decipher. I will look into these errors more tomorrow with an extra set of eyes.

12Feb2014:
 * Plan:
 * First generate all of the Experiments I will be using for the trains
 * Ideally to run them all at the same time.
 * Begin the process of one by one running a train
 * Alternate Experiments to use different Train files
 * The goal of this method is to show the differing results with high senone values on shorter transcription files against recommended senone values on larger transcription files
 * Every two Experiments, double the density
 * This could cause a long time to train

13Feb2014: Troubleshoot why trains will not run correctly I was able to figure out that this is where the issue was stemming from. Something in the training process would error out as soon as it got to a certain point. This was solved by using a prior dictionary know to be good. I copied this dictionary and have placed it in an easy to find spot /mnt/main/corpus/dist/custom/first_5hr_train_full.dic
 * One thing I noticed was that the Phone list that was generated had somephones duplicated with a Blue character at the end
 * These were later found to be derived from the dictionary that was created.
 * Looking further into it, The cmudict.0.7a_custom.dic contained the same characters
 * This could have been from uploading the file after modifying it on my desktop (in notepad)
 * This dictionary contains all of the words needed for a first_5hr train

After fixing the Dictionary file I was able to get some of the trains running. These were only the ones that used the first_5hr/train Corpus Switchboard. the others I attempted to run were the first_5hr/test. These are a 30 min subset of the full first_5hr train. I have given up for the time being on the subset trains, and have focused on the full five hours of data. Since 5 Experiments ran I already had decent data in the works. I then continued to build a LM in the same experiment because I was already using quite a few. Minimizing the amount of Experiments will reduce confusion in the future. The LM went off without a hitch, The commands are very simple. After all the LM's were successfully built I started running Decodes. This was a time consuming process that I ended up letting run overnight. I did run into a snag while decoding. Because I was using other than default senone values, I needed to use run_decode2.pl because the original file did not allow for the senone value to be modified. Run decode2 is run as follows; cp -i /mnt/main/scripts/user/run_decode2.pl. ./run_decode2.pl   3000 I allowed the decodes to run overnight as I was expecting them to take quite a few hours. I want to thank the groups for allowing me to borrow their resources as to lighten the load on any individual machine.
 * Idefix
 * Traubadix
 * Caesar
 * Methusalix

14Feb2014: I ran 0171 as an example of the benefit of genTrans5.pl. I had read that genTrans5.pl was providing better results for Eric's group. With this in mind I wanted to compare the two. Experiment 0171 is an exact replica of 0161 except using genTrans5.pl I will duplicate the the rest of the experiments if the data proves to be worth it. (it did) I ended up running the other 4 Experiments. These are all duplicates of their counterparts, besides the genTrans file used in transcription generation. These are all in the process of Decoding.

16Feb2014 My Plan today was just to collect and score all of the Decode data. This proved to be very usefull amount of data that was collected. I wanted to post it all on the wiki as soon as they were all finished. Leter in the day I plan to make a graph representing the data in a clear and easy to read format

12Feb2014: My Main concern is that I will be running too many trains for the server to handle. In this event I would balance the load across the two machines allotted for my group. If this is still not enough, I suppose they could take turns. Another concern was the fact that some of my files would get a blue ^W character at the end of some lines. It immediately proved to be an issue with the Exp#.phone file so I had to manually go through and delete all the likes that displayed that character. Could not get any trains to run tried multiple differences across 5 different trains. Could not achieve success with a single one. With tired eyes, I give up until tomorrow. 13Feb2014 The only concern I had was not being able to get the trains to run correctly. After much deliberation I learned that the dictionary file was formatted incorrectly. This was an easy fix once it was identified. 14Feb2014: No Concerns, just going to take time. 16Feb3014: No Concerns, Decodes were all successfully completed. Just had to collect the data.
 * Concerns:
 * This issue also appeared in my dictionary file. (Not sure the cause)

Week Ending February 25, 2014
19Feb2014: 20Feb2014:
 * Task:
 * Help Colby C with:
 * Running Trains
 * Building Language Models
 * Running Decodes
 * Attempt a full train
 * 308hr/train
 * about 96 hours of data according to CMU
 * Collect Results
 * Score all of the successfully run decodes
 * Collect times it too trains and decodes to run

24Feb2014: All work was done collaboratively with Colby C
 * Create a 100hr subset of the full data set
 * Learn about past sphinx training and decode parameters used
 * 
 * 
 * Attempt to run tests on the 10hr AMs using small data subset
 * Create graphs with Completed decode data

25Feb2014 All work was done collaboratively with Colby C and David M
 * Fix the experiments run last night
 * rerun with the appropriate decode setup
 * Brainstorm/ Implement transcription fixes to improve accuracy overall.
 * Use clean trans data (no [laughter] or [noise] etc)
 * Use filler dictionary instead of reg exing them out with genTrans

19Feb2014: Colby C and I managed to get the trains to run without a problem. After all of the AMs and LMs were created we attepmpted to run the Decodes. These ran fine except for the fact that as soon as the terminal was closed, or the network of the user changed, the process would die. We remedied this issue by running the process in the background. nohup &
 * Results:

With this solution we were able to free up clutter on the desktop as well as let decodes and trains run while we change networks and are idle.

EXP 0162: Transcription: 10hr/train Senone Value: 5000 Density: 8 Dictionary: /mnt/main/corpus/dist/custom/10hr.dic Train Time: 2 Hours 57 Min Decode Time: 17.18 Hours SYSTEM SUMMARY PERCENTAGES by SPEAKER ,-.     |                            hyp.trans                            | |-|     | SPKR    | # Snt # Wrd | Corr    Sub    Del    Ins    Err  S.Err | |=================================================================|     | Sum/Avg | 8860  134108| 79.6   15.6    4.8   17.4   37.8   96.7 | |=================================================================|     |  Mean   | 43.9  663.9 | 79.7   15.7    4.6   18.5   38.8   96.8 | | S.D.   | 19.4  286.0 |  6.8    5.5    2.3    8.6   12.5    4.2 | | Median | 40.0  595.5 | 80.5   14.9    4.3   17.6   37.8   98.1 | `-'

EXP 0164: GenTrans: genTrans5.pl Transcription: 10hr/train Senone Value: 5000 Density: 16 Dictionary: /mnt/main/corpus/dist/custom/10hr.dic Train Time: 3 Hours 8 min Decode Time: 74023 Sec (20.56 Hours) SYSTEM SUMMARY PERCENTAGES by SPEAKER ,-.     |                            hyp.trans                            | |-|     | SPKR    | # Snt # Wrd | Corr    Sub    Del    Ins    Err  S.Err | |=================================================================|     | Sum/Avg | 8860  134108| 85.8   10.2    4.0   15.4   29.6   94.0 | |=================================================================|     |  Mean   | 43.9  663.9 | 85.7   10.4    3.9   16.4   30.7   94.2 | | S.D.   | 19.4  286.0 |  5.2    4.0    2.1    8.0   11.1    6.2 | | Median | 40.0  595.5 | 86.2    9.9    3.4   15.4   29.0   95.7 | `-'

EXP 0166: GenTrans: genTrans5.pl Transcription: 10hr/train Senone Value: 5000 Density: 64 Dictionary: /mnt/main/corpus/dist/custom/10hr.dic Train Time: 6 Hours 53 min Decode Time: 104135 Sec (28.89 Hours) SYSTEM SUMMARY PERCENTAGES by SPEAKER ,-.     |                            hyp.trans                            | |-|     | SPKR    | # Snt # Wrd | Corr    Sub    Del    Ins    Err  S.Err | |=================================================================|     | Sum/Avg | 8860  134108| 93.3    3.9    2.8   10.0   16.7   78.7 | |=================================================================|     |  Mean   | 43.9  663.9 | 93.2    4.1    2.7   10.9   17.7   79.6 | | S.D.   | 19.4  286.0 |  3.1    2.0    2.3    6.0    7.6   11.1 | | Median | 40.0  595.5 | 93.9    3.6    2.1    9.8   16.4   81.3 | `-'

EXP 0168: Transcription: first_5hr/train Senone Value: 5000 Density: 8 Dictionary: /mnt/main/corpus/dist/custom/first_5hr_train_full.dic Train Time: 1 Hours 10 min Decode Time: 6.88 Hours SYSTEM SUMMARY PERCENTAGES by SPEAKER ,-.     |                            hyp.trans                            | |-|     | SPKR    | # Snt # Wrd | Corr    Sub    Del    Ins    Err  S.Err | |=================================================================|     | Sum/Avg | 4659  68616 | 82.2   12.8    5.0   16.1   33.9   94.5 | |=================================================================|     |  Mean   | 58.2  857.7 | 82.0   13.1    4.8   17.7   35.6   94.9 | | S.D.   | 22.1  330.0 |  5.6    4.3    2.1    8.4   11.4    5.5 | | Median | 55.5  813.0 | 83.2   12.8    4.5   16.8   34.6   96.5 | `-'

EXP 0170: Transcription: first_5hr/train Senone Value: 5000 Density: 16 Dictionary: /mnt/main/corpus/dist/custom/first_5hr_train_full.dic Train Time: 1 Hours 37 min Decode Time: 9.05 SYSTEM SUMMARY PERCENTAGES by SPEAKER ,-.     |                            hyp.trans                            | |-|     | SPKR    | # Snt # Wrd | Corr    Sub    Del    Ins    Err  S.Err | |=================================================================|     | Sum/Avg | 4659  68616 | 88.1    7.8    4.1   13.5   25.4   89.3 | |=================================================================|     |  Mean   | 58.2  857.7 | 87.9    8.1    3.9   14.8   26.9   90.3 | | S.D.   | 22.1  330.0 |  4.2    3.2    1.9    7.5    9.7    7.9 | | Median | 55.5  813.0 | 88.6    7.4    3.5   13.9   25.6   92.7 | `-'

EXP 0180: Transcription: first_5hr/train Senone Value: 5000 Density: 32 Dictionary: /mnt/main/corpus/dist/custom/first_5hr_train_full.dic Train Time: 2 Hours 19 min Decode Time: 8.33 Hours SYSTEM SUMMARY PERCENTAGES by SPEAKER ,-.     |                            hyp.trans                            | |-|     | SPKR    | # Snt # Wrd | Corr    Sub    Del    Ins    Err  S.Err | |=================================================================|     | Sum/Avg | 4659  68616 | 91.8    4.6    3.6   10.7   19.0   78.9 | |=================================================================|     |  Mean   | 58.2  857.7 | 91.7    4.8    3.5   11.8   20.0   80.6 | | S.D.   | 22.1  330.0 |  3.2    2.0    2.2    6.1    7.3   10.2 | | Median | 55.5  813.0 | 92.3    4.5    2.8   11.1   19.6   82.7 | `-'

EXP 0181: Transcription: 10hr/train Senone Value: 5000 Density: 32 Dictionary: /mnt/main/corpus/dist/custom/10hr.dic Train Time: 5 Hours 46 min Decode Time: 87504 Sec (24.3 Hours) SYSTEM SUMMARY PERCENTAGES by SPEAKER ,-.     |                            hyp.trans                            | |-|     | SPKR    | # Snt # Wrd | Corr    Sub    Del    Ins    Err  S.Err | |=================================================================|     | Sum/Avg | 8859  134105| 90.8    6.1    3.1   12.8   22.0   87.6 | |=================================================================|     |  Mean   | 43.9  663.9 | 90.7    6.3    3.0   13.8   23.1   88.0 | | S.D.   | 19.4  286.0 |  3.8    2.9    1.6    7.1    9.5    8.3 | | Median | 40.0  595.5 | 91.5    5.8    2.7   13.0   21.6   88.9 | `-'

EXP 0182: Transcription: first_5hr/train Senone Value: 5000 Density: 64 Dictionary: /mnt/main/corpus/dist/custom/first_5hr_train_full.dic Train Time: 3 Hours 22 min Decode Time: 9.09 Hours SYSTEM SUMMARY PERCENTAGES by SPEAKER ,-.     |                            hyp.trans                            | |-|     | SPKR    | # Snt # Wrd | Corr    Sub    Del    Ins    Err  S.Err | |=================================================================|     | Sum/Avg | 4659  68616 | 85.7    6.1    8.2    7.4   21.6   78.1 | |=================================================================|     |  Mean   | 58.2  857.7 | 85.7    6.1    8.2    8.0   22.3   79.8 | | S.D.   | 22.1  330.0 |  5.4    1.4    5.0    4.1    6.5    9.5 | | Median | 55.5  813.0 | 85.8    5.9    7.0    7.1   21.5   81.6 | `-'

results of graph were as expected. future graphs will show more comparison data.

I was afraid to attemp a 10 hr with new params, but in doing so we should see some pretty positive results. These should show how sphinx 3 interacts with other steps....

the slow results of a 100hr train concluded that we have a viable transcription file but not a full dictionary. This has become an issue that will be looked into a bit more. The next step is to complete the dictionary.

Prepare several Experiments to have trains, LM, and Decodes run. The goal is to set up 8 Experiments in total: All of these Experiments used the genTrans5.pl file to prepare the transcriptions. They also were all run with a Senone value of 5000 the value was based on a recommended value, although this is a bit above the actual recommended value. The Densities between the two sets of 4 used increasing density values of 8,16,32,64. 128 has been omitted as it would only be necessary on very large amounts of training data.
 * Plan:
 * 4 will use first_5hr/train
 * 0168,0170,0180,0182
 * 4 will use 10hr/train
 * 0162,0164,0166,0181

20Feb2014:
 * Go into every experiment I have run this semester and collect data
 * Collect all parameters used for each Exp
 * Collect Train duration and Decode duration of all Exp
 * Display SClite scores if possible
 * start to put data in new Results page

24Feb2014 (Now we have a 100hr data set to train off of)
 * Build 100hr data set from the full data set
 * Create 100hr Dir
 * 100hr
 * 100hr/train
 * 100hr/train/trans
 * 100hr/train/wav
 * Copy 1/3 of the text to a new txt file
 * Upload to server
 * Run copySph.pl to make symbolic links to the SPH files needed
 * /mnt/main/scripts/user/copySph.pl

25Feb2014 Encountered a permissions error. We were incapable of removing a directory. Switching networks will continue to ruin our running processes. nohup &
 * So the correct way to do this is, to go through the entire process of running the train, without actually running it.
 * Things we need:
 * Dictionary, feats and language model.
 * Then we run a decode as we normally would but change the second parameter to the experiment # to the acoustic model that you would like to test off of.
 * So we will decode against 5hr/test data, as sort of a subset, but our training data(acoustic model) was built off of the 10hr corpus.
 * Concerns:
 * Solved by using sudo command prior to the rm -rf
 * Solved by running processes in the background

20Feb2014: No Concerns, just collecting Data

24Feb2014 Training:
 * Do we need OOV (out of vocabulary) words in transcript or can they be removed
 * Find where inefficiencies lie in the training process

Decode:
 * Interpreting parameter names
 * Time...(paralellization)
 * Creating a decoding with smaller data sets

25Feb2014
 * The Future
 * One of my main concerns looking forward is optimization. Right now we are averaging about 15% to 30% error rates, and 15% is well over trained.


 * After some research Colby and I found that others before us have had much better results some even as low as 7%, so that in mind I would really like to find out what we can do to make our results more optimal.
 * I think we need to look at our dictionary and try to compile that better, however there are so many variables to account for so we need to try and make it less cumbersome.

Week Ending March 4, 2014
27Feb2014:
 * Task:
 * My Task today is to create links to some useful abstracts and documentation I have found.
 * This simple task is a conglomeration of my research up to this point

28Feb2014 All Work was done collaboratively with David and Colby C
 * Create a transcription with only clean data
 * This is different from genTrans, we want to extract only the utterances with clean data instead of removing single words.
 * No brackets
 * remove { } chars
 * remove _1 chars
 * remove - at the end of words
 * The clean data transcriptions were made for the full, first_5hr, last_5hr, and 10hr corpora
 * They are located in their respected locations under corpus/switchboard/ /clean/trans/train.trans
 * Create Experiments that houses multiple trains run with varying parameters
 * Corpora
 * first_5hr/clean (0199)
 * last_5hr/clean (0200)
 * 10hr/clean (TBD)
 * full/clean (TBD)
 * Densities
 * 8
 * 16
 * 32
 * Senone Values
 * 3000
 * 5000
 * 7000

3March2014
 * Re-run all the Trains that failed over the weekend
 * Attempt a train using 2 CPUs
 * NPART =2
 * QUEUE_TYPE = Queue::POSIX

4March2014
 * Decode and score all trains run in Exp 0199
 * Create graphs reflecting Data


 * Results:

27Feb2014: [] [] [] [] [] [] [] []
 * This website provides a detailed look at the scripts that reside in Sphinx3. It describes parameter values in a good amount of detail
 * The Incomplete guide to Sphinx-3 Performance tuning (AM tuning NOT training)
 * A Decent look at the decode tuning parameters
 * A large collection of data gathered using WSJ corpora. Reflects similar results to our own (but with greater accuaracy)
 * A very Detailed look a Phone frequencies/ Syllable duration and other information regarding the Switchboard Corpus
 * FAQs about Sphinx speech recognition
 * Some discussion about Switchboard Corpus Cross Talk. Seems to be why first_5hr has less accurate results last_5hr
 * Forum thread on Parallelization using the -npart Parameter

28Feb2014
 * We set up all the Experiment dirs with the clean datasets
 * We began to run trains on all of the created dirs.
 * Then we murdered Caesar by popping a fuse
 * We will resume training once Caesar is back up and running

3March2014 The trains all had to be rerun as none of them finished before we popped a fuse. I restarted all and they all have completed successfully. This is one step closer to showing a relationship between different corpora and varying parameters. The next step will be to build the LMs and decode them all. This is where it will get time consuming. The Trains took a couple hours. The decodes are going to take a day or two to finish all. Also I was successful in running a train using parallelization of local machines resources. This reduced training time in half! This is one step forward in our large scale parallelzation effort. Future plans are to run on multiple machines. I will be getting with Forrest to discuss his progress with Torque.

4March2014 As I expected we achieved a lower error rate but the hyp.trans file generated shows that the conversation overlap is still present. This seems to be a our biggest hurdle. Exp 0200 will prove this true. Exp 0200 uses Last_5hr/Clean Which supposedly has little to no cross talk. This should give us more accurate results (this is why Eric was getting better results and did not know why) There is little to no cross talk in conversation 3170 on. This has let me to create a 3170/clean subset (about 97 hours of data)

Results of 0199 were somewhat as expected but still low the results are present here The xRT is the Realtime factor. we want something around 1 as you can see. were not very close. This can be mitigated later on after we get more accurate results. In the mean time I want to find out why the Times varied so greatly.
 * 0199 Speech:Exps 0199


 * Plan:

27Feb2014: 1. Copy 2. Paste 3. Repeat

28Feb2014 David found a grep and sed combo that does exactly what we needed to remove the "dirty" data. This allowed us to push the results into a file which I then copied into the corpus/switchboard file. These data sets will be used to see what training on clean data can yield for accuracy. This will determine where our inefficiencies lie as we are getting high WERs in previous Experiments. The Experiments that we run will each be reserved for a corpus. Inside each experiment we will run trains with varying densities (8,16,32) and Senone values (3000,5000,7000). This should show a nice representation of how our data changes with different values. The reason for using first_5hr and last_5hr is because in prior semesters, last_5hr has shown to yield more accurate results. A bit of research has shown that one cause of this could be because the conversions contained in last_5hr are said to have little to no cross talk. which is proving to at least one cause of inefficiencies.

3March The trains all just needed one command to be restarted I ran them all in the background instead of opening so many putty terminals. htis was to hopefully reduce a bit of the load on Caesar at once.The decodes are to come. I will begin them today and let them run overnight (assuming they take that long) Using Experiment 0201 I have run a train mimicking Exp 0168. The difference lied in a few extra parameters in the CFG file. The new settings are as follows: NPART = 2 QUEUE_TYPE = Queue::POSIX This allows the trainer to utilize multiple CPUs on a local machine thus cutting training time in half. Im speculating what effect it will have on the decode. My guess is it should at least reduce that time (maybe not by half) I hope to attempt this again using more than one machine in the future assuming Forrest has made successful progress with Torque. 4March2014 Follow typical Decoding and scoring procedure as described in information Graphs are to be created showing relationships between the 9 trains and their results
 * Corpus: first_5hr/Train
 * Senone: 5000
 * Density: 8
 * Concerns:

27Feb2014: I may have missed a few link that I have stumbled across in the past. From now on I will document as I make these discoveries

28Feb2014
 * One concern I have is that the results of the last_5hr and first_5hr will not be much different as we are removing a lot of the conversations due to "dirty" data. This could potentially remove most of the cross talk in the first_5hr train. the results should prove this theory.
 * We killed Caesar by popping a fuse when we attempted so many trains at once. This issue has been resolved and should not happen again.
 * The first_5hr, last_5hr, and 10hr clean data sets may not be enough data to accurately train the AM.

3March2014 While it was very easy to paralellize on one machine there is a large obstacle in our way of using multiple machines. This is dependent on if Torque is set up correctly and if someone is familiar with using distributed clusters. Time will tell

4March2014
 * No idea why the xRT is so inconsistent in 0199
 * Averages show steady increase but is not very consistent
 * How do we deal with cross talk?!?!?
 * problem is most prevalent in pre-3170 converstations

Week Ending March 18, 2014
| cool info on Switchboard Corpus 6March2014
 * Task:
 * Theorize about other causes of inefficiencies.
 * Get an add.txt file that accurately shows what words need to be added to run the 3170/clean transcripts
 * Do words containing 's get confused at decode time with words that do not contain 's but have the same Phonetic structure?
 * Do Dictionary words with duplicate phonetic structures get confused at decode time?

7March2014
 * Run Decodes on the 10hr Acoustic models we created earlier in the semester (Exps 0162, 0164, 0166, 0181) using the last_5hr clean.
 * I wanted to make sure that the results were similar to the test on trains that we ran at the time the trainings were done.
 * If the results varied too greatly it would mean that the amount of data we test on matters to gauge accurate results.
 * last_5hr/clean is a subset of 10hr (and about 3.8 hours of data)This would see what is the better data to use

16March2014
 * Upload a Switchboard dictionary
 * custom/switchboard.dic
 * Create genTrans8.pl
 * This will accommodate the new dictionary being used
 * run a train to test the new dictionary/genTrans combo

18March2014 Run a 100hr train using new dictionary and transcript file since we proved it to be an improvement

6March2014 Because of our questions we have decided to remove the 's all together form the 3170/clean transcription as well as the dictionary. There were 40,000+ words with 's, this could cause at least a minor improvement in accuracy Of course doing so in the dictionary would cause a lot of duplicate words to show up. For this reason we were going through and finding how many there actually were. Turned out there was 15,000+ not including the originals. We both feel uneasy as to whether or not we should delete the duplicates.
 * Results:

7March2014 Here are the results of the decodes Results The results were slightly more accurate than that of the original, However the xRT was up. My theory is the large LM I created for those Decodes. Here are the original Exps: 0162, 0164, 0181, 0166

16March2014 The results of the Experiment showed that using the newly generated genTrans8.pl file and the Dictionary I have discovered we can yield better results even though this does not include lexical stresses in the phones. I have added the dictionary for everyone to use: /mnt/main/corpus/dist/switchboard.dic

The results of 0168: WER = 33.9
 * GenTrans: genTrans5.pl
 * Dictionary: 10hr.dic

The results of 0209: WER = 32.4
 * GenTrans: genTrans8.pl
 * Dictionary: switchboard.dic

18March2014 The Training was successful. We now have our first 100hr acoustic model. The first_5hr data set is being used on to test the data and check accuracy of the model compared to a smaller one that we are more familiar with. I will post the decode results here as well as in the experiemnt dir when it is successfully completed.

6March2014
 * Plan:
 * We began developing the add.txt file for the 3170/clean transcript when David and I had a few ideas.
 * Remove 's from Transcription
 * Remove 's from Dictionary
 * Remove Duplicates from dictionary
 * Generate the add.txt file to see what is missing
 * before this was done add.txt had 881 words - trying to simplify the manual labor

7March2014 I created one experiment that housed my directory structure for the 4 decodes. 0208 contained d8,d16,d32,d64. From there I went about the typical Experiment creating process making sure that I made the necessary changes to each of the CFG files and file names. I created one LM in the 0208/ dir. What I did differently was, create the LM using the full transcripts data. CMU suggested creating the LM out of a large amount of data. Because of this reason I had to modify the run_decode2.pl script. I gave it the functionality to use a different training directory, LM directory, Decode Directory, and file names separate from the exp number (for complex experiment directories containing multiple in one.), I re-saved it as run_decode4.pl

16March2014 My plan of attack was to first modify the Dictionary to work for us. I made everything Capital and replaced a couple phones that exist exclusively in the dictionary and not in CMU's phone list (the one we are using)
 * | Dictionary was found here
 * ax,el,and en were changed to AH, AH L, and AH N respectively

Then next step was to build a genTrans file that works. Because the dictionary provides pronunciation for the bracketed words we simply need to comment out existing regular expressions from gentrans to leave them in the transcription file while maintaining the formatting that is currently performed.
 * genTrans8.pl will now work with the new dictionary

Now to attempt running the train
 * Read concerns section for my Troubleshooting
 * Read Results section for results of Train/Decode

18March2014 Set the train up the same way I would any other. Using a 32 density and a senone value of 7000 we hope to score somewhere around 30% with the first_5hr data set. We can modify density and senone values afterwards in order to achieve a higher accuracy. I am also using a LM that was created using the full data set. This is because CMU claims it better to use large amounts of data for the LM for accuracy purposes. I haev proven this to be the case in prior Experiments.

6March2014 UPDATE: We decided to leave them and only remove the resulting duplicate words not phone structures. Duplicate words were produced from removing 's. The Prune dictionary will handle the rest of our problems in the mean time
 * Concerns:
 * We both feel uneasy as to whether or not we should delete the duplicates.

7March2014
 * This was the first time I used multiple dirs for decoding with only one LM.
 * Wasn't planning on modifying run_decode2.pl

16March2014 Running the Train:
 * The dictionary turned out to not contain [Laughter], [Noise], or [Vocalized-Noise]
 * I proceeded to the a regular Expression that removes those from the resulting transcription file.

A Concern I had about the Dictionary was that it does not contain Lexical stresses. Removing them seemed to not only speed up training time slightly, but the new dictionary yielded better results

18March2014 The only concerns I had was overcoming an error where my transcription file was resulting in blue new line characters in the file. I vi'd into the file and used :g/^M/s/ to remove them. ^M was made using CTRL+V followed by CTRL+M

Week Ending March 25, 2014
23March2014 Logged in Read Logs, Looked at past experiments results.

20March2014 My only task today was to score the two decodes I had running (Exp 0218 and 0227)
 * Task:
 * 0218:
 * first_5hr/train was used for the transcription data
 * 0227:
 * mini/train was used for the transcription data

'24March2014 Today I worked with Colby Chenard to separate the Full transcript into a Train, Dev, and Eval set

20March2014 The results are: The results conclude that larger data sets real time factor increase not with the size of the decode data but with the size training data. This should not be the case. We should be getting more accurate results with longer training data. That, or only slightly less accurate. But these results are a lot less accurate that expected and took a lot longer than expected.
 * Results:
 * [Speech Exps: 0218 | 0218]
 * xRT = 3.61
 * WER: 43.0
 * [Speech Exps: 0227 | 0227]
 * xRT = 3.81
 * WER: 34.6

24March2014 Data was separated and the audio was symbolically linked to the wav folders of the respected directory. The transcription files do not have the new line characters.

20March2014 Same Routine. Scored the decode following the instructions on the Information page.
 * Plan:

24March2014 Our plan is to prep our data to be trained on. We will be making a new Directory structure called final. This is essentially a copy of full but with the last 333 lines missing. This has turned into our Eval set. The file structure is as follows:
 * Final (highest directory)
 * Train (This contains all except the last two hours of the full data set)
 * Trans
 * Wav
 * Dev (The last 304 lines of the final/train data set 4 full conversations)
 * Trans
 * Wav
 * Eval (The last 333 Sentences of the Switchboard corpus data set 4 full conversations)
 * Trans
 * Wav

20March2014 Not sure why the real time factor and the accuracy were so high.... This needs to be determined and resolved before we continue on to using larger amounts of training data.
 * Concerns:

24March2014 I was worried that the blue ^M new line chars would show up but luckily they did not.

Week Ending April 1, 2014
[] 3/27/2014
 * Task:
 * Look into MLLT
 * could potentially lower our WER

3/27/2014 After running a train and Decoding on a model using the following parameters: I duplicated the experiment and the results were far too similar. It didn't make sense. Turns out we need to edit the CFG file. After doing so I attempted again. This time it dies during the decode because MLLT uses two python add-ons. NumPy and SciPy. Open Suse has NumPy but an out of date version. It does not have SciPy. I attempted to install in several different ways. It seems however that the lack of support for Open Suse is becoming an issue. The descision was made to switch to Rome which is using Fedora to do the same thing. This requires that we run an Experiment first on the Fedora before installing dependencies. The Experiment is in the works.
 * Results:
 * Corpus: mini/train
 * Dictionary: switchboard.dic
 * Senones: 5000

3/27/2014 Run 2 Experiments with identical parameters. One using RunAll_CDMLLT.pl on using RunAll.pl. Compare the results to one another and see if there is any improvement. CMU says it could increase accuracy by up to 25%! as well as reducing decode time. 3/27/2014 MLLT requires NumPy and SciPY. Its proving to be a real pain to install such add-ons on Open Suse. Hopefully Fedora wants to cooperate.
 * Plan:
 * Concerns:

Error Log

Week Ending April 8, 2014

 * Task:


 * Results:


 * Plan:


 * Concerns:

Week Ending April 15, 2014

 * Task:


 * Results:


 * Plan:


 * Concerns:

Week Ending April 22, 2014

 * Task:


 * Results:


 * Plan:


 * Concerns:

Week Ending April 29, 2014

 * Task:


 * Results:


 * Plan:


 * Concerns:

Week Ending May 6, 2014

 * Task:


 * Results:


 * Plan:


 * Concerns: