Chapter 1. Introduction

Sam is a tool to create and test acoustic models. It can compile new speech models, use models created by Simon and produce models that can be used by Simon later on.

It is targeted towards people wanting more control over their acoustic model and provides much lower level access to the build process. Sam is mainly geared towards speech professionals wanting to improve and / or test their acoustic model.

For more information on the architecture of the Simon suite please see the Simon manual.

Background

This section will provide a bit of background information on the compilation and testing process.

Effective testing

One of the major features of Sam is to test the generated acoustic models.

The basic testing procedure is to run recognition on samples where the transcription is already known and comparing the results. Sam also takes the confidence score of the recognition into account to measure how robust the created system is.

Due to the way acoustic models are created, both the recognition accuracy and confidence will be highly skewed when the same samples are used both for training and testing. This is called "in corpus" testing (the samples used for testing are also in your training corpus).

While in corpus testing might tell you if the compilation process failed or produced sub par results it won't tell you the "real" recognition rate of the created model. Therefore it is recommended to do "out of corpus" testing: Use different samples for training and for testing.

For out of corpus testing simply split your prompts file in two parts: One used to compile the model and one to test the model. Of course the test set doesn't need to be very big to get a representative result.

If you don't have a lot of training data, you can also split the complete corpus in ten parts. Compile 10 models, each leaving out one part of the corpus. Then do then individual tests (always with the test set that was excluded during compilation) and average the results.