Chapter 2. Overview

Architecture

The main recognition architecture of Simon consists of three applications.

  • Simon

    This is the main graphical interface.

    It acts as a client to the Simond server.

  • Simond

    The recognition server.

  • KSimond

    A graphical front-end for Simond.

These three components form a real client / server solution for the recognition. That means that there is one server (Simond) for one or more clients (Simon; this application). KSimond is just a front-end for Simond which means it adds no functionality to the system but rather provides a way to interact with Simond graphically.

Additionally to the Simon, Simond and KSimond other, more specialized applications are also part of this integrated Simon distribution.

  • Sam

    Provides more in-depth control to your speech model and allows to test the acoustic model.

  • SSC / SSCd

    These two applications can be used to collect large amount of speech samples from different persons more easily.

  • Afaras

    This simple utility allows users to quickly check large corpora of speech data for erroneous samples.

Please refer to the individual handbooks of those applications for more details.



Simon is used to create and maintain a representation of your pronunciation and language. This representation is then sent to the server Simond which compiles it into a usable speech model.

Simon then records sound from the microphone and transmits it to the server which runs the recognition on the received input stream. Simond sends the recognition result back to the client (Simon).

Simon then uses this recognition result to execute commands like opening programs, following links, etc.

Simond identifies its connections with a user / password combination which is completely independent from the underlying operating system and its users. By default a standard user is set up in both Simon and Simond so the typical use case of one Simond server per Simon client will work out of the box.

Every Simon client logs onto the server with a user / password combination which identifies a unique user and thus a unique speech model. Every user maintains his own speech model but may use it from different computers (different, physical Simon instances) simply by accessing the same Simond server. One Simond instance can of course also serve multiple users.

If you want to open up the server to the Internet or use multiple users on one server, you will have to configure Simond. Please see the Simond manual for details.