Many SUMMIT tools that compute quantities across many utterances understand the espec control usage, including:
Using an espec file, one can succintly specify a particular set of data, expressing it in terms of union's, intersection's, negations, etc, of other sets or individual data. Sets can be either built in to the corpus (eg, <train>, <dev>, etc), or user-defined in the espec file using define. Sets can be further constrained based on individual properties of the data -- these are called "constraint sets". Once the set is chosen, a template mechanism is used to fill in and compute a shell-style control line.
The espec file recognizes a simple language for working with sets of data. These sets can be those defined in the corpus, or user-defined sets in the espec file. In all cases, the set in the espec file can be overided by specifying -set on the command line.
To specify a set, there must be a `corpus:' line in the espec file which points to the corpus file you are working with. For example:
corpus: /usr/sls/summit/recognizers/corpora/voyager.corpusNext, there must be a `set:' line which designates which set you are working with. This set can be expressed as a function of corpus-defined sets, and user-defined sets, using some basic set-algebraic operations:
define <my_set> = <abr> + <mbr> + <chs> - utt-s-abr-14The right hand side of this definition can again be any general set expression, and can refer to other defined espec sets or corpus sets. Besides corpus-defined and espec-defined sets, you can also use constraint-sets. Constraint sets spontaneously create a set based on some constraint. For example, you could create the set of all data whose utterance_duration is greater than 5.0 seconds like this:
<[utterance_duration > 5.0]>The expression inside the []'s is a general logic expression, and can consist of any number of terms connected with &&'s, ||'s, and !'s, and using ()'s to group terms:
<[gender == m && (type != read || orthography includes boston)]>Each term in the formula must be of the form `property operator value', where property is a corpus-defined property, value is a string or number value, and the operator is one of the following:
corpus: voyager.corpus define <train_read> = <train> * <[type == read]> define <train_spont> = <train> * <[type == spontaneous]> define <my_test> = (<dev> + <test> - <abr>) * <[type != read]> set: (<train_spont> * <[gender == f]>) + <my_test>
The above functionality is sufficient for most needs. However, there are some additional espec set "functions" which can be used to select random subsets of sets, and also to save sets out to a file in the espec format. The first of these functions is trim, and its usage is:
trim(set, num, random|direct)Trim takes the specified set, and trims away enough elements to leave exactly num elements, and returns the resulting set. It can trim randomly, which will yield a different set every time it is used, or direct, which will yield the same set every time it is used (ie, it will be deterministic). Trim can be useful, for example, for randomly selecting a small subset of a training or development set to examine in detail with other tools (e.g., sapphire). An example is:
set: trim(<dev>, 25, random)
Trim is useful for choosing a single random subset of data. If, however, you'd like to choose several subsets which are orthogonal with respect to a particular partition of the data (usually speakers), you can use the function rand. Rand is useful for creating independent-of-speaker training and testing sets, for example. Its usage is this:
rand(set, partition-name, num)Rand will choose a subset of the specified set, whose number of elements is num or more. The way that it chooses the subset is to choose elements from the partition one at a time, until it has exactly num or more elements. The partition name must be the name of a partition-set in the corpus, for example `by_speaker'. This could be used as follows:
corpus: voyager.corpus define <train0> = rand(<all_data>, by_speaker, 2500) define <test0> = <all_data> - <train0>
The final function is save, which will save a set to a file in a format that can later be included in an espec file. Its usage is:
save(set, file-name, set-name)It will save the set to the file `file-name', naming it the set `set-name'. This file can then be included in the espec file and those sets used for future computation. This is useful for computing a random set and saving it, so you can re use it later.
line: -in <apnet> line: -ref "<orthography>" line: -word_graph_out /t/summit/<speaker>/<tag>.wg.gzControl lines are lines that start with a the keyword "line:". They act simply as templates, such that any text surrounded by <>'s is taken to refer to an utterance property, unless the < is preceeded by a backslash (\).
This template is instantiated for every element in the specified set of data. To translate from an espec file to the equivalent control file, use the tool ctl_from_espec.
The special property <tag> is not formally a corpus property, but behaves just like one. It will be expanded according to the name of each datum in the designated set. This variable, coupled with the <speaker> property in the corpus, is useful for composing filenames with the proper directory structure (as in the -word_graph_out key above). Any key with ends in out will be interpreted as a file based output, and any tools loading a control file will make the appropriate directories as necessary.