We use the term word lattice exclusively for both representations. in Microsoft Office Generate data matrix barcodes in Microsoft Office We use the term word lattice exclusively for both representations.

We use the term word lattice exclusively for both representations. generate, create data matrix barcode none on microsoft office projects SQL Server 2000/2005/2008/2012 Large Vocabulary Search Algorithms Figure 13.10 illustrates the general n-best/lattice search framework. Those KSs providing most constraints, at a lesser cost, are used first to generate the n-best list or word lattice.

The n-best list or word lattice is then passed to the rescoring module, which uses the remaining KSs to select the optimal path. You should note that the n-best and word-lattice generator sometimes involve several phases of search mechanisms to generate the n-best list or word lattice. Therefore, the whole search framework in Figure 13.

10 could involve several (> 2) phases of search mechanism. KS Set 1 KS Set 2. Speech Input N-Best or Lattice Generator N-Best list Rescoring Word Lattice Results Figure 13.10 N-best/latti Data Matrix barcode for None ce search framework. The most discriminant and inexpensive knowledge sources (KSs 1) are used first to generate the n-best/lattice.

The remaining knowledge sources (KSs 2, usually expensive to apply) are used in the rescoring phase to pick up the optimal solution [40].. Does the compact n-best o r word-lattice representation impose constraints on the complexity of the acoustic and language models applied during successive rescoring modules The word lattice can be expanded for higher-order language models and detailed context-dependent models, like inter-word triphone models. For example, to use higher-order language models for word lattice entails copying each word in the appropriate context of preceding words (in the trigram case, the two immediately preceding words). To use interword triphone models entails replacing the triphones for the beginning and ending phone of each word with appropriate interword triphones.

The expanded lattice can then be used with detailed acoustic and language models. For example, Murveit et al. [30] report this can achieve trigram search without exploring the enormous trigram search space.

. The Exact N-best Algorithm Stack decoding is the cho ice of generating n-best candidates because of its best-first principle. We can keep it generating results until it finds n complete paths; these n complete sentences form the n-best list. However, this algorithm usually cannot generate the n best candidates efficiently.

The efficient n-best algorithm for time-synchronous Viterbi search was first introduced by Schwartz and Chow [39]. It is a simple extension of time-synchronous Viterbi search. The fundamental idea is to maintain separate records for paths with distinct histories.

The history is defined as the whole word sequence up to the current time t and word w. This exact n-best algorithm is also called sentence-dependent n-best algorithm. When two or more paths come to the same state at the same time, paths having the same history are merged and their probabilities are summed together; otherwise only the n best.

N-best and Multipass Search Strategies paths are retained for ea ch state. As commonly used in speech recognition, a typical HMM state has 2 or 3 predecessor states within the word HMM. Thus, for each time frame and each state, the n-best search algorithm needs to compare and merge 2 or 3 sets of n paths into n new paths.

At the end of the search, the n paths in the final state of the trellis are simply re-ordered to obtain the n best word sequences. This straightforward n-best algorithm can be proved to be admissible7 in normal circumstances [40]. The complexity of the algorithm is proportional to O(n), where n is the number of paths kept at each state.

This is often too slow for practical systems.. Word-Dependent N-Best and Word-Lattice Algorithm Since many of the differe nt entries in the n-best list are just one-word variations of each other, as shown in Table 13.4, one efficient algorithm can be derived from the normal 1-best Viterbi algorithm to generate the n best hypotheses. The algorithm runs just like the normal time-synchronous Viterbi algorithm for all within-word transitions.

However for each time frame t, and each word-ending state, the algorithm stores all the different words that can end at current time t and their corresponding scores in a traceback list. At the same time, the score of the best hypothesis at each grammar state is passed forward, as in the normal timesynchronous Viterbi search. This obviously requires almost no extra computation above the normal time-synchronous Viterbi search.

At the end of search, you can simply search through the stored traceback list to get all the permutations of word sequences with their corresponding scores. If you use a simple threshold, the traceback can be implemented very efficiently to only uncover the word sequences with accumulated cost scores below the threshold. This algorithm is often referred as traceback-based n-best algorithm [29, 42] because of the use of the traceback list in the algorithm.

However, there is a serious problem associated with this algorithm. It could easily miss some low-cost hypotheses. Figure 13.

11 illustrates an example in which word wk can be preceded by two different words wi and w j in different time frames. Assuming path wi wk has a lower cost than path w j - wk when both paths meet during the trellis search of wk , the path w j - wk will be pruned away. During traceback for finding the n best word sequences, there is only one best starting time for word wk , determined by the best boundary between the best preceding word wi and it.

Even though path w j - wk might have a very low cost (let s say only marginally higher than that of wi - wk ), it could be completely overlooked, since the path has a different starting time for word wk .. 7 Although one can sho Microsoft Data Matrix ECC200 w in the worst case, when paths with different histories have near identical scores for each state, the search actually needs to keep all paths (> N) in order to guarantee absolute admissibility. Under this worst case, the admissible algorithm is clearly exponential in the number of words for the utterance, since all permutations of word sequences for the whole sentence need to be kept..

Copyright © . All rights reserved.