• No results found

NNQuery NNClassification NIAM N-GramModels NF-SS NFS N

N/A
N/A
Protected

Academic year: 2021

Share "NNQuery NNClassification NIAM N-GramModels NF-SS NFS N"

Copied!
1
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

NFS

▶ Storage Protocols

NF-SS

▶ Normal Form ORA-SS Schema Diagrams

N-Gram Models

DJOERDHIEMSTRA

University of Twente, AE Enschede, The Netherlands

Definition

In language modeling, n-gram models are probabilistic models of text that use some limited amount of histo-ry, or word dependencies, where n refers to the number of words that participate in the dependence relation.

Key Points

In automatic speech recognition, n-grams are impor-tant to model some of the structural usage of natural language, i.e., the model uses word dependencies to assign a higher probability to ‘‘how are you today’’ than to ‘‘are how today you,’’ although both phrases contain the exact same words. If used in information retrieval, simple unigram language models (n-gram models with n¼ 1), i.e., models that do not use term dependencies, result in good quality retrieval in many studies. The use of bigram models (n-gram models with n¼ 2) would allow the system to model direct term dependencies, and treat the occurrence of ‘‘New York’’ differently from separate occurrences of ‘‘New’’ and ‘‘York,’’ possibly improving retrieval performance. The use of trigram models would allow the system to find direct occurrences of ‘‘New York metro,’’ etc. The following equations contain respec-tively (1) a unigram model, (2) a bigram model, and (3) a trigram model: PðT1; T2; TnjDÞ ¼ PðT1jDÞPðT2jDÞ PðTnjDÞ ð1Þ PðT1; T2; TnjDÞ ¼ PðT1jDÞPðT2jT1; DÞ PðTnjTn1; DÞ ð2Þ PðT1; T2; TnjDÞ ¼ PðT1jDÞPðT2jT1; DÞPðT3jT1; T2; DÞ PðTnjTn2; Tn1; DÞ ð3Þ

The use of n-gram models increases the number of parameters to be estimated exponentially with n, so special care has to be taken to smooth the bigram or trigram probabilities. Several studies have shown small but significant improvements of using bigrams if smoothing parameters are properly tuned [2,3]. Improvements of the use of n-grams and other term dependencies seem to be bigger on large data sets [1].

Cross-references

▶ Language Models ▶ Probability Smoothing

Recommended Reading

1. Metzler D. and Bruce Croft W. A Markov random field model for term dependencies. In Proc. 31st Annual Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, 2005, pp. 472–479.

2. Miller D.R.H., Leek T., and Schwartz R.M. A hidden Markov model information retrieval system. In Proc. 22nd Annual Int. ACM SIGIR Conf. on Research and Development in Informa-tion Retrieval, 1999, pp. 214–221.

3. Song F. and Bruce Croft W. A general language model for information retrieval. In Proc. 22nd Annual Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, 1999, pp. 4–9.

NIAM

▶ Object-Role Modeling

NN Classification

▶ Nearest Neighbor Classification

NN Query

▶ Nearest Neighbor Query

▶ Nearest Neighbor Query in Spatio-temporal Databases

Referenties

GERELATEERDE DOCUMENTEN

Je zou, op dezelfde manier werkend als Max Bill, die tinten grijs van de ‘eerste 8 rechthoeken’ ook in een andere volgorde hebben kunnen plaatsen. Door de grijstinten in volgorde

Preparation of the slices at different phases of the circadian cycle revealed that the timing as well as the waveform of these oscillations are determined by the light dark cycle

Furthermore, according to MakingStandards Work, an international handbook on good prison practice produced by Penal Reform International (1996) with the assistance of the Ministry

Het dagelijks bestuur van de CEVO heeft de definitieve normering voor het centraal examen 2006, tweede tijdvak, vastgesteld zoals aangegeven in onderstaand overzicht. Deze

Het voorwerp wordt dan op de hoofdas over een afstand van 1,5f dichter naar de lens geschoven... Op een zomerdag duikt Joost in

• Vaak hebben een aantal toestanden

Thus, let k be an algebraically closed field of characteristic 0 and let K be a trans- cendental field extension of k, where we allow the transcendence degree to be arbitrarily

Keyter, J... Die besl~.ik:bare hoeveelheid kenn:i s neem ge:vJeldig vinnig ·toe. die uiteindelike opvoedingsdoel vr.rs nie.. Die et.iese doel ra.a.k 'n