Umass coherence score

1/17/2024

Mazarura and De Waal ( 2016) shows that the LDA model may not perform well when handling short and sparse text data, such as tweets, since these are often just concerned with one specific topic, therefore affecting the validity of the LDA’s main assumption (Alvarez-Melis and Saveski 2016). The LDA is a generative process, which assumes that each document in a corpus is generated by a mixture of topics (Blei et al. We propose the simulation of pseudo-documents as a new evaluation method to compare LDA, GSDMM and GPM and contrast the results of our method with standard evaluation approaches. We provide a detailed discussion of the shortcomings of the coherence scores and the evaluation by top words in Sect. However, this approach relies on subjective human interpretation and is costly and time intensive. The interpretation by top words, which are the words with a high probability in a topic is an alternative approach to the automatic topic model evaluation. ( 2021) provided a detailed critique of coherence scores, showing that high coherence scores do not necessarily correspond to people’s ratings of topic quality. In a recent publication, however, Hoyle et al. Coherence scores have been widely used for the evaluation and comparison of topic models. ( 2014) propose coherence scores for automatic topic model evaluation and show that they correlate with the human evaluation of topics. ( 2009) shows that the perplexity metric is negatively correlated with measures that are based on human evaluation. We compare the performance of the most widely used Latent Dirichlet Allocation (LDA) topic model with the Gibbs Sampler Dirichlet Multinomial Model (GSDMM) and the Gamma Poisson Mixture Model (GPM), which are specifically designed for sparse data and hence presumably more suitable for Twitter data than the LDA with the Pseudo-Document Simulation method.įor evaluating and comparing topic models, standard approaches such as the likelihood-based perplexity metric, coherence scores and top words are insufficient. Tweets are relatively short, which creates challenges when using standard topic models relying on the inherent assumption that texts are composed as mixtures of latent topics (Mazarura and De Waal 2016). However, with the rising importance of social media platforms such as Twitter, extracting latent topics from short and sparse texts has become increasingly relevant.

Topic models are widely used to extract latent topics in texts, but the most regularly applied models are not well tuned for sparse documents.

0 Comments

Umass coherence score

Leave a Reply.

Author

Archives

Categories