Topic Modeling and Clustering Case Study
We look in detail at some of the most common clustering and topic modeling techniques in NLP against a philosophical text to see how they perform. Focused on Python sklearn and gensim libraries that support K-Means, Hierarchical and Non-negative matrix factorization clustering techniques as well as Latent Semantic Analysis, which is another name for Singular Value Decomposition (SVD) and Latent Dirichlet Allocation. Techniques are compared and contrasted and output is analyzed to evaluate the different techniques against a single, comprehensive philosophy text (Theology Reconsidered for those who are familiar with the author’s work).
Presentation and associated materials are available on my youtube channel here.
Leave a Reply
Want to join the discussion?Feel free to contribute!