Skip to main content
-1 votes
1 answer
43 views

Unsupervised Topic Modeling for Short Event Descriptions

I have a dataset of approximately 750 lines containing quite short texts (less than 150 words each). These are all event descriptions related to a single broad topic (which I cannot specify for ...
Arthur GONAY's user avatar
0 votes
0 answers
40 views

MiniBatchKMeans BERTopic not returning topics for half of data

I am trying to topic a dataset of tweets. I have around 50 million tweets. Unfortunately, such a large dataset will not fit in ram (even 128GB) due to the embeddings. Therefore, I have been working on ...
Matthieu B's user avatar
0 votes
0 answers
30 views

Calculating Topic Correlations or Coocurrences for keyATM

I have been playing around with the keyATM package extensively, however unfortunately there is no approach how to calculate topic correlations and cooccurences, once the model is calculated. I already ...
dpaltra22's user avatar
0 votes
1 answer
83 views

Correct topics from LDA Sequence Model in Gensim

Python's Gensim package offers a dynamic topic model called LdaSeqModel(). I have run into the same problem as in this issue from the Gensim mailing list (which has not been solved). The problem is ...
hyco's user avatar
  • 221
1 vote
1 answer
64 views

Inspect all probabilities of BERTopic model

Say I build a BERTopic model using from bertopic import BERTopic topic_model = BERTopic(n_gram_range=(1, 1), nr_topics=20) topics, probs = topic_model.fit_transform(docs) Inspecting probs gives me ...
coolhand's user avatar
  • 2,109
0 votes
0 answers
30 views

importing util library failed

i am trying to pip install bertopic command for installing and usng bertopic model, here is my next code : from bertopic import BERTopic topic_model = BERTopic.load("MaartenGr/BERTopic_Wikipedia&...
user avatar
0 votes
0 answers
61 views

Unhashable type when calling HuggingFace topic model `topic_labels_` function

If I try to follow the topic modeling tutorial at: https://huggingface.co/docs/hub/en/bertopic The first few lines give me an error: from bertopic import BERTopic topic_model = BERTopic.load("...
coolhand's user avatar
  • 2,109
0 votes
0 answers
24 views

PackagesNotFound error even when verified packages as installed

I am trying to follow this tutorial for BERT topic modeling: https://jpcompartir.github.io/BertopicR/ library(reticulate) reticulate::install_miniconda() library(BertopicR) BertopicR::...
coolhand's user avatar
  • 2,109
0 votes
0 answers
49 views

Topic modelling outputs are gender biased?

Has anyone had this issue? My topic modelling seems to be presenting responses that are very dominated by male respondents. The volume of responses across three different questions is over 800 in each ...
GrBrn's user avatar
  • 3
0 votes
1 answer
51 views

Stopwords problem in text data preprocessing in Python

I want to do topic modeling in Python. For this reason, I used my own stop word list, a stop word list I found on GitHub, and nltk's stop word list to clean the stopwords. However, when I examined the ...
deniz's user avatar
  • 11
0 votes
0 answers
37 views

Cannot find AIC/BIC of my topic modelling after using "lda.collapsed.gibbs.sampler" in LDA package

I have used "lda.collapsed.gibbs.sampler" to do my topic modelling and LDA visualisation, and now I want to determine which number of models (K) best fits my model. Then I tried to use AIC/...
Pang kalok's user avatar
4 votes
1 answer
341 views

Topic modelling many documents with low memory overhead

I've been working on a topic modelling project using BERTopic 0.16.3, and the preliminary results were promising. However, as the project progressed and the requirements became apparent, I ran into a ...
Bbrk24's user avatar
  • 973
0 votes
1 answer
38 views

How to extract terms and probabilities from tmResult$terms in topic modeling?

I like to create separate word clouds for each of my 8 topics in an LDA model. I extracted top 40 words across 8 topics - an object of length 320 containing top words and occurrence probabilities. I ...
NoaMi's user avatar
  • 41
0 votes
1 answer
88 views

How is coherence score calculated in Mallet?

I do understand how the diagnostics output shows the coherence values for each topic but my values range between -150 and -600 and other posts that I have seen where Mallet was used show coherence ...
Glorifier's user avatar
0 votes
0 answers
53 views

Understanding and improving coherence values using Mallet

I am attempting to run an LDA topic model using Mallet. My corpus consists of user comments from news websites. It's a relatively small corpus with approx. 614k words. The first approach I took was to ...
Glorifier's user avatar

15 30 50 per page
1
2 3 4 5
66