All Questions
Tagged with topic-modeling java
33 questions
0
votes
1
answer
399
views
Mallet Topic Modeling in Java
What I find very hard with Machine Learning tutorials/books/articles is when a model is explained (even with code) you only get the code until you train (and/or test) the model. Then it stops. I ...
1
vote
1
answer
110
views
Mallet DMR negative propability for feature-based topic-distribution?
I've created a DMR Topic model (via Java API) which calculates the topic distribution based on the publication-year of the documents.
The resulting distribution is a bit confusing, because there are ...
0
votes
1
answer
47
views
Java Mallet LDA keyword distributions
I have used Java-Mallet API for topic modelling with LDA. The API produce following results:
topic : keyword1 (count), keyword2 (count)
For example
topic 0 : file (12423), test (3123) ...
topic 1 : ...
1
vote
1
answer
400
views
Use Log Likelihood to compare different mallet topic models?
I'm trying to find out if it's possbible - or what's the best way - to compare programmatically different topic models created with mallet to determine the "best" fitting model for the given corpus.
...
1
vote
1
answer
111
views
Mallet outputting either topic weight 0.0 or 1.0 and nothing in between
So created a little program using mallet's API following this example in the developer's guide. However, I do not understand the final weight output.
While the program is running it is outputting ...
1
vote
0
answers
242
views
MALLET Unable to restore instance list
I am trying to train a MALLET topic model that has been created using import-file, but I am presented with an error stating that MALLET was unable to restore the instance list. Additionally, I ...
0
votes
1
answer
118
views
Different topic distributions for the same data with mallet topic modeling
I am using Mallet topic modeling and I have trained a model. Right after the training, I print the topic distribution for one of the documents of the training set and save it. Then, I try the same ...
3
votes
2
answers
590
views
Mallet Topic Modelling API - How to decide number of intervals needed or best for optimization?
Sorry I'm quite the beginner in the field of NLP, as the title says what is the best interval for optimization in Mallet API? I was also wondering if it was dependent or related to the number of ...
0
votes
1
answer
99
views
getting instances and topic sequences of all document in mallet
I'm working topic modeling with mallet library. My data set is in filePath path and csvIterator seems can read data because model.getData() has about 27000 rows that is equal to my dataset.
I wrote a ...
1
vote
1
answer
89
views
Create customized Pattern for my data-set in mallet
I'm using Mallet 2.0.7 in java for mining of tweets.
According the documentation, for topic modeling I have to read data set using CsvIterator.
Reader fileReader = new InputStreamReader(new ...
3
votes
1
answer
400
views
Error in Mallet Java
I want to do topic modelling , So, I ran the below command :-
bin\mallet train-topics --input web.mallet --output-state output-file.gz
It tells me :- Topic modeling currently only supports feature ...
2
votes
1
answer
580
views
Extracting keywords from relevant topics using a trained MALLET Topic model
I'm attempting to use MALLET's TopicInferencer to infer keywords from arbitrary text using a trained model. So far my overall approach is as follows.
Train a ParallelTopicModel with a large set of ...
2
votes
1
answer
943
views
Strange perplexity values of LDA model trained with MALLET
I have trained an LDA model with MALLET on parts of the Stack Overflow data dump and did a 70/30 split for training and test data.
But the perplexity values are strange, because they are lower for ...
2
votes
2
answers
3k
views
Java out of memory: increase heap space?
This seems to be a common issue, however the existing solutions didn't work for me.
I am trying to perform topic modeling in R with the help of the mallet package.
The corpus consists of forum ...
0
votes
1
answer
89
views
Change order of columns in topic distribution file in MALLET
MALLET generates a tab-separated file with the topic distribution of each document by using the --output-doc-topics parameter while training the topic model. It kind of looks like this:
doc# ...