Skip to main content
deleted 9 characters in body
Source Link

So whenWhen I try to run the following code for arrays with more than 10k elements, it takes hours and I don't know how to make it in the most efficient way. 

Any ideas?

Thank you!

So when I try to run the following code for arrays with more than 10k elements, it takes hours and I don't know how to make it in the most efficient way. Any ideas?

Thank you!

When I try to run the following code for arrays with more than 10k elements, it takes hours and I don't know how to make it in the most efficient way. 

Any ideas?

Updated data file.
Source Link
kibs
  • 51
  • 2

Link to example data:Link to example data:

The data file is a Python 2.7 pickle objectxlsx. The format is as followsIt has three columns: One array "C"label (an array of clusters, in this case len(C) = 2 = 2 clusterscluster label), feature_1 and feature_2. Each cluster is an array of arrays (each cluster contains

The process to get C from the vectorized representation offile and get the observations).functions working should be something like this:

import pandas as pd
import numpy as np
df = pd.read_excel('example_data.xlsx')
c1 = np.asanyarray(df[df['labels'] == 0].apply(lambda row: ([row['feature_1'], row['feature_2']]), axis=1))
c2 = np.asanyarray(df[df['labels'] == 1].apply(lambda row: ([row['feature_1'], row['feature_2']]), axis=1))
C = [c1,c2]
H_score(C)

Link to example data:

The data file is a Python 2.7 pickle object. The format is as follows: One array "C" (an array of clusters, in this case len(C) = 2 = 2 clusters). Each cluster is an array of arrays (each cluster contains the vectorized representation of the observations).

Link to example data:

The data file is a xlsx. It has three columns: label (cluster label), feature_1 and feature_2.

The process to get C from the file and get the functions working should be something like this:

import pandas as pd
import numpy as np
df = pd.read_excel('example_data.xlsx')
c1 = np.asanyarray(df[df['labels'] == 0].apply(lambda row: ([row['feature_1'], row['feature_2']]), axis=1))
c2 = np.asanyarray(df[df['labels'] == 1].apply(lambda row: ([row['feature_1'], row['feature_2']]), axis=1))
C = [c1,c2]
H_score(C)
Uploaded example data
Source Link
kibs
  • 51
  • 2

Link to example data:

The data file is a Python 2.7 pickle object. The format is as follows: One array "C" (an array of clusters, in this case len(C) = 2 = 2 clusters). Each cluster is an array of arrays (each cluster contains the vectorized representation of the observations).

Thank you!

Link to example data:

The data file is a Python 2.7 pickle object. The format is as follows: One array "C" (an array of clusters, in this case len(C) = 2 = 2 clusters). Each cluster is an array of arrays (each cluster contains the vectorized representation of the observations).

Thank you!

edited tags; edited title
Link
200_success
  • 145.7k
  • 22
  • 191
  • 481
Loading
Source Link
kibs
  • 51
  • 2
Loading