3

I have made input data for machine learning as csv file . The data are 2D arrays input and label Example

[[55:32:1:23:41:243:255:11:182:192:231:201],"play"]

[[23:222:225],"talk"]

[[23:132:215:111:29:192],"talk"]

[55:32:1:23:41:243:255:11:182:192:231:201] play
[23:222:225] talk

I tried to train using the follwing code

import tensorflow as tf
import numpy as np
np.set_printoptions(precision=3, suppress=True)
import pandas as pd
from tensorflow.keras import layers
from tensorflow.keras.layers.experimental import preprocessing
import io

data = pd.read_csv('./newTest4.csv',  header=None)
data_features=data.copy()
data_labels=data_features.pop(0)
data_features=np.array(data_features)
data_labels=np.array(data_labels)
data_labels 


data_model=tf.keras.Sequential ([
layers.Dense(64),
layers.Dense(1)
])
data_model.compile(loss=tf.losses.MeanSquaredError(),optimizer=tf.optimizers.Adam())


data_model.fit(data_features,data_labels,epochs=100)

But the output was

UnimplementedError:  Cast string to float is not supported
     [[node mean_squared_error/Cast (defined at <ipython-input-18-ce25e735eaa4>:1) ]] [Op:__inference_train_function_1561]
Function call stack:
train_function
1
  • 1
    You cannot use strings as inputs to ML or DL models, you have to encode the strings into vectors of numbers using different techniques available and then use them as inputs to your model. Commented Jun 6, 2021 at 6:59

2 Answers 2

1

You need a way that the model can predict the output. If you have a fixed amount of strings that you want to predict, you have to map each unique string to a binary variable.

An example is a 2-dimensional vector where the first dimension represents "play" and the second dimension represents "talk".

Your data then looks like this:

[[55:32:1:23:41:243:255:11:182:192:231:201],[1,0]] # "play", no "talk"

[[23:222:225], [0,1]] # no "play", "talk"

Now, the model can learn to predict whether the output is [1,0] (play) or [0,1] (talk).

This representation is called one-hot encoding, you can read about it in this blogpost!

1
  • 1
    Won't this work with categorial data more ? I have different arrays representing the same activity which is talk for example won't this make it not work properly ?
    – Kok129
    Commented Jun 7, 2021 at 11:55
0

You cannot train a model on categories as strings. Rather you encode each string as a unique integer value.

There is a blog post about how to encode categorical data. Check 3 Ways to Encode Categorical Variables for Deep Learning.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.