I am trying to use a neural network to predict the price of houses. Here is what the top of the dataset looks like:
Price Beds SqFt Built Garage FullBaths HalfBaths LotSqFt
485000 3 2336 2004 2 2.0 1.0 2178.0
430000 4 2106 2005 2 2.0 1.0 2178.0
445000 3 1410 1999 1 2.0 0.0 3049.0
...
Here is how I am coding the neural network (using Python's keras).
dataset = df.values
X = dataset[:,1:8]
Y = dataset[:,0]
from sklearn import preprocessing
from sklearn.model_selection import train_test_split
min_max_scaler = preprocessing.MinMaxScaler()
X_scale = min_max_scaler.fit_transform(X)
X_scale
X_train, X_val_and_test, Y_train, Y_val_and_test = train_test_split(X_scale, Y, test_size=0.3)
X_val, X_test, Y_val, Y_test = train_test_split(X_val_and_test, Y_val_and_test, test_size=0.5)
print(X_train.shape, X_val.shape, X_test.shape, Y_train.shape, Y_val.shape, Y_test.shape)
from keras.models import Sequential
from keras.layers import Dense
model = Sequential(
Dense(32, activation='relu', input_shape=(7,)),
Dense(1, activation='relu'))
model.compile(optimizer='sgd',
loss='mse',
metrics=['mean_squared_error'])
hist = model.fit(X_train, Y_train,
batch_size=32, epochs=100,
validation_data=(X_val, Y_val)) #Error here!
model.evaluate(X_test, Y_test)[1] #Same Error here!
I get the same error when running the hist =
line and the model.evaluate
line. Here is the error information:
TypeError Traceback (most recent call last)
<ipython-input-19-522714a352ba> in <module>
----> 1 hist = model.fit(X_train, Y_train,
2 batch_size=32, epochs=100,
3 validation_data=(X_val, Y_val))
~/opt/anaconda3/lib/python3.8/site-packages/tensorflow/python/keras/engine/training.py in _method_wrapper(self, *args, **kwargs)
106 def _method_wrapper(self, *args, **kwargs):
107 if not self._in_multi_worker_mode(): # pylint: disable=protected-access
--> 108 return method(self, *args, **kwargs)
109
110 # Running inside `run_distribute_coordinator` already.
...
TypeError: in user code:
...
name_scope += '/'
TypeError: unsupported operand type(s) for +=: 'Dense' and 'str'
I'm not sure why this is happening because when I run df.dtypes
on my original dataframe, all of the columns are either integers or floats.