Data scraping and preprocessing are done. Now, I’ll use a simple BP neural network to fit the data. I’m using TensorFlow 2.1.0.


Input/Output Selection

Since we’re just testing the BP neural network, let’s keep inputs and outputs simple.

INPUT: place_id, house_type, house_area, house_towards

OUTPUT: house_price

The four input data are:

  1. Area ID (I’ve numbered each area, e.g., Andingmen: 1)
  2. House type (also numbered, e.g., Board-style: 0)
  3. House area (in square meters)
  4. House direction/towards (also numbered, e.g., South: 0)

The output is the house transaction price (in 10,000 yuan).


Data Processing

After flattening the dataset, the input data X_data becomes a dataset of size (150731, 310), where 310 = 4+8+297+1. I used one-hot encoding for house direction, type, and area, then merged them together. 4 represents 4 house types, 8 represents 8 directions, 297 represents 297 areas. The last element is house area.


Building the Model

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

model = keras.Sequential([
    layers.Dense(128, activation='relu', input_shape=(310,)),
    layers.Dense(64, activation='relu'),
    layers.Dense(32, activation='relu'),
    layers.Dense(1)
])

model.compile(optimizer='adam',
              loss='mse',
              metrics=['mae'])

model.fit(X_train, y_train, epochs=50, batch_size=32, validation_split=0.2)

Summary

This is a simple neural network for housing price prediction. In production, you’d need:

  • More features
  • Better preprocessing
  • Cross-validation
  • Model optimization