fbpx
Turning a Keras Model into an Estimator Turning a Keras Model into an Estimator
Google’s TensorFlow engine has a unique way of solving problems, allowing us to solve machine learning problems very efficiently. Nowadays, machine... Turning a Keras Model into an Estimator

Google’s TensorFlow engine has a unique way of solving problems, allowing us to solve machine learning problems very efficiently. Nowadays, machine learning is used in almost all areas of life and work, with famous applications in computer vision, speech recognition, language translations, healthcare, and many more. 

This article is an excerpt from the book, Machine Learning Using TensorFlow Cookbook by Alexia Audevart, Konrad Banachewicz, and Luca Massaron – A comprehensive cookbook for data scientists and ML engineers to master TensorFlow and create powerful machine learning algorithms, with valuable insights on Keras, Boosted Trees, Tabular Data, Transformers, Reinforcement Learning and more. 

We usually work out our linear regression models using specific Estimators from the tf.estimator module. This has clear advantages because our model is mostly run automatically and we can easily deploy it in a scalable way on the cloud (such as Google Cloud Platform, offered by Google) and on different kinds of servers (CPU-, GPU-, and TPUbased).   

However, there is a possibility that by using Estimators, we may lack flexibility in our model architecture as required by our data problem, which is instead offered by the Keras modular approach.  

In this article, we will remediate this by showing how we can transform Keras models into Estimators and thus take advantage of both the Estimators API and Keras versatility at the same time.

https://opendatascience.com/announcing-the-odsc-machine-learning-certification/

Getting ready 

We will use the same Boston Housing dataset as in the previous recipe, while also making use of the make_input_fn function. As before, we need our core packages to be imported: 

import tensorflow as tf  
import numpy as np 
import pandas as pd
import tensorflow_datasets as tfds 
tfds.disable_progress_bar() 

We will also need to import the Keras module from TensorFlow. 

import tensorflow.keras as keras 

Importing tf.keras as keras will also allow you to easily reuse any previous script that you wrote using the standalone Keras package.  

How to do it… 

Our first step will be to redefine the function creating the feature columns. In fact, now we have to specify an input to our Keras model, something that was not necessary with native Estimators was not necessary since they just need a tf.feature function mapping the feature: 

def define_feature_columns_layers(data_df, categorical_cols, numeric_cols): 
    feature_columns = [] 
    feature_layer_inputs = {}      

    for feature_name in numeric_cols: 
        feature_columns.append(tf.feature_column.numeric_column(feature_name, dtype=tf.float32)) 
        feature_layer_inputs[feature_name] = tf.keras.Input(shape=(1,), name=feature_name)          

    for feature_name in categorical_cols: 
        vocabulary = data_df[feature_name].unique() 
        cat = tf.feature_column.categorical_column_with_vocabulary_list(feature_name, vocabulary) 
        cat_one_hot = tf.feature_column.indicator_column(cat) 
        feature_columns.append(cat_one_hot) 
        feature_layer_inputs[feature_name] = tf.keras.Input(shape=(1,), name=feature_name, dtype=tf.int32)      

    return feature_columns, feature_layer_inputs 

The same goes for interactions. Here, too we need to define the input that will be used by our Keras model (in this case, one-hot encoding): 

def create_interactions(interactions_list, buckets=5): 
    feature_columns = []      

    for (a, b) in interactions_list: 
        crossed_feature = tf.feature_column.crossed_column([a, b], hash_bucket_size=buckets) 
        crossed_feature_one_hot = tf.feature_column.indicator_column(crossed_feature) 
        feature_columns.append(crossed_feature_one_hot)          

    return feature_columns 

After preparing the necessary inputs, we can proceed to the model itself. The inputs will be collected in a feature layer that will pass the data to a batchNormalization layer which will automatically standardize it. After that, the data will be directed to the output node, which will produce the numeric output.  

def create_linreg(feature_columns, feature_layer_inputs, optimizer):   

    feature_layer = keras.layers.DenseFeatures(feature_columns) 
    feature_layer_outputs = feature_layer(feature_layer_inputs) 
    norm = keras.layers.BatchNormalization()(feature_layer_outputs) 
    outputs = keras.layers.Dense(1, kernel_initializer='normal', activation='linear')(norm)      

    model = keras.Model(inputs=[v for v in feature_layer_inputs.values()], outputs=outputs) 
    model.compile(optimizer=optimizer, loss='mean_squared_error') 
    return model 

At this point, having set all the necessary inputs, new functions are created and we can run them: 

categorical_cols = ['CHAS''RAD'] 
numeric_cols = ['CRIM''ZN''INDUS',  'NOX''RM''AGE''DIS''TAX''PTRATIO''B''LSTAT'] 
feature_columns, feature_layer_inputs = define_feature_columns_layers(data, categorical_cols, numeric_cols) 
interactions_columns = create_interactions([['RM''LSTAT']])   

feature_columns += interactions_columns   

optimizer = keras.optimizers.Ftrl(learning_rate=0.02) 
model = create_linreg(feature_columns, feature_layer_inputs, optimizer) 

We have now obtained now a working Keras model. We can convert it into an Estimator using the model_to_estimator function. This requires the establishment of a temporary directory for the Estimator’s outputs: 

import tempfile   

def canned_keras(model): 
    model_dir = tempfile.mkdtemp() 
    keras_estimator = tf.keras.estimator.model_to_estimator( 
        keras_model=model, model_dir=model_dir) 
    return keras_estimator 
estimator = canned_keras(model) 

Having canned the Keras model into an Estimator, we can proceed as before to train the model and evaluate the results. 

train_input_fn = make_input_fn(train, y_train, num_epochs=1400) 
test_input_fn = make_input_fn(test, y_test, num_epochs=1, shuffle=False)   

estimator.train(train_input_fn) 
result = estimator.evaluate(test_input_fn)   

print(result) 

When we plot the fitting process using TensorBoard, we will observe how the training trajectory is quite similar to the one obtained by previous Estimators: 

 Figure: Canned Keras linear Estimator training  

Canned Keras Estimators are indeed a quick and robust way to bind together the flexibility of user-defined solutions by Keras and the high-performance training and deployment from Estimators. 

How it works… 

The model_to_estimator function is not a wrapper of your Keras model. Instead, it parses your model and transforms it into a static TensorFlow graph, allowing distributed training and scaling for your model. 

There’s more… 

One great advantage of using linear models is to be able to explore their weights and get an idea of what feature is producing the result we obtained. Each coefficient will tell us, given the fact that the inputs are standardized by the batch layer, how that feature is impacted with respect to the others (the coefficients are comparable in terms of absolute value) and whether it is adding or subtracting from the result (given a positive or negative sign): 

weights = estimator.get_variable_value('layer_with_weights-1/kernel/.ATTRIBUTES/VARIABLE_VALUE') 
print(weights) 

Anyway, if we extract the weights from our model we will find out that we cannot easily interpret them because they have no labels and the dimensionality is different since the tf.feature functions have applied different transformations. 

We need a function that can extract the correct labels, from our feature columns as we mapped them prior to feeding them to our canned estimator: 

def extract_labels(feature_columns): 
    labels = list() 
    for col in feature_columns: 
        col_config = col.get_config() 
        if 'key' in col_config: 
            labels.append(col_config['key']) 
        elif 'categorical_column' in col_config: 
            if col_config['categorical_column']['class_name']=='VocabularyListCategoricalColumn': 
                key =  
col_config['categorical_column']['config']['key'] 
                for item in col_config['categorical_column']['config']['vocabulary_list']: 
                     labels.append(key+'_val='+str(item)) 
            elif col_config['categorical_column']['class_name']=='CrossedColumn': 
                keys = col_config['categorical_column']['config']['keys'] 
                for bucket in range(col_config['categorical_column']['config']['hash_bucket_size']): 
                    labels.append('x'.join(keys)+'_bkt_'+str(bucket)) 
    return labels 

This function only works with TensorFlow version 2.2 or later because in earlier Tf 2.x versions the get_config method was not present in tf.feature objects. 

Now we can extract all the labels and meaningfully match each weight in the output to its respective feature: 

labels = extract_labels(feature_columns)   

for label, weight in zip(labels, weights): 
    print(f"{label:15s} : {weight[0]:+.2f}") 

Once you have the weights, you can easily get the contribution of each feature to the result by observing the sign and the magnitude of each coefficient. The scale of the feature, can however, influence the magnitude unless you previously statistically standardized the features by subtracting the mean and dividing by the standard deviation.  

Summary of Keras Models

In this article, we studied how through practical code examples how Keras can sometime be leveraged as Estimators with right measures to provide much needed flexibility to our model architecture. 

About the Authors

Alexia Audevart, also a Google Developer Expert in machine learning, is the founder of datactik. She is a data scientist and helps her clients solve business problems by making their applications smarter. Her first book is a collaboration on artificial intelligence and neuroscience. 

Konrad Banachewicz holds a PhD in statistics from Vrije Universiteit Amsterdam. He is a lead data scientist at eBay and a Kaggle Grandmaster. He worked in a variety of financial institutions on a wide array of quantitative data analysis problems. In the process, he became an expert on the entire lifetime of a data product cycle. 

Luca Massaron is a Google Developer Expert in machine learning with more than a decade of experience in data science. He is also the author of several best-selling books on AI and a Kaggle master who reached number 7 for his performance in data science competitions. 

ODSC Community

ODSC Community

The Open Data Science community is passionate and diverse, and we always welcome contributions from data science professionals! All of the articles under this profile are from our community, with individual authors mentioned in the text itself.

1