SQL Server 2016 Machine Learning Services: A Work Around for Single Value Outputs of the Raw data type in R

By | 2017-12-04T17:46:18+00:00 December 4th, 2017|Advanced Analytics, Data Science, SQL Server|0 Comments

SQL server 2016 (and 2017) enables SQL developers to write in database analytics in the form of stored procedures. The ideal flow of events is described in the below visual. Please note that the stored procedures in the visual insert results directly into the tables.

 

The key step that I would like to highlight in the diagram is step three. When we train a machine learning model we save it as a varbinary(max) in a SQL Server table. To save it as a binary data type we must convert the model to type raw in R prior to outputting it to SQL. However, when a raw data type is returned from R inside SQL Server 2016, the stored procedure can only return a single value. This makes the above flow of events not possible.

The diagram above does not work to store a machine learning model because we would like to track the metadata surrounding it. Our model storage table will likely contain models from all types of machine learning solutions we deploy in a SQL Server database, therefore, the metadata surrounding the model allows us to track and see the evolution of our solution over time. For example, we may want to know the date the model was created, metrics about performance (accuracy, precision, etc.), and the name of the model.

An alternative flow that I suggest to customers as a work around to this issue is below:

 

The above diagram simply inserts one step to the flow of events. Previously, the train model stored procedure had only a varbinary(max) output value containing the model. Now we will still serialize the model in our R script, however, we output the model from a stored procedure as a varchar(max). This allows us to output more than a single value so that we can save that information in a table. The added stored procedure executes the train model stored procedure, which returns a result set, then the original stored procedure casts the varchar(max) to varbinary(max), and inserts the result set into a SQL table to be used later in time.

This is a very general overview of how to work around the binary output from stored procedures. If you have any questions about how to implement this technically, or if you have other questions about SQL Server 2016 Machine Learning Services, please contact clientservices@nwcadence.com.

About the Author:

Leave A Comment