SQL Server 2016 is the most advanced and feature-packed release of SQL Server ever. From SQL Server 2005 through SQL Server 2014, there have been some very nice improvements and the product keeps getting better every release. However, SQL Server 2016 doesn’t just bring a new game, it brings a whole new playing field. Here are my top 10 most compelling reasons to upgrade to SQL Server 2016:
- Performance. All things being equal SQL Server 2016 offers an average of 40% better performance over SQL Server 2014. Keep in mind that I am talking about averages. Your mileage may vary, but given several different environments I witnessed anywhere from 10 to 70 percent improvement in database engine performance.
- PolyBase. At this point, we have all heard the buzz phrase “Big Data” and frankly, most of us are tired of hearing it. “Big Data” is really just “Data”. It’s just not the data that we commonly think of as easily accessible for analytics. Instead of nice structured relational data that is easy to query and analyze, it is messy unstructured and semi-structured data like web logs, twitter feeds and other miscellaneous data feeds. However, there is a great deal of useful data we can gather and analyze in such semi and un-structured data sources. The data analysis industry has been hard at work developing a system where that analysis is possible and there have been a number of breakthroughs. Probably the most significant is Hadoop. I won’t go in to Hadoop here, but the significant factor is that Hadoop enables to store a very large amount of loosely structured data in the Hadoop Distributed File System (HDFS) and use some very cool new tools to dig through and analyze that data. What Hadoop doesn’t allow us to do is naturally join that data with structured relational data to gain additional insights from joining structured and loosely structured data. This is the part of the story where PolyBase emerges.
PolyBase provides the mechanism and infrastructure to efficiently join structured and loosely structured data. It does this by connecting a SQL Server instance to a Hadoop cluster or Azure Blob Storage in such a way that we can use T-SQL to query across our structured relational store and the less structured data housed in Hadoop or Azure. There is no need to learn another language like HiveQL. We can start analyzing the data now, with the tools we already understand. PolyBase not only enables direct access to semi-structured data, it provides the mechanism to quickly move that data in to SQL Server or move the data from SQL Server into Hadoop or Azure Blob Storage reducing our reliance on complex ETL processes to move data from one platform to the other.
- R Services. R Services could easily have made up half of the new cool features, but that would be cheating so I will lump them all together. R Services is Microsoft’s foray in to pure data analytics. Since R is the world’s most powerful programming language for statistical computing, machine learning and data visualization, Microsoft’s release of an entire set of R services is huge. With the integration of R into SQL Server 2016, developing and deploying R inside one platform is made easy. OK, maybe not easy, but at least easier.R-Services are made up of Microsoft R Server, Microsoft R Client, Microsoft R Open and SQL Server R Services.Microsoft R Server is Microsoft’s implementation of the Open Source analytics platform for R. Microsoft’s implementation has some nice performance enhancements that aren’t necessarily available with the open source versions. However, it is still based on Open Source R and is compatible with packages and scripts written for other platforms.Microsoft R Client is Microsoft’s client-side data science tool. R Client is built on top of Microsoft R Open which means any package developed with it can be used on other Open Source tools and visa-versa. In addition, R Client along with R Server provides access to ScaleR technologies (previously RevoScaleR, created by Revolution Analytics which Microsoft acquired last year). ScaleR offers significant performance advantages with its proprietary functions that enable R to benefit from parallelization and remote computing. However, an important thing to keep in mind is that ScaleR requires the Enterprise license of SQL Server.Microsoft R Open is an enhanced open source distribution of the R platform. It is still open source which makes it compatible with all existing R packages, but it does deliver some performance enhancements. It is free to download and use. It’s a cost effective alternative to R Server. However, R Server, especially in the Enterprise edition, with its ScaleR support, provides significant performance improvements.SQL Server R Services is essentially “in-database” support of R. It runs alongside SQL Server and allows the execution of R scripts using a new T-SQL stored procedure along with several new extensions. These new objects enable the creation of traditional T-SQL programming objects like stored procedures, functions and views that can call on the power of the R analytics engine.
- R Services and PolyBase together. OK, I am cheating, but this deserves its own mention since it opens up a whole new world of possibilities. The ability to link multiple data sources of various source types, extract and analyze that data with R while using traditional T-SQL objects through the PolyBase interface extends both R and PolyBase possibilities.
- Stretch Database. Besides the new data analytics features, this is one of my favorites. I have been responsible for managing very large databases with huge tables. The challenge was always managing ease of access with maintainability and storage requirements. Every time I went to the operational folks asking for another terabyte of storage, I got the same “are you kidding me?” reaction. Telling them “storage is cheap” could result in a tar and feathering. The Stretch Database feature provides a mechanism to move less-accessed data to Azure. The data is securely moved into an Azure SQL Database where its availability is completely transparent to the client. When the tables that are so configured are queried, the database engine knows which rows are local and which rows are in Azure. Since it is more than just remote storage, the query optimizer takes advantage of the Azure SQL Databases processing power and passes the execution plan for the Azure-hosted data. You do, however, pay for the Azure Blob storage, which, generally speaking, is much cheaper than enterprise storage.
- JSON Support. SQL Server 2016 adds native support for Java Script Object Notation (JSON). Over 10 years ago Microsoft added native support for XML in SQL Server 2005. Now they have added a very similar support structure for JSON. Using the FOR JSON clause T-SQL can be used to create JSON output from querying database objects. Likewise, raw JSON can be shredded and translated to relational row and column format.
- Always Encrypted. A couple of interesting security and regulatory challenges to securing data in a database is protecting data from the evil administrator (or just curious one) and securing data in transit. Previously SQL Server encryption was essentially managed by the DBA. Encryption keys were created and managed by the admin configuring SQL Server across the entire server. Additionally, encrypted data was decrypted for transit, which required organizations to employ additional encryption and security in the transmission of data. To work around these issues many organizations would employ third party solutions to manage and protect sensitive data adding another layer of complexity and administration. With SQL Server 2016’s Always Encrypted feature, the data encryption is handled by the client. This means that encrypted data stored in the database is always encrypted. The DBA or system administrator has no ability to peruse the encrypted data. The same goes for any individual with read access. Without the client key, the data is a collection of seemingly random characters. The client key is managed, well, by the client. The actual keys to encrypt and decrypt data is stored in an external key store outside of SQL Server.
- Temporal Data Support. Have you ever looked at a table and wondered what the data looked like before it was updated? How about comparing old values with new values? Without some fairly complex schema changes and development effort, auditing a table’s data lineage is difficult. That’s where temporal data support comes in. After creating or altering a table to add temporal support all modifications to that table are tracked. This enables the querying of data for a particular time. I thought this feature needed just a bit more explanation so here is a very simple example:
I created a table called inventory to track the history of individual parts.
CREATE TABLE dbo.Inventory ( PartId int NOT NULL PRIMARY KEY CLUSTERED , PartName nvarchar(100) NOT NULL , PartDescription varchar(MAX) NOT NULL , Category varchar(100) NOT NULL , UPC bigint NOT NULL , MFPrice money NOT NULL , ValidFrom datetime2 (2) GENERATED ALWAYS AS ROW START , ValidTo datetime2 (2) GENERATED ALWAYS AS ROW END , PERIOD FOR SYSTEM_TIME (ValidFrom, ValidTo) ) WITH (SYSTEM_VERSIONING = ON (HISTORY_TABLE = dbo.InventoryHistory));
Over a period of time the manufacturers price changes. In order to analyze the price fluctuations, I query the table:
SELECT PartName ,Category ,MFPrice ,CONVERT(varchar(14),ValidFrom,111) AS UpdateDate ,CONVERT(varchar(14),ValidTo,111) AS ValidTo FROM Inventory FOR SYSTEM_TIME ALL WHERE PartId = 987463 ORDER BY ValidFrom
Any rows coordinating with the PartId of 9874346, including updates, inserts and deletes will be returned.
PartName Category MFPrice UpdateDate ValidTo RedWidget Wagons 99.99 2016/08/01 2016/09/05 RedWidget Wagons 109.99 2016/09/05 2016/09/12 RedWidget Wagons 119.99 2016/09/12 2016/09/19 RedWidget Wagons 129.99 2016/09/19 2016/09/26 RedWidget Wagons 139.99 2016/09/26 2016/10/03 RedWidget Wagons 149.99 2016/10/03 2016/10/10 RedWidget Wagons 159.99 2016/10/10 2016/10/17 RedWidget Wagons 169.99 2016/10/17 2016/10/20
The earliest record shows when the row was initially added. The last row, since the ValidTo value is in the past shows when the record was deleted. Everything else is updates. Very Cool.
- Reporting Services. For a variety of reasons, I have never been a big fan of SQL Server Reporting Services – until now. After remaining virtually unchanged for the last dozen years or so it has finally been modernized and looks great. SSRS now includes support for Excel Reports, Power BI and even allows for the creation and deployment of mobile reports. It includes an HTML 5 rendering engine and a variety of new slick controls and chart types. An added bonus is the removal of the clunky Active X controls it has used in the past. The new reporting web portal that replaces the previous report manager has options for creating dashboards with KPIs sourced either manually or from defined data sources, the aforementioned mobile reports and paginated reports. This is a giant leap forward for reporting services.
- Everything Else. I am really cheating on number 10, but it was too difficult to choose so I am going with all the other added features and improvements. There are almost too many to mention. Some stand outs are Row Level Security, dynamic data masking, Availability Group support in Standard Edition, huge improvements in the ColumnStore and In-Memory feature as well as a new Query Store feature that records query execution plans over time. There have also been significant improvements in the Analysis Services and Integration Services engines that I didn’t even mention. The top most of those in my current position is the is the improvements in automated deployment possibilities, but there are plenty more that are very exciting to the Business Intelligence community.
So, there you have it. Having worked with SQL server since the SQL Server 6.5, I have never been more excited about a SQL Server release. The advances that have been made in the relational engine, analysis engine, integration services, reporting services and data science features in a single platform are groundbreaking.