73

Improve SQL Server Scalar UDF Performance in SQL Server 2019 with Scalar UDF Inl...

 5 years ago
source link: https://www.tuicool.com/articles/hit/v6Rre2I
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

By:Aaron Bertrand |   Last Updated: 2018-11-12   |  |   Related Tips:More > SQL Server 2019

Problem

User-defined functions have been an Achilles heel my entire career. Seemingly simple routines to apply formatting or perform a lookup caused inexplicable performance degradation, and often the evidence was obscured, or made excruciatingly worse by observer overhead. In this tip, I'm going to discuss the three types of functions in SQL Server, and what the next version does to address scalar UDF performance specifically.

Solution

There are three types of user-defined functions in SQL Server:

  1. Scalar user-defined functions (UDFs) - these are bad, since they have to be called for every row, and there is a lot of overhead there, which can lead to sub-optimal performance. They also are costed minimally and inhibit parallelism. The pain caused by scalar UDFs became easier to find when they added sys.dm_exec_function_stats in SQL Server 2016, if you knew to look for it. However, in all of this time they haven't done much else to improve performance here ( with one exception ).
  2. Multi-statement table-valued functions (TVFs) - these are also bad, since they have a fixed cardinality estimate (1 or 100, depending on version), which can lead to sub-optimal execution plans. The pain caused by multi-statement TVFs has been eased a little bit by interleaved execution , introduced in SQL Server 2017. Essentially, this pauses execution, determines cardinality, and adjusts optimizations, accordingly.
  3. Inline TVFs - no problems here. The logic is "inlined" into the query, ensuring fewer issues with runtime performance or cardinality estimations. You will notice when you review the execution plan of a query that references an inline TVF, the function is nowhere to be found. This is a key point in how scalar UDFs will be handled going forward as well.

Last night, I installed SQL Server 2019 CTP 2.1, and restored a copy of AdventureWorks. I ran the following query:

DBCC FREEPROCCACHE;
GO
SELECT TOP (100) SalesOrderID, [Status] = dbo.ufnGetSalesOrderStatusText([Status]) 
  FROM Sales.SalesOrderHeader
  ORDER BY SalesOrderID DESC;
GO
SELECT execution_count, total_elapsed_time
  FROM sys.dm_exec_function_stats 
  WHERE [object_id] = OBJECT_ID(N'dbo.ufnGetSalesOrderStatusText');

I got the following plan:

2yieEzj.png!web

Which included CPU and duration allocated to the user-defined function, as well as information about the function:

m6zemu2.png!web

And the DMV query yielded 10 executions with an elapsed time of 26 microseconds. The time in this case is negligible; the important point is that the DMV records every execution.

Then I remembered that, when I've restored an older database to a new version of SQL Server, I should change the compatibility level of the database to match the new version. This allows me to take advantage of any new optimizations:

ALTER DATABASE AdventureWorks SET COMPATIBILITY_LEVEL = 150;

I tried my query again, and "noticed" that it ran a bit quicker (I didn't really notice, but work with me here). I checked the plan, and it seemed more complex – and your gut instinct is probably to think that this must be worse:

67Zjiyj.png!web

But this is actually better and, if this query had been a candidate for parallelism, it could have now gone parallel, too.

Looking deeper, I noticed that the query against sys.dm_exec_functions now came back empty, and looking at the XML, no time was allocated to any UDF, and in fact there wasn't even a <UserDefinedFunction> node anymore:

JrEFruv.png!web

This is, coincidentally, how you determine that your UDF was inlined – it is not present in the XML. You can also tell when it hasn't happened, with a new Extended Event that fires when the optimizer encounters a UDF that it can't inline: tsql_scalar_udf_not_inlineable .

If a function did not inline, it should be easy to determine why. First, you can check if a function is inlineable in the first place, by looking at the new column is_inlineable in sys.sql_modules . This will be 1 for any function that *might* be inlined. It is important to note that this does not mean the function will always be inlined. It must not only conform to the requirements laid out in the official documentation , but also must pass other checks by the optimizer. These include things like complexity, level of nesting or recursion, and presence in a GROUP BY clause, as well as compatibility level, database scoped configuration settings, and hints. Basically, a lot of stars must align in order to make inlining happen. On the plus side, when it can happen, it will happen automatically – you don't have to go change your functions, or recompile the queries or modules that call them.

Unlike older versions of SQL Server, where you had to know esoteric trace flags to enable/disable certain optimizer features, scalar UDF inlining can be turned on or off in a variety of ways:

TSQL_SCALAR_UDF_INLINING
WITH INLINE = ON | OFF
OPTION (USE HINT('DISABLE_TSQL_SCALAR_UDF_INLINING'))

There was a trace flag in earlier versions of the SQL Server 2019 CTP to enable this functionality (which was disabled by default), but that flag is no longer necessary.

Summary

SQL Server 2019 provides a new mechanism to vastly improve the performance of scalar user-defined functions. The change is transparent, provided you are running in the current compatibility level and your function conforms to all of the requirements. This is simply one more "it just runs faster" win you'll enjoy when you upgrade. Ideally you will be working toward removing scalar UDFs, but this is one way you can restore decent performance without that work.

Next Steps

Read on for related tips and other resources:

Last Updated: 2018-11-12

J7JRjaZ.png!web

q2qQNb.png!web

About the author

NzEvEzv.jpg!web Aaron Bertrand (@AaronBertrand) is a Product Manager at SentryOne, with industry experience dating back to Classic ASP and SQL Server 6.5. He is editor-in-chief of the performance-related blog, SQLPerformance.com, and serves as a community moderator for the Database Administrators Stack Exchange.

View all my tips

Related Resources


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK