T-SQL Tuesday #137 – Using Notebooks Everyday
source link: https://glennsqlperformance.com/2021/04/13/t-sql-tuesday-137-using-notebooks-everyday/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
This month’s T-SQL Tuesday is being hosted by Steve Jones (blog|twitter). The subject is how are you using Jupyter notebooks every day. As Steve describes it, “I want you to write about how you have used, or would like to use, a Jupyter notebook“. This post will be my contribution for T-SQL Tuesday #137 – Using Notebooks Everyday.
This is a pretty focused subject area, that still gives everyone plenty of possibilities for blog topics. Being more focused on one topic might be bad, since it may discourage some people from writing a post. I hope that is not the case. Just writing a useful blog post, and getting quality links back and forth from other blogs in the SQL Server community is good for everyone who participates!
BTW, reading and then commenting on the T-SQL Tuesday blog posts of other participants helps both you and the other blogger. They get views and comments on their blog, and you get backlinks to your blog. T-SQL Tuesday also helps encourage people to blog more often, which is good.
In my case, I am going to talk about the Jupyter notebook version of my SQL Server 2019 Diagnostic Information Queries.
T-SQL Tuesday #137 – Using Notebooks Everyday
I have to admit that I do not use Jupyter notebooks or Azure Data Studio (ADS) everyday. Last August, I made separate Jupyter notebook versions of my SQL Server Diagnostic Information Queries. There was a separate version for SQL Server 2012 through SQL Server 2019, along with one for Azure SQL Database. This was after a number of requests from people in the community.
Creating these notebooks was a pretty decent amount of work. Luckily, this was right around the time that Azure Data Studio was making it much easier to edit and format markdown for the text blocks. Since then, Azure Data Studio is even easier to use for editing and formatting. Even more fortuitous was the fact that Julie Koesmarno (@MsSQLGirl) volunteered to greatly improve my formatting!
Unfortunately, there has not been as much interest in my Jupyter notebooks as I hoped for. There are probably a number of reasons for this.
Azure Data Studio Is Targeted at Developers
Azure Data Studio is currently a better tool for database developers rather than traditional database administrators. While Azure Data Studio continues to improve and it has a regular, rapid release cycle, it does not have full feature parity with SQL Server Management Studio (SSMS) for many common administrative tasks.
Experienced DBAs often have years of experience with SSMS, and it serves their daily needs pretty well. Getting many DBAs to completely switch to ADS will be difficult, but I don’t think that is Microsoft’s objective.
Developers Are Not Interested in Diagnostic Queries
This is obviously a sweeping generalization. It is probably more accurate to state that many database developers are not even aware of my SQL Server Diagnostic Queries. Even if they have heard of them, they may not realize how useful they can be for a developer. One key example is being able to see what cached queries and stored procedures are the most expensive for an instance or database.
In many shops, developers probably won’t have VIEW SERVER STATE permission on their Production databases, so they won’t even be able to run many of my queries. That is something that can easily be fixed in many cases.
Azure Data Studio Used to Have Performance Issues
I did a half-day session for the PASS Virtual Summit 2020 called “Dr. DMV with Jupyter Notebooks”. Because of the virtual nature of that event I had to record the session in advance, then edit it and upload it. This session is embedded below.
Dr. DMV with Jupyter Notebooks
Recording the session in advance was extra work, but it actually turned out to be a blessing in disguise. I was using ADS 1.22, which was the latest version available at the time. When I started recording the session, I discovered (to my horror) that ADS was nearly unusable with my SQL Server 2019 Diagnostic Information Queries Jupyter notebook loaded.
I was connected to a local instance of SQL Server 2019, and running individual queries and scrolling around in the editor and code cells was extremely slow and inconsistent. ADS would freeze up for multiple seconds doing nearly anything. If I had been running those queries live, it would have been a very bad experience for me and for my audience. It would have also made ADS look very bad.
Since I was recording, I was able to make a separate recording for each query, and edit out the freezes and delays. I would not have had that luxury if it was a live presentation.
BTW, my workstation at the time was an AMD Ryzen 9 3950X with 64GB of DDR4-3600 RAM and a GeForce RTX 2080 Super GPU, with my OS running on a 480GB Intel Optane 900P SSD. That is very fast client hardware, faster than most DBAs will probably have. I saw similar bad performance on other machines at the time, so it was not just some weird problem with my machine.
Azure Data Studio Performs Much Better Now!
ADS 1.26 included a raft of performance improvements according to the release notes. I did some testing on my current workstation this morning using the latest ADS 1.27 release. I was very pleasantly suprised by the performance.
My current workstation is an AMD Ryzen 5950X with 64GB of DDR4-3600 RAM and a GeForce RTX 2080 Super GPU, with my OS running on a 500GB Samsung 980 Pro PCIe 4.0 NVMe SSD. It will load my SQL Server 2019 Diagnostic Information Queries Jupyter notebook in about 7.5 seconds, which is a huge improvement.
I can also run all of the code cells in the notebook and get all of the results back in about two minutes and 45 seconds. This makes it feasible to run the entire notebook and have the saved results of the queries included in the saved notebook. Having this ability will make Jupyter notebooks much more interesting to me.
Don’t Use The Bundled Version of ADS!
Starting with SSMS 18.7, Microsoft began including a bundled version of the System Installer version of ADS that was installed when you installed SSMS whether you wanted it or not. I wasn’t a big fan of that, and I would have preferred that you could easily opt out of the bundled installation in the SSMS GUI.
This was not because I was anti-ADS. It was because SSMS is on a slower release cycle than ADS, and because of that, the bundled version of ADS that you get with each SSMS release is unlikely to be the latest version (depending on when you install it). Right now, with SSMS 18.8, you will get ADS 1.25.1, which is pretty old.
It is also likely that you may have two different versions of ADS installed. One may be the System installer version that SSMS installed for you, and the other may be a User installer version that you downloaded and installed yourself. Just look in the “Programs and Features” applet to see if this is the case. You can also check your ADS version in the Help, About dialog. Ideally, you want to be on ADS 1.26 or newer.
If you have multiple, possibly different versions of ADS installed, my advice is to uninstall them and then do a fresh User install of whatever the latest version is. You can always find the latest version here.
I hope that this post has been interesting and useful! I also want to thank Steve for hosting this month.
If you have any questions about this post, please ask me here in the comments or on Twitter. I am pretty active on Twitter as GlennAlanBerry. Thanks for reading!
Aggregate valuable and interesting links.
Joyk means Joy of geeK