It’s Time For Our Annual Anniversary Sale!

Company News
0

I first registered BrentOzar.com way back in May 2001.

Over the years, some things in databases have changed a lot, but some things still remain the same.

Every year, people get handed applications and databases, and they’re told, “The back end is Microsoft SQL Server. Figure it out and make it work – faster.”

Every year, people struggle. They try to learn as much as they can via Google, free blog posts and webcasts, and their peers. They put the pieces together as best they can, but eventually, they hit a wall.

And that’s where my classes come in.

Fundamentals, Yearly
$195Save $200
Fundamentals + Mastering, Yearly
$995Save $300

The sale ends May 31. To get these deals, you have to check out online through our e-commerce site by clicking the buttons above. During the checkout process, you’ll be offered the choice of a credit card or buy now pay later.

Can we pay via bank transfer, check, or purchase order? Yes, but only for 10 or more seats for the same package, and payment must be received before the sale ends. Email us at Help@BrentOzar.com with the package you want to buy and the number of seats, and we can generate a quote to include with your check or wire. Make your check payable to Brent Ozar Unlimited and mail it to 9450 SW Gemini Drive, ECM #45779, Beaverton, OR 97008. Your payment must be received before we activate your training, and must be received before the sale ends. Payments received after the sale ends will not be honored. We do not accept POs as payment unless they are also accompanied with a check. For a W9 form: https://downloads.brentozar.com/w9.pdf

Can we send you a form to fill out? No, to keep costs low during these sales, we don’t do any manual paperwork. To get these awesome prices, you’ll need to check out through the site and use the automatically generated PDF invoice/receipt that gets sent to you via email about 15-30 minutes after your purchase finishes. If you absolutely need us to fill out paperwork or generate a quote, we’d be happy to do it at our regular (non-sale) prices – email us at Help@BrentOzar.com.


I’m Getting Antsy for SQL Server vNext.

SQL Server 2025
9 Comments

Historically, Microsoft publicly announces the next version of SQL Server about a year before it ships. For example:

This was pretty consistent with past releases: about a year before the release, Microsoft would go on record with the features included with the next version, and some companies would be able to start using it in production to prepare for the eventual release.

<checks calendar>

Well, it’s been about 2.5 years since SQL Server 2022’s announcement, so we’re looking at at least a 3-year release cycle. We might be coming up on the announcements for SQL Server 2025.

They could either announce the next version’s features at Microsoft Build on May 21-24 or at the PASS Summit on Nov 4-8. I’ll be watching the Build session catalog as Microsoft updates it because when Microsoft announces a new product, the related sessions are hidden until after the product is officially announced. It’s free to attend Build online, so if they announce 2025 during a keynote, I’ll tune in to the related general sessions.

I personally love a 3-year release cadence, and I don’t want new versions any more frequently than that. People have a hard time upgrading, and frankly, Microsoft has a hard time shipping that quickly. Just release every 3 years, and make sure the product is actually ready at the time of release. That’s good enough.

I’ve heard some folks say, “Maybe Microsoft is done with SQL Server because they want everyone to move to Azure.” Well, I’m sure they do, but they also acknowledge the reality that many companies host their infrastructure in Amazon and Google. Those clients need SQL Server licensing, because you can’t practically run Azure SQL DB in other clouds.

Microsoft also has to keep releasing new versions of SQL Server because they need to keep cashing your sweet, sweet Software Assurance checks. While SA does come with other benefits, like free licensing for a couple/few passive standby servers for HA and DR, those benefits aren’t priceless. If Microsoft didn’t ship a new version for, say, 5 years, I know some companies that would simply stop paying Software Assurance. Even if these companies don’t actually upgrade that often, they pay for the ability to know that they could upgrade – which means Microsoft has to keep bringing out versions & features.

What might Microsoft add in 2025? Clearly, AI is the buzzword du jour, and Microsoft’s Joe Sack has already demoed some of the Copilot things they’re working on in Azure SQL DB. Copilot integration requires internet connectivity, something that’s easier to integrate in cloud databases like Azure SQL DB. If you try to do large language model work on-premises, you’ll quickly learn that it’s extremely CPU and GPU intensive, and that doesn’t really make sense for a product like SQL Server that’s licensed by the CPU core. I’ll be curious to see how they package AI services to a crowd that sometimes screams in horror at the idea of giving their database servers access to the Internet.

Would you be okay giving your SQL Servers access to the Internet? And what AI-related features would you love to see included?


Who’s Changing the Table? Answers and Discussion

Your challenge for this week was to find out who keeps mangling the contents of the AboutMe column in the Stack Overflow database.

Conceptually, there are a lot of ways we can track when data changes: Change Tracking, Change Data Capture, temporal tables, auditing, and I’m sure I’m missing more. But for me, there are a couple of key concerns when we need to track specific changes in a high-throughput environment:

  • I need to capture the login, app name, and host name that made the change
  • I need to capture a subset of table columns, not all
  • I need to capture a subset of the before & after changes, not all

For example, in this week’s query challenge, the Users table has a lot of really hot columns that change constantly, like LastAccessDate, Reputation, DownVotes, UpVotes, and Views. I don’t want to log those changes at all, and I don’t want my logging to slow down the updates of those columns.

Furthermore, I probably don’t even want to capture the entire before or after values of the AboutMe column, either. It’s NVARCHAR(MAX), which can be huge. Depending on what “mangling” means, I might only need to grab the first 100 characters, or the length of the before/after changes. That’ll reduce how much I store, too, important in high-transaction environments that push the data to several Availability Group replicas.

Let’s say here’s the data I decided to gather:

You could refine that further by capturing less data if you knew exactly how the data was being mangled. For example, if the mangler always sets the rows to null or to short strings, you wouldn’t have to gather the before/after contents – just the length would be enough.

You could also get a little fancier by only storing the “before” data, but I kept it simple and just logged both before & after here for simplicity’s sake. (Spoiler alert: the next Query Exercise at the end of this post is going to be a related challenge.)

To populate this data, I love triggers. Here’s the one I came up with after a little testing:

The actual query plan on an update is nice and quick. A few things about the code that may not be intuitively obvious:

  • I didn’t use the UPDATE() function because it doesn’t handle multi-row changes accurately
  • The SET NOCOUNT ON stops the trigger from reporting back the number of rows affected by the insert, which can break apps that weren’t expecting to see multiple messages back about how many rows just got inserted/updated
  • The COALESCEs in the WHERE are to handle situations where someone sets the AboutMe to null (or changes it from null to populated)
  • The COALESCEs in the SELECT are to properly set the length columns to 0 when the AboutMe is null
  • I only fire the trigger on updates, not inserts or deletes, because the business request was about mangling existing AboutMe data

If you knew the exact kind of mangling that was happening, you could refine the WHERE clause even further, looking for specific data patterns.

Here’s what the output table contents look like after changing a few rows:

Nice and simple, and makes for really easy investigations. Just make sure to drop the trigger and the change table when you’re done! I had a really hearty laugh at one client when I returned a year later and they still had both in place.

Other Solutions

Tom aka Zikato aka StraightforwardSQL commented with a pointer to an awesome blog post he’s written on this very topic! He compared & contrasted a few different solutions, and then ended up with a hybrid solution involving a trigger, XE, and Query Store.

Connor O’Shea wrote an extended events session that filters query text based on its literal contents. I get a little nervous about that kind of thing, because they often miss weirdo situations (like synonyms and aliases), and they’re a little tricky to debug. For example, his first iteration also caught selects – something that would be terrible on a production system. It’s a good starting point though.

Erik Darling used temporal tables to capture the changes, and I especially love his security disclaimer. I don’t have any experience with login impersonation either – it’s funny, but my contract actually prohibits any security work whatsoever. I hate security work.

ChatGPT is helpful for tasks like this, even for a starting point if you don’t end up actually using the code. I asked ChatGPT to do this for me, and it came up with different results, but a good starting point nonetheless.

Your Next Query Challenge:
Getting Data Out of a Fancier Trigger

The above process works great if we’re willing to store both the before AND after values, but what if we’re dealing with a really high-throughput system with a ton of changes per second, and we want to avoid storing the “after” values each time? The table & triggers would look like this:

Now, when we want to get data out to see the before & after values, it gets a little trickier. Say we run 3 update statements, and then check the value of the change table:

The resulting query only shows the before data, but not the change that was actually affected during that update statement:

For example, notice that the last update above set the contents to Update Part 3, but that value doesn’t show in the screenshot. That’s your challenge: I want you to write a SELECT query that reproduces the full before & after data for each change, like the earlier query from the blog post:

Share your answer in the comments, and feel free to put your queries in a Github Gist, and include that link in your comments. I’ll check back in next week with answers & thoughts. Have fun!


SQL ConstantCare® Population Report: Spring 2024

The short story: SQL Server 2019 appears poised to swallow the SQL Server market altogether, hahaha.

The long story: ever wonder how fast people are adopting new versions of SQL Server, or what’s “normal” out there for SQL Server adoption rates? Let’s find out in the winter 2023 version of our SQL ConstantCare® population report.

Out of the thousands of monitored SQL Servers, SQL Server 2019 is now at 49% of the market! That’s the highest percentage we’ve seen for any version in the 3 years that we’ve been doing this analysis. Here’s how adoption is trending over time, with the most recent data at the right:

SQL Server 2019 still continues to grow while everything else shrinks, with the exception of 2022 treading water:

  • SQL Server 2022: 7%, up from 5% last quarter
  • SQL Server 2019: 49%, holding pretty steady
  • SQL Server 2017: 15%, holding steady
  • SQL Server 2016: 18%, down from 22%
  • SQL Server 2014: 6%, steady – and goes out of support in just 2 months!
  • SQL Server 2012 & prior: 4%
  • Azure SQL DB and Managed Instances: 1%
  • New metric this month: 4% of the SQL Servers here are Amazon RDS. They’re included in the SQL Server 2019/2017/2016 metrics above, since they’re technically part of those crowds.

How big are typical SQL Servers?

Let’s group the market together roughly into quarters:

For a long time in the SQL Server space, people have used the term VLDB to denote a very large database, and we’ve usually marked 1TB as the town borders of VLDBville. Today, given that about 1/4 of all SQL Servers host 1TB or more, and given how fast modern storage is able to back up 1TB databases, that 1TB threshold is less meaningful.

That large number of small servers means the CPU distribution is also fairly small:

As is the memory target distribution:

Although if you slice & dice by data size, then that changes the CPU & memory numbers differently. The number of CPU cores people typically use for >3TB SQL Servers is very different than the core count for, say, 50-250GB servers. And now if you’ll excuse me, I gotta do a whole bunch of slicing and dicing because I share that data with clients when we’re analyzing their SQL Servers during a SQL Critical Care®.


[Video] Office Hours: Ask Me Anything About Azure and Microsoft Databases

Videos
3 Comments

Back at home in the office, time to settle in with a nice caffeine-free Diet Coke and go through your top-voted questions from https://pollgab.com/room/brento. Why caffeine-free? Because I slug multiple coffees first thing in the morning when I wake up (usually around 3am-4am), and by the time I stream with y’all, I don’t need any more go juice.

Here’s what we covered:

  • 00:00 Start
  • 02:30 MyTeaGotCold: Has your opinion of Lock Pages in Memory changed over the past 10 years?
  • 03:48 MustangKirby: How can I check what data or pages are in cache? I woke up last night wondering if data I’m writing takes up cache memory.
  • 05:50 DBADoug: Why is SELECT INTO faster than INSERT INTO (of the same exact schema) with no indexes? When testing I notice about 9x more time on the “Table Insert” block in the plan) when the table is pre-defined.
  • 06:30 John R: Does SQL Server 2014+ have stored proc execution plans by user/spid? We saw 1 stored proc get slow for just 1 user (timing out after 30 sec, but 300ms for other users). Once stats were updated, stored proc was fast for that user again. I
  • 07:31 DemandingBrentsAttention: Does order of “include” index columns matter, like it does for key columns?
  • 08:40 Newtopg: Hi Brent , are there any tools You recommend for MSSQL to PostgreSQL migration ? Any open source tool that does both schema and data migration ? We are testing Pgloader and wanted your opinion on if there is any popular tool you recommend free or paid . Thank you
  • 09:36 GrumpyOldMan: I have a query that is causing a lot of blocking. SQL is recommending an index, which I agree should be used. Problem is, index already exists, exact index that SQL is recommending, but not being used. What’s up with this?
  • 10:35 bamdba: PowerShell for SQL Server? Love T-SQL & its effectiveness, but surprised by PowerShell’s popularity. Am I missing automation & scripting benefits?
  • 11:30 Sergio B: Can the tools sp_blitz and especially sp_BlitzWho capture the statements executed with sp_executesql?
  • 11:45 Chris Blain: I recently rebuilt indexes with compression. If I take a full backup, and logs, and restore to a brand new instance, will the restored database’s indexes have compression applied, or will I have to manually run the “alter index ” again on each one to apply data compression?
  • 12:35 Chris Blain: We have a database that is a single partition, some of the tables specify TEXTIMAGE_ON [PRIMARY]. this is stopping applying data_compression to the indexes. is it needed? if it was removed, could it cause issues for the underlying table, could I then apply index compression?
  • 13:13 SickieServer: Have you ever encountered databases (that maybe have been in service for decades) where the queries, while potentially fixable, are so numerous, and so bad, that the only practical solution is for the organisation to license more cores and buy more RAM?
  • 16:30 Gustav: What’s your opinion of the new regex support in Azure SQL DB? Will we see this flow down to canned SQL Server?
  • 18:33 I’m a potato ?: What’s different from Hong Kongs Databases to the rest of the world?
  • 23:06 Miles: Hi Brent,while tuning queries, which ones to be tuned first? is it high logical reads queries or high cpu queries or high duration queries? which one to be focused first?

Who’s Hiring in the Microsoft Data Platform Community? May 2024 Edition

Who's Hiring
7 Comments

Is your company hiring for a database position as of May 2024? Do you wanna work with the kinds of people who read this blog? Let’s set up some rapid networking here.

If your company is hiring, leave a comment. The rules:

  • Your comment must include the job title, and either a link to the full job description, or the text of it. It doesn’t have to be a SQL Server DBA job, but it does have to be related to databases. (We get a pretty broad readership here – it can be any database.)
  • An email address to send resumes, or a link to the application process – if I were you, I’d put an email address because you may want to know that applicants are readers here, because they might be more qualified than the applicants you regularly get.
  • Please state the location and include REMOTE and/or VISA when that sort of candidate is welcome. When remote work is not an option, include ONSITE.
  • Please only post if you personally are part of the hiring company—no recruiting firms or job boards. Only one post per company. If it isn’t a household name, please explain what your company does.
  • Commenters: please don’t reply to job posts to complain about something. It’s off topic here.
  • Readers: please only email if you are personally interested in the job.

If your comment isn’t relevant or smells fishy, I’ll delete it. If you have questions about why your comment got deleted, or how to maximize the effectiveness of your comment, contact me.

Each month, I publish a new post in the Who’s Hiring category here so y’all can get the latest opportunities.


Query Exercise: Who’s Changing the Table?

Query Exercises
27 Comments

For this week’s Query Exercise, your challenge is to find out who keeps messing up the rows in the Users table.

Take any size version of the Stack Overflow database, and the Users table looks like this:

People are complaining that from time to time, the contents of the AboutMe column in some – but not all – of the Users rows are getting mangled. We’re not sure if it’s an app bug, an ETL problem, or someone goofing around in T-SQL, and we need you to find out.

At the same time, this is also a busy production database, and we want to minimize impacts to the end users. The site still needs to go fast.

For this exercise, I’m not expecting a specific “right” or “wrong” answer – instead, for this one, you’re probably going to have a good time sharing your answer in the comments, and comparing your answer to that of others. Feel free to put your queries in a Github Gist, and include that link in your comments. I’ll check back in next week with the approach I usually use with clients. Have fun!


Join Me at the PASS Data Community Summit in November!

#SQLPass
4 Comments

On November 4 in Seattle, I’m presenting a new pre-conference workshop!

Tuning T-SQL for SQL Server 2019 and 2022

You’ve been working with SQL Server for a few years, and you’re comfortable writing queries and reading execution plans.

Your company is now using SQL Server 2019 or 2022 in production, and you want your queries to go as fast as possible. You want to know what changes to make to your existing queries in order to speed them up dramatically. You don’t want a “what’s new” session – you want practical information you can use to identify T-SQL that used to work fine in older versions, but now needs attention in newer versions.

In this 1-day session, Brent Ozar will use the same practical before-and-after techniques that he uses in his Query Challenges blog series in order to demonstrate what parts of your skills need to change as you modernize your databases.

You should be comfortable using SSMS to write multi-page queries, functions, and stored procedures. You should be comfortable identifying common query plan operators like index seeks & scans, key lookups, sorts, and parallelism, and comparing plans.

In this session, you’ll:

  • Learn what kinds T-SQL should be rewritten to aim for batch mode
  • Understand the effects of cardinality estimation feedback & parallelism feedback, and how to improve them
  • Discover new monitoring potential in query plan DMVs to troubleshoot bad plans and blocking

Attendees will get one year free access to my Fundamentals of Query Tuning and Mastering Query Tuning classes, both of which will be updated with content from the pre-con.

Register here, and see you in Seattle!


[Video] Database Answers in Hong Kong

Videos
1 Comment

As my time in Hong Kong came to an end, I sat inside on a foggy morning and hit your top-voted questions from https://pollgab.com/room/brento.

Here’s what we covered:

  • 00:00 Start
  • 01:32 MyTeaGotCold: Are there any signs of brain drain from SQL Server to Postgres? It seems that every SQL Server guru agrees that Postgres is better, even if SQL Server pays them more.
  • 02:33 Frozt: Do you have a checklist on decommissioning an SQL Server?
  • 03:06 Mike: Hi Brent, you mentioned that you are going to update Recorded Classes in 2024. Is this still true ? If yes, which ones you want to update, and when to expect that ?
  • 03:38 Mike: We know Read Committed is default isolation level, and locks are taken on statement level. What happens when you issue Begin Transaction, and then run many SQL statements before Commit ? Isolation level changes during the explicit transaction ? If yes, to which one ?
  • 04:49 RoJo: if a nightly CheckDB fails, how do I approach recovery or validating data? Would it be certain pages / tables? Must I simply roll back to last known good CheckDB ?
  • 05:33 Eugene Meidinger: If I wanted to run the Stack Overflow database as a homelab on a laptop, what specs would you recommend?
  • 06:22 Justin M: Are you a Biphasic sleeper? What is your optimal nap time and length?
  • 07:10 DataBarbieAdministrator: Could you create a blog series on well-known SQL Server Best Practices that are now outdated, explaining why and any alternatives? That would be useful and absolutely interesting! Thanks for your great job for the community!
  • 08:06 Fermin: Hi Brent, just want to know what do you think about the career path future for DBAs. Are we moving to a modern Full Stack or Data Engineer? Thanks.
  • 08:48 corelevel: Hey Brent! In your opinion, what is the minimum table size (in pages) to start adding indexes to it?
  • 09:48 ChompingBits: How many of your customers are using Kerberos for authentication vs SPNs on NTLMv2 or lower? The latter is a pain to configure, and often leads to users/devs running scripts from the SQL server when one more servers are connected to in the query.
  • 10:04 Jonathan: What do you recommend for the number of databases/size on a single disk? I’ve always just ballparked around keeping 2TB per drive and spinning up a new drive when that one fills, but lately we’ve had smaller dbs that have a lot of I/O and larger dbs with less users, etc. Thanks!
  • 11:27 Miles: How important archiving data is? We see history tables growing from GBs to TBs.What problems we might run into if we don’t archive the old data. App team hesitant due to potential data requests from other teams. Any advise on balancing data retention needs with perf optimization?
  • 12:47 Miles: When there’s blocking, app team asks what lock caused blocking. Is it okay to collect lock info or is it overwhelming. sp_whoisactive runs very slow with get_locks =1. Are we doing right thing collecting lock info or should be focusing on something else like tuning head blocker?
  • 13:37 SteveE: Hi Brent, Is there ever a use case to have a Non Clustered index created on the clustering key, perhaps when the table is a wide table?
  • 14:34 Gustav: What’s your opinion of copilot for Azure SQL DB?
  • 15:35 Steve E: Hi Brent We have a query which runs daily that inserts a recursive CTE into a table. Our monitoring tool shows the plan changes each time as it has a different plan handle but all the plans have the same plan hash. What would cause the difference in the plan handle?
  • 16:39 ganesh: Hello Sir, Using excel source in ssis , error : there are XXX bytes of physical memory with XXXX bytes free. system reports 98% memory load.tried autobuffer in data flow ,rows per batch at destination, reduce columns size from navchar255-nvarchar 25 -30 .how to avoid error
  • 16:58 Jim Johnston: I’m thru half of the Brent Ozar Season pass bundle, but looking for guidance with excessive context switching in SQL server. If its in the bundle, can I get a link to where can find it to review this?
  • 17:53 Nintenbob: Do you have any experience/thoughts on using SQL Server 2019’s UTF-8 collation support for varchar as an alternative to nvarchar? At the surface level it seems a clear improvement for most uses that need unicode, but not sure if there are hidden downsides.
  • 18:46 Moctezuma DBA: I was recently told index rebuild removes the 14-byte version control pointer. When index rebuild makes sense, should we set FILLFACTOR less than 100% to make room for those values and then mitigate page splits for that 14-byte pointer addition at each row?
  • 19:23 ProdDBA: Storage for big data, what options are there for maintaining large databases over time without archiving? Splitting the large databases onto their own drives? Just buying more storage? Ideas on how to ask others in big data how they manage their database storage?

Troubleshooting Mysterious Blocking Caused By sp_getapplock

I’m kinda weird. I get excited when I’m troubleshooting a SQL Server problem, and I keep hitting walls.

I’ll give you an example. A client came to me because they were struggling with sporadic performance problems in Azure SQL DB, and nothing seemed to make sense:

  • sp_BlitzFirst @SinceStartup = 1 showed very clearly that their top wait, by a long shot, was blocking. Hundreds of hours of it in a week.
  • sp_BlitzIndex showed the “Aggressive Indexes” warning on a single table, but… only tens of minutes of locking, nowhere near the level the database was seeing overall.
  • sp_BlitzCache @SortOrder = ‘duration’ showed a couple queries with the “Long Running, Low CPU” warning, and they did indeed have blocking involved, but … pretty minor stuff. Plus, their plan cache was nearly useless due to a ton of unparameterized queries pouring through constantly, overwhelming Azure SQL DB’s limited plan cache.
  • sp_Blitz wasn’t reporting any deadlocks, either. (sp_BlitzLock doesn’t work in Azure SQL DB at the moment because Microsoft’s no longer running the default system health XE session up there. They turned that off in order to save money on hosting costs, and passed the savings on to… wait… hmm)
  • As a last-ditch hail-Mary, I ran sp_BlitzWho repeatedly, trying to catch the blocking happening in action. No dice – the odds that I’d catch it live weren’t great anyway.

During a bio break, I unloaded the dishwasher (it’s what I do) and made a mental list of things that would cause blocking, but not on indexes, and not show up in the plan cache. And that’s when it hit me, just as I was unloading the wine glasses. (Don’t judge me: I drink a lot, and I don’t have time for hand-washing.)

sp_getapplock lets developers use SQL Server’s locking mechanisms for their own purposes, unrelated to tables. Let’s say you have an ETL process, and you want to make sure that it can only be run from one session at a time. You’d start the process by running:

That’s kinda like saying BEGIN TRAN, but for entire processes. If someone else tries to run the same query and grab a lock from their session, they’ll get blocked. Here’s what it looks like from sp_BlitzWho:

Session 63 grabbed the lock first, and session 70 is trying to get it, but they’re waiting on LCK_M_X because 63 has an eXclusive lock on it. Note that 63’s query_text shows sp_getapplock – that’s where the troubleshooting Eureka moment hits.

When session 63 is done, it can release the lock like this:

If you work like I do, troubleshooting things after they’ve happened, without the ability to set things up in advance of the problem, this is extraordinarily hard to track down.

Once you know about the problem, you can monitor blocked processes in Azure SQL DB (shout out to Etienne Lopes for a well-written, start-to-finish tutorial, and you should subscribe to his blog.) Etienne uses a file target there, but I used a ring buffer target because I didn’t have access to the client’s Azure file systems. The ring buffer target is way easier to set up:

The problem is that Azure SQL DB’s blocked process threshold is 20 seconds, and can’t be changed. <sigh> For high-volume, short-duration blocking, you’ll be better off rapidly running sp_BlitzWho and logging it to table:

Once you’ve identified that there’s a blocking chain led by sp_getapplock, the database side of the tuning work is done. There’s no database-side “tuning” on blocking like this: it’s up to the developers to use conventional transaction-based tuning processes, but in whatever app code is calling sp_getapplock. The app is essentially running its own transaction, so the developers need to:

  • Minimize the setup work done inside the transaction: do all set-up work like fetching config data ahead of time, before the transaction starts
  • Minimize post-processing work inside the transaction: do logging later, after releasing the lock
  • Minimize row-by-row processing: do work in sets, touching each table as few times as possible, not row-by-row inserts/updates/deletes
  • Minimize round trips between the app server and database server: ideally, prepare everything as a package and send it off to a stored procedure to do all the committing quickly, eliminating all round trips (and then rip out sp_getapplock while you’re at it)

With those tips, sp_getapplock can be completely fine. It’s not like sp_getapplock is inherently bad, any more than BEGIN TRAN is bad. It’s just that detecting its long transactions is way harder.


It’s almost time for Fundamentals Week! Have you signed up yet?

Conferences and Classes
0

The Monday after next, I’ll be getting together with dozens of smart people who are a lot like you. You’ve attended some of my free online streams, read my blog posts, and you use the First Responder Kit. You enjoy my laid-back, humorous approach to sharing what I’ve learned over the years.

You love live classes. You’ve tried watching recordings or stepping through demos yourself, but…you just can’t block out the time, and you can’t stay motivated while you’re watching a recording. You like the fast-paced energy of seeing me live, asking me questions, and seeing me respond to your chats.

You hate free conferences that feel like Zoom meetings. You’ve tried attending a few online events, but they felt just like sitting through one meeting after another. They didn’t dive deeply into any one subject – they just hopped around from one topic to another, and many of those topics just weren’t relevant to you. They were good for free stuff, but…

You’re ready for Fundamentals Week! You’ll spend the week with me learning about indexes, query tuning, columnstore, how I use the First Responder Kit, and more.

The conference dates are May 6-10, 2024, and classes are 10AM-6PM Eastern, 7AM-3PM Pacific. Here’s what we’ll cover:

Register now for $1,595. To keep prices low, this price does not include the class recordings. If you want those, during checkout, you’ll see an offer to add lifetime access to my Recorded Class Season Pass Fundamentals for just $395. See you in class!


[Video] Office Hours: Hong Kong Harbor

Videos
2 Comments

I recently went to Shanghai & Hong Kong, and stopped to take your questions from https://pollgab.com/room/brento while sitting next to the Hong Kong harbor.

Here’s what we covered:

  • 00:00 Start
  • 01:18 Jason G – RN & Accidental DBA: Would you elaborate on DB Owner implications? sp_Blitz help recommends using the SA account, but the articles referenced by Andreas Wolter advocate for using low privileged accounts which are DB specific. Which would you recommend and why?
  • 02:44 MyTeaGotCold: Can you name any good relational databases that aren’t built around SQL? It’s strange that a system so old is still the best.
  • 03:58 John: Hello Brent. Is SQL Server 2022 ready for production/prime time and more more huge bugs/issues? Asking due to last blog post was in 2023 on that particular topic. Thanks.
  • 04:57 Chicago Joe: Is there a trend to move database access to API only? I am asking because we are moving to next version of ERP and our CIO has told database developers that only access to new database will be through a Web API. Database is still on prem on next version, too.
  • 06:23 J. Fisher: Hey Brent, Are you able to comment/explain SQL Server “Native” Geography/Geometry datatypes, other CLR stuff, and how they can use and exhaust “App Domain” memory… leading to “Unloading due to memory pressure”… Can’t “afford” to keep adding memory.
  • 08:19 Steve E: Hi Brent, Is there a way to assess overall reads per table across a workload in an attempt to see which tables we might want to focus our index tuning efforts to? Eg if the Posts table has 90% of the overall workload reads, we would probably want to start our index tuning there.
  • 09:29 neil: dev thinks “azure” will solve all their problems. (they dont understand we’re already sql on azure vm). they’re committing all the same mistakes that created disasters on-prem. what surprises are they in for ?
  • 11:12 Dream catcher: What time do you like to go-to bed and wake up? Do you nap after lunch?
  • 11:43 ChompingBits: What do you think is nominally the difference between ADF and SSMS? ADF has query plans and access to many of the admin tools and reports in SSMS. How long do you think Microsoft will continue to offer both tools.
  • 12:45 gringomalbec: Hi Brent, we realize you recommend not using Linked Servers to connect to other MS SQL Servers. But my friend asks if you find ok using Linked Servers to download data from sources other than MS SQL Server that cannot be connected directly in SSMS using Database Engine ?

How to Have a Conversation About Software Licensing

Licensing
4 Comments

As a consultant, I sometimes see SQL Servers with high CPU core counts relative to their workload. For example, the situation that inspired this post was seeing half a dozen SQL Servers with ~50 cores each, hundreds of cores total – yet they consistently had just 1-5% CPU usage.

Here’s what I said to management:

Right now, you’re running SQL Server on 280 cores. SQL Server Enterprise Edition licensing is about $7,000 per core, so you’ve installed $1,960,000 worth of licensing. If an employee or former employee goes to the BSA and reports you, they may get a pretty sizable cash reward. I’m not the licensing police, and I’m not going to ask if you actually paid for that licensing. I’m going to phrase my question very carefully: would you like help reducing your licensing footprint?

Every now and then, the managers on the call say the company actually is licensed for that many cores, and it usually involves a funny story about someone getting a really good deal. For example, one of my clients got an amazing licensing agreement with Microsoft where they paid for very few cores initially, and they only had to “true up” their licensing every 5 years. Immediately after signing the agreement, they installed dozens of large Enterprise Edition servers. They knew they wouldn’t have to pay for those servers for another 5 years, and by then, they expected their company to be acquired by a much larger one. Good times.

But most of the time, the managers get really wide eyes, and they start stammering and making excuses. On this particular call, the manager immediately started telling the employees, “Don’t get any ideas about reporting us,” and laughed nervously.

I’m completely chill about it – after all, it’s not like their company owes ME for the licensing, and I’m not about to be the one who calls the BSA. Besides, the terms & conditions for BSA rewards are hilariously bad, making it pretty damn unlikely that anybody ever gets a reward. I do like mentioning that, though, because it gives managers an incentive to fix the problem quickly.

The reason I care, as a consultant, is that often:

  • Managers didn’t realize how much money is at stake.
  • Once they know, their performance problems become a much bigger issue (because now they’re concerned about cutting horsepower.)
  • Once they know, the project often changes scope – because now we’re probably going to build new SQL Servers with dramatically lower core counts, and migrate over to those.

When we build those new SQL Servers, we can make other corrective changes that will help their particular bottlenecks. For example, if the workload is memory-starved, we can throw much more memory in the new boxes – because management now understands that the cost of a terabyte of memory is nothing compared to a $2M licensing bill.

This usually starts a whole new discussion with management, and the next step usually involves discussing my 7 rules for SQL Server licensing.

So far, there have only been a couple of times in my career when the client’s management decides that they’re going to keep pirating SQL Server. As a consultant, I don’t wanna go anywhere near companies like that – because when the licensing police inevitably come calling, the company will say, “Well, Brent Ozar worked on these systems, and he told us that it was completely fine.” That’s why I’m out as soon as I find out that the company is purposely pirating the software, and on my way out the door, I leave a very clear paper trail behind me.


Announcing Offline Access to Your BrentOzar.com Courses

Company News
2 Comments

So you’ve got an iPhone or iPad or Vision Pro, and you bought one of my training classes or the Fundamentals or Mastering passes, and you wanna watch the classes while you’re offline. Perhaps you wanna learn on the train ride in to work, or perhaps on the weekends, you like to completely disconnect from the world and retreat into a bunker lined with a Faraday cage. I won’t judge.

Go into the app store and get the Teachable online courses app:

Teachable app

Log in, and you’ll see your list of courses:

Your courses

Click on a course, click on a lesson, and at the top right of each lesson, there’s a download button:

Download button

Click the download button, and … well, you’re an IT professional, and you can figure out what happens next. When you want to clear out your downloads, go into the app’s settings.

I’ve asked Teachable to enable a download button at the course level so you can download entire courses rather than individual modules. I can’t promise a delivery date on that, obviously, but they’re aware of the request.

Enjoy!


[Video] Office Hours: Career & Data Q&A

Videos
2 Comments

A few career and security questions found their way into the queue at https://pollgab.com/room/brento for this episode:

Here’s what we covered:

  • 00:00 Start
  • 02:10 Ethan: What percent of your constant care shops require encryption to connect with SQL server? What are your thoughts on mandatory encryption?
  • 03:24 Miles: Hi Brent, an app that spawns multiple spid’s,multiple txns.While troubleshooting BLOCKING,initially thought to trace one spid and all sql stmts for that spid.But it turned out, multiple spids r getting involved? how to troubleshoot issues when multiple spids & db’s are involved?
  • 05:13 Miles: Hi Brent, Does MSDTC txns gets translated to serializable isolation level? but when I check the additionalinfo column output in sp_whoisactive, it shows ReadCommited. Other thing, even when I am using RCSI, it shows isolation level as ReadCommitted? am i missing anything?
  • 05:58 Ignacio: Do you have a recommended formula for required number of DBAs to number of SQL Server VMs managed?
  • 06:45 Karthik: What’s your opinion of ChatGpt ability to analyze a pasted query execution plan?
  • 07:53 DBA_JR: Hello Brent, why does SQL Server allocate space in tempdb for running a query? And when does this allocation occur? Upon enabling Read Committed Snapshot Isolation (RCSI), why does the tempdb size continue to increase? My database contains 3TB of data.
  • 08:54 Ben Grimm: What are your thoughts on making the domain user for the SQL account a local admin on the VM?
  • 09:09 Pedro Gonzalez: Hi Brent, I’m one of the junior DBA admins in a mid size company, and I want to be like you. Where do I start? I like to read documentation, so they can hire me for that.
  • 10:58 MyTeaGotCold: I find temp tables so valuable to performance tuning that my go-to solution for tuning functions is to convert them to stored procedures. Is this a common problem?
  • 12:17 Steve E: Hi Brent, You often say 5 Non clustered indexes per table is a good rule of thumb, does this advice change when working with very wide data warehouse tables which have more than five columns users often search on?
  • 13:10 Ignacio: Does sp_Blitz check for instant file initialization enablement?
  • 14:10 Sean C: Hi Brent, do you have any experience and/or resources you could point my friend to regarding unit testing for SQLServer procs, etc? My friend has some idea of what unit tests are, but is having trouble visualizing how this is accomplished in TSQL; he’s pretty dumb.
  • 15:16 Atherol: Why does the four day work week work well for the U.K. but not the U.S.A.?
  • 16:01 Ethan: Will strict TLS 1.3 encryption in SSMS20 cause users to think twice about upgrading from SSMS19?
  • 16:30 Karthik: How do you keep your mind from racing at bedtime?
  • 17:55 Ignacio: How does Aurora PostgreSQL failover speed compare with Azure SQL DB / M.I. / V.M. failover speeds?

Your Teammates Need the Fundamentals.

Conferences and Classes
0

You’ve been reading my site for a while, watching videos, keeping up with the best SQL Server techniques. However, you’ve got a lot of other folks on your team who don’t have as much free time as you, or they’re just getting started on their SQL Server performance tuning journey.

Forward this post to them, because they need to know about Fundamentals Week, May 6-10.


Hey you! You love live classes. You’ve tried watching recordings or stepping through demos yourself, but…you just can’t block out the time, and you can’t stay motivated while you’re watching a recording. You like the fast-paced energy of seeing me live, asking me questions, and seeing me respond to your chats.

You hate free conferences that feel like Zoom meetings. You’ve tried attending a few online events, but they felt just like sitting through one meeting after another. They didn’t dive deeply into any one subject – they just hopped around from one topic to another, and many of those topics just weren’t relevant to you. They were good for free stuff, but…

You’re ready for Fundamentals Week! You’ll spend the week with me learning about indexes, query tuning, columnstore, how I use the First Responder Kit, and more.

The conference dates are May 6-10, 2024, and classes are 10AM-6PM Eastern, 7AM-3PM Pacific. Here’s what we’ll cover:

Register now for $1,595. To keep prices low, this price does not include the class recordings. If you want those, during checkout, you’ll see an offer to add lifetime access to my Recorded Class Season Pass Fundamentals for just $395. See you in class!


Using ChatGPT for T-SQL Code Reviews

T-SQL
23 Comments

For this month’s T-SQL Tuesday, Pinal Dave asked us if AI has helped us with our SQL Server jobs. For me, there’s been one instant, clear win: code reviews.

I usually keep a browser tab open with ChatGPT 4, and I paste this in as a starting point:

You are an experienced, diligent database developer who specializes in Microsoft SQL Server. Review the below stored procedure and decide whether it has any security or performance risks.

Then I paste in the code right below that. Here’s an example output using bad dynamic T-SQL. ChatGPT 4 cut straight to the point, identified the risk of SQL injection, and even rewrote the query for me in a less-vulnerable way, plus pointed out the performance risks of using LIKE for string searches.

In milliseconds.

This particular example is a really short piece of code, and even here, the advice isn’t perfect. ChatGPT renamed my @StringToExecute variable, and I’ve experienced tons of similar issues with client code. It’ll introduce subtle changes that are unrelated to the problem it’s trying to solve, but… for a blazing-fast code review, I’ll take it.

The more complex your code gets, the more vague ChatGPT’s advice becomes. For example, I took a query from my Mastering Query Tuning class’s deadlock module and asked ChatGPT why this query is getting deadlock errors. ChatGPT’s answer was really long, but mostly devoid of value. There’s one tiny good nugget in there, but I feel really bad for someone tasked with reading ChatGPT’s wall of text and extracting the tiny nugget that will actually solve the problem. I even tried a very directed, suggestive prompt for ChatGPT to lead the horse to water, and it still didn’t drink. It’s not bad advice, per se – it’s just really wordy and misdirected.

Still, though, I love using this as a very, very, very fast sanity check on code. I also like doing it together while sharing screens with client staff too because:

  • I can get code recommendations faster than they can explain what the code is doing
  • The client’s tech staff learns a new technique to get peer review on their own code
  • The client’s management staff sees firsthand that 2024-level AI is a helpful tool, but not an end-all solution, because a lot of the advice requires human interpretation (and a lot of it is garbage)

Is it a perfect tool? No, but it’s very productive. Like Mitch Ratcliffe said, “A computer lets you make more mistakes faster than any other invention with the possible exceptions of handguns and tequila.”


[Video] Office Hours: Just Three Questions

Videos
3 Comments

Sometimes y’all post questions at https://pollgab.com/room/brento that require extended answers. Today, I’m hitting just 3 questions:

Here’s what we covered:

  • 00:00 Start
  • 00:44 MyTeaGotCold: If storage is no issue, do I need to worry about the size of Query Store? It’s going to hit 10% of the size of my largest database. I have no interest in changing QUERY_CAPTURE_MODE away from ALL or my retention period.
  • 04:12 DB-Ay?: For a high activity table, is it worth updating stats at a higher percentage just for auto-update to resample at a lower rate within an hour? I’m not able to persist the value and auto-update must remain enabled. I think my only option is to exclude it from the custom stats job.
  • 08:39 David E: I took over for an idiot who setup every database to have 4 transaction Log files per database. I have tried everything I know to remove the files, but it errors out everytime. Never had to deal with before in my career. I have tried some things with no success.

Who’s Hiring in the Microsoft Data Platform Community? April 2024 Edition

Who's Hiring
6 Comments

Is your company hiring for a database position as of April 2024? Do you wanna work with the kinds of people who read this blog? Let’s set up some rapid networking here.

If your company is hiring, leave a comment. The rules:

  • Your comment must include the job title, and either a link to the full job description, or the text of it. It doesn’t have to be a SQL Server DBA job, but it does have to be related to databases. (We get a pretty broad readership here – it can be any database.)
  • An email address to send resumes, or a link to the application process – if I were you, I’d put an email address because you may want to know that applicants are readers here, because they might be more qualified than the applicants you regularly get.
  • Please state the location and include REMOTE and/or VISA when that sort of candidate is welcome. When remote work is not an option, include ONSITE.
  • Please only post if you personally are part of the hiring company—no recruiting firms or job boards. Only one post per company. If it isn’t a household name, please explain what your company does.
  • Commenters: please don’t reply to job posts to complain about something. It’s off topic here.
  • Readers: please only email if you are personally interested in the job.

If your comment isn’t relevant or smells fishy, I’ll delete it. If you have questions about why your comment got deleted, or how to maximize the effectiveness of your comment, contact me.

Each month, I publish a new post in the Who’s Hiring category here so y’all can get the latest opportunities.


Announcing Fundamentals Week, May 6-10

Conferences and Classes
3 Comments

You’ve attended some of my free online streams, read my blog posts, and you use the First Responder Kit. You enjoy my laid-back, humorous approach to sharing what I’ve learned over the years.

You love live classes. You’ve tried watching recordings or stepping through demos yourself, but…you just can’t block out the time, and you can’t stay motivated while you’re watching a recording. You like the fast-paced energy of seeing me live, asking me questions, and seeing me respond to your chats.

You hate free conferences that feel like Zoom meetings. You’ve tried attending a few online events, but they felt just like sitting through one meeting after another. They didn’t dive deeply into any one subject – they just hopped around from one topic to another, and many of those topics just weren’t relevant to you. They were good for free stuff, but…

You’re ready for Fundamentals Week! You’ll spend the week with me learning about indexes, query tuning, columnstore, how I use the First Responder Kit, and more.

The conference dates are May 6-10, 2024, and classes are 10AM-6PM Eastern, 7AM-3PM Pacific. Here’s what we’ll cover:

Register now for $1,595. To keep prices low, this price does not include the class recordings. If you want those, during checkout, you’ll see an offer to add lifetime access to my Recorded Class Season Pass Fundamentals for just $395. See you in class!