Jeff's SQL Server WeblogJeff's SQL Server WeblogJeff's very popular Microsoft SQL Server weblog. Advice, ideas, techniques and more for beginners and experts alike, with a sense of humor as well. Articles
How to calculate Median in SQL Server
2010-08-30 06:00:00 Nothing earth-shattering here, I was just helping out a colleague with this so I thought I'd post up the example I gave him. -- sample table: create table People ( Person varchar(1) primary key, City varchar(10), Age int ) go -- with some sample data: insert into People select 'A','Boston',23 union all -- odd # select 'B','Boston',43 union all select 'C','Boston',29 union all select 'D','Chicago',15 union all -- single # select 'E','NY',12 union all -- even # select 'F','NY',55 union all select 'G','NY',57 union all select 'H','NY',61 go -- here's our query, showing median age per city: select city, AVG(age) as MedianAge from ( select City, Person, Age, ROW_NUMBER() over (partition by City order by Age ASC) as AgeRank, COUNT(*) over (partition by City) as CityCount from People ) x where x.AgeRank in (x.CityCount/2+1, (x.CityCount+1)/2) group by x.City go -- clean... More About: Server
An interesting take on "Stored Procedures" (link)
2009-11-16 19:42:00 Another great post SQL-related post over at TheDailyWTF regarding a, umm, "creative" use of "Stored Procedures": http://thedailywtf.com/Articles/For-the-E ase-of-Maintenance.aspx More About: Humor , Interesting , Links , Link
Programming is like dreaming?
2009-11-02 20:52:00 This is from March, so it's a little old, but I just stumbled upon it and found it a bit interesting. I never thought of programming this way before, but it does make sense: Programming is like a dream More About: Links , Dreaming
Processing an OLAP cube with a T-SQL Stored Procedure
2008-10-27 20:26:00 Here's a simple SQL Server stored procedure that you can call to process an OLAP cube using T-SQL. The parameters should be self-explanatory. To me, this is a little easier and more flexible than processing using DTS packages. create procedure ProcessCube @Database varchar(100), @Cube varchar(100), @Partition varchar(100) = null, -- If NULL, process the entire Cube @Server varchar(100) = 'localhost' as /* Author: Jeff Smith Version: 10/27/2008 */ /* variables used to store object handles */ declare @o_svr int, @o_db int, @o_cube int, @o_part int, @o_mds int declare @hr int /* different cube processing options. This SP uses "default" */ declare @PROCESS_DEFAULT int declare @PROCESS_FULL int declare @PROCESS_REFRESH_DATA int set @PROCESS_DEFAULT = 0 set @PROCESS_FULL = 1 set @PROCESS_REFRESH_DATA = 2 -- create a DSO.Server object: exec @hr = sp_OACreate 'DSO.Server', @o_svr out if @hr <> 0 begin print 'Error at...
I'm back ... with some news!
2008-10-16 22:04:00 I apologize for not posting any new content in quite some time, but now I am back and will soon start posting on a (hopefully!) regular basis once again. The reason for my hiatus was a pretty good one, though: I recently became a father with the birth of my son Benjamin on October 2, 2008! He is doing great and already knows that he should always do his formatting at the client and never within the database. He's a natural! As if that wasn't enough, I also found out recently I have been named as a 2009 SQL Server MVP! Now I can finally get a decent seat at a restaurant by pulling the "do you know who I am?" routine. In all seriousness, though, I am very honored and excited by the award and I intend to live up to it in 2009 by continuing to update this blog and help users out in the SQLTeam forums. Thank you to everyone who reads this blog and participates in the discussions, and I promise more updates are coming soon. If you have any specific topics or ideas for a blog po... More About: News , Back
Foreign Key Constraints: SET NULL and SET DEFAULT
2008-08-13 15:56:00 Most people know about cascading updates and deletes, but did you know there are two other foreign key constraint options you can use to maintain referential integrity? Read all about them in my latest article over at SQLTeam.com. These features, introduced with SQL Server 2005, haven't got a lot of publicity, but they can be very useful. I just used the SET NULL option recently for the first time (inspiring me to put together an article on it) and it works great. More About: Foreign , Null
Database Column Names != Report Headings
2008-08-06 17:43:00 Always remember that the column names returned in a result set do not have to be the same as what you eventually output at your presentation layer. For example, suppose you have stored procedure that accepts a @CurrentYear parameter and returns a sales variance between the current year and the previous year for each customer. I often see programmers struggling with writing dynamic SQL to produce output like this: CustomerID 2008 Total 2007 Total Variance ---------- ---------- ---------- -------- ABC $100 $50 $50 DEF $200 $250 -$50 That is, the names of the columns vary based on the data; that is not a good way to return data from your database! A much better result set to return is simply this: CustomerID CurrentYear PrevYear Variance ---------- ---------- --------- -------- ABC $100 $50 $50 DEF $200 $250 -$50 Notice that with that set of column... More About: Report , Database , Column , Names
How To Calculate the Number of Week Days Between two Dates
2008-07-31 18:39:00 If the start date and end date are both week days, then the total number of week days in between is simply: (total difference in days) - (total difference in weeks) * 2 or DateDiff(dd, @start, @end) - DateDiff(ww, @start, @end)*2 ... since the DateDiff() function with weeks returns the number of week "boundaries" that are crossed; i.e., the number of weekends. If you have a table of holidays, then you can simply subtract them out as well: DateDiff(dd, @start, @end) - DateDiff(ww, @start, @end)*2 - (select count(*) from holidays where holiday_date between @start and @end) Now, what if the start day or the end day is on a weekend? In that case, you need to define what to do in those situations in your requirements. For example, if the start date is Sunday, Nov 20th, and the end day is Monday, Nov 21st -- how many week days are between those dates? There's no universal correct answer; it could be 0, or 1, or perhaps even "undefined" (null) depending on your needs. More About: Dates , Week , Days , Calculate , Number
Convert input explicitly at your client; don't rely on the database to "fig
2008-07-24 15:12:00 A common mistake beginners make when working with SQL is trying to format their output at the database layer, as opposed to simply doing this at the presentation layer (i.e., client application, reporting tool, web page, etc). I've covered that quite a bit in various blog posts, but I've only touched upon another similar issue which I feel is equally as important and also commonly mishandled. In the SqlTeam forums, I often see code that accepts DateTime input in the form of a string value (say, from a TextBox on a web form) and uploads that value to the database written like this: SqlCommand c = new SqlCommand(); c.CommandText = "insert into SomeTable (DateCol) values ('" + txtDate.Text + "')"; c.ExecuteNonQuery(); Now, I think that hopefully even most beginners will agree that this is bad code. The primary issue, of course, is SQL Injection. Avoiding SQL Injection is very easy to do using Parameters. So, let's say that you rewrite this code using parameters like this: S... More About: Database , Convert , Client , Input
The MailBag --- Super-Sized Edition! String Parsing, Crosstabs, SQL Inject
2008-07-16 22:32:00 OK, boys and girls, it's time for the mailbag! There's lots of stuff to cover, so let's get to it! --- Greg E writes: Hello Jeff, I just found your blog and wanted to know if you could point me in the right direction or possibly toss me a solution. I am looking at a badly formed telelphone number column in a MS SQL Server db. Entries contain '(555) 555-1212' or '555.555.1212, etc. Do you know how I would go about stripping out unwanted characters from the telephone number? Thanks for the brain cycles. Greg -- A simple UDF should do the trick for you. For example, something like this: create function NumbersOnly(@txt varchar(1000)) returns varchar(100) as begin declare @i int declare @ret varchar(100) select @i = 1, @ret = '' while (@i <= len(@txt)) select @ret = @ret + case when substring(@txt,@i,1) like '[0-9]' then substring(@txt,@i,1) else '' end, ... More About: Super , String , Edition
The Golden Rule of Data Manipulation
2008-06-11 17:48:00 Introduction There is a very simple rule when it comes to storing (and returning) data, which I see violated all the time, making life so much more complicated for everyone involved. In case you haven't noticed, that's a common theme I discuss here on this blog -- different ways programmers make life more difficult for themselves, instead of simply following good practices and doing things the easy way. This is yet another example of that situation. The "Golden Rule of Data Manipulation " is a simple, but important rule that you should always follow when designing a database, writing database code, or really writing any application code at all for that matter: "It is always easier and more flexible to combine data elements rather than to break them apart" In other words: Concatenation is easy. Parsing is hard. Often, very hard -- or even impossible depending on the data. Problems with Parsing It is amazing how often I see people struggling with "difficult SQL problems" s... More About: The Golden Rule
The Joy of Blog Feedback
2008-06-06 21:43:00 Introduction I have been writing my little blog here for some time now, and my favorite part of doing this is of course the feedback. It's always great to hear from the readers, to have mistakes corrected, to debate various topics and techniques, and to learn a lot about SQL and the various topics I discuss here. At this point, I have received over 1,700 comments over the years, and while all of them are truly appreciated, I have noticed that unfortunately many of the, uh, less helpful comments do seem to consistently fall neatly into various categories. Let's take a look at an example of a simple, typical blog post and some of the responses that often come back. If you write a blog of your own, or often read the feedback from other blogs, many of these may seem familiar to you. A Typical Blog Post Today, I have a simple tip for beginner SQL programmers. When writing a SELECT, you can add a WHERE clause to filter the results that are returned. For example, to only r... More About: Feedback
The Truth about "Cursor Busting" in SQL
2008-06-05 16:56:00 Let's say you are called in to troubleshoot a stored procedure that is performing poorly. You dive in to investigate and this is what you find: create procedure ProcessProducts as declare @Products cursor, @ProductID int set @Products = cursor for select ProductID from Products order by ProductID open @Products fetch next from @Products into @ProductID while (@@FETCH_STATUS=0) begin fetch next from @Products into @ProductID exec DoSomething @ProductID end deallocate @Products Ah ha! A cursor! It seems we have identified the bottleneck: Clearly, the performance problems are because the code is not doing things in a set-based manner, but rather by processing rows one at a time using a dreaded cursor. This cursor is opening up the Products table, looping through the rows one at a time, and calling the "DoSomething" stored procedure for each ProductID. As we all know, cursors are not the way to go when writing SQL code; ... More About: Truth , Cursor , The Truth
Log Buffer #98
2008-05-23 15:09:00 Hello and welcome to the 98th edition of Log Buffer . My name is Jeff Smith and I will hosting this week's exciting episode. If, for some reason, you are not completely satisfied with this edition, simply write in and complain to Dave over at The Pythian Group and you will receive Log Buffer #99 absolutely free! Now that is a guarantee you can feel good about. OK, let's get to work. I have only limited exposure to both PostgreSQL and MySQL, but I have often wondered why MySQL is so popular while it seems that PostgreSQL has the superior features. Over at Xaprb, they attempt to answer that very question. Be sure to read the comments from that post, and check out the big discussion from that article over at reddit as well. The theory I like the best? MySQL is easier to pronounce! (How do you pronounce "PostgreSQL" anyway?) Speaking of MySQL, Sheeri Cabral points out that MySQL's website certainly doesn't do the product any favors, and there's also a good discuss...
Implementing "Interfaces" in SQL
2008-05-19 15:06:00 My latest article has just been published over at SQLTeam: Implementing Table Interfaces When I wrote a Table Inheritance article a few months back, that was one was pretty standard and straight-forward. As I was writing it, I thought it would be an interesting challenge to figure out a way to implement table interfaces as well, where different tables don't inherit from the same base class, but they still "implement" the same relations. That definitely was not as easy, and the end result isn't as clean and direct, but I hope this at least provides some ideas and at the very least it should provoke interesting comments and alternative approaches.
Need an Answer? Actually, No ... You Need a Question
2008-05-13 17:04:00 Welcome! The reason you were directed here is because you need assistance, and I am here to help. I am not, however, here to provide you with any answers! You see, it looks like the assistance you need is not finding an answer; it is rather that you need assistance finding a question. As you know, there are all kinds of questions. Question s that test memory recall. Questions that test logic skills. Brain-teasers and mathematical questions and so on. But there is one requirement that all good questions must have in common before they can be answered: A proper question MUST provide ALL of the information necessary in order for an answer to be given. In other words, if you omit important information from a question, it doesn't matter how simple or easy that question is: It suddenly becomes very difficult, or even impossible, to answer. For example, consider the following question: "Am I wearing a hat?" Seems pretty easy, right? No logic, no memorization, no trivia,... More About: Answer
GROUP BY ALL
2008-05-05 18:25:00 Here's an obscure piece of SQL you may not be aware of: The "ALL" option when using a GROUP BY. Consider the following table: Create table Sales ( SaleID int identity not null primary key, CustomerID int, ProductID int, SaleDate datetime, Qty int, Amount money ) insert into Sales (CustomerID, ProductID, SaleDate, Qty, Amount) select 1,1,'2008-01-01',12,400 union all select 1,2,'2008-02-25',6,2300 union all select 1,1,'2008-03-02',23,610 union all select 2,4,'2008-01-04',1,75 union all select 2,2,'2008-02-18',52,5200 union all select 3,2,'2008-03-09',99,2300 union all select 3,1,'2008-04-19',3,4890 union all select 3,1,'2008-04-21',74,2840 SaleID CustomerID ProductID SaleDate Qty Amount ----------- ----------- ----------- ----------------------- ----------- --------------------- 9 1 1 2008-01-01 00:00:00.000 12 400.00 10 1 2 2008-02-25 00:00... More About: Group
UNPIVOT: Normalizing data on the fly
2008-04-23 16:33:00 Everyone seems to want to "pivot" or "cross tab" data, but knowing how to do the opposite is equally important. In fact, I would argue that the skill of "unpivoting" data is more useful and more important and more relevant to a SQL programmer, since pivoting results in denormalized data, while unpivoting can transform non-normalized data into a normalized result set. We all know that there's lots of bad databases designs out there, so this can be a handy technique to know. Of course, even a well designed, fully normalized database can still benefit from "unpivoting" from time to time, so let's take a look at some common situations and some of the options we have to handle this at our disposal. We will focus on some traditional SQL techniques to do this, and then take a close look at the UNPIVOT operator that was introduced with SQL Server 2005. Example #1: A Bad database design Let's start with a commonly bad table design, in which someone has decided to relate a client ... More About: Data
Minimize a DropDownList's ViewState
2008-04-17 18:21:00 Let's say you have a very large DropDownList with lots of values and text. We need to maintain ViewState in this DropDownList so that we can retrieve the selected value on a post back. Of course, this means that now the ViewState contains the data for every single value in the list, both values and text included. Even though the page itself may be fairly simple and lightweight, the result of having this simple DropDownList on the page is that the page size is quite large and the amount of data passed back and forth on a postback is very large as well. If you have an efficient database, and/or if are caching the data anyway, you might not mind re-loading the list items each time the page posts back to eliminate the need for the page itself to hold all of this ViewState data. However, if you turn off ViewState on the DropDownList, you will notice that it now does not remember the selected value on post backs. The solution is very simple -- just manually set the DropDownList...
But it's not missing. It's just ... invisible, that's all.
2008-03-19 20:51:00 Microsoft Access has a pretty handy boolean property that you can set for any "text" column in your tables: AllowZeroLength True means that empty strings ('') are allowed in the column, False means that they are not. This is actually quite nice, because by using this along with the Required (i.e., disallow NULLS) property, you can ensure that your column has an actual, non-Null, non-Empty String value without the need for any additional constraints. SQL Server does not have this property, but we can easily achieve the same effect by using a CHECK constraint: create table foo ( column1 varchar(100) not null check (column1 <> '') ) This enforces that column1 is not only never null, but it also cannot be an empty string. For example: insert into foo (column1) values ('') Msg 547, Level 16, State 0, Line 2 The INSERT statement conflicted with the CHECK constraint "CK__foo__column1__6E8B6712". The conflict occurred in database "PlayGround", table "dbo.foo... More About: Missing , Invisible
There's a much easier way!
2008-03-13 19:51:00 Let's say you are struggling on a programming project. Your code is growing exponentially and becoming more convoluted by the day, and it is clearly out of control. You're getting run-time errors, compile-time errors, wrong output, no output, endless loops, your machine is overheating, and perhaps you are starting to feel like you might be a little over your head. Maybe this is the first time you are using a new programming language, or it's the first time you've worked with a particular database, or maybe it's your first complicated project overall in any technology and you're getting hopelessly lost. So, it's come to this: it's time to swallow your pride, and ask for help. Maybe you decide to visit a programming forum online, or maybe you have coworkers or colleagues or an old college professor that you can contact for help. Either way, it is time to get to the bottom of this. You have deadlines that you are getting dangerously close to missing. People are start...
Simple DataViewReader class that implements IDataReader for a DataView
2008-02-28 16:34:00 I previously wrote that the .NET 2.0 DataTableReader class is really handy, but unfortunately there is no DataViewReader class. Thus, the only way to use the IDataReader interface with a sorted/filtered DataView was to first use the ToRows() method of the view to create a brand new DataTable, and then call CreateDataReader() on that new table. This is not the most efficient process when all you want to do is enumerate a DataView. So, here's a solution: a simple, efficient, mostly-complete DataViewReader class. Just create a new instance of it passing in the source DataView and off you go. This will be much more efficient than creating a brand new DataTable from the DataView, since we just use the default enumerator of the DataView and do not need to copy/create any more things in memory. Most methods are implemented but a few aren't, feel free to complete the class if you need the missing functionality. It's pretty simple and straightforward. If you find any bugs or ... More About: Simple , Class
Exporting data to a remote server with SQL Express
2008-02-27 21:04:00 I recently helped a friend out who only had access to SQL Server Management Studio Express , and he needed to copy a database locally from his PC to his remote web hosting company. Normally, the process is a simple backup/restore, but his hosting company does not allow restoring databases. Luckily, however, the company does allow direct access to his hosted database via client tools such as SSMS. Unluckily, SSMS Express does not include any tools that allow you to export data to a remote server. So, here's what we did: First, we scripted out the entire database, including all tables, indexes, constraints, views, functions, procs, etc, using SSMS Express. This took some time to execute, but it was easy to do. Then, we ran that script remotely on the hosted database to create the environment. Easy enough. The final step was potentially the hardest part: Now we had to copy all of the data in all of the tables from the local database to the remote database. I thought about gen... More About: Data , Remote
DataTable, DataView and CreateDataReader
2008-02-22 18:30:00 Here's something I was not aware of in .NET 2.0+ that I recently discovered. There is a new DataTableReader class that implements IDataReader so that you can loop through an in-memory DataTable using the same interface that you would to loop through a SqlDataReader. You can create this object by using the CreateDataReader() method of a DataTable. This is great because now you can create a generic ReadFromDatabase() method for your classes that accepts an IDataReader parameter and you can create those objects from pretty much any data source, including DataTables, which wasn't as easy to do previously. There is one catch: There is no CreateDataReader() method for the DataView class, meaning if you want to use a DataTableReader to loop through a DataTable that is sorted or filtered, you are out of luck. This was pretty disappointing -- in fact, I would think that CreateDataReader(), if only implemented in one place, makes more sense to implement in the DataView class rather tha...
Does SQL Server Short-Circuit?
2008-02-22 17:04:00 I got an email recently regarding one of my early blog posts from the olden days: Steve Kass wrote about your post:"there is no guarantee that WHERE <filter 1> OR <filter 2> will be optimized so that the filters are evaluated in the order typed". I am not certain that optimization changes the priority of the expressions, but I do not think so. We can force evaluation so that it is done in a certain order by enclosing the first expression to evaluate in parentesis, since enclosed in parentesis expressions are evaluated first, like this: ((<filter 1>) OR <filter 2>) AND ((<filter 3>) OR <filter 4>) In this case filter 1 is evaluated before than filter 2 and filter 3 is evaluated before than filter 4. Thanks in advance In that original post, I claimed that using efficient OR logic will not only make your code simpler and cleaner, but al... More About: Server , Circuit , Short
Top N Percent per Group
2008-02-21 19:05:00 Here's a good question in the feedback from my post about using the T-SQL 2005 features to return the Top N per Group of a result set: Sani writes: What about Top n Percent per Group??? I would greatly appreciate an input on that as well. That's a good question, and also easily solvable, by using a combination of rank() and partitioned aggregate functions (also a new SQL Server 2005 feature). Simply calculate the rank() of each row in the group, and also the count(*) of all rows in the group. Multiply the count(*) by the percentage of rows you want returned and filter so that your rank() per group is less than that. Here's an example. Suppose we want to return the newest 10% of all products per region, where the newest product has the latest "AddedDate" column. We can write it like this: with ProductsByRegion as ( select Region, Product, AddedDate, rank() over (partition by Region order by AddedDate desc) as AddedRank, count(*) over (partition by Region) a...
An INT primary key .... yet not an Identity?
2008-02-21 18:08:00 Ah, this is not an anti-identity rant, don't worry! Though, in a round-a-bout sort of way, it is yet another argument against always blindly using them -- but not in the way you might expect. There is a simple rule I'd like to propose, let me know what you think: "If your client code, SQL code, or configuration files reference the primary key column of a table to determine any application logic, that primary key column should not be an identity." Now, I am not saying that primary key column can't be a meaningless integer. I am just saying it should not be an auto-generated meaningless integer. Working with an inherited web application (in "classic" ASP .... yuck), there were lots of tables like this: groupID, groupName 1, Administrator 2, User 3, Finance User 5, Account Manager 8, ReadOnly and there was lots of code written like this: if groupID = 1 then show delete button if groupID =3 or groupID=5 and groupID != 8 then show the create new project button ... repeated... More About: Primary , Identity
Implementing Table Inheritance with SQL Server
2008-02-21 17:05:00 I have a new article up at SQLTeam: Implementing Table Inheritance with Sql Server It discusses the situation where you have multiple entities that are distinct, yet they have many attributes or relations in common. There is an easy way to simplify your database design and your code if you use the concept of a "base table" for that common data, which is very similar to the concept of Inheritance in Object-Oriented programming. I recently had to do this for a client that tracks Contributions to their foundation; there is a base set of data that all Contributions have, such as who donated, how they paid, the date a tax letter was sent, etc. But for certain contribution types we need additional information, such as if they bought an Auction Item, sponsored an Event, and so on. Thus, we set up a "base table" of Contributions and sub-tables for the different types, and set up a constraint to ensure that the right subtable is used depending on the type. It sounds complicated and l...
On RIGHT OUTER JOINS ...
More articles from this author:2008-02-13 17:22:00 Because I feel pretty strongly about this and the entire focus of my blog is writing clear, clean and efficient SQL, I thought I'd repeat my response from a SQLTeam forum question here. smithje asks this, regarding OUTER JOINS: Left/Right, does it matter. Is one better than the other? A few years ago consultant on a project for our company advised me to always write my queries to use Left joins. He had worked on the project to convert the original database application to MS SQL when Microsoft took it over. He claimed the design of the query engine handled Left joins more effeciently than right. I converted several queries that processed large datasets to Left join only and got quicker results. I have used Left exclusively since then. Has this concept ever been tested or written about? My response: Hi -- they are technically the same, but it is always clearer to use LEFT OUTER JOINS. I strongly recommend to never use RIGHT OUTER JOINS. When you write a SQL statement, you should... 1, 2, 3, 4 |



