1

How to do an Impossible Join with String Split

 2 years ago
source link: https://www.codeproject.com/Articles/5326621/How-to-do-an-Impossible-Join-with-String-Split
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

In this article, I’ll show you how to use STRING_SPLIT() to join two tables. I call this the impossible join.

I’ve come across this situation several times and always when dealing with data imported from two different sources. In most cases, it has to do with users inputting data inconsistently.

Let me show you.

Table of Contents

Watch the Video

Don’t forget to check out the video. I go over the examples and provide more background for you to better understand it all!

The Impossible Join

In this example, I have two tables, Project and ProjectStatus. I would like to join ProjectStatus to Project to display the Name, ProjectStatus, and AccountNumber in the same result.

If you look at ProjectStatus, ProjectName is a coder’s dream. The codes look great. The Project table, on the other hand, is a hot mess. The codes are buried within the column value.

There is no way to do a direct join from one to the other. In fact, only Project 9108 successfully joins.

The Impossible Join Approach

The way we make the impossible join work is by splitting the ProjectName into pieces. This way, we don’t have to join on “F1432 plastic” but can look to join on the parts “F1432” or “plastic.”

As you can see from the example, “F1432” is part of the good ProjectName and will readily join between the two tables.

The main steps are:

  1. Split the hot mess column into pieces.
  2. Join into this separate piece to make a match.
  3. If a match is made, then join the two tables together.

So now that we have an idea how this is happening, let’s make it happen. To do so, we’ll first learn about STRING_SPLIT().

How STRING_SPLIT() Works

STRING_SPLIT() is used to break up a column into separate parts. It is a special type of function which returns a table as its result. It is similar to a Table Valued Function.

You can learn more about STRING_SPLIT() in this article I wrote, but the general idea is it takes the column you want to split, and the character, such as space that separates the values (aka words).

So STRING_SPLIT(‘F1432 plastic’, ‘ ‘) returns a table with two rows:

  • F1432
  • Plastic

In our example, let's look at what happens when you use string split in a query against Project:

Here is the tokenized table with account numbers:

Copy Code
select value ProjectName, AccountNumber
from Project
    cross apply string_split(ProjectName, ' ')

There are a couple of things to notice here. First, since STRING_SPLIT() returns a table, we can use it in our query. You’ll see I’m using CROSS APPLY to join the STRING_SPLIT() result to the Project. I do so I can return a result showing each ProjectName value alongside the AccountNumber.

We now use this result in a final join to the ProjectStatus table to obtain our final result.

Learn More: Check out SQL Joins – The Ultimate Guide

Using STRING_SPLIT() with INNER JOIN

At this point, that hard part is over. We have an intermediate result which includes the project “codes” as well as account numbers.

That last thing to do is relate this to the Project table.

I pull this all together in the following example:

Copy Code
Select s.ProjectName, s.ProjectStatus, p.AccountNumber
from ProjectStatus s
    left join (
        select value ProjectName, AccountNumber
        from Project cross apply string_split(ProjectName, ' ')
    ) p on s.ProjectName = p.ProjectName

Conclusion

IF you find yourself needed to join to a “code” embedded within another string, then you may find that STRING_SPLIT() allows you to get to the “code” to make the join successful.

The post How to do an Impossible Join with String Split appeared first on Essential SQL.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK