Tutorial :How to find rows in SQL that start with the same string (similar rows)?



Question:

I have a table with primary keys that look like this:

FIRSTKEY~ABC

SECONDKEY~DEF

FIRSTKEY~DEF

I want to write a SELECT statement that strips off the segment following the tilde and returns all rows that are duplicates after the post-tilde segment is gone. That is,

SELECT ...

Gives me:

FIRSTKEY~ABC

FIRSTKEY~DEF

As "duplicates".

I already have the bit to strip off the end segment using SUBSTRING:

SELECT SUBSTRING(COLUMN, 0, CHARINDEX('~', COLUMN)) FROM TABLE

This is on SQL Server.


Solution:1

The first solution given will identify the key prefixes; extend it just a bit to get the table rows beginning with those keys:

SELECT *   FROM TABLE  WHERE SUBSTRING(COLUMN, 0, CHARINDEX('~', COLUMN)) IN  (      SELECT SUBSTRING(COLUMN, 0, CHARINDEX('~', COLUMN)) FROM TABLE       GROUP BY SUBSTRING(COLUMN, 0, CHARINDEX('~', COLUMN))      HAVING COUNT(*) > 1  )  

Alternately, you could use a join between a temp table containing the prefixes and the original table - if the number of prefixes becomes very large, using a "where in" can become very expensive.


Solution:2

Give this a shot

SELECT SUBSTRING(COLUMN, 0, CHARINDEX('~', COLUMN)), COUNT(*) FROM TABLE   GROUP BY SUBSTRING(COLUMN, 0, CHARINDEX('~', COLUMN))  HAVING COUNT(*) > 1  

Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com
Previous
Next Post »