Grouping From & To continuous dates in SQL Server

Ashokraja V

StartDate EndData EmpCode 1/1/2014 1/31/2014 12345 2/1/2014 2/28/2014 12345 3/3/2014 3/25/2014 12345 4/1/2014 4/30/2014 12345 5/1/2014 5/31/2014 12345 7/1/2014 7/31/2014 12345 I need the below output from above table structure using SQL Query. Output: StartDate EndData EmpCode 1/1/2014 2/28/2014 12345 3/3/2014 3/25/2014 12345 4/1/2014 5/31/2014 12345 7/1/2014 7/31/2014 12345

Mycroft Holmes

And you want us to decipher the logic of what the output is, oh goody a challenge to start the year off with. It looks like row 1 & 3 are 2 months and the others are 1 month. Why! Help us to help you, tell us what you are trying to achieve and we might be able to help.

Never underestimate the power of human stupidity RAH

Ashokraja V

Need to group all the continuous periods. That's why output with 4 rows. StartDate EndData EmpCode 1/1/2014 1/31/2014 12345 2/1/2014 2/28/2014 12345 need to merger as 1/1/2014 to 2/28/2014. If no continuous dates, display as it is.

Mycroft Holmes

Ooh that's nasty, I presume you mean consecutive MONTHS not dates! I don't see a simple way to do that in a straight query so I would probably throw it into a cursor and process each record setting a GroupingNo and then query based on the GroupingNo. Someone with CTE skills should be able to give you a more elegant solution.

Never underestimate the power of human stupidity RAH

Richard Deeming

There might be a better way to do this, but the obvious solution would be:

WITH cteStartingPoints As
(
-- Find the rows with no row ending on the previous day:
SELECT
A.EmpCode,
A.StartDate,
ROW_NUMBER() OVER (ORDER BY A.StartDate) As RN
FROM
YourTable As A
WHERE
Not Exists
(
SELECT 1
FROM YourTable As B
WHERE B.EmpCode = A.EmpCode
And B.EndDate = DateAdd(day, -1, A.StartDate)
)
),
cteEndingPoints As
(
-- Find the rows with no row starting on the next day:
SELECT
A.EmpCode,
A.EndDate,
ROW_NUMBER() OVER (ORDER BY A.StartDate) As RN
FROM
YourTable As A
WHERE
Not Exists
(
SELECT 1
FROM YourTable As B
WHERE B.EmpCode = A.EmpCode
And B.StartDate = DateAdd(day, 1, A.EndDate)
)
)
SELECT
S.EmpCode,
S.StartDate,
E.EndDate
FROM
cteStartingPoints As S
INNER JOIN cteEndingPoints As E
ON E.EmpCode = S.EmpCode
And E.RN = S.RN
;

http://sqlfiddle.com/#!3/c1331/2[^]

"These people looked deep within my soul and assigned me a number based on the order in which I joined." - Homer

Mycroft Holmes

Richard Deeming wrote:

but the obvious solution would be

Only if you like CTEs, deserves 5

Never underestimate the power of human stupidity RAH

Umer Akram

Try this

Declare @tableA table ( startdate datetime, Enddate datetime, ID int)

insert into @tableA
select '2014-01-01', '2014-01-31', 12345 ID union all
select '2014-02-01', '2014-02-28', 12345 union all
select '2014-03-03', '2014-03-25', 12345 union all
select '2014-04-01', '2014-04-30', 12345 union all
select '2014-05-01', '2014-05-31', 12345 union all
select '2014-07-01', '2014-07-31', 12345

select ID, min(cdate) as startdate, max(cdate) as enddate
from (
select startdate as cdate, MONTH(startdate) as mon, ID
from @tableA
union all
select enddate as cdate, Month(enddate) as mon, ID
from @tableA
) Z

Group by ID, mon

hope it helps.

Ashokraja V

Thanks, Code works fine. But i have 50 million records in my table. So process is getting slow. Is there any other ways to improve performance? Thanks in advance...

Mycroft Holmes

Run the execution plan and see if it recommends any indexes, look for high cost bottle necks. You also need to make sure you want to work with all the 50m records, can you filter out some that are not relevant to your query.

Never underestimate the power of human stupidity RAH

Jorgen Andersson

Yes, it should be slow if you read all 50 million records. There are four full table scans for two NOT EXISTS and two sorts. And on top of that you have two conditions with Functions in them. You could of course use a computed column index to solve that. But that would depend on which version of SQL Server you are using. So, what version of SQL Server are you using? Do you need all data in one go or is it enough to filter out one user at a time?

Wrong is evil and must be defeated. - Jeff Ello

Ashokraja V

Thanks for the information. Im using SQL Server 2008 R2. And also i need to work with limited set of records from the 50 million records. So im planning to take the required records to seperate temp table and planing to execute the query.