How to calculate the number of times a route was travelled using SQL?
I need to determine how many times a certain route was travelled per vehicle on given a certain date range, but the GPS management software built on top of the database doesn't have this functionality.
The database contains several tables that store GPS, route, and location data. A route is made up of several locations and a sequence number. A location is a set of upper and lower bound latitudinal/longitudinal values attached to a name. A vehicle uploads its GPS location data several times a minute to the server, which updates the GPS table.
eg. A certain route "FOO" may consist of locations "Warehouse", "School", "Stadium" and "Park" in order. "BAR" is essentially the same route, but in reverse (locations may be found on several different routes).
A simple select returns this type of information from the GPS data table (vehicle id, location, datetime):
34 Warehouse 2011-03-26 18:17:50.000
34 Warehouse 2011-03-26 18:18:30.000
34 Warehouse 2011-03-26 18:19:05.000
34 School 2011-03-26 18:21:34.000
34 School 2011-03-26 18:21:59.000
34 S开发者_运维百科chool 2011-03-26 18:22:42.000
34 School 2011-03-26 18:23:55.000
34 Stadium 2011-03-26 18:24:20.000
34 Stadium 2011-03-26 18:24:47.000
34 Park 2011-03-26 18:25:30.000
34 Park 2011-03-26 18:26:50.000
34 Warehouse 2011-03-26 18:28:50.000
etc.
It's clear from manual inspection that vehicle 34 travelled route "FOO" at least once that day. How can I use SQL to determine the total number times that route was travelled on that day, per vehicle?
I get the feeling I'm going to have to use some sort of control-break structure but I'm hoping there's an easier way. This is the SQL used to retrieve the info from the GPS table.
SELECT
v.VehicleID,
ml.LocationName
gps.Time,
FROM dbo.Routes r
INNER JOIN dbo.RoutePoints rp
ON r.RouteId = rp.RouteId
INNER JOIN dbo.MapLocations ml
ON rp.LocationId = ml.LocationId
INNER JOIN dbo.GPSData gps
ON ml.LowerRightLatitude < gps.Latitude AND ml.UpperLeftLatitude > gps.Latitude
AND ml.UpperLeftLongitude < gps.Longitude AND ml.LowerRightLongitude > gps.Longitude
INNER JOIN dbo.Vehicles v
ON gps.VehicleID = v.VehicleID
WHERE r.Desc = @routename
AND gps.Time BETWEEN @startTime AND @endTime
ORDER BY v.VehicleId, gps.Time
EDIT: Backtracking is not considered part of the same route. Warehouse, School, Stadium, Park, Warehouse, School, Stadium, Park. is only two trips on route FOO. If there's a deviation to a location that isn't part of a known route eg. Warehouse, School, Stadium, Pub, Park, then that location can be ignored. (ie. it will still be considered to be route "FOO")
DECLARE @route
TABLE
(
route INT NOT NULL,
step INT NOT NULL,
destination INT NOT NULL,
PRIMARY KEY (route, step)
)
INSERT
INTO @route
VALUES
(1, 1, 1),
(1, 2, 2),
(1, 3, 3),
(1, 4, 4),
(2, 1, 3),
(2, 2, 4)
DECLARE @gps
TABLE
(
vehicle INT NOT NULL,
destination INT NOT NULL,
ts DATETIME NOT NULL
)
INSERT
INTO @gps
VALUES
(1, 1, '2011-03-30 00:00:00'),
(1, 2, '2011-03-30 00:00:01'),
(1, 1, '2011-03-30 00:00:02'),
(1, 3, '2011-03-30 00:00:03'),
(1, 3, '2011-03-30 00:00:04'),
(1, 3, '2011-03-30 00:00:05'),
(1, 4, '2011-03-30 00:00:06'),
(1, 1, '2011-03-30 00:00:07'),
(1, 3, '2011-03-30 00:00:08'),
(1, 4, '2011-03-30 00:00:09'),
(1, 1, '2011-03-30 00:00:10'),
(1, 2, '2011-03-30 00:00:11'),
(1, 2, '2011-03-30 00:00:12'),
(1, 3, '2011-03-30 00:00:13'),
(1, 3, '2011-03-30 00:00:14'),
(1, 4, '2011-03-30 00:00:15'),
(1, 3, '2011-03-30 00:00:16'),
(1, 4, '2011-03-30 00:00:17')
;
WITH iteration (vehicle, destination, ts, route, edge, step, cnt) AS
(
SELECT vehicle, destination, ts, route, 1, step, cnt
FROM (
SELECT g.vehicle, r.destination, ts, route, step, cnt,
ROW_NUMBER() OVER (PARTITION BY route, vehicle ORDER BY ts) rn
FROM (
SELECT *, COUNT(*) OVER (PARTITION BY route) cnt
FROM @route
) r
JOIN @gps g
ON g.destination = r.destination
WHERE r.step = 1
) q
WHERE rn = 1
UNION ALL
SELECT vehicle, destination, ts, route, edge, step, cnt
FROM (
SELECT i.vehicle, r.destination, g.ts, i.route, edge + 1 AS edge, r.step, cnt,
ROW_NUMBER() OVER (PARTITION BY i.route, g.vehicle ORDER BY g.ts) rn
FROM iteration i
JOIN @route r
ON r.route = i.route
AND r.step = (i.step % cnt) + 1
JOIN @gps g
ON g.vehicle = i.vehicle
AND g.destination = r.destination
AND g.ts > i.ts
) q
WHERE rn = 1
)
SELECT route, vehicle, MAX(edge / cnt)
FROM iteration
GROUP BY
route, vehicle
Here we have two routes: (1, 2, 3, 4)
and (3, 4)
The vehicle made 2 trips on route (1, 2, 3, 4)
and 4 trips on route (3, 4)
.
It is important that each route has the steps numbered starting from 1
and without gaps (though if it's not the case, you can easily work around it using an additional CTE
with ROW_NUMBER()
)
So, I'm posting my attempt at this question for three reasons...
- To Make Quassnoi's answer look even more elegant
- Because I spent about two hours on it and I don't want to throw it away
- On the rare possibility that you're using something earlier than SQL 2005, this might be useful
Here, there are three routes...
Foo: Locations 1, 2, 3, 4
Bar: Locations 4, 3, 2, 1
Quicky: Locations 2, 3
This query, while long, is giving me the correct results for the test data included:
CREATE TABLE #Routes
(
RouteID INT,
RouteName VARCHAR(50),
)
CREATE TABLE #RoutePoints
(
RouteID INT,
LocationID INT,
SequenceNumber INT
)
CREATE TABLE #MapLocations
(
LocationID INT,
LocationName VARCHAR(50)
)
CREATE TABLE #GPSData
(
VehicleID INT,
LocationID INT,
Time DATETIME
)
INSERT INTO #Routes (RouteID, RouteName) VALUES (1, 'Foo')
INSERT INTO #Routes (RouteID, RouteName) VALUES (2, 'Bar')
INSERT INTO #Routes (RouteID, RouteName) VALUES (3, 'Quicky')
INSERT INTO #MapLocations (LocationID, LocationName) VALUES (1, 'Warehouse')
INSERT INTO #MapLocations (LocationID, LocationName) VALUES (2, 'School')
INSERT INTO #MapLocations (LocationID, LocationName) VALUES (3, 'Stadium')
INSERT INTO #MapLocations (LocationID, LocationName) VALUES (4, 'Park')
INSERT INTO #RoutePoints (RouteID, LocationID, SequenceNumber) VALUES (1, 1, 1)
INSERT INTO #RoutePoints (RouteID, LocationID, SequenceNumber) VALUES (1, 2, 2)
INSERT INTO #RoutePoints (RouteID, LocationID, SequenceNumber) VALUES (1, 3, 3)
INSERT INTO #RoutePoints (RouteID, LocationID, SequenceNumber) VALUES (1, 4, 4)
INSERT INTO #RoutePoints (RouteID, LocationID, SequenceNumber) VALUES (2, 4, 1)
INSERT INTO #RoutePoints (RouteID, LocationID, SequenceNumber) VALUES (2, 3, 2)
INSERT INTO #RoutePoints (RouteID, LocationID, SequenceNumber) VALUES (2, 2, 3)
INSERT INTO #RoutePoints (RouteID, LocationID, SequenceNumber) VALUES (2, 1, 4)
INSERT INTO #RoutePoints (RouteID, LocationID, SequenceNumber) VALUES (3, 2, 1)
INSERT INTO #RoutePoints (RouteID, LocationID, SequenceNumber) VALUES (3, 3, 2)
INSERT INTO #GPSData (VehicleID, LocationID, Time) VALUES (1, 1, '2011-03-26 12:17:50.000')
INSERT INTO #GPSData (VehicleID, LocationID, Time) VALUES (1, 1, '2011-03-26 12:18:50.000')
INSERT INTO #GPSData (VehicleID, LocationID, Time) VALUES (1, 1, '2011-03-26 12:19:50.000')
INSERT INTO #GPSData (VehicleID, LocationID, Time) VALUES (1, 2, '2011-03-26 12:20:50.000')
INSERT INTO #GPSData (VehicleID, LocationID, Time) VALUES (1, 2, '2011-03-26 12:21:50.000')
INSERT INTO #GPSData (VehicleID, LocationID, Time) VALUES (1, 2, '2011-03-26 12:22:50.000')
INSERT INTO #GPSData (VehicleID, LocationID, Time) VALUES (1, 2, '2011-03-26 12:23:50.000')
INSERT INTO #GPSData (VehicleID, LocationID, Time) VALUES (1, 3, '2011-03-26 12:24:50.000')
INSERT INTO #GPSData (VehicleID, LocationID, Time) VALUES (1, 3, '2011-03-26 12:25:50.000')
INSERT INTO #GPSData (VehicleID, LocationID, Time) VALUES (1, 1, '2011-03-26 18:17:50.000')
INSERT INTO #GPSData (VehicleID, LocationID, Time) VALUES (1, 1, '2011-03-26 18:18:50.000')
INSERT INTO #GPSData (VehicleID, LocationID, Time) VALUES (1, 1, '2011-03-26 18:19:50.000')
INSERT INTO #GPSData (VehicleID, LocationID, Time) VALUES (1, 2, '2011-03-26 18:20:50.000')
INSERT INTO #GPSData (VehicleID, LocationID, Time) VALUES (1, 2, '2011-03-26 18:21:50.000')
INSERT INTO #GPSData (VehicleID, LocationID, Time) VALUES (1, 2, '2011-03-26 18:22:50.000')
INSERT INTO #GPSData (VehicleID, LocationID, Time) VALUES (1, 2, '2011-03-26 18:23:50.000')
INSERT INTO #GPSData (VehicleID, LocationID, Time) VALUES (1, 3, '2011-03-26 18:24:50.000')
INSERT INTO #GPSData (VehicleID, LocationID, Time) VALUES (1, 3, '2011-03-26 18:25:50.000')
INSERT INTO #GPSData (VehicleID, LocationID, Time) VALUES (1, 4, '2011-03-26 18:26:50.000')
INSERT INTO #GPSData (VehicleID, LocationID, Time) VALUES (1, 4, '2011-03-26 18:27:50.000')
INSERT INTO #GPSData (VehicleID, LocationID, Time) VALUES (1, 1, '2011-03-26 18:28:50.000')
INSERT INTO #GPSData (VehicleID, LocationID, Time) VALUES (1, 1, '2011-03-26 19:17:50.000')
INSERT INTO #GPSData (VehicleID, LocationID, Time) VALUES (1, 1, '2011-03-26 19:18:50.000')
INSERT INTO #GPSData (VehicleID, LocationID, Time) VALUES (1, 1, '2011-03-26 19:19:50.000')
INSERT INTO #GPSData (VehicleID, LocationID, Time) VALUES (1, 2, '2011-03-26 19:20:50.000')
INSERT INTO #GPSData (VehicleID, LocationID, Time) VALUES (1, 2, '2011-03-26 19:21:50.000')
INSERT INTO #GPSData (VehicleID, LocationID, Time) VALUES (1, 2, '2011-03-26 19:22:50.000')
INSERT INTO #GPSData (VehicleID, LocationID, Time) VALUES (1, 2, '2011-03-26 19:23:50.000')
INSERT INTO #GPSData (VehicleID, LocationID, Time) VALUES (1, 3, '2011-03-26 19:24:50.000')
INSERT INTO #GPSData (VehicleID, LocationID, Time) VALUES (1, 3, '2011-03-26 19:25:50.000')
INSERT INTO #GPSData (VehicleID, LocationID, Time) VALUES (1, 4, '2011-03-26 19:26:50.000')
INSERT INTO #GPSData (VehicleID, LocationID, Time) VALUES (1, 4, '2011-03-26 19:27:50.000')
INSERT INTO #GPSData (VehicleID, LocationID, Time) VALUES (1, 1, '2011-03-26 19:28:50.000')
CREATE TABLE #GPSRoute
(
RouteID INT,
VehicleID INT,
LocationID INT,
SequenceNumber INT,
Time DATETIME,
StepsInRoute INT
)
INSERT INTO #GPSRoute
(
RouteID,
VehicleID,
LocationID,
SequenceNumber,
Time,
StepsInRoute
)
SELECT
r.RouteID,
gps.VehicleID,
rp.LocationID,
rp.SequenceNumber,
gps.Time,
(
SELECT COUNT(*)
FROM #RoutePoints
WHERE RouteID = r.RouteID
)
FROM
#Routes r JOIN
#RoutePoints rp ON r.RouteID = rp.RouteID JOIN
#GPSData gps ON gps.LocationID = rp.LocationID
SELECT
r.RouteID,
r.RouteName,
previousRouteStep.VehicleID,
COUNT(*) / NULLIF(previousRouteStep.StepsInRoute - 1, 0) AS TimesRouteCompleted
FROM
#Routes r JOIN
#GPSRoute previousRouteStep ON r.RouteID = previousRouteStep.RouteID JOIN
#GPSRoute nextRouteStep ON
previousRouteStep.RouteID = nextRouteStep.RouteID AND
previousRouteStep.VehicleID = nextRouteStep.VehicleID AND
previousRouteStep.SequenceNumber + 1 = nextRouteStep.SequenceNumber AND
previousRouteStep.Time < nextRouteStep.Time AND
-- Only include the step if it is followed by the subsequent step.
nextRouteStep.Time = (
SELECT MIN(Time)
FROM #GPSRoute
WHERE
RouteID = nextRouteStep.RouteID AND
VehicleID = nextRouteStep.VehicleID AND
sequenceNumber = nextRouteStep.SequenceNumber AND
Time > previousRouteStep.Time
) AND
-- Only include the step if it is the latest step in the sequence that meets the criteria.
previousRouteStep.Time = (
SELECT MAX(Time)
FROM #GPSRoute
WHERE
RouteID = previousRouteStep.RouteID AND
VehicleID = previousRouteStep.VehicleID AND
sequenceNumber = previousRouteStep.sequenceNumber AND
Time < nextRouteStep.Time
)
WHERE
-- This step is only valid if every preceding step has been done.
NOT EXISTS (
SELECT 1
FROM
#RoutePoints rp LEFT JOIN
#GPSRoute gpsr ON
gpsr.RouteID = rp.RouteID AND
gpsr.LocationID = rp.LocationID AND
gpsr.SequenceNumber = rp.SequenceNumber AND
gpsr.Time < previousRouteStep.Time AND
gpsr.VehicleID = previousRouteStep.VehicleID
WHERE
rp.SequenceNumber < previousRouteStep.SequenceNumber AND
gpsr.RouteID IS NULL AND
rp.RouteID = previousRouteStep.RouteID
)
GROUP BY
r.RouteID,
r.RouteName,
previousRouteStep.VehicleID,
previousRouteStep.StepsInRoute
精彩评论