pgtfs_csa
pgtfs_csa
— Perform the connection scan algorithm to identify the shortest arrival time between an origin and a destination, given a departure time and a GTFS network.
Availability
Version 0.0.1
Official function
Version 0.0.2
Fixed incorrect arrival times
Version 0.0.3
Added minimize_transfers boolean flag (which defaults to false)
Description
The algorithm scans through a sequence of transit stops and connections, exploring potential routes from an origin to a destination. Starting from an origin, the algorithm iteratively scans connections departing from that stop and moves on to its neighbor if a transfer is possible. This is done recursively until the destination is reached, or there are no more feasible connections.
Signature
pgtfs_csa(
origin TEXT,
destination TEXT,
departure_time DOUBLE PRECISION,
network TEXT,
minimize_transfers BOOLEAN
)
RETURNS TABLE (
stop_id TEXT,
stop_sequence INT,
arrival_time DOUBLE PRECISION,
trip_id TEXT
)
Input columns
Name |
Type |
Description |
---|---|---|
origin |
|
Id of the origin |
destination |
|
Id of the destination |
departure_time |
|
Epoch for departure time |
network |
SQL query to select the GTFS network (see below for definition) |
|
minimize_transfers |
|
Boolean flack on whether to minimize number of transfers (defaults to false) |
Result columns
Name |
Type |
Description |
---|---|---|
stop_id |
|
Id of the stop |
stop_sequence |
|
Position of the stop within the route |
arrival_time |
|
Epoch for arrival time |
trip_id |
|
Id of the trip for the stop sequence |
Network SQL
The network query should return the following data
Name |
Type |
Description |
---|---|---|
trip_id |
|
Id of the trip |
stop_id |
|
Id of the stop |
arrival_time |
|
Arrival time at the stop |
departure_time |
|
Departure time from the stop |
stop_sequence |
|
Stop sequence within trip |
Example query
select stop_id, stop_sequence, to_timestamp(arrival_time), trip_id from pgtfs_csa(
'1', -- origin
'6', -- destination
extract (epoch from '2024-01-01 11:55:00+00'::timestamptz)::double precision, -- expected departure time
$bd$
select
trip_id,
stop_id,
extract (epoch from arrival_time)::double precision,
extract (epoch from departure_time)::double precision,
stop_sequence::int
from (
values
('trip-1','1',NULL::timestamptz, '2024-01-01 12:00:00+00'::timestamptz,1),
('trip-1','2','2024-01-01 12:10:00+00','2024-01-01 12:15:00+00',2),
('trip-1','3','2024-01-01 12:20:00+00','2024-01-01 12:25:00+00',3),
('trip-1','4','2024-01-01 12:30:00+00',NULL,4),
('trip-2','3',NULL, '2024-01-01 12:45:00+00',1),
('trip-2','5','2024-01-01 12:50:00+00','2024-01-01 12:55:00+00',2),
('trip-2','6','2024-01-01 13:00:00+00',NULL,3)
)
as data (trip_id, stop_id, arrival_time, departure_time, stop_sequence)
$bd$
);
Which produces a result similar to
stop_id | stop_sequence | to_timestamp | trip_id
--------+---------------+------------------------+---------
1 | 0 | 2024-01-01 11:55:00+00 | trip-1
2 | 1 | 2024-01-01 12:10:00+00 | trip-1
3 | 2 | 2024-01-01 12:20:00+00 | trip-1
5 | 3 | 2024-01-01 12:50:00+00 | trip-2
6 | 4 | 2024-01-01 13:00:00+00 | trip-2
Minimizing transfers on route
The default version of CSA does not have any mechanism that allows for minimizing the number of transfers on a route.
For example, the following query .. code:
select stop_id, stop_sequence, to_timestamp(arrival_time), trip_id from pgtfs_csa(
'1', -- origin
'4', -- destination
extract (epoch from '2024-01-01 11:55:00+00'::timestamptz)::double precision, -- expected departure time
$bd$
select
trip_id,
stop_id,
extract (epoch from arrival_time)::double precision,
extract (epoch from departure_time)::double precision,
stop_sequence::int
from (
values
('trip-1','1',NULL::timestamptz, '2024-01-01 12:00:00+00'::timestamptz,1),
('trip-1','2','2024-01-01 12:10:00+00','2024-01-01 12:15:00+00',2),
('trip-1','3','2024-01-01 12:20:00+00','2024-01-01 12:25:00+00',3),
('trip-1','4','2024-01-01 12:30:00+00',NULL,4),
('trip-2','1',NULL, '2024-01-01 12:00:00+00',1),
('trip-2','2','2024-01-01 12:10:00+00','2024-01-01 12:15:00+00',2),
('trip-2','4','2024-01-01 12:30:00+00',NULL,3)
)
as data (trip_id, stop_id, arrival_time, departure_time, stop_sequence)
$bd$
);
Produces a result spanning two different trips
stop_id | stop_sequence | to_timestamp | trip_id
--------+---------------+------------------------+---------
1 | 0 | 2024-01-01 12:55:00+01 | trip-1
2 | 1 | 2024-01-01 13:10:00+01 | trip-1
4 | 2 | 2024-01-01 13:30:00+01 | trip-2
In this case, we can utilize the minimize_transfers variable and set it to TRUE as in the query below
select stop_id, stop_sequence, to_timestamp(arrival_time), trip_id from pgtfs_csa(
'1', -- origin
'4', -- destination
extract (epoch from '2024-01-01 11:55:00+00'::timestamptz)::double precision, -- expected departure time
$bd$
select
trip_id,
stop_id,
extract (epoch from arrival_time)::double precision,
extract (epoch from departure_time)::double precision,
stop_sequence::int
from (
values
('trip-1','1',NULL::timestamptz, '2024-01-01 12:00:00+00'::timestamptz,1),
('trip-1','2','2024-01-01 12:10:00+00','2024-01-01 12:15:00+00',2),
('trip-1','3','2024-01-01 12:20:00+00','2024-01-01 12:25:00+00',3),
('trip-1','4','2024-01-01 12:30:00+00',NULL,4),
('trip-2','1',NULL, '2024-01-01 12:00:00+00',1),
('trip-2','2','2024-01-01 12:10:00+00','2024-01-01 12:15:00+00',2),
('trip-2','4','2024-01-01 12:30:00+00',NULL,3)
)
as data (trip_id, stop_id, arrival_time, departure_time, stop_sequence)
$bd$,
TRUE
);
This produces a result spanning one trip
stop_id | stop_sequence | to_timestamp | trip_id
--------+---------------+------------------------+---------
1 | 0 | 2024-01-01 12:55:00+01 | trip-2
2 | 1 | 2024-01-01 13:10:00+01 | trip-2
4 | 2 | 2024-01-01 13:30:00+01 | trip-2
Note that this is not an exhaustive solution that works on all cases, and more advanced algorithms are better suited if minimizing transfers is essential.