pgtfs_csa

pgtfs_csa — Perform the connection scan algorithm to identify the shortest arrival time between an origin and a destination, given a departure time and a GTFS network.

Availability

  • Version 0.0.1

    • Official function

  • Version 0.0.2

    • Fixed incorrect arrival times

  • Version 0.0.3

    • Added minimize_transfers boolean flag (which defaults to false)

Description

The algorithm scans through a sequence of transit stops and connections, exploring potential routes from an origin to a destination. Starting from an origin, the algorithm iteratively scans connections departing from that stop and moves on to its neighbor if a transfer is possible. This is done recursively until the destination is reached, or there are no more feasible connections.

Signature

pgtfs_csa(
 origin TEXT,
 destination TEXT,
 departure_time DOUBLE PRECISION,
 network TEXT,
 minimize_transfers BOOLEAN
 )

RETURNS TABLE (
 stop_id TEXT,
 stop_sequence INT,
 arrival_time DOUBLE PRECISION,
 trip_id TEXT
 )

Input columns

Name

Type

Description

origin

TEXT

Id of the origin

destination

TEXT

Id of the destination

departure_time

DOUBLE PRECISION

Epoch for departure time

network

Network SQL

SQL query to select the GTFS network (see below for definition)

minimize_transfers

BOOLEAN

Boolean flack on whether to minimize number of transfers (defaults to false)

Result columns

Name

Type

Description

stop_id

TEXT

Id of the stop

stop_sequence

TEXT

Position of the stop within the route

arrival_time

DOUBLE PRECISION

Epoch for arrival time

trip_id

TEXT

Id of the trip for the stop sequence

Network SQL

The network query should return the following data

Name

Type

Description

trip_id

TEXT

Id of the trip

stop_id

TEXT

Id of the stop

arrival_time

DOUBLE PRECISION

Arrival time at the stop

departure_time

DOUBLE PRECISION

Departure time from the stop

stop_sequence

INT

Stop sequence within trip

Example query

select stop_id, stop_sequence, to_timestamp(arrival_time), trip_id from pgtfs_csa(
    '1', -- origin
    '6', -- destination
    extract (epoch from '2024-01-01 11:55:00+00'::timestamptz)::double precision, -- expected departure time
    $bd$
    select
        trip_id,
        stop_id,
        extract (epoch from arrival_time)::double precision,
        extract (epoch from departure_time)::double precision,
        stop_sequence::int
        from (
            values
            ('trip-1','1',NULL::timestamptz, '2024-01-01 12:00:00+00'::timestamptz,1),
            ('trip-1','2','2024-01-01 12:10:00+00','2024-01-01 12:15:00+00',2),
            ('trip-1','3','2024-01-01 12:20:00+00','2024-01-01 12:25:00+00',3),
            ('trip-1','4','2024-01-01 12:30:00+00',NULL,4),
            ('trip-2','3',NULL, '2024-01-01 12:45:00+00',1),
            ('trip-2','5','2024-01-01 12:50:00+00','2024-01-01 12:55:00+00',2),
            ('trip-2','6','2024-01-01 13:00:00+00',NULL,3)
        )
        as data (trip_id, stop_id, arrival_time, departure_time, stop_sequence)
    $bd$
);

Which produces a result similar to

stop_id | stop_sequence |      to_timestamp      | trip_id
--------+---------------+------------------------+---------
1       |             0 | 2024-01-01 11:55:00+00 | trip-1
2       |             1 | 2024-01-01 12:10:00+00 | trip-1
3       |             2 | 2024-01-01 12:20:00+00 | trip-1
5       |             3 | 2024-01-01 12:50:00+00 | trip-2
6       |             4 | 2024-01-01 13:00:00+00 | trip-2

Minimizing transfers on route

The default version of CSA does not have any mechanism that allows for minimizing the number of transfers on a route.

For example, the following query .. code:

select stop_id, stop_sequence, to_timestamp(arrival_time), trip_id from pgtfs_csa(
    '1', -- origin
    '4', -- destination
    extract (epoch from '2024-01-01 11:55:00+00'::timestamptz)::double precision, -- expected departure time
    $bd$
    select
        trip_id,
        stop_id,
        extract (epoch from arrival_time)::double precision,
        extract (epoch from departure_time)::double precision,
        stop_sequence::int
        from (
            values
            ('trip-1','1',NULL::timestamptz, '2024-01-01 12:00:00+00'::timestamptz,1),
            ('trip-1','2','2024-01-01 12:10:00+00','2024-01-01 12:15:00+00',2),
            ('trip-1','3','2024-01-01 12:20:00+00','2024-01-01 12:25:00+00',3),
            ('trip-1','4','2024-01-01 12:30:00+00',NULL,4),
            ('trip-2','1',NULL, '2024-01-01 12:00:00+00',1),
            ('trip-2','2','2024-01-01 12:10:00+00','2024-01-01 12:15:00+00',2),
            ('trip-2','4','2024-01-01 12:30:00+00',NULL,3)
        )
        as data (trip_id, stop_id, arrival_time, departure_time, stop_sequence)
    $bd$
);

Produces a result spanning two different trips

stop_id | stop_sequence |      to_timestamp      | trip_id
--------+---------------+------------------------+---------
1       |             0 | 2024-01-01 12:55:00+01 | trip-1
2       |             1 | 2024-01-01 13:10:00+01 | trip-1
4       |             2 | 2024-01-01 13:30:00+01 | trip-2

In this case, we can utilize the minimize_transfers variable and set it to TRUE as in the query below

select stop_id, stop_sequence, to_timestamp(arrival_time), trip_id from pgtfs_csa(
    '1', -- origin
    '4', -- destination
    extract (epoch from '2024-01-01 11:55:00+00'::timestamptz)::double precision, -- expected departure time
    $bd$
    select
        trip_id,
        stop_id,
        extract (epoch from arrival_time)::double precision,
        extract (epoch from departure_time)::double precision,
        stop_sequence::int
        from (
            values
            ('trip-1','1',NULL::timestamptz, '2024-01-01 12:00:00+00'::timestamptz,1),
            ('trip-1','2','2024-01-01 12:10:00+00','2024-01-01 12:15:00+00',2),
            ('trip-1','3','2024-01-01 12:20:00+00','2024-01-01 12:25:00+00',3),
            ('trip-1','4','2024-01-01 12:30:00+00',NULL,4),
            ('trip-2','1',NULL, '2024-01-01 12:00:00+00',1),
            ('trip-2','2','2024-01-01 12:10:00+00','2024-01-01 12:15:00+00',2),
            ('trip-2','4','2024-01-01 12:30:00+00',NULL,3)
        )
        as data (trip_id, stop_id, arrival_time, departure_time, stop_sequence)
    $bd$,
    TRUE
);

This produces a result spanning one trip

stop_id | stop_sequence |      to_timestamp      | trip_id
--------+---------------+------------------------+---------
1       |             0 | 2024-01-01 12:55:00+01 | trip-2
2       |             1 | 2024-01-01 13:10:00+01 | trip-2
4       |             2 | 2024-01-01 13:30:00+01 | trip-2

Note that this is not an exhaustive solution that works on all cases, and more advanced algorithms are better suited if minimizing transfers is essential.

See also