In today’s data-driven world, high connectivity translates into massive amounts of data being stored. In urban mobility, Intelligent Transportation Systems have set a new paradigm for all stakeholders involved, both on the demand and supply side. However, a need for correctly and accurately interpreting and working with data came in the wake of the big data revolution of the last few years.

In this regard, Moliere’s Mobility Data Marketplace (MDM) has the potential to bring together transit operators and data analysts under a protected environment, where both can share data and knowledge to improve transportation systems.

As announced in previous articles, one of Molière’s use cases, led by the University Polytechnic of Catalonia (UPC), is resulting in a bus service prediction algorithm which, if applied to data as introduced in the MDM, will help transit agencies better design and plan their routes, consequently optimising the behaviour of their networks.

The original idea behind the use case was built upon demonstrating that simple algorithms of AI could discover non-obvious relationships between transport-impacting factors and modal choice, some of these being the time of the day or week the trip is to take place, the trip distance, the weather forecast, the area of origin or destination, or the price. Whilst these relationships have been studied in the past, these AI algorithms could be personalised for different types of end-users, which consequently positions Molière as an advanced solution for the issues the industry is currently experiencing.

Focusing on fixed bus networks, these algorithms could then predict routes demand depending on external factors and on how these impact travel time. This would additionally make real time services information more reliable, consequently improving user experience and helping modal shift from single-occupancy car use to public transport.

To achieve this ultimate goal through the development of the bus service prediction algorithm, the UPC had to first establish geolocation data as their core value: without taking reliable, accurate geolocation data as the starting point, bus routes and the impact of external factors on them may not be properly managed and/or designed. This is since the uncertainty on the observed coordinates, and therefore on the many variables related to it, may play an integral role in scheduling and other important issues.

Considering all the above, the algorithm has been developed in two phases which enabled the pre-processing of data (reduction of dimensionality, data cleaning and data integration) to follow with the modelling of the algorithms (EDA, model training and model validation).

During further stages, the algorithm models will be applied to reality within the MDM, enabling the use of high accurate, reliable geolocation data as supplied by Galileo.

Achieving the end-goal of the bus route prediction tool would mean bus operators and city authorities could plan timeframed decisions on key factors of the public transport network operation. Examples of these measures can be amending the fleet required for specific demand in a reliable way (e.g. relating longer stop times with peak demand or with high levels of congestion, depending on other elements which could be correlated), or enabling the temporary use of bus lanes for other modes in specific areas or times of the day or week, consequently finishing with the use of public space as a rigid element of our cities, and optimising it for all.