How does Google Maps compute ETA?

  Google
Add Your Answer
Answers (4)

Clarifying Questions:

  1. Car , public transport, walk, bike -> lets focus on car
  2. Include real-time traffic? Leave now or leave at another specified date/time ? -> include traffic, leave now
  3. Scope – initial calculation or adaptive calculation till destination is reached? lets start with initial and then add the adaptive
  4. any other customization options to consider (avoid tolls, highways, weather etc ) -> ignore

Framing the answer : Data needed :

  1. Route from A to B, (divided into discrete road segments each with a segment id) with turn by turn instructions
  2. Typical driving speeds on the route from A to B by segment (based on historical data) and by time-segment (lets assume 15 min segment, so 96 segments per day)
  3. realtime traffic data (Current moving speed of cars on the route between A to B) by segment & time-segment

Calculation : Sum (each segment distance * speed) for all segments between A to B

Lets now consider each of the data points needed : segments – Critical piece to tag information Given that maps can identify the small section of road from my house, this segment length likely to be <50m , for simplicity assume 25m Route from A to B divided by segments : This requires roadways data and a service to calculate the turn-by-turn instructions between A to B. Map data is available easily from external sources, but this can also be obtained from historical trip data stored within Google , users running GPS & maps. Typical driving speeds between A to B by segment : Broken by segment id, this data can be fetched from historical trip data of all google maps users who have driven that segment ,during this time-segment. To limit the search query, we can time-bound for past 30 days for this information real-time traffic data by segment and time-segment: this information will be retrieved based on current stored information (Current time-segment- avg moving speed of cars using google maps, for each segment. This can be saved on the local device cache, to utilize for the adaptive compute.

Initial ETA : Current timestamp + Sum (25m * avg speed for current time-segment) for all segments between A to B

TO calculate adaptive ETA, The user presses start now and ETA starts with the initial ETA Lets say there are n segments between A to B (S1,s2,s3….sn) as the user crosses from one segment to another, Maps will send the remaining segment ids till B and check for updated information of avg speed for current time-segment for all segments between current segment to B . Using the existing cached data, the ETA is adapted as ; adaptive ETA ( after S1 is passed) : Current timestamp + Sum (25m * avg speed for current time-segment) for all segments between S2 to B The avg speed within the current time-segment could be constantly changing based on new users, new data points added etc.

This focuses mainly from the client POV, i havent discussed about the data that goes to the server and is used for the average value calculations

A few edge cases : What if a Route A to B has a missing segment ? the segments can be classified as street, highway, tollway, community roads, dirt road etc. THis way, even if a segment is idneitfied to be missing with tagged data, it at least will exist and can be leveraged with predicted speeds what if a segment has no tagged data (speed etc) : Ideally , google’s car would have gone to these segments to collect initial sample data. Else based on previous segment type, we can assign default speed values Average vs median – Median speeds are better to avoid skewed by outliers. Sample size of data needs to be considered, if a segment does not have many samples (lets say >10) then leverage predicted speed rather than calculated median

First, I’d like to clarify the scope of this question. By ETA, I’m going to assume we’re focusing on the time to reach a destination – and you’re traveling from current location to Point B. (Similar logic could be applied for calculating any Point A to Point B. I’m also going to assume for simplicity that we are focusing only on driving, and not on the other ETAs of walking, biking, etc. though again similar logic probably would work there.

Given that, let’s brainstorm some of the factors that affect ETA.

  • Routes
  • Alternate routes
  • Traffic conditions
  • Weather conditions
  • Time it took other drivers in the past
  • Speed limit on different routes

To calculate ETA, what likely happens is Google first gets a rough estimate based on either historical data (e.g., other drivers who have traveled a similar path from close to point A to close to point B). Then Google can adjust based on current traffic and weather conditions.

It’s likely that Google also does this estimation and adjustment by taking parts of the route and calculating the ETA on each section. This then becomes a recursive problem. For example, if you broke down the route from point A to point B into 10 chunks, you could see how long it would take to traverse each chunk, and then sum it up to get total ETA. This likely is how Google does it because then they can also offer alternate routes by combining the fastest “chunks” to reach your final destination.

Google likely also re-calculates every so often to account for stops that you’ve made along the way or new traffic data. That is why your ETA can change even if you are still en route.

I would like to begin by first asking some clarifying questions.

  1. Google Maps provides travel directions for a variety of methods of transportation – car, walking, bike, public transportation. Would you like to focus on one of these, perhaps the car? Yes, we can just focus on the car.
  2. Is there a particular platform I should discuss? Mobile, Desktop, Watch? I would start with mobile, as almost all mobile devices have a GPS unit and is usually along for the ride. Let’s focus on the mobile app.
  3. Lastly, there are two main use cases of ETA that come to mind. One is prior travel (i.e. planning a trip), and the second is during the travel (i.e. navigation). Is there one use case you would want me to focus on? Let’s go over both.

Ok, so let’s talk about Google’s mission which is to “organize the world’s information and make it universally accessible and useful.” Google Maps seems to be a manifestation of this regarding the physical realm – Where is something? What’s around me? How do I get there?

It’s the last question – “How do I get there?” – in which the ETA function plays a vital role. The purpose of the ETA calculation is to provide the user an estimated time of arrival based upon a route.

Does this align with your understanding? Yes.

Perfect, give me a few moments to structure my thoughts.

Before continuing, one thing that came to mind was the infrastructure and design system necessary to scale this solution. For this sake of this exercise, I would like to focus instead on the core algorithm to compute the ETA and not the system design. However, this can be something we dive in on later.

Also, would it be safe to assume that a route has already been selected, or do you want to me discuss how to select and calculate the route? Let’s assume a route has already been selected.

Would that be ok? Yes.

Ok, my approach here would be the following: (1) use cases, (2) available inputs, and (3) using inputs.

Moving onto use cases, as we mentioned earlier, there were two main use cases we discussed.

  1. ETA when NOT en route
  2. ETA when en route

I would like to start on the first use case, and this seems to be the simpler of the two and can be used as a foundation for the latter.

ETA when NOT en route

  • The user has a particular address in mind. There are several ways the user could have obtained the address (search result, typed in, via link), but that shouldn’t be relevant to the purpose of ETA calculation.
  • The user would like to understand how long it would take him/her to get there.
  • We have a few subcases to this
    • 1A) User wants to know the current ETA (i.e. if they left now)
    • 1B) User wants to know a future ETA
    • 1C) User wants to know a past ETA
    • Use case 1C seems to be an edge case, let’s exclude that one for now.

ETA when NOT en route – Inputs

  • User’s current position
  • Destination position
  • Route
  • Current traffic patterns (leveraging Google Maps, Waze, or other asset/partnerships)
  • Predicted traffic patterns
  • Current traffic issues (i.e. accident, construction)
  • Future traffic issues (i.e. scheduled road closure)
  • Historical trips and ETAs

This list is pretty substantial unless you want me to consider others, can I take a few moments on how I would leverage this information.

ETA when NOT en route – Using Inputs

I want to start with a simple way of calculating the ETA, and then maybe can think of a more complex one.

  1. Split the route into parts (maybe by road and mile). For each part of the route, we should know the distance to be traveled and the current traffic pattern (i.e. speed limit, if current traffic not available), enabling us to calculate the duration of each part. This duration of each allows us to calculate the ETA of the segment (to be used later).
  2. Check for known issues. Known issues may be taking into current traffic patterns, but maybe there are parts of the route where Google doesn’t have enough data. So we can isolate this to those segments where the traffic patterns are unknown.
  3. Check for predicted issues. As the ETA of the segment increases, the current traffic pattern is less reliable. After a certain threshold, Google should calculate a predicted traffic pattern. A simple example would be a driver initially calculating the ETA before rush hour, but by the time the driver would make it on to a highway, they would be in traffic during rush hour. I want to dig into this a little more, does that work for you? Yes.

Predicted Issues

This piece of the puzzle seems to be the most subjective and real opportunity to provide an edge/moat against other players in the market. Machine learning could be valuable here, as it allows the algorithm to assess multiple data points efficiently, as well as improve over time.

I could see adding additional external data points that could help improve the accuracy of the predictions. Some examples could be:

  1. Weather
  2. Major events (sports, concerts, conferences)
  3. Airports and travel volume

We would want to measure the accuracy of the predicted time, to the actual time. With the right system in place, these feedback signals should also improve the machine learning algos.

Any questions here before I continue? No.

I want to be sensitive of time, as for next steps I can do a similar exercise of how ETA works when en route, I can summarize my thinking, or go in a different direction. What would you prefer? Can you briefly go high level over how ETA would be different en route? And summarize your thoughts.

Ok, let me think about the ETA while en route use case, and the differences here. Need a minute.

Let me briefly talk about some of the differences the ETA while en route use case has vs while NOT en route.

  1. User’s current position changes
  2. ETA needs to be continuously updated
  3. U/X and communication differences
  4. Parallel calculations on alternative routes

Besides constantly updating the ETA calculation (which can use a similar process as we described earlier), I think a major important difference here is what happens when we find a more optimized route. I imagine other than the ETA calculation of the current route, in the background, the ETA should be calculated of other potential routes that the user can take at that moment in time.

Based on these calculations, we may need to communicate updates to the user. Examples could be – (1) updated ETA time of current route; (2) route change based upon more recent information.

To summarize, the question posed was how does Google Maps calculate ETA, and we isolated this to the use case to the mobile app and driving as the method of transportation. We identified two use cases (en route, and not en route) and focused on the more simple of the two, which would provide a foundation for the more complex use case.

To calculate the ETA while not en route, we divided up the route into different segments and spoke about the different data points necessary to calculate the ETA as well as how we can leverage ML to provide better calculations.

Even though we isolated just the driving use case, similar principles can be used when thinking about other forms of transportation.

There are a couple scenarios when ETA is calculated: planning a future journey, planning current journey. Can we assume scope to the current journey? The learnings will likely be transferable. User inputs to ETA include origin, destination, and route chosen. Let’s assume first that the user has chosen an origin, destination, and a route chosen. We can come back to these assumptions later.

The way I’ll structure this answer is: (1) what are the variables that need to be taken into account, (2) what data does Google have, (3) what qualities does the solution require, (4) combining #’s 1-3 to formulate the best technical solution:

(1) Variables that can be taken into account include:

(2) Google has these data []

  • Average speed of current traffic – phones running maps provide GPS & speed data
  • Obstacles (collisions, construction, events) – crowd-sourced from Waze and perhaps third party API
  • Weather conditions can be used as an external regressor for time series prediction – connection with internal weather tools
  • Historical speed of traffic (for time series modeling/prediction) – phones running maps provide GPS & speed data
  • Historical similar trip time (possibly can be used in ML models, but increases complexity) – historical trips logged, with routes taken and time to complete

(3) The solution needs to be fast (local compute, cache data when possible), accurate (GPS used), reactive & live (continuously pinging the server).

(4) Based on user input, Maps knows the time of departure, route, and destination. Models run in the Maps app, taking into account the variables & data listed above, and the output is the ETA. Below, I will go into possible models in the Maps app.

  • Generating a baseline ETA (quick user feedback): the first thing the app should do is generate a bare-bones model based on cached data on historical speeds of traffic in major roadways, time of day, weather conditions, and presence/absence of obstacles. Similar trips (same origin and destination during similar time of day and weather) can also be taken into account here. The output should be an ETA and a statistical range (such as confidence interval). All these data should be stored locally to optimize compute time. Nothing live is taken account in the baseline model.
  • Adjusting the ETA based on current speed of traffic: once the baseline ETA is generated based mostly on historical data, the app should adjust the ETA based on current speed of traffic. A model get take speeds and locations of other cars on the same route (outliers should be removed) to calculate the amount of time a car traveling at the current average speed on the roads will take to arrive at the destination. Check how far is the updated ETA from the baseline ETA and notify the user if it’s longer than expected.

This solution provides a method of calculating ETA for one route. Now, the real Maps app provides multiple route suggestions, and these are suggestions amongst all the possible route possibilities. ETA’s are taken into account for these suggestions, so there must be a cache of historical baseline ETA’s, given current time/weather/obstacles.

This solution goes through the relevant variables, how Google has data for these variables, and design of a model to meet specific user needs. Some things I could have considered more include (i) how the live data is stored in Google data centers, what type of data store to use (ii) how it technically reaches the Maps app and is ingested into the model.