This blog delves into a research exploration of Google's authentication flow, using only the inspect tool. By simulating potential pathways, we aim to uncover insights into how Google manages the login process. Join us as we embark on an exploration into the intricacies of Google's authentication mechanisms.
Let's embark on an intriguing exploration. Imagine we're not logged into our Google account and attempt to access one of its products, like Docs or Forms. We'll observe the flow by scrutinizing the network console. Join us as we navigate through this process step by step.
As per the blogs agenda now we are trying to access one of the Google's product docs.
When attempting to access Docs without being logged into any Google accounts, the screen automatically redirects to Google's login page for authentication.
How does this happens let us observe the network calls that has been made on the event of accessing docs
If we observe the network tab there are series of endpoint calls made to check whether the user is logged into the browser with any of the google account
the endpoints called during this process are
document/?
This endpoint is a part of the docs.google.com service, as observed. Assuming its purpose is to redirect requests to another endpoint called ServiceLogin, which belongs to the accounts.google.com service, as deduced from the response of the document endpoint. Furthermore, the response redirects the request based on the HTTP status code 302.
ServiceLogin
Typically, this service is used to verify the user's session. If the user is logged in, it proceeds to the next process, which makes sense in a logged-in scenario. However, in our current situation, assuming the user is not logged in, the response redirects to a new endpoint named InteractiveLogin, belonging to the server accounts.google.com. The purpose of this endpoint is unclear, but it appears to be responsible for displaying or redirecting users to the Google login page.
The InteractiveLogin endpoint redirects to an endpoint called identifier, which serves the Google login page as the response.
After entering our credentials and completing the normal authentication process, we typically encounter a welcome page. This serves as the final step before gaining access to the product, as depicted in the image below.
After choosing to continue, you will be directed to the Google Accounts page. From there, you can access any desired product from the available apps, signifying the successful completion of the authentication process.
Is that everything this blog covers? Absolutely not!
Let us try to access another product called forms from the google app
Keep in mind that we have successfully authenticated (logged in) with one of our Google accounts.
Now we are able to access the application as we loggedin in the process.
Similarly to the previous process, when attempting to access the application now that we are logged in, a series of endpoints are executed. However, this time, the sequence of endpoints executed is slightly different from the initial section. Let's delve into the network calls and analyze specific aspects.
forms/?
This endpoint is a part of the docs.google.com service, as observed.As Assumed earlier its purpose is to redirect requests to another endpoint called ServiceLogin, which belongs to the accounts.google.com service
This time, the ServiceLogin endpoint redirects to a new endpoint called SetOSID. This endpoint likely handles the setting of crucial cookies for future use before redirecting to the forms server at docs.google.com/forms to access the form application.
If we attempt to access the rest of the application, the same set of mechanisms will be executed. After completing necessary steps, you'll eventually gain access to the application you've been trying to reach.
The question that prompted me to write this blog is: How do we finally reach the application after traversing through various servers in the process? The answer lies in a simple mechanism: They transmit the origin of the application to all subsequent servers in the process by appending it as query parameters named "continue."(Please refer to the image above, where the highlighted section is indicated in yellow.).
For all applications, the process proceeds in the same manner, ultimately granting access or redirecting back to the sign-in process if your are not authenticated.
Finally, we've managed to grasp an authentication flow of Google, albeit not comprehensively, but it's something we didn't know before. But is this the end of the blog? Ha! Definitely not!
How do all other application servers recognize that the user is authenticated? As most of you have probably guessed, it's through cookies.
The crucial aspect here is the manner in which cookies are managed, as they play a vital role in ensuring a seamless login experience across various applications.
Cookies
As widely understood, cookies are automatically included in subsequent requests to the server after they have been set or attached to the browser.
The manner in which cookies are attached plays a significant role. There are two strategies for setting cookies: Domain Specific and Generic.
In the Application Tools Cookies session screenshot below, various columns and values are displayed, including "Name" (the key of the cookies), "Value" (the actual value of the cookie), and "Domain" (the domain associated with the cookie). While other properties are self-explanatory, the Domain column is of particular importance.
For instance, consider the cookie "SID" with a value of ".google.com" in the Domain column. This indicates that the cookie can be accessed by any subdomain under "google.com," such as "docs.google.com," "accounts.google.com," and "xyz.google.com." This cookie-setting approach enables widespread accessibility across multiple subdomains within the parent domain.
On the other hand, let's examine the "COMPASS" cookie, with a Domain value of "docs.google.com." This signifies that the cookie is exclusively accessible to the "docs.google.com" server.
In essence, "SID" is a Generic Cookie, while "COMPASS" is a Domain-Specific Cookie.
This strategy simplifies the authentication mechanism, allowing for a single login to access multiple applications.
Based on these observations, let's deduce the assumed flow for an unauthenticated user as depicted in the images below.
and the authenticated version follows
The process we've been learning and encountering is essentially a mechanism that leads to the implementation of Single Sign-On (SSO).
This is one way of implementing SSO.
Conclusion
This blog is an exploration of concepts previously unknown to us, based entirely on pure assumption using the available toolset and observations. It's important to note that the conclusions drawn here are not guaranteed to represent the exact processes followed in real-time organizations like Google. In reality, implementing systems such as these involves a much deeper level of complexity and detail. This exploration serves as an opportunity to learn and expand our understanding. Keep learning and exploring!