Google will cache all content (and removes it after a certain period of time if it's isn't requested anymore), some others like Netflix will specifically cache popular content.
But it's the same idea really, brand new content may be slightly slower than it would be from the cache.
Here's some documentation from a few years ago, it still works more or less the same though:
quote:
The explosion of broadband access and rich multimedia content continually increases the demand on service provider networks. Google Global Cache (GGC) allows you to serve Google content, primarily video, from the edge of your own network. This eases congestion on your network and lessens traffic on peering and transit links. GGC saves you money while improving the experience of your users.
System Overview
Without GGC, every user request for the latest YouTube video causes a unique copy of that video to transit your network, from Google to your user. With GGC, only the first copy of the video makes the transit. When another user requests that same video, Google serves it from your GGC node.
GGC Features
- Reduced traffic through your network
Cache hit rates vary with the usage pattern of your users, but typical performance is close to 75%.
- Faster response, transparent to users
Google transparently serves your users’ requests from caches inside your network.
- Easy to set up
Installation requires a rack, a laptop, a copy of our CD, and a connection to your network. Once the servers have been initially configured and are reachable, Google will do the rest of the work and monitoring remotely.
- Robust
The node has multiple levels of redundancy. If the GGC node is unavailable for any reason, user requests will be sent transparently to Google.
How GGC Works
When a user requests a piece of content – for example, a video, web page, or image – Google systems determine if that content can be served from the GGC node inside your network and if the user is authorized to access the GGC node.
If the GGC node already has the requested content in its local cache, it will serve the content directly to the end user, improving the user experience and saving transit expense.
If the content is not stored on the GGC node, the node will retrieve it from Google, serve it to the user, and store it for future requests.
Request Flow Diagram
Diagram Key
1. A user follows a link to a Google-hosted video or other content. The computer generates a DNS request for the IP address of the content host.
2. Your DNS resolver queries Google DNS for the IP address of the content host.
3. Google DNS knows that you have Google Global Cache, so it replies with the IP address of the node. It knows this because you have advertised the IP addresses of your DNS resolver to the node (via BGP) and Google has loaded that information into its DNS system.
4. Your DNS returns the IP address of the cache node to the user.
5. The user’s computer now sends the content request to the received IP address, which routes to your GGC node.
6. The node validates that the user should be served from this node. It does this by comparing the user’s IP address to the list of IP blocks advertised to the node via BGP. If the address is not valid for the requested content, the user is redirected to a cache on the Google network.
7. If the content is not already on the GGC node, the node requests the content from Google and caches it.
8. Once the GGC node has the content, it serves it to the user. The content is retained on the node so that the next request can be served without pulling the content from Google.
Maintenance and Support
The GGC system is designed with multiple levels of redundancy. Content and user requests are spread across all available servers, so if a server failure occurs, another server in the GGC node can immediately respond to the user request. If a server is unable to respond to a request, it will redirect the user back to Google.
Google monitoring will detect failures and attempt to resolve issues remotely. In the event that the failure cannot be repaired remotely, we will contact your technical contact to schedule the next step. This next step could include hardware diagnosis, hardware replacement, or software reinstallation. If a hardware swap is required, Google will arrange for the RMA.
Frequently Asked Questions
Q: What is the expected hit rate of the video cache
A: Cache hit rate will vary based on the traffic and usage pattern of your users. Typical cache hit rate is between 70% and 90%.
Q: Will the users be required to make any changes to take advantage of GGC? Could this generate additional call volume on our technical support lines?
A: The system will be transparent to users. If the GGC node is unavailable for any reason, user requests will be sent directly to Google as they are today.
Q: How will Google send content to the GGC Node?
A: The local cache is filled in a read-through basis when content is requested by the end-user. No content is pre-loaded.
Q: What Google services will be supported by the GGC node?
A: Typically, most of the traffic served from the GGC node is large content such as video streams and file downloads. Other web services, such as Google search and maps are proxied and cached as well. Web services are dynamically added and removed from cache nodes based on capacity and end-user performance improvement.
Q: Who owns the GGC node?
A: Google will retain ownership of the hardware and software that makes up the node. Google will be responsible for all maintenance, support, and shipping costs related to the server equipment.
Q: Will other ISPs’ customers be sent to the GGC node on our network?
A: Requests from any user who can access your DNS resolvers may be sent to the node. The node will redirect requests from users outside of the prefixes you are advertising. For this reason, access control lists limiting IP ranges that can reach the cache are not permitted.
If you provide service to downstream ASNs, please ensure their prefixes and resolvers are provided via the BGP feed.
Q: We provide transit services for other ISPs. Will their end users use the GGC node?
A: They can, and probably should, if your network is their primary path to Google. Ensure their user and DNS resolver IP ranges are included in the BGP feed.