Overview
Developers using grpc-web often find the need to implement their own caching mechanism. Common approaches include memory caching and service workers. The primary use case of gRPCExpress is to eliminate this overhead by providing an out-of-the-box caching solution.
Data Flow of a call
Overall structure of the library, how we intercepted calls and stored items in cache All grpc-web method calls are initiated as streams. At the heart of gRPCExpress is its ability to intercept grpc-web method calls. Every time an application issues a call, the library jumps in before it reaches any other interceptors. This interception is where the magic happens, deciding if a cached response should be used or if the call should proceed to the server. The library maintains an in-memory cache, wherein cached responses are stored as serialized buffers.
When a call is intercepted, gRPCExpress checks this cache to see if a relevant buffer exists:
- If found, the buffer is deserialized using its associated deserialization function, transforming it back into a usable response format. This means that frequently accessed data can be rapidly fetched from memory without the need for redundant network calls or serialization overhead.
Every cache hit not only retrieves data but also plays a role in cache management:
- The frequency associated with that specific cache item is updated. This helps the library track how often certain data is accessed.
- Simultaneously, the cost associated with the cache item is also recalculated. This cost, determined by both size and frequency, is integral to the library’s cost-aware algorithm. In scenarios where the cache size needs to be trimmed, this algorithm prioritizes removing high-cost items first.
Should the in-memory cache not contain a needed response, the library allows the grpc-web method call to proceed to the server:
- Once the server responds, the library takes a crucial step: it serializes the response into a buffer.
- This buffer is then stored in the in-memory cache for future use. Alongside, the deserialization function specific to this buffer is stored separately. This ensures that when the cached response is needed, it can be accurately and efficiently converted back to its original format.