reflecting on media types
after my work this week coding proper caching support for resources with multiple media types, i've come to view things a bit differently. first, my initial model allowed the programmer to create URI lists (one for immediate and one for background work) that are given the
HEAD, Cache-Control:no-cache treatment whenever the specific resource is updated (
POST, PUT, DELETE). i needed to modify things a bit when i added media types (declare the media types for a resource then use that as a sub-loop for working through the URI lists), but it was not all that difficult. i'm getting a bit of a nagging feeling that i'm creating some exponential problems for cache updates (imagine a resource that has 10 dependent URIs and five different media types), but i still think it's manageable.
but a real killer came up last night - cache updates against URI that holds a resource with a different media type than the updated resource. update a
text/xml resource and you need to refresh the cache of a
text/html resource. i added an override feature to the URI lists, but that's really icky.
so i have a solution...
when the resource is defined in code, make the list of supported media types a custom attribute of the class. then, when it comes time to trundle through the URI lists, pick up the class that belongs to that URI (using the existing
UriPattern custom attribute) and pull in the media types for that URI. then the cache update routine will be sure to update the proper potential items, without hard-coding it into the instance class that was updated. not bad, right?
there are two other issues that are haunting me on this. first, this pattern works by forcing *into* cache all media types for a resource at the URI. again, 10 URIs and five media types means 50 updated cache entires from one PUT. is that scalable? is it better just *drop* the cache items and let requests force the items back in as needed?
second, as things scale up (more URI, more updates, more users...) the time it takes to complete the cache update (immediate and background) will increase. i currently do the immediate update before yielding the thread to the caller. i do the background work by spawning a new thread to run in the background. i suspect that this will have to change. most likely, the cache updates will need to be handled by a separate process (outside the web app) altogether. maybe a queue up the requests and there's a service working on cleaning it up.
third, the sheer size of the cache will eventually result in things dropping out due to space. and that can create and endless call to update the cache - driving the system to it's knees. i'll need to be prepared to use a third-party caching engine (memcached comes to mind) to handle the larger workload.
finally, i think i might be in for some 'race conditions' for some of this caching. if two people update two different resources that share dependent URIs we could be in for a deal where the cache updates again flood the system. i might ahve to consider some kind of locking/semaphore pattern to avoid that.
at least these last items are longer term issues. nothing immediate. right now i need to add support for reflection for the media types and that will solve the vital issue.