why i use a proxy for my cloud data (SSDS)

2008-10-14 @ 21:24#

over the last several months, i've been lucky enough to be allowed beta access to Microsoft's SQL Server Data Services (SSDS) platform. it's excellent.

i like the approach they are taking for connectivity (native support for SOAP and HTTP/REST interface); the storage model (Authority [data-center], Container [server], Entity [storage item]); and their data modeling (XML documents plus binary files).

early on i decided to implement a proxy server to 'front' the SSDS servers. i get lots of questions about this. it seems counter-productive to 'tunnel' requests for distributed data through a single server. in fact, i couldn't support my online samples unless i used a proxy. and i don't think any real, large-scale application could either. and even if you think you *can* skip using a proxy, i don't think you should. and here's why.

security

first, SSDS (currently) offers no user-level account management. you get a single account (user/password) w/ complete rights to add/edit/delete all content. that means no public internet apps can read or write without these credentials. second, it means you can't enforce read-only access to the data. finally, you'd need to distribute the account information with every app install (either in text, config, or compiled into code).

integrity

SSDS requires a unique record id for every entity in your container. and SSDS does not generate these id values for you (no auto-increment field, etc.). client apps do that. if you distribute an application throughout your organization, region, country, etc. you need to make sure each client always uses a unique id. easy if you are the only one writing code for the data storage. what if someone else wants to write a client to interact w/ your data?

availability

while SSDS does a great job distributing the data, it does not provide any intermediate caching or data compression. that means every client requesting data will *always* get a fresh copy of the data directly from the origin server. even if the client already has a valid current copy of that data. and any data sent is not compressed to reduce bandwidth. common browsers have support for both these optimizations built-in.

so, proxy then

using a proxy allows me to provide user-level access control; maintain data integrity by ensuring new entity id values follow a pre-designed pattern; and allow me to add compression and caching support to reduce bandwidth and response times for my client applications. this also gives me options for building client applications. i can easily host an Ajax-based application in the same domain as my data proxy. i can't use Ajax directly against SSDS domains.

futures

i expect some of these issues to be resolved in future releases of SSDS. for example, the caching and compression items are relatively easy to implement. but some might be outside the bounds of SSDS. for example, user-level access might be provided by a completely different service - an identity proxy maybe?

whenever the feature set of SSDS grows, i can adjust my proxy server accordingly. often w/ little or no impact on clients (i've done this already at least once). btw, i'm looking forward to new, cool stuff from them very soon.

SDS