Tradeoffs When Minimizing The Impact Of Changes

Silver Bullet: From http://www.tejasthumpcycles.com/Parts/LeversGripsctrls/Silver_Bullet/Silver_Bullet_Shift_Brake.jpg In a recent post I talked about how good design attempts to minimize the impact of changes to a system, often through Design Patterns.

When used appropriately, Design Patterns are a great tool for building a great design, but there is an important caveat to keep in mind anytime you apply a pattern. A Design Pattern might minimize the impact of one kind of change at the expense of amplifying another type of change.

What do I mean by this? One common pattern is the Abstract Factory pattern which is often manifested in .NET code via the Provider Model pattern. The Provider Model abstracts access to an underlying resource by providing a fixed API to the resource. This does a bang up job of insulating the consumer of the provider when changing the underlying resource.

The MembershipProvider is one such implementation of the provider model pattern. The consumer of the MembershipProvider API doesn’t need to change if the SqlMembershipProvider is swapped in favor of the ActiveDirectoryMembershipProvider. This is one way that the provider pattern attempts to minimize the impact of changes. It insulates against changes to the underlying data store.

However there is a hidden tradeoff with this pattern. Suppose the API itself changes often. Then, the impact of a single API change is multiplied across every concrete implementation of the provider. In the case of the MembershipProvider, this is pretty much a non-issue because the likelihood of changing the API is very small.

But the same cannot be said of the data access layer for software such as a blog (or similar software). A common approach is to implement a BlogDataProvider to encapsulate all data access so that the blog software can make use of multiple databases. The basic line of thought is that we can simply implement a concrete provider for each database we wish to support. So we might implement a SqlBlogDataProvider, a MySqlBlogDataProvider, a FireBirdBlogDataProvider, and so on.

This sounds great in theory, but it breaks down in practice because unlike the API to the MembershipProvider, the API for a BlogDatabaseProvider is going to change quite often. Pretty much every new feature one can think of often needs a backing data store.

Everytime we add a new column to a table to support a feature, the impact of that change is multiplied by the number of providers we support. I discussed this in the past in my post entitled Where the Provider Model Falls Short.

Every Design Pattern comes with inherent tradeoffs that we must be aware of. There is no silver bullet.

The key here when looking to apply patterns is to not follow a script for applying patterns blindly. Look at what changes often (in this case the database schema) and figure out how to minimize the impact of that change. In the above scenario, one option is to simply punt the work of supporting multiple databases to someone else in a more generic fashion.

For example, using something like NHibernate or Subsonic in this situation might mean that a schema change only requires changing one bit of code. Then NHibernate or Subsonic is responsible for making sure that the code works against its list of supported databases.

One might object to these approaches because they feel these approaches cannot possibly query every database they support as efficiently as database specific SQL as one would do in a database specific provider. But I think the 80/20 rule applies here. Let the dynamic query engine get you 80% of the way, and use a provider just for the areas that need it.

So again, this is not an indictment of the provider model. The provider model is extremely useful when used appropriately. As I mentioned, the Membership Provider is a great example. But if you really need to support multiple databases AND your database schema is succeptible to a lot of churn, then another pattern may be in order.

What others have said

Requesting Gravatar... Damien Guard Nov 16, 2006 10:01 AM
# re: Tradeoffs When Minimizing The Impact Of Changes
There is a middle-ground between the two whereby you just don't put any db-specific code into your product unless absolutely necessary.

You can use the .NET DbProviderFactories, DbProviderFactory and the associated IDb interfaces to isolate your own code from the implementation details with some caveats and limitations (parameter naming, SQL syntax, etc).

These are more undelying problems with ADO.NET's failure to provide a true provider-independent API.

[)amien
Requesting Gravatar... Rick Strahl Nov 16, 2006 2:06 PM
# re: Tradeoffs When Minimizing The Impact Of Changes
Phil, maybe a BlogProvider isn't the right point to abstract. Ultimately it should be the data layer that has to be abstracted not the business layer if you want to handle different data sources.

In the end it's the difference between application level and system level code. Patterns rarely make a lot of sense at the application level, but they are frequently useful at the system level.
Requesting Gravatar... Haacked Nov 16, 2006 2:18 PM
# re: Tradeoffs When Minimizing The Impact Of Changes
Rick, I probably wasn't clear, but that's exactly what I'm talking about. My example was BlogDataProvider. The idea behind a provider there is so that onstensibly, the application is shielded from the specific implementation of the database.

But in practice, this is problematic. Look at .TEXT and Subtext, which have been around a while. Are there any providers implemented for other databases? What about DotNetNuke? Providers for other databases are extremely rare.

If you have four concrete providers, and you add a column, you have four places (at least) to make that change. The providers have to know about the new column, right?

But if you use something like a dynamic query engine, you only make that change once for all databases that the query engine supports.
Requesting Gravatar... Haacked Nov 16, 2006 2:24 PM
# re: Tradeoffs When Minimizing The Impact Of Changes
P.S. Just to be clear, I'm not saying patterns are not appropriate here. I'm saying to choose the right pattern. In my view, for this particular scenario, the "Active Query" pattern might be better than the "Provider Pattern".
Requesting Gravatar... Rick Strahl Nov 16, 2006 7:05 PM
# re: Tradeoffs When Minimizing The Impact Of Changes
Phil, I think we're talking about different things here.

The way I see database abstraction, that happens much lower down the stack. In my app framework I have a data layer that has versions for several different database servers (Sql, VistaDb, Oracle, MySql, FoxPro and generic OleDb) and these co-exist without any changes ever being made from the application layer.

Application specific code - business logic etc., which is where a changed field has an effect is higher up the chain at the business object/entity level. And there you do have to map between the database and the fields exposed either manually or via code generation. But that should have zero effect on the data access layer.

I know .Text doesn't quite go that route <g> - it has a DAL that also is a business layer and if you do that then this sort of database abstraction is out the window.

Abstraction where you have separate data and business layers may still require database specific bracketing at times, although I find that it's rare. In most of my applications I can switch between any of the data providers without any code changes, provided the database structures are compatible (which is usually the biggest pain).

Reality is that this is overrated though. I have two products that are vertical and I get 2 or 3 requests a year for running on anything but SQL Server and most of the time even these requests are not really pressing <s>...

I doubt there's a real serious need to run an ASP.NET based Blogging application on Oracle <g>...
Requesting Gravatar... Haacked Nov 16, 2006 8:12 PM
# re: Tradeoffs When Minimizing The Impact Of Changes
I'm not sure I totally understand yet. Is your data access layer a dynamic query engine? Given that some databases support stored procs and others don't, I'm not sure how a data access layer is completely insulated from the business layer unless it is a dynamic query engine or code generated inline sql (ala Subsonic, NHibernate, etc...).

For example, suppose I have the biz method

Person GetPersonById(int id);

And I have the abstract base class DataAccessProviderBase with the method:

IDataReader GetPersonById(int id);

That is more resilient to change than if the class returned a fully loaded Person object. However, if I add a column to Person, then I have to update the stored procs for the database engines that support stored procs. And update the inline SQL for those that don't. That's still a lot of changes.

With a dynamic query engine, I make the change once. To the code that generates the query.

Phil
Requesting Gravatar... David Hayden Nov 18, 2006 10:20 AM
# re: Tradeoffs When Minimizing The Impact Of Changes
I believe Rick is saying that if you make a change to the database schema and have to change code in the database provider for each database your application supports, then your database provider is operating too high in the application stack.

Ideally, you should only have to change the dynamic query or stored procedure once in the application and then each database provider will appropriately handle the call to its database using the proper change.

I believe the DAL in .TEXT (SubText) is operating perhaps a bit high in the stack and would require code/sql changes for each database when a schema change occurs.

If using dynamic SQL hardcoded in the DAL, you also have to change all the queries to compensate for the additional column / schema change. With solutions like SubSonic or NHibernate, you don't have to make as many or any changes because the schema is grabbed in real-time or read from a mapping file. I think that is where Phil is coming from.

I prefer the real-time schema discovery or mapping files as well in situations where you can expect an evolving / volatile schema. Saves you from an avalanche of changes in many cases.
Requesting Gravatar... Haacked Nov 19, 2006 10:26 AM
# re: Tradeoffs When Minimizing The Impact Of Changes
Thanks David. In the case of a dynamic query engine, then I agree with you. In fact, that is the point that I am making with this point.

But in the point of a stored procedure, I disagree. For example, SQL Compact doesn't support stored procedures. The stored procedure syntax is different among the various database engines. So if change the schema AND are using stored procedures, you've just created multiple changes.

Now Rick makes the great point (which I've also made before) that you almost never need to swap out another database. However, my discussion is focusing on the use case of Subtext, in which case we have requests for running it on MySQL, and in the future, SQL Compact.
Requesting Gravatar... Rick Strahl Nov 19, 2006 1:24 PM
# re: Tradeoffs When Minimizing The Impact Of Changes
Phil, I don't if 'dynamic query engine' is the right word. I call this layer a Data Access Layer, but I think we're defining this differently <s>... if you look at .Text (and I suppose SubText still uses that base engine, right?) what you have is combined data and business layer. The 'DAL' handles both business logic and data access (hardcoded against Sql Server in one place.

In a more tiered architecture you split out the data access into a separate layer, where the data access is basically generic. It's essentially an abstraction layer around SqlClient, OleDbClient, OracleClient etc. which provides additional functionality on top of the base data providers provided by .NET although it looks similar. For example, in my framework there are methods like Execute() (retrieves a DataSet), ExecuteTable(), ExecuteReader(), ExecuteNonQuery() etc. with many overloads that handle various combinations of SQL string/Commands, variable parameters etc more easily than writing raw ADO.NET code. IOW, in my business layer I almost never touch ADO.NET objects directly (except when the results come back like a datatable or reader... It's one more level of abstraction. In .NET 1.1 this was a crucial requirement because .NET lacked a database factory but that's been fixed in 2.0. However, the data layer still provides much value because it's much easier to create query code than with ADO.NET native code. The business object layer ontop of that then provides additional abstraction for many Crud operations, like loading a DataRow or Entity etc.

Now your point about database abstraction and differences in engines is a valid one. Obviously if you're using stored procs and another provider doesn't supporting it is a problem, but you neeed to work around that with bracketing. The point though is that this would mostly happen in one place in your business object that talkes to the DAL. There's only one place where the change happens and hopefully there won't be many places where bracketing is required - specifically CRUD operations should be handled in such a way that there are no changes between backends.

There are many different ways to build this sort of thing from entity frameworks, to homegrown like mine where DAL and busniess framework interact which each other heavily. What I've built for my self works very well and is super easy to use and maintain and I call it an enitity hybrid implementation (uses a mixture of entity and ADO.NET objects) which seems to me the best tradeoff in performance and usability.


Requesting Gravatar... Rick Strahl Nov 19, 2006 1:35 PM
# re: Tradeoffs When Minimizing The Impact Of Changes
A few more thoughts here re: Dynamic Query Engine. If you use Stored Procs for your data access there's a good chance you're tying yourself to a specific backend anyway, so switching between backends immediately becomes much more difficult because at the very least you'll have to migrate each of the stored procs. However beyond that - with proper abstraction the DAL should be able to negotiate the stored proc calls from business layer in such a way that there are no code changes from the application...

So yes in what you refer to - I use a dynamic query engine for that flexibility (and the fact that I hate writing TSQL code <s>), although there are certainly stored procs in my applications where they are truly required for complex multi-query operations.


Some of that pain can be eleviated though with Code Generation/Templates. Things like nHibernate have templates that can be used to generate the appropriate stored procedure abstractions for different backends I believe.

But like I said it all depends on whether different backends really matter. I doubt that it makes sense running SubText on Oracle or MySql although I'm sure you'd have a few takers.
Requesting Gravatar... Haacked Nov 19, 2006 3:09 PM
# re: Tradeoffs When Minimizing The Impact Of Changes
Thanks Rick, I think I see the disconnect better. Actually, .TEXT and consequently Subtext, does separate data access from the business object layer. It does so via a the provider model. In .TEXT, there were to levels of abstractions, the ObjectProvider and the DataProvider. The DataProvider returned IDataReaders and the ObjectProvider returned fully hydrated objects. Thus the client app (the blog engine), only has to deal with the ObjectProvider, never knowing how the underlying data is stored.

For Subtext, multiple backends will matter. At some point, we will want to support SQL Compact which doesn't support stored procs.

So at that point, we can either choose to have 2 providers, one for Sql 2000/2005 which can still use stored procs and one for Sql Compact, or we can simply stop using stored procs altogether, which works for both database engines.

My point in this whole article is if we take the second approach, and focus on using a dynamic query engine, I don't have to add providers for every new database engine we wish to support, I get MySql support for free, assuming the dynamic query engine (such as NHibernate) supports that db.

What do you have to say?

(will show your gravatar)
Please add 5 and 5 and type the answer here: