4 minute read

Sam Gentile writes about how he feels that developers don’t get the distributed paradigm when developing in .NET and that the documentation and literature doesn’t help any.

I’m not sure I agree with his premise, however, that it is normally a good idea to distribute the tiers onto different hardware. In fact, my usual approach is to encourage multiple tiers to reside on the same front-end box in such a way that you can duplicate those boxes for redundancy and scaling using network load balancing on the front end.

This doesn’t bring us back to client/server systems: n-tier architectures are logical software designs and not necessarily tied to physical deployment designs. Client/server was all about long lived connections to the database and that was what didn’t scale. N-tier is all about get in quick, get what you want, and get out. It’s also about separating presentation from business logic from data storage. None of that implies multiple hardware layers.

Sam claims that deploying the middle tier on separate boxes usually gives far better performance relative to hardware costs. I’m not sure that my experience supports that conclusion. It might be true for very high transaction systems but, on a smaller scale, communication costs between the layers can add significant latency.

In recent times, the occasions where I have supported the “middle tier” on separate hardware have been in what we’re learning to call Service Oriented situations where the functionality has been exposed either with remoting or, more favourably, with a web service. In general, this has been for one of two reasons: security where it is possible to put an additional firewall between the front-end external presentation hardware and the underlying internal service (thanks to avoiding the blunderbuss DCOM approach to firewalls); and for deploying subsystems (such as search engines) where this key functionality might be extended, scaled, upgraded, or otherwise changed completely independent from the rest of the system.

What I have found to be typical in data-driven distributed systems is that scalability is more of a problem on the back-end database than it is on front-end web servers. It is relatively cheap in development cost, maintenance, and hardware cost to build a system where you can add an extra web server to the cluster if you need a bit more horsepower there. What is much more challenging is designing your application in a way that allows you to scale the database: while front-end hardware is cheap, scaling your database up gets increasingly expensive for decreasing returns and scaling out is something you really need to plan for up front and may require some compromises on things like referential integrity.

I’m not a big fan of distributed transaction through the DTC using COM+/ES in systems that often end up using only one resource manager at a time (say to a SQL Server or Oracle database). Back working with COM+/DNA/VB6 it made life much easier and was a price worth paying but with .NET and the managed data providers giving tighter access to the database I don’t think that is always the case. This is the foundation underlying my declarative SQL transactions code and code generator where I support ES-like attributes but rely on SQL Server transactions. I recognise that for large systems where you’ve had to partition and scale out your data the DTC is necessary but I’ve worked on a fair number of transactional e-commerce systems that needed front-end scalability but hit against a single database server at the back.

Sam’s starting point was looking at data access options with ADO.NET and he comments on Rob Howard’s advice to use DataReaders unless you absolutely need DataSets. I’m in agreement with Sam in finding any reason not to use DataSets however I was firmly in the pass DataReaders camp as being the fastest way to get data to your output. Latterly I’ve had mixed feelings about this. It is true that you can’t pass a DataReader across a boundary but exploding the data into object form only to repeat the process as you bind it to output controls in ASP.NET also seems like anathema and not entirely distant from the criticism of the bloated DataSet. In some cases I’ve compromised on separation and passed a DataReader up to the presentation layer and in cases where I know I need (or am likely to need) to cross a boundary, I’ve transferred the data into “behaviour-less” (state-only) objects. These are readily serialised for transmission by remoting or web service and you can still use databinding for presentation.

In conclusion I am saying that using data binding and DataReaders in the presentation layer doesn’t necessarily mean that there isn’t a careful separation of presentation and business logic and doesn’t have to mean we’re heading back down the road to monolithic client/server applications. The logical and physical architectures of distributed systems don’t necessarily have to match.

Updated: