Tuesday, July 14, 2009

Spot the bottleneck

This article (http://www.internetretailing.net/news/waitrose-picks-precise-to-manage-website-performance-and-availability) is quite interesting for a couple of reasons:

  1. Did they really need a contract with a third party to work out that some SQL statement were causing performance issues?
  2. The fact that improving SQL performance is their primary mechanism for performance gains.
  3. IBM provided the platform.

It would be very interesting to get a view into their existing architecture – to see how much tuning they have done so far.

(PS – I know performance is difficult – you only need to look at the website I currently work with/for/on to see that I don’t have all the answers; my argument is that if they have the money to employ someone specifically for the task that suggests they have exhausted all possible internal solutions. There is surely nothing more dispiriting for a development team than to be held to account for a performance issue and then denied the resources to solve it, only to see a third party parachuted in and given all the assistance they require.)

Update:

The other thing that I wanted to mention was the issue of scalability – we’re being asked to scale 1,000% in three months – should I be scared? An increase of 35% seems trivial – why can’t they scale out to cover that increase?

Friday, July 03, 2009

Tipping Point?

The mercury has now popped out of the top - when Computerworld starts picking up on these things we can now assume they have gone mainstream: http://www.computerworld.com/action/article.do?command=viewArticleBasic&articleId=9135086

To paraphrase ((c) Computerworld.com)
The movement's chief champions [...] learned to get by at their cash-strapped startups without Oracle by building their own data storage solutions, emulating those being built by Google and Amazon.

Now that their open source data stores manage hundreds of terabytes or even petabytes of data for thriving Web 2.0 and cloud computing vendors, switching back is neither technically, economically or even ideologically feasible.

Tuesday, June 30, 2009

Yet more bad news for RDBMS enthusiasts

The temperature's rising, and it's surely only a matter of time before normalisation is wiped from the developer's best-practice lexicon. Another well written article killing the myth here - http://www.roadtofailure.com/2009/06/19/social-media-kills-the-rdbms/

As previously stated here - the sheer scale of internet applications has exposed the short-comings of traditional databases in all but the most severe environments (banks?)

My favourite section of the article is the list of things that the author will not be missing with his new solution:

[Quote (c) Bradford Stephens, Road To Failure blog]
WHAT WE’RE SCRAPPING:

* Transactions. Our data is written in from a Hadoop cluster in large batches. If something fails, we’ll just grab the HDFS block and try again.
* Joins. Nothing is more evil than normalization when you need to shard data across multiple servers. If we need to search on 15 primary fields, we’re fine with copying our data set 15 times, with each field a primary key for its table.
* Backup and Complex Replication. All of our data is imported from HDFS. If high-availability is a must, we can simply use Zookeeper to keep track of what nodes die, and then bring up a new one and feed it the data needed in ~ 60 seconds. With scales of hundreds of millions of documents, no one will miss a few hundred thousand for that brief period of time.
* Consistency. If our users are analyzing millions of documents, they’re not going to care if there’s 15,000 unique Authors, or 15,001.

Agreed - if you're a financial institution, the difference between 15,000,000,000 and 15,000,000,001 is important, but for the rest of us, it just isn't.

Tuesday, June 23, 2009

Graph Databases

Something that has really resonated with me over recent weeks is the concept of the graph database. I’ve spent most of my professional career railing against RDBMS software and the frustration of database cost/scale/performance, and although graph databases (or key-value databases) won’t solve the database dilemma, it’s very encouraging to find such a vibrant community of experts trying to tackle these issues.

Here is a great presentation which introduces the concepts - http://markorodriguez.com/Lectures_files/risk-symposium2009.pdf

Monday, June 08, 2009

RDBMS, RIP?

Many years ago I wrote about the death of the ACID transaction and the rise of the compensating transaction in loosely-coupled systems (here). I've never liked databases, and their sensitive, delicate, demeanour, so I'm particularly pleased to read more and more about the rise of massively scalable (and robust) "databases" based on denormalised key-value pairs - Cassandra (Facebook), BigTable (Google) & Dynamo (Amazon) to name but three.

I know none of these is exactly new, but I think the ideas behind them are being shared within the broader community these days, and that can only be a good thing.

Tuesday, June 02, 2009

Google Wave

We’ve been using Basecamp to collaborate on our team for the past year, and one of the things that it highlights is how incredibly poor an experience email provides. A typical email thread (say 10 replies) can encompass a number of people who are cc’d in (or dropped) from any single message, making retrospective auditing of a decision very, very complex. (In fact it’s impossible if you weren’t on the critical email in the chain.)

Having got used to Basecamp, we now use its Messages function in preference to email precisely because it provides a single conversational thread where anyone can see the decisions being made in chronological order, irrespective of when they joined.

One of the features of Basecamp we haven’t really got comfortable with is the Chat feature, which is functionally equivalent to the Messages, but in ‘real’ time – i.e. it’s a better IM, where Messages are a better email.

Google Wave seems like a better Chat and a better Messages function, combined in one. I have no idea if it’s a ‘killer app’, and some of the initial press has suggested it’s just too ambitious, too complicated for non-technical users, but I for one applaud Google’s ambition in at least addressing the problem. Email is well past it’s sell-by-date, I think that for tech-savvy power users Wave (or an equivalent) could become a de facto communication medium.

Wednesday, May 27, 2009

Where does innovation come from?

Somewhere in my previous post I stated that in the online economy innovation comes from the bottom not the top, something that I thought at the time was fairly uncontroversial.

Last week I attended and IT seminar, and two things struck me. First, I really don't work in IT - although I don't know what the alternative is; does the VP product development at Google put "IT Consultant" on their passport? Probably not, but what else is there? Anything else seems a little pretentious.

Second, my comment about innovation was quite controversial. The assembled crowd (mostly CIOs / IT Directors) nodded in agreement when the panel suggested that innovation was a luxury in the current climate, and that all that really mattered today was business value as measured by cost reductions and efficiency gains.

But if your business is technology, how can you not innovate? I was astounded at the assumption that innovation was something could be turned off. I may be very fortunate in my current job but my role is essentially controlling the unstoppable flood of innovation from our development team, and directing it towards some appropriate business objective. Turning it off would be unthinkable, if even possible without losing the team itself.

Someone at the seminar gleefully announced that the fabled Google 20% time was all but gone now, and even the mighty search giant had succumbed to market forces. Well, possibly, according to the HR team, but I'll bet a lot of money that the innovation continues unabated, 20% time or not. Google's scheme was more about encouraging what goes on anyway, with or without formal recognition.

So. I hereby declare that I no longer work in IT, and furthermore that I will never work in IT again (given the choice of course). Where I work innovation is endemic, and comes from the youngest, keenest, coolest people in the room and not the oldies in the comfortable chairs. The Internet (capital 'I') has matured to the point where it now represents an industry sector in its own right, and that's where I intend to stay...