I flew back from Orlando to London last night and come straight to the practitioner-heavy QCON architecture conference. I will check out a few sessions before calling it a day. The first one I am going to is a presentation about clustering from Ari Zilka, CTO of Terracotta [Disclosure: client]
Terracotta is JVM-level clustering, which allows you to scale apps across multiple nodes. Terracotta not only works completely transparently, that is, without significant code changes, but it plays very nicely with crucial enterprise Java technologies such as Spring and Hibernate. Terracotta also scales Apache, rather than needing a JEE server. The software is all about clustering Plain Old Java Objects (POJO).
Sounds too good to be true, right? Well don’t take my word for it. Terracotta is open source and freely downloadable. Go download it and see if it works. No need to pay unless you want to. That’s right- its the pay at the point of value model again, similar to another RedMonk client, MuleSource.
I am sitting here next a guy that works at a large investment bank and he says its an interesting and ground-breaking approach.
“Terracotta does for distribution of state what the garbage collector did for managing memory”.
Which seems like a good transition to Ari’s pitch. As usual Ari starts at the beginning, with the fact he built the clustering architecture for Walmart.com. The story is a really good one because it concerns the realities of topdown vs bottom up architectural engineering.
Ari’s boss, the CIO at Walmart at the time advocated the Web 1.0 approach of Oracle as the state manager. “We started like everyone else. Stateless+load balanced + Oracle (24CPUs, 25 Gb).. include veritas and EMC San. $5m to support the web site.
Sounds like 1999, right?
But Ari’s approach evolved to using the database as little as possible.
“You don’t need a database” – we used to get attacked for saying that. We want to support the the application you want to build. You know when you need a database – say to store a customer record. But why use one if you don’t need too?
Over time it became clear that the way to go was to fight the State Monster, one battle at a time, moving as much state off of Oracle as possible.
scalability – avoid bottlenecks
Availability – write to disk
simplicity no copy on read/copy on write semantics.
The CIO had said “Put everything in Oracle. Oracle is our friend.”
Ari talked to the different approaches to scale, and where the bottlenecks emerge.
- stateless load balanced architecture – bottleneck on DB.
- in memory session replication – bottleneck on the CPU
- clustered db cache- bottleneck on memory, db
- WebLogic is the best clustering for web apps. buddy clustering etc. none the less the CPU is the bottleneck
- memcache- bottleneck on server
- JMS-based replicated cache… network as the bottleneck
So go with JVM level. With Terracotta
- all reads from cache
- all writes are deltas only
- write in log forward fashion (no disk seek)
- statistics and heuristics (greedy locks)
My take:
Terracotta calls its approach network attached memory. Zilka talked to a case study where a customer had reduced their database load by 50%. To me that seems like Terracotta’s big opportunity to drive sales. To go after the database in the same way Oracle has gone after other middleware and storage technologies. Reducing reliance on other technologies. Oracle database is incredibly powerful, but its pretty expensive. I think the ROI argument for Terracotta is to offer to reduce database load and boost improve performance.
Terracotta as database optimisation.