By Désiré Athow
The ever multiplying number of ways we consumer data is having a profound affect on the database market.
As such, companies are scrambling to keep up with the changes taking place and adjust their data storage operations to fit the needs of their business and their customers.
Adrian Carr, VP of database specialist MarkLogic, spoke to us about how new technologies like Not Only SQL, or NoSQL, are now becoming essential due to the flexibility they offer when looking after data.
TechRadar Pro: Can you explain the shift that’s taking place in the database segment right now?
Adrian Carr: Not so long ago we all subscribed to the theory that you could store anything in a relational database. But now we realise that, although it is technically possible, it’s nothing like ideal.
Although you can now store rich content such as documents and social media in relational databases, it’s not easy to then do anything with the data. Even basic functions such as search are problematic. Relational databases simply don’t perform well unless they are given beautifully structured data and are expensive to boot.
As 80 percent of the world’s data is unstructured and more suited to NoSQL (Not Only SQL) databases, these technologies have matured, resulting in the surge of interest we are witnessing in this space.
TRP: How do you predict the database market will evolve?
AC: Rather than widgets and one-project wonders, I believe that platforms control the destiny of most computing. In the NoSQL database world, this means an integrated platform that incorporates not just the database but also the search engine and application services.
With 50 or so NoSQL players jostling for position, we are seeing the first stages of consolidation – such as IBM buying Cloudant – and segmentation.
I predict that all the relational database incumbents will have to make a move into this space as it would take too long for them to build whole new engines themselves. They have been adding extensions to their product but it’s like fitting a round peg in a square hole and it certainly doesn’t scale.
Part of the challenge for the incumbents is a fear of undermining the huge revenues they receive from their relational database business by launching lower cost NoSQL offerings.
TRP: What impact is big data having on how we architect the datacentre?
AC: Big data has helped to change people’s mindsets and appreciate the value of the terabytes of data being created and stored.
However it requires a different approach to building databases and applications. Up to now developers have built out a database to power data centre applications. But for every application you have to load the data from wherever it lives into the application-specific database.
With multiple applications, pretty soon, you have hundreds of data stores with data duplicated all over the place. One day you wake up screaming when you realise the problem is worse than duplication. You have also lost context, security and data provenance.
TRP: Your company discusses the ‘data-centred’ data centre – what is it and how is it different than a traditional data centre architecture?
AC: The data-centered approach is to focus on the data, its use, and its governance through its lifecycle as the primary consideration. By bringing the applications to the data, instead of taking the data to the applications, you can minimise data duplication resulting in consistent data integrity, more flexible application development, portability to users’ devices, and scalability.
Some of these databases can get pretty large and having them in one place makes life simpler when it comes to managing enterprise fundamentals such as security, disaster recovery, tiered storage and archiving.
TRP: What are the primary considerations for enterprises that are evaluating and deploying NoSQL?
AC: I believe every enterprise checklist should start with security and compliance. The next point depends on the value of your data because valuable data requires ACID compliance.
For those businesses looking at virtualisation, cloud and private cloud, there will be a requirement for scalability across variable and elastic infrastructures. And those operations that run 24 x 7 will have to evaluate data backup, recovery, and business continuity needs.
The unloved stepchild in this world is the administrative toolset. The database tools need to easily integrate into existing system management solutions and afford a level of customisation while supporting development and testing.
TRP: How does Hadoop fit into the data-centred data centre?
AC: One trend in the market is the growing understanding of the importance of Hadoop as a file system. Hadoop by itself is not an application environment – you have to build your application on top of it.
It is a key technology for data-centred data centres, enabling businesses to implement a full-tiered storage architecture so they can store significant data volumes at a low cost, using commodity hardware, and then run the database natively on the Hadoop distributed file system.
I think of Hadoop as the new shared storage infrastructure for big data but it’s important to note that it needs to be complemented with high-performance storage technologies and analytics for real-time search, discovery and analysis.
We can already see in the alliances being formed between Hadoop companies and those owning or building the database functionality on top.
TRP: Does semantics technology play a role yet?
AC: Semantics has been through the hype cycle since the incredible Tim Berners-Lee first brought it to our attention a dozen or so years ago.
Now the technology has caught up with the hype and organisations are finding they can make better-informed decisions, reduce risk, and convey more accurate information by combining documents, data and RDF triples (also known as linked data) in a single architecture. The BBC used semantics to populate its Olympics websites.
Until now, triple stores have been separated from the data source itself, with context getting lost in the process, but newer technologies such as ours are solving this problem by storing the triples, documents and data in the same database.
TRP: Can NoSQL databases ever meet the stringent requirements of enterprises such as financial services institutions and government?
AC: Although the large relational database players are loath to concede their dominance in the enterprise database market to more nimble NoSQL vendors, the reality is that NoSQL technology is already present in enterprises such as financial institutions and government agencies running production applications and operational transactions.
Many of the NoSQL vendors currently lack these types of enterprise capabilities, but they have stated that they intend to include enterprise features to their future roadmaps. Remember how in the mid 1990s Oracle started on a sharp upward trajectory once it supported true transactional consistency? I predict the same growth in the NoSQL space.