Choosing an appropriate database architecture is key to a successful implementation. In many cases, the decision will be driven by the data itself, and how predictable – or not – the incoming data will be, something our data engineers deal with on a daily basis.
While SQL is a relational database technology that stores its data in tables, NoSQL (standing for Not Only SQL) is non-relational, not table based, and, in many ways, more flexible. But that doesn’t always make it the most appropriate option.
What are SQL databases best used for?
It’s easy to visualise an SQL (Structured Query Language) database as a series of tables, with rows for each record, and columns for every data point. They’ve been used for years for storing records with highly repeatable and predictable variables, like customer lists, product inventories, and financial transactions. As such, they are commonly used in legacy systems.
What are NoSQL databases best used for?
NoSQL data is less easy to visualise, since it can take several forms. It could, for example, be a collection of records, which each record made up of a range of unique fields. It’s possible that no two records contain exactly the same range or number of fields, and if they do, the data these relate to may be duplicated, rather than pointing to a single source. Merit’s own data experts believe that “while SQL Databases are highly structured, when it comes to plug and play data analysis for large volumes – NoSQL Databases provide the fluidity to manage Business Intelligence given their semi-structured nature.”
Simpler user queries with NoSQL
While the NoSQL model may appear to make inefficient use of available space, particularly in direct comparison to SQL, queries can more often be satisfied with just one call, saving any process employed by the user several round trips to the server.
For example, an online store may build JSON-based ‘documents’ for each customer, holding payment and delivery details, past orders, and so on. Rather than linking past orders to master product names, descriptions and SKUs in a separate table, they can be stored in the same JSON data as every other data point on the customer record. So, where two or three customers have each bought the same item, it will be listed in each of their documents.
In an SQL database, a similar result could be achieved by using a product ID to reference a distinct product record stored in a separate table.
MongoDB, a NoSQL database built to evolve and scale
MongoDB is one of the most popular NoSQL database platforms. Its developers explain that, “non-relational databases often perform faster because a query doesn’t have to view several tables in order to deliver an answer, as relational datasets often do.
Non-relational databases are therefore ideal for storing data that may be changed frequently or for applications that handle many different kinds of data. They can support rapidly developing applications requiring a dynamic database able to change quickly and to accommodate large amounts of complex, unstructured data.”
NoSQL databases built for recommendation engines like Google
Further, the structure of the database itself – as in the case of a Graph database – may in effect generate discrete data points of its own. For example, nodes for individual entities and assets may be linked by edges that describe the relationship in a way that would be impossible to emulate in an SQL database.
This model is traditionally used in recommendation engines, allowing organisations to present related content based on a user’s past actions. It’s no coincidence that some of the lead developers – and biggest users – of NoSQL databases are companies like Facebook, Amazon, and Google.
SQL Database architecture and design
“SQL Databases have evolved into a long-standing power tool given the fact that the data is well defined and structured.” according to one of Merit’s Senior Service Delivery Managers. “This is effectively a universal language regardless of changing DB Admins – the data in tables itself holds in good stead and remains self-governing as a system. This makes things easy to manage in terms of Data Governance from an IT perspective.”
“SQL databases are better for multi-row transactions, while NoSQL is better for unstructured data like documents or JSON,” says Integrate. On the other hand, “SQL databases are also commonly used for legacy systems that were built around a relational structure.”
SQL databases are highly structural. It is therefore essential that considerable thought be put into the design at their outset, at a point when the use case might still be evolving. Developers must anticipate the kind of queries that will be asked of a relational database and, should this change over time, considerable work may be required to re-engineer the database to accommodate the organisation’s needs.
NoSQL Database architecture and design
As NoSQL databases don’t rely on the data being organised in a rigid table structure (or series of tables), database administrators have considerable latitude throughout the design phase, in the knowledge that unexpected formats and structures can more easily be accommodated at a later date.
Moreover, NoSQL data can be nested, or organised in arrays, with specific data points containing sub-points. Taken as a whole, the nested document gives an overview of the data but, when considered as individual parts, its constituent fields may allow for more granular querying and analysis.
However, “the advantages of NoSQL don’t come without a cost,” says Serdar Yegulalp at InfoWorld. “NoSQL systems don’t generally provide the same level of data consistency as SQL databases. In fact, while SQL databases have traditionally sacrificed performance and scalability for the ACID [Atomicity, Consistency, Isolation, Durability] properties behind reliable transactions, NoSQL databases have largely ditched those ACID guarantees for speed and scalability.”
Choosing an appropriate database model
The choice of database technology will have a fundamental impact on the way that the host architecture is designed. As Integrate explains, “SQL databases are vertically scalable in most situations. You’re able to increase the load on a single server by adding more CPU, RAM, or SSD capacity. NoSQL databases are horizontally scalable. You’re able to handle higher traffic by sharding, which adds more servers to your NoSQL database. Horizontal scaling has a greater overall capacity than vertical scaling, making NoSQL databases the preferred choice for large and frequently changing data sets.”
However, says Microsoft, “sharding dramatically increases operational overhead by spitting data across many pieces that cannot easily communicate. It can be costly and time consuming to manage.”
Thus, a decision on whether to use SQL or NoSQL will never be a case of pure preference. Different data types are suited to specific jobs, and must be stored in an architecture that supports both the content and the intended end-use.
That’s not to say that it’s impossible to migrate from SQL to NoSQL if the needs of the business evolve, but making the right decision at the design stage is more likely to deliver optimal results from day one.
Related Case Studies
A Unified Data Management Platform for Processing Sports Deals
A global intelligence service provider was facing challenge with lack of a centralised data management system which led to duplication of data, increased effort and the risk of manual errors.
Bespoke Data Engineering Solution for High Volume Salesforce Data Migration
A global market leader in credit risk and ratings needed a data engineering solution for Salesforce data migration.