Tuesday, 25 April 2017

SQL vs NOSQL Database

In the world of database technology, there are two main types of databases: SQL and NoSQL—or, relational databases and non-relational databases. The difference speaks to how they’re built, the type of information they store, and how they store it. Relational databases are structured, like phone books that store phone numbers and addresses. Non-relational databases are document-oriented and distributed, like file folders that hold everything from a person’s address and phone number to their Facebook likes and online shopping preferences.
We call them SQL and NoSQL, referring to whether or not they’re written solely in structured query language (SQL). In this article, we’ll explore what SQL is, how it makes these databases different, and how each type structures the data it holds so you can easily determine which type is right for you.

Image for SQL vs. NoSQL Databases: What’s the Difference?
In the world of database technology, there are two main types of databases: SQL and NoSQL—or, relational databases and non-relational databases. The difference speaks to how they’re built, the type of information they store, and how they store it. Relational databases are structured, like phone books that store phone numbers and addresses. Non-relational databases are document-oriented and distributed, like file folders that hold everything from a person’s address and phone number to their Facebook likes and online shopping preferences.
We call them SQL and NoSQL, referring to whether or not they’re written solely in structured query language (SQL). In this article, we’ll explore what SQL is, how it makes these databases different, and how each type structures the data it holds so you can easily determine which type is right for you.
relational database vs document-oriented database

SQL: RELATIONAL DATABASES

First, let’s take a look at one of the main features that separates these two systems: the way they structure data. A relational database—or, an SQL database, named for the language it’s written in, Structured Query Language (SQL)—is the more rigid, structured way of storing data, like a phone book. Developed by IBM in the 1970s, a relational database consists of two or more tables with columns and rows. Each row represents an entry, and each column sorts a very specific type of information, like a name, address, and phone number. The relationship between tables and field types is called a schema. In a relational database, the schema must be clearly defined before any information can be added.
For a relational database to be effective, the data you’re storing in it has to be structured in a very organized way. A well-designed schema minimizes data redundancy and prevents tables from becoming out-of-sync, a critical feature for many businesses, especially those that record financial transactions. A poorly designed schema can result in organizational headaches due to its rigidity. For example, a column designed to store U.S. phone numbers might require 10 digits because that’s the standard for phone numbers in the U.S. This has the advantage of rejecting any invalid values (for example, if a number is missing an area code). However, if you need to change the schema (for instance, if you need to include an international phone number entry with more than 10 digits), then the entire database needs to be edited. Key takeaway: excellent organization results in a compromise in flexibility with a relational database.
Structured Query Language (SQL) is a programming language used by database architects to design relational databases. In an SQL database like MySQL, Sybase, Oracle, or IBM DM2, SQL executes queries, retrieves data, and edits data by updating, deleting, or creating new records. SQL is a lightweight, declarative language that does a lot of heavy lifting for the relational database, acting like a database’s version of a server-side script. One particular advantage of SQL is its simple-yet-powerful JOIN clause, which allows developers to retrieve related data stored across multiple tables with a single command.
Another reason SQL databases remain popular is that they fit naturally into many venerable software stacks, including LAMP and Ruby-based stacks. These databases are well understood and widely supported, which can be a major advantage if you run into problems.

Popular SQL databases and RDBMS’s

  • MySQL—the most popular open-source database, excellent for CMS sites and blogs.
  • Oracle—an object-relational DBMS written in the C++ language. If you have the budget, this is a full-service option with great customer service and reliability. Oracle has also released an Oracle NoSQL database.
  • IMB DB2—a family of database server products from IBM that are built to handle advanced “big data” analytics.
  • Sybase—a relational model database server product for businesses primarily used on the Unix OS, which was the first enterprise-level DBMS for Linux.
  • MS SQL Server—a Microsoft-developed RDBMS for enterprise-level databases that supports both SQL and NoSQL architectures.
  • Microsoft Azure—a cloud computing platform that supports any operating system, and lets you store, compute, and scale data in one place. A recent survey even put it ahead of Amazon Web Services and Google Cloud Storage for corporate data storage.
  • MariaDB—an enhanced, drop-in version of MySQL.
  • PostgreSQL—an enterprise-level, object-relational DBMS that uses procedural languages like Perl and Python, in addition to SQL-level code.

NOSQL DATABASES: NON-RELATIONAL & DISTRIBUTED DATA

If your data requirements aren’t clear at the outset or if you’re dealing with massive amounts of unstructured data, you may not have the luxury of developing a relational database with clearly defined schema. Enter non-relational databases, which offer much greater flexibility than their traditional counterparts. Think of non-relational databases more like file folders, assembling related information of all types. If a WordPress blog used a NoSQL database, each file could store data for a blog post: social likes, photos, text, metrics, links, and more.
Unstructured data from the web can include sensor data, social sharing, personal settings, photos, location-based information, online activity, usage metrics, and more. Trying to store, process, and analyze all of this unstructured data led to the development of schema-less alternatives to SQL. Taken together, these alternatives are referred to as NoSQL, meaning “Not only SQL.” While the term NoSQL encompasses a broad range of alternatives to relational databases, what they have in common is that they allow you to treat data more flexibly.
How do NoSQL databases work? Instead of tables, NoSQL databases are document-oriented. This way, non-structured data (such as articles, photos, social media data, videos, or content within a blog post) can be stored in a single document that can be easily found but isn’t necessarily categorized into fields like a relational database does. It’s more intuitive, but note that storing data in bulk like this requires extra processing effort and more storage than highly organized SQL data. That’s why Hadoop, an open-source computing and data analysis platform capable of processing huge amounts of data in the cloud, is so popular in conjunction with NoSQL database stacks.
NoSQL databases offer another major advantage, particularly to app developers: ease of access. Relational databases have a fraught relationship with applications written in object-oriented programming languages like Java, PHP, and Python. NoSQL databases are often able to sidestep this problem through APIs, which allow developers to execute queries without having to learn SQL or understand the underlying architecture of their database system.

Common Types of NoSQL Databases

  1. Key-value model—the least complex NoSQL option, which stores data in a schema-less way that consists of indexed keys and values. Examples: Cassandra, Azure, LevelDB, and Riak.
  2. Column store—or, wide-column store, which stores data tables as columns rather than rows. It’s more than just an inverted table—sectioning out columns allows for excellent scalability and high performance. Examples: HBase, BigTable, HyperTable.
  3. Document database—taking the key-value concept and adding more complexity, each document in this type of database has its own data, and its own unique key, which is used to retrieve it. It’s a great option for storing, retrieving and managing data that’s document-oriented but still somewhat structured. Examples: MongoDB, CouchDB.
  4. Graph database—have data that’s interconnected and best represented as a graph? This method is capable of lots of complexity. Examples: Polyglot, Neo4J.

Popular NoSQL Databases

  • MongoDB—the most popular NoSQL system, especially among startups. A document-oriented database with JSON-like documents in dynamic schemas instead of relational tables that’s used on the back end of sites like Craigslist, eBay, Foursquare. It’s open-source, so it’s free, with good customer service. Read more in Should You Use MongoDB? A Look at the Leading NoSQL Database.
  • Apache’s CouchDB—a true DB for the web, it uses the JSON data exchange format to store its documents; JavaScript for indexing, combining and transforming documents; and, HTTP for its API.
  • HBase—another Apache project, developed as a part of Hadoop, this open-source, non-relational “column store” NoSQL DB is written in Java, and provides BigTable-like capabilities.
  • Oracle NoSQL—Oracle’s entry into the NoSQL category.
  • Apache’s Cassandra DB—born at Facebook, Cassandra is a distributed database that’s great at handling massive amounts of structured data. Anticipate a growing application? Cassandra is excellent at scaling up. Examples: Instagram, Comcast, Apple, and Spotify.
  • Riak—an open-source key-value store database written in Erlang. It has fault-tolerance replication and automatic data distribution built in for excellent performance.
    What database solution is right for you?

Reasons to use a SQL database

When it comes to database technology, there’s no one-size-fits-all solution. That’s why many businesses rely on both relational and nonrelational databases for different tasks. Even as NoSQL databases gain popularity for their speed and scalability, there are still situations where a highly structured SQL database may be preferable. Here are a few reasons you might choose an SQL database:
  1. You need to ensure ACID compliancy (Atomicity, Consistency, Isolation, Durability). ACID compliancy reduces anomalies and protects the integrity of your database by prescribing exactly how transactions interact with the database. Generally, NoSQL databases sacrifice ACID compliancy for flexibility and processing speed, but for many e-commerce and financial applications, an ACID-compliant database remains the preferred option.
  2. Your data is structured and unchanging. If your business is not experiencing massive growth that would require more servers and you’re only working with data that’s consistent, then there may be no reason to use a system designed to support a variety of data types and high traffic volume.

Reasons to use a NoSQL database

When all of the other components of your server-side application are designed to be fast and seamless, NoSQL databases prevent data from being the bottleneck. Big data is the real NoSQL motivator here, doing things that traditional relational databases cannot. It’s driving the popularity of NoSQL databases like MongoDB, CouchDB, Cassandra, and HBase.
  1. Storing large volumes of data that often have little to no structure. A NoSQL database sets no limits on the types of data you can store together, and allows you to add different new types as your needs change. With document-based databases, you can store data in one place without having to define what “types” of data those are in advance.
  2. Making the most of cloud computing and storage. Cloud-based storage is an excellent cost-saving solution, but requires data to be easily spread across multiple servers to scale up. Using commodity (affordable, smaller) hardware on-site or in the cloud saves you the hassle of additional software, and NoSQL databases like Cassandra are designed to be scaled across multiple data centers out of the box without a lot of headaches.
  3. Rapid development. If you’re developing within two-week Agile sprints, cranking out quick iterations, or needing to make frequent updates to the data structure without a lot of downtime between versions, a relational database will slow you down. NoSQL data doesn’t need to be prepped ahead of time.

SQL vs. NoSQL- Which to Use?

The idea that SQL and NoSQL are in direct opposition and competition with each other is flawed one, not in the least because many companies opt to use them concurrently. As with all of the technologies I’ve previously discussed, there really isn’t a ‘one-system-fits-all’ approach; choosing the right technology hinges on the use case. If your data needs are changing rapidly, you need high throughput to handle viral growth, or your data is growing fast and you need to be able to scale out quickly and efficiently, maybe NoSQL is for you. But if the data you have isn’t changing in structure and you’re experiencing moderate, manageable growth, your needs may be best met by SQL technologies. Certainly, SQL is not dead yet.

The Benefits of NoSQL

When compared to relational databases, NoSQL databases are more scalable and provide superior performance, and their data model addresses several issues that the relational model is not designed to address:
  • Large volumes of rapidly changing structured, semi-structured, and unstructured data
  • Agile sprints, quick schema iteration, and frequent code pushes
  • Object-oriented programming that is easy to use and flexible
  • Geographically distributed scale-out architecture instead of expensive, monolithic architecture

NoSQL vs. SQL Summary

SQL DatabasesNOSQL Databases
TypesOne type (SQL database) with minor variationsMany different types including key-value stores, document databases, wide-column stores, and graph databases
Development HistoryDeveloped in 1970s to deal with first wave of data storage applicationsDeveloped in late 2000s to deal with limitations of SQL databases, especially scalability, multi-structured data, geo-distribution and agile development sprints
ExamplesMySQL, Postgres, Microsoft SQL Server, Oracle DatabaseMongoDB, Cassandra, HBase, Neo4j
Data Storage ModelIndividual records (e.g., 'employees') are stored as rows in tables, with each column storing a specific piece of data about that record (e.g., 'manager,' 'date hired,' etc.), much like a spreadsheet. Related data is stored in separate tables, and then joined together when more complex queries are executed. For example, 'offices' might be stored in one table, and 'employees' in another. When a user wants to find the work address of an employee, the database engine joins the 'employee' and 'office' tables together to get all the information necessary.Varies based on database type. For example, key-value stores function similarly to SQL databases, but have only two columns ('key' and 'value'), with more complex information sometimes stored as BLOBs within the 'value' columns. Document databases do away with the table-and-row model altogether, storing all relevant data together in single 'document' in JSON, XML, or another format, which can nest values hierarchically.
SchemasStructure and data types are fixed in advance. To store information about a new data item, the entire database must be altered, during which time the database must be taken offline.Typically dynamic, with some enforcing data validation rules. Applications can add new fields on the fly, and unlike SQL table rows, dissimilar data can be stored together as necessary. For some databases (e.g., wide-column stores), it is somewhat more challenging to add new fields dynamically.
ScalingVertically, meaning a single server must be made increasingly powerful in order to deal with increased demand. It is possible to spread SQL databases over many servers, but significant additional engineering is generally required, and core relational features such as JOINs, referential integrity and transactions are typically lost.Horizontally, meaning that to add capacity, a database administrator can simply add more commodity servers or cloud instances. The database automatically spreads data across servers as necessary.
Development ModelMix of open-source (e.g., Postgres, MySQL) and closed source (e.g., Oracle Database)Open-source
Supports TransactionsYes, updates can be configured to complete entirely or not at allIn certain circumstances and at certain levels (e.g., document level vs. database level)
Data ManipulationSpecific language using Select, Insert, and Update statements, e.g. SELECT fields FROM table WHERE…Through object-oriented APIs
ConsistencyCan be configured for strong consistencyDepends on product. Some provide strong consistency (e.g., MongoDB, with tunable consistency for reads) whereas others offer eventual consistency (e.g., Cassandra).

0 comments:

Post a Comment