Business Analytics with In-Memory Databases
Abstract
The Business intelligence (BI) and Data Warehouse vendors are increasingly turning to in-memory technology in place of traditional disk-based storage to speed up implementations and extend self-service capabilities.
For years, it has been noticed that the process of creating customer data queries and building business intelligence reports has been a prolonged activity. This is because the information needed must be pulled from operational systems and then controlled in separate analytical data warehouse systems that can accept the queries. Now, however, with the advent of true ‘in-memory analytics’, a technology that will allow operational data to be held in a single database that can handle all the day-to-day customer transactions and updates, as well as analytical requests – in virtually real time.
Starting Questions Successful Business Analytics project implementations start by asking the right questions. Here are a few that should be on your short list. · How do I manage and maintain the performance of my existing reports with the ever increasing data? · What is the cost effective alternative to data warehouses that provide the ability to analyze very large data sets, but is much simpler to set up and administer? · What can I do today to support near-real time reporting requirements and not relying heavily on IT departments? · How can I demonstrate value to my company to extend real-time ad-hoc query capabilities for high volume transaction functionalities such as Financial Services? · How do I minimize the administration overhead and yet provide a transparent reporting environment to end user? |
The purpose of this article is to put both BI technologies in perspective, in-memory and disk-based, explain the differences between them, and finally explain, in simple terms, why disk-based BI technology is not on its way to extinction. Rather, explain the requisites for considering an in-memory database BI solution.
But before we get to that, let us understand the differences between disk-based and in-memory databases.
Disk-based and In-memory Databases
Database irrespective of disk-based or in-memory, we are talking about where the data resides while it is actively being queried by an application: with disk-based databases, the data is queried while stored on disk and with in-memory databases; the data being queried is first loaded into RAM (Random Access Memory).
Disk-based databases are engineered to efficiently query data residing on the hard drive. At a very basic level, these databases assume that the entire data cannot fit inside the relatively small amount of RAM available and therefore must have very efficient disk reads in order for queries to be returned within a reasonable time frame. On the other hand, in-memory databases work under the opposite assumption that the data can fit entirely inside the RAM. The engineers of in-memory databases benefit from utilizing the fastest storage system a computer has (RAM), but have much less of it at their disposal.
The fundamental trade-off with disk-based and in-memory technologies is faster reads and limited amounts of data versus slower reads and practically unlimited amounts of data. These are two critical considerations for business intelligence applications, as it is important both to have fast query response times and to have access to as much data as possible.
Fast analysis, better insight and rapid deployment with minimal IT involvement!
What is it?
As the name suggests, the key difference between conventional BI tools and in-memory products is that the former query data on disk while the later query data in random access memory (RAM). When a user runs a query against a typical data warehouse, the query normally goes to a database that reads the information from multiple tables stored on a server’s hard disk. With a server-based inmemory database, all information is initially loaded into memory. Users then query and interact with the data loaded into the machine’s memory.
BI with In-memory databases may sound like caching, a common approach to speeding query performance, but inmemory databases do not suffer from the same limitations. Caches are typically subsets of data, stored on and retrieved from disk (though some may load into RAM). The key difference is that the cached data is usually predefined and very specific, often to an individual query; but with an inmemory database, the data available for analysis is potentially as large as an entire data mart.
In-memory database is designed specifically to take advantage of the immense amount of addressable memory now available with the latest 64-bit operating systems. In-memory technology uses the multi-gigabytes of memory space available in 64-bit servers as for its data store. In-memory analysis is designed to improve the overall performance of a BI system as perceived by users, especially affecting complex queries that take a long time to process in the database or when accessing a very large database where all queries are hampered by the database size. With in-memory database, it allows data to be analyzed at both an aggregate and a detailed level without the time-consuming and costly step of developing ETL processes and data warehouses or building multidimensional OLAP cubes. Since data is kept in-memory, the response time of any calculation is lightning fast, even on extremely large data sets analyzed by multiple concurrent users.
This kind of immediate, interactive analysis is particularly important when people are trying to discover unknown patterns or learning new opportunities.
Who is it for? | Know your challenges | Finding the right mix |
---|---|---|
|
|
|
Summary
Business Analytics with in-memory database provides companies with a faster, more flexible, and arguably lower-cost way of accessing and processing information allowing users to get answers to business questions in seconds rather than hours. By virtue of its high performance architecture in-memory has the potential to help midsize organizations become more informed, agile and respond quicker to changing market conditions.
In addition, advances in technology and lower costs of memory and CPU make this type of technology more attractive than ever before. Matching the appropriate architectural approach with the kind of business analytics solutions needed by a midsize company has the potential to deliver benefits such as reduced time to insight, greater agility, increased self-service and lower overall IT demands.
References:
- Open source In-memory Analytics – YellowFin
- Extinction of traditional Business Intelligence: Elasticube Chronicles
- In-Memory Data Management by Plattner/Zeier
Don’t forget to leave your comments below.
Srikanth Chintamaneni is a manager in the Information Management service line of Deloitte Consulting India Pvt. Ltd. He has over 13 years of experience in providing consulting services involving data warehouse and content management solutions in the Health care, Commercial & Consumer Finance, and Industrial Products industry segments. His capabilities support services involving data profiling, data modeling, report design, and end-to-end data warehouse implementations.