OLAP, data mining ? what does it all mean?
Printer Friendly 
Business intelligence and analytics technologies have evolved their own terminology. Just so we're all on the same page, here's a short tour of some VisualSmart terms, and why they are important to front-line decision-makers.
What is OLAP?
OLAP (On-Line Analytical
Processing) is a technology that provides solutions to three fundamental reporting issues:
Slow running reports. Some methods for calculating total sales for all products might take a very long time. Moreover, each report might keep calculating the same value all over again. To resolve this problem, OLAP can store all possible totals in summary (aggregation) tables. The aggregations are usually re-calculated daily or hourly. This allows decision-makers to act on real-time data.
Report Explosion. The number of possible static reports that can be created from a database may be too large. Take, for example, a simple sales table with three columns: product, date and dollars. Suppose just 10 products are sold and there is just 10 days worth of data. This results in 29 possible reports:
- Total sales
- Total sales by product
- Total sales by product for each day - 10 reports
- Total sales by date
- Total sales by date for each product - 10 reports
- Total sales by date (rows) and product (columns)
- Total sales by date (columns) and product (rows)
- Total sales by date (rows) and product (rows)
- Total sales by product (rows) and date (rows)
- Total sales by date (columns) and product (columns)
- Total sales by product (columns) and date (columns)
If one additional column is added to the table, say "customer name"—the number of possible reports would increase exponentially. Because an average organization has many more than three or four columns associated with its sales, the number of possible reports can approach or exceed trillions
easily overwhelming a static reporting tool. But OLAP uses a 'data cube,' which supports real-time, on-demand, ad-hoc queries. Your report query accesses only the pertinent data, returning refined results each time.
Statistical calculations. These types of calculations are not typically supported by a relational database. For example, SQL (Structured Query Language) requires users to run multiple queries to get the mean, median, or the standard deviation. Because OLAP has pre-calculated totals, this model supports a wide range of statistical functionality that is quickly calculated as a single step.
What is a data warehouse?
Data warehousing is a common approach to creating a reporting database that provides the following advantages:
Centralized data storage. All organizational data is stored in one place. The data is sourced from diverse operational systems such as accounting, human resource,s and payroll. When operational records are imported into the data warehouse, the system makes sure that all information can be linked to the rest of the data. Now users can submit queries with results that combine data from different sources.
Historical data is captured. Sometimes it is important to have a historical snapshot of data.The operational-type database is updated constantly to reflect the latest changes and may not store the old data. The data warehouse will store all the old and new data.
What is data mining?
Data mining is the automated extraction of hidden predictive information from databases. Data mining does two things well: it identifies hidden patterns in data; and it can be used to predict the future based on historical patterns. No technology can predict the future with absolute certainty, and the past is not always an accurate indicator of the future. But when viewed together, historical and current data can be the best indicator of future conditions, and is the closest thing to a crystal ball in business today.
|