Wednesday, November 12, 2014

WebFOCUS Hyperstage for BI Columnar Database

For a client, I researched Information Builders' columnar database offering called Hyperstage. If you are a WebFOCUS customer, Hyperstage may be a cost-effective way to improve BI performance without special hardware, design, or administration.

This is a WebFOCUS-integrated database based upon technology from InfoBright (previously known as Brighthouse). Information Builders calls Hyperstage a "load-and-go," high-performance, highly scalable database that is simple to implement, use, and maintain.

Like other database adapters, Hyperstage is an add-on product integrated into the WebFOCUS Reporting Server, uses standard metadata, and is administered through the server console.

While relational databases store data organized into two-dimensional tables containing rows and columns, Hyperstage is designed to store together serialized values of each particular column. Rather than retrieve entire rows, a columnar database only reads the necessary columns.

This approach has proven useful for BI systems which need quick access to small subsets of data. Columnar databases in generate are optimized for analytic queries which ask questions about the data (e.g., aggregations and trends) rather than displaying records from the data.

Hyperstage organizes columns into "Data Packs" which are compressed to reduce the physical size of the database. While columnar data compressions of 10:1 ratios are common, an IB resource says one client obtained a 45:1 compression rate.

For faster retrieval, Hyperstage caches the Data Packs into memory. Hyperstage optimizes data access using an in-memory metadata layer known as the "Database Knowledge Grid." This feature helps eliminate the need for typical performance improvement techniques such as indexes, partitions, and materialized views.

Hyperstage uses an open-source relational database (either MySQL or PostgreSQL) and adds its own layers for improved data storage and query optimization. The WebFOCUS Hyperstage adapter will transform BI requests into optimized SQL queries. Queries that cannot be optimized by Hyperstage are handled by the relational database engine.

While the WebFOCUS language does not currently support a "HOLD FORMAT HYPERSTAGE" option, we should be able to easily replicate this feature with a small custom utility.

For example, many of my clients have nightly batch processing to load special BI databases for fast dashboard access. For Hyperstage, we might simply take some of those existing flat import files and use them as the input to a Hyperstage bulk load process.

Using the WebFOCUS language, we can develop a simple procedure to read the flat file's metadata and generate two SQL routines:

• CREATE command to initialize a Hyperstage structure (using ENGINE=BRIGHTHOUSE)
• LOAD DATA INFILE to bulk-load the Hyperstage structure

You can download my prototype of this WebFOCUS utility from my GitHub account.

Do you see places where Hyperstage might be of use within your organization? Let me know if I can be of service. 

Friday, February 21, 2014

Integrating WebFOCUS with Python

The Python open-source object-oriented programming language is popular in today's workforce. It's been around for a couple of decades, but demand for talent is now growing rapidly.

While job postings for WebFOCUS professionals have always been low and steady, the number of jobs with Python has dramatically taken off over the last several years. Below is an Indeed job trend showing the relative growth of jobs for Python and WebFOCUS.  

It is never good to be limited to just a single instrument in your personal toolbox. If you are a WebFOCUS professional, you should learn additional popular technologies--one of them perhaps being Python. 

Many companies have started to use Python for statistical analysis and predictive analytics; there are even contentious arguments on the web debating if Python is better than the R statistical programming language (for example, see this blog). 

Recently, I posted a four-part article on how to integrate WebFOCUS with the R statistical programming language. Guess what? These techniques work equally well with the Python language. 

See these past blog postings:

Part I: Overview
Part II: How WebFOCUS calls R 
Part III: How WebFOCUS generates dynamic R scripts
Part IV: Closer Look at the R Scripts inside WebFOCUS

Are you seeing demand for Python in the marketplace? Leave a comment to give us your insight. 

About Me

My photo

I am a project-based consultant, helping data-intensive firms use agile methods and automation tools to replace legacy reporting and bring in modern BI/Analytics to leverage Social, Cloud, Mobile, Big Data, Visualizations, and Predictive Analytics. For several world-class vendors, I led services teams specializing in providing software implementation and custom application development. Based on scores of successful engagements, I have assembled proven methodologies and automated software tools.

During twenty years of technical consulting, I have been blessed to work with smart people from some of the world's most respected organizations, including: FedEx, Procter & Gamble, Nationwide, The Wendy's Company, The Kroger Co., JPMorgan Chase, MasterCard, Bank of America Merrill Lynch, Siemens, American Express, and others.

I was educated at Valparaiso University and the University of Cincinnati, graduating summa cum laude. In 1990, I joined Information Builders, the vendor of WebFOCUS BI and iWay enterprise integration products, and for over a dozen years served in branch leadership roles. For several years, I also led technical teams within Cincom Systems' ERP software product group and the custom software services arm of Xerox.

Since 2007, I have provided enterprise BI services such as: strategic advice; architecture, design, and software application development of intelligence systems (interactive dashboards and mobile); data warehousing; and automated modernization of legacy reporting.