The rattling noise in the era of information economy is the outburst of high volume and high speed data, with varying file formats, which is camouflaging a good amount of business intelligence being sent out through the business channels. The data streams are combinations of data like the click stream data, social media posts, video surveillance feeds, sensor data, transactional data etc. Nick named as big data and supported by the Hadoop platform, it has emerged as the system to grind such data streams and churn intelligence for effective decisions leading to increase in top line. HTC has been experimenting on Hadoop platforms and variants of distributed storage to fine tune the platform elements for optimal returns through its research incubator.
Big data system is the talk with-in board rooms compelling for data driven decisions to justify investments, and objections. Data driven decisions, call for data analytics, which is the prime force to drive decisions. The core job of data analytics is to elucidate data sets and enable organizations gain insight on customer spending patterns. Data analytics equips business with the essential edge, based on collected data to make better, more informed decisions and improve their services.
The new customer centric data driven business, gravitates around data products like data centric recommender and filtering systems apart from data driven predicting systems primarily driven by large data. Recommender system is one of the most generic and common application witnessed as a big data generation application to understand and automate data stream analysis. These systems are very common in finance and insurance industry verticals to recommend, investment opportunities or sales strategies as specific recommendations. These systems are integral parts of online shops to generate an excellent user experience, and enable consumers to seamlessly browse through customized product offerings. These systems connect the consumers with potential products to purchase by correlating the product contents and the expressed opinion.
Recommender systems are blue printed and custom designed based on fundamental algorithms. Most of the blue prints are grouped around data analysis of vast amounts of historical choices and or purchase patterns of existing customer profiles and are designed to suggest products based on past choices. Consumer profiles are usually a composite data set including demographic information and a collection of answers for well structured questionnaire. Such planned profiles get associated to matching products through well designed customer interaction models. Called as collaborative filtering; this system recommends products based on statistics driven purchase patterns of similarly profiled customers built around supervised or unsupervised techniques. Collaborative filtering built around neighborhood methods and latent factor models are designed to investigate the affinity between consumers’ profiles and product interdependencies to diagnose new user-item associations. Crowd-sourced intelligence, as an equivalent term is used to generalize the user’s interaction on a purchase activity, and recommend products to other profiles based on crowd sourced responses. This is evident from the active presence of leading brands in social network forums inviting customers to share their experience on their product portfolio.
A variant of the above is built around content-based filtering, where the blue printed system uses a detailed profile of the individual user based on their previous purchase, likes, searches, tweets, and blogs. Such data collections are essential in profile building activities which are evident from the various social media feedback systems rating systems deployed by various online products and free web space provided by organizations.
Customers have a unique attribute of peer consultation to take decisions and make purchases. Such decisions are mostly network peer driven and is centered on group dynamics and expressed opinions on social sites. Hence a recommender engine built around collaborative filtering with collated and summarized social datasets influences the purchase behavior of customers leading to business. These recommender engines can be normalized for customer management systems, hospitals, doctors, movies, insurance plans, transport systems, banking systems, educational programs, tourist programs etc. For example the recommender system can recommend potential customers from CRM data; can recommend specific aspects of lead generation practices; can recommend products, and solutions in in-bound and outbound call management solutions; can be used to compare and recommend hospitals, doctors etc. For example the call analytics is a composition of the basic profile of the end user, categorized on the time of call, call duration, the highlighted issues, call repetition rate, and the geographical spread of the call pattern and specific pattern identification routines. Mobile centric recommender systems have started surfacing based on huge volumes of targeted mobile apps giving a new dimension.
When the social sites are web scrapped, the collected data is drilled through a cleaning process in order to push them through the recommender system. The system is designed to work on statistical designs in order to address the fuzziness and vagueness inherent in the collated data sets. This is due to the relative product ratings by the end users and the environment in which the product was deployed and or purchased. The recommender system is expected to rank the relative relevance of entities based on blue printed yardsticks.
Machine learning techniques form the backbone of recommender systems. Machine learning oriented ranking system variants are referred to as learning-to-rank (LTR) systems, and are designed to operate on LTR algorithms. Some of the common LTR algorithms include ADARANK, which aims to optimize an arbitrary list wise information retrieval (IR) metrics, such as the Mean Average Precision (MAP), ERR, or NDCG, RANKSVM, which is an SVM based technique formulating the ranking task as a binary problem etc. There are a variety of algorithms to design the recommender from grass root levels. Some of the common rank evaluation measures include Mean Average Precision (MAP), Expected Reciprocal Rank (ERR), Discounted Cumulative Gain (DCG) and Normalized Discounted Cumulative Gain (NDCG). As a common information retrieval measure, NDCG is used to appraise the fruition of a recommender system based on the indexed applicability of the participating entities. It usually measured in the range of 0.0 to 1.0, where 1.0 represents the ideal ranking of the entities. It is designed for ranking tasks with more than one relevance levels. Root Mean Squared Error (RMSE) is another popular metric used in evaluating accuracy of predicted ratings. As an often spoken metric, Reciprocal Rank (RR) is suitable for navigational needs.
Recommender engines are establishing domain level leadership. They have emerged as an alternative for word of mouth feedback system to anywhere feedback permitting individual mobility coupled with a spectrum of choices. HTC has been experimenting on variations of recommender systems to deep dive on core analytics. Some of the experimental works were built around provider quality, recommending doctors, Marketing and CRM call return on investment and text analytics built on open source technology stack and in-house customized stacks. A couple of interesting prototypes have been built around call management analytics, fraud analytics, sales force analytics and provider quality analytics.