Monday, July 25, 2016

Zyme: Emergence and Evolution of Channel Data Management Software

Previous to the official launch of the new version of Zyme’s solution, I had the opportunity to chat and be briefed by Ashish Shete, VP of Products and Engineering at Zyme, in regard to version 3.0 of what Zyme describes as its channel data management (CDM) solution platform.
This conversation was noteworthy from both the software product and industry perspectives. In particular, the solution is relevant to an industry that needs software and technology solutions to help control, streamline, and improve the management of a fascinating and complex ecosystem called the distribution channel.
Zyme aims to increase the efficiency of this ecosystem through its CDM platform.

The distribution channel: a hidden monster
According to the United Nations Conference on Trade and Development (UNCTAD):
Driven by favorable policies, technological innovation and business models bringing down the costs of cross-border transactions, international trade in goods and services added about 20 trillion US$ during the last 25 years, going from about 4 trillion US$ in 1990 to about 24 trillion US$ in 2014.
Global business is now “business as usual,” as it is the norm for a global economy. As manufacturers and service providers put goods on the market that are worth trillions of dollars, a huge infrastructure of distributors, resellers, retailers, and value-added resellers (VARs)—what we call the channel—is responsible for selling and moving them the globe.
As more goods and services reach new markets and new trade and commercialization models are created, the channel becomes an increasingly complex ecosystem that moves an immense flow of goods from many different places (see Figure 1 below).

Figure 1. A simple version of the channel (Image courtesy of Zyme)

As a result, manufacturers and service providers are experiencing challenges in handling the increasing volume and diversity of data coming from the channel and still be able to maintain visibility into the channel as well as garner insight on when and how their products and services are being sold and moved within the channel.

Simply managing this data is typically a complex and cumbersome task. This is because the data collected from the channel originates from different sources, and comes in different formats (text files, spreadsheets, via Open Database Connectivity (ODBC) connectors to third-party systems, etc.) and diverse structures (plain text, XML files, etc.). The challenge then is to find the most efficient way to collect, clean, organize, and consolidate this variety of data in order to gain visibility and insight from all these data points.
Companies like Zyme offer CDM software solutions as a concrete means to address this challenge. But what is a CDM solution? Well, in the words of Zyme, CDM is:
a discipline concerned with the acquisition and use of data originating from the channel. It enables companies to significantly grow their business by offering transformative insights into the way business is conducted in the channel.
In other words, a CDM solution offers a series of tools that enable customers or users to efficiently manage the data coming from the channel. This includes the following:

  • Integration with third-party systems
  • Automated data collection
  • Data enrichment functionality
  • Advanced analytics and reporting capabilities

Zyme aims to achieve complete channel visibility through its cloud CDM platform, which collects the raw data originating from partners, and pushes it to the proprietary technologies and content libraries, which then transform it into usable data for intelligence gathering. Once the data is ready, it can be processed and consumed for analysis and visualization through dashboards and/or other specific third-party analytics systems.

The channel data management market has enormous potential for growth and evolution. And a company such as Zyme, with its combination of expertise and innovative technology, keeps constantly developing this segment of the data management market.

Proof of this is the consistent growth of Zyme―which accounts for more than 70% of market share. The company expects to process more than $175 billion in channel revenues and more than 1 billion transactions this year thanks to its set of big customers, which includes Microsoft, VMWare, and GE, just to name a few of the players.

Figure 2. Zyme Screencap (Courtesy of Zyme)

Zyme adds power with version 3.0 
On June 30th, Zyme announced the release of version 3.0 of its CDM solution. This news keeps with its mission to expand the platform and provide channel visibility to global enterprises. Zyme’s new version has been enriched with several improvements, three of which are core to the new direction of the company:

  • The addition of zymeEcommerceSM to the platform. This new e-commerce offering will give companies more visibility into online shelf space. This new solution can keep track of metrics such as competitors’ product positioning, pricing, and customer perception across e-commerce channels—and consequently delivers market intelligence.
  • The addition of the new zymeIncentives solution. This solution allows companies to perform incentives management, and thus automatically calculate and validate rebates and credits earned by partners based on Zyme’s existing decision-grade data. The solution is also able to communicate as well as facilitate incentives payments to channel partners quickly and seamlessly.
  • Zyme’s approach to the Internet of Things (IoT) called zymeCDMSM. This enhances Zyme’s existing functionality with capabilities for tracking connected devices down to individual serial numbers in real time. This in turn improves visibility into product movement, such as mapping out a product’s complete route to a customer for a manufacturer, with the ultimate goal of closing the loop between manufacturers and end users. 

In regard to its new version, Chandran Sankaran, Zyme’s CEO mentioned:
The Zyme cloud platform 3.0 makes our proprietary technologies and comprehensive content libraries, including more than 1.5 million channel partners and the largest directory of products and retailers, available to customers through a modern, scalable, SaaS platform. Global enterprises have immediate access to complete, accurate and timely data from resellers and distributors to unlock the enormous value that had previously been trapped in the channel due to inefficient and outdated reporting systems and processes.
On the other hand, on the customer side, Kevin Nusky, Director of Marketing and Sales Operations at Schneider Electric’s IT Business Unit had the following to say about Zyme’s new release:
More than 65 percent of our sales go through a distribution system, so we can't make informed business decisions without accurate data from channel partners. Zyme delivers unprecedented partner reporting accuracy, which has led to improved inventory management, reduced rebate overpayments, increased revenue through better partner development and accelerated channel growth and success.
Building on its core mission to deliver channel visibility to global enterprises, the company offers a targeted solution that provides complete channel visibility. Zyme’s cloud platform 3.0 aims to empower companies to obtain the maximum value from the channel sales.

Zyme in a blue sea
It appears that Zyme has encountered in channel data management a market with huge potential where competitors appear to be scarce and users willing to consider these new types of software offerings. In this market, the IoT could empower companies like Zyme with the tools for improving the mechanisms driving complete channel visibility for its customers.
As with many other types of enterprise software applications, Zyme’s success will depend on how efficiently it can integrate with the existing software stack (customer relationship management [CRM], enterprise resource planning [ERP], and other systems) in order to ensure data management agility and timeliness, as well as accurate visibility and natural interactivity with other business operations. Zyme appears to be on a right path to achieving these goals.


Wednesday, June 15, 2016

An Interview with Dataiku's CEO: Florian Douetteau

As an increasing number of organizations look for ways to take their analytics platforms to higher grounds, many of them are seriously considering the incorporation of new advanced analytics disciplines, this includes hiring data science specialists and solutions that can enable the delivery of improved data analysis and insights. As a consequence, this also triggers the emergence of new companies and offerings in this area.

Dataiku is one of these new breed of companies. With its Data Science Studio (DSS) solution, Dataiku aims to offer full data science solution for both data science experienced and non-experienced users.

In this opportunity I had the chance to interview Florian Douetteau, Dataiku’s CEO and be able to pick some of his thoughts and interesting views in regards to the data management industry and of course he’s company and software solution.

A brief Bio of Florian 

In 2000, at age 20, he dropped the prestigious “Ecole Normale Supérieure”  math courses and decided to look for the largest dataset he could find, and the hardest related problem he could solve.

That’s how he started working at Exalead, a search engine company that back at the time was developing technologies in web mining, search, natural language processing (NLP) and distributed computing. At Exalead, Florian scaled to be managing VP of Product and R&D. He stayed in the company until it was acquired in 2010 by Dassault Systèmes for $150M (a pretty large amount for French standards).

Still in 2010 when the data deluge was pouring into to new seas, Florian worked in the social gaming and online advertising industry, an industry where machine learning was already being applied on petabytes of data. Between 2010 and 2013 he held several positions as consultant and CTO.

 By 2013 Florian along with other 3 co-founders creates Dataiku with the goal of making advanced data technologies accessible to companies that are not digital giants, since then one of Florian’s main goals as CEO of Dataiku is to be able of democratizing access to Data Science.

So, you can watch the video or listen to the podcast in which Florian shares with us some of his views on the fast evolution of data science, analytics, big data and of course, his data science software solution.

 Of course, please feel free to let us know your comments and questions.

Monday, April 18, 2016

Altiscale Delivers Improved Insight and Hindsight to Its Data Cloud Portfolio

Logo courtesy of Altiscale

Let me just say right off the bat that I consider Altiscale to be a really nice alternative for the provisioning of Big Data services such as Hortonworks, Cloudera or MapR. The Palo Alto, California–based company offers a full Big Data platform based in the cloud via the Altiscale Data Cloud offering. In my view, Altiscale has dramatically increased the appeal of its portfolio with the launch of the Altiscale Insight Cloud and a partnership with Tableau, which will bring enhanced versatility and power to Altiscale’s set of services for Big Data.

The new Altiscale Insight Cloud

On March 15th, Altiscale released its new Altiscale Insight Cloud solution. In the words of Altiscale, this is a “self-service analytics solution for Big Data.” Altiscale Insight Cloud aims to equip business analysts and information workers with the necessary tools for querying, analyzing, and getting answers from Big Data repositories using the tools that they are familiar with, such as Microsoft Excel and Tableau.

According to the California-based company, with this new offering, Altiscale will be able to provide its customers with a robust self-service tool and an accessible and easy-to-query data lake infrastructure. As such, companies will be able to avoid many of the complexities involved in the complex and difficult preparation process of providing users with easy and fast access to Big Data sources.

To achieve simplicity and agility, Altiscale relies on having a converged architecture, so that on the one hand it can minimize the need for data movement and replication, especially across Big Data sources, and on the other hand, it can eliminate the need for separate relational data stores in order to reduce organizational costs and management efforts.

According to Raymie Stata, chief executive officer (CEO) and founder of Altiscale, the Insight Cloud:

Solves the challenge of bringing Big Data to a broader range of users, so that enterprises can quickly develop new offerings, better target customers, and respond to shifting market or operational conditions. It’s a faster and easier way to get from Big Data infrastructure to insights that drive real business value.

Altiscale considers that its Insight Cloud will be able to replace many more complex and expensive alternatives, allowing organizations to get their hands on Big Data broadly and quickly, without heavy information technology (IT) involvement. As such, Altiscale Insight Cloud will have a significant impact on the speed and facility with which organizations will be able to access and analyze Big Data sources.

As a high-performance, self-service analytics solution, some of the core features of the Altiscale Insight Cloud include:

  • interactive Structured Query Language (SQL) queries,
  • dynamic visualizations,
  • real-time dashboards, and 
  • other reporting and analytics capabilities.

The big news is that with its Insight Cloud offering, Altiscale will be delivering not only a reliable Big Data platform, but also an extension to its infrastructure that can simplify the connection between Big Data and the end user, which is currently a complex, slow, and expensive process for many organizations. This can also significantly reduce the need for expensive, proprietary solutions—not to mention that this new offering can avail many business analysts easier and faster access to an organization’s existing Hadoop data lake.

Of course, organizations interested in this offering will need to consider a number of things including Altiscale’s power to perform data preparation and cleaning to ensure high-quality data and profiling. But without a doubt, this is a wise step from Altiscale: to provide its customers with the next logical step in the Big Data infrastructure, which is the ability to perform fast and efficient analysis.

Altiscale and Tableau: Business intelligent partnership?

Within a few short weeks of the Altiscale Insight Cloud launch, Altiscale announced a partnership with data discovery and visualization powerhouse Tableau. The partnership with Tableau will, according to both vendors:

make it easier for business analysts, IT professionals, and data scientists to access, analyze, and visualize the massive volumes of data available in Hadoop.

Additionally, according to Dan Kogan, director of product marketing at Tableau:

Altiscale shares our mission to help people see and understand their data. Partnerships with leading Hadoop and Spark providers such as Altiscale help us to bring rich visual analytics to anyone within the enterprise looking to derive value from data.

Now users can use Tableau connected to the Altiscale Insight Cloud directly via Open Database Connectivity (ODBC), the standard application programming interface (API) for accessing database management systems (DBMSs). Once connected, Altiscale Insight Cloud will enable users to create visualizations and perform analysis similarly to working with other databases.

User will be able to use Tableau’s easy features to drag and drop fields, filter data, analyze data, and derive insights to create visualizations that can later be published to Tableau Server. Additionally, there is a noteworthy feature that allows users to reuse intermediate solutions provided by Altiscale partners, so that users can first aggregate and catalog data prior to creating visualizations with Tableau, thus providing extra flexibility and power to the Altiscale-Tableau connection.

Of course, the first thing that stands out from this partnership is the opportunity for thousands of users on both ends of the partnership and from different disciplines to, on the one hand, be able to use an appealing and easy-to-use tool such as Tableau, and on the other hand, to easily crack the data coming from large and complex data repository residing in Hadoop.

This partnership shows how Big Data and analytics and business intelligence (BI) providers are moving in an industry-wise manner to increasingly narrow the functional gaps between Big Data sources and their availability for analysis, while widening the number of options for incorporating Big Data within enterprise analytics strategies.

While such a partnership is not at all surprising, it is relevant to the continuous evolution and maturity of new enterprise BI and analytics platforms.

But what do you think? Of course, I look forward to hearing your comments and suggestions. Drop me a line, and I’ll respond as soon as possible.


Wednesday, March 30, 2016

Hortonworks’s New Vision for Connected Data Platforms

Courtesy of Hortonworks
On March 1, I had the opportunity to attend this year’s Hortonworks Analyst Summit in San Francisco, where Hortonworks announced several product enhancements and new versions and a new definition for its strategy going forward.

Hortonworks seems to be making a serious attempt to take over the data management space, while maintaining a commitment to open sources and especially to the Apache Foundation. Thus as Hortonworks keeps gaining momentum, it’s also consolidating its corporate strategy and bringing a new balance to its message (combining both technology and business).

By reinforcing alliances, and at the same time moving further towards the business mainstream with a more concise messaging around enterprise readiness, Hortonworks is declaring itself ready to win the battle for the big data management space.

The big question is if the company’s strategy will be effective enough to succeed at this goal, especially in a market already overpopulated and fiercely defended by big software providers.

Digesting Hortonworks’s Announcements
The announcements at the Hortonworks Analyst Summit included news on both the product and partner fronts. With regards to products, Hortonworks announced new versions for both its Hadoop Data (HDP) and Hadoop Dataflow (HDF) platforms.

HDP—New Release, New Cycle
Alongside specific features to improve performance and reinforce ease of use, the latest release of Apache HDP 2.4 (figure 1) includes the latest generation of Apache’s large-scale data processing framework, Spark 1.6, along with Ambari 2.2, Apache’s project for making Hadoop management easier and more efficient.

The inclusion of Ambari seems to be an important key for the provision of a solid, centric management and monitoring tool for Hadoop clusters.

Figure 1. Hortonworks emphaszes enterprise readiness for its HDP version
(Image courtesy of Hortonworks)

Another key announcement with regard to HDP is the revelation of a new release cycle for HDP. Interestingly, it aims to provide users with a consistent product featuring core stability. The new cycle will enable, via yearly releases, HDP services such as HDFS, YARN, and MapReduce as well as Apache Zookeeper to align with a compatible version of Apache Hadoop with the “ODPi Core,” currently in version 2.7.1. These can provide standardization and ensure a stable software base for mission critical workloads.

On the flip side, those extended services that run on top of the Hadoop core, including Spark, Hive, HBase, Ambari and others will be continually released throughout the year to ensure these projects are continuously updated.

Last but not least, HDP’s new version also comes with the new Smartsense 1.2, Hortonworks’s issue resolution application, featuring automatic scheduling and uploading, as well as over 250 new recommendations and guidelines.

Growing NiFi to an Enterprise Level
Along with HDP, Hortonworks also announced version 1.2 of HDF, Hortonworks’s offering for managing data in motion by collecting, manipulating, and curating data in real time. The new version includes new streaming analytics capabilities for Apache NiFi, which powers HDF at its core, and support for Apache Storm and Apache Kafka (figure 2).

Another noteworthy feature coming to HDF is its support for integration with Kerberos, a feature which will enable and ease management of centralized authentication across the platform and other applications. According to Hortonworks, HDF 1.2 will be available to customers in Q1 of 2016.

Figure 2. Improved security and control added to Hortonworks new HDF version
(Image courtesy of Hortonworks)

Hortonworks Adds New Partners to its List
The third announcement from Hortonworks at the conference was a partnership with Hewlett Packard Labs, the central research organization of Hewlett Packard Enterprise (HPE).

The collaboration mainly has to do with a bipartisan effort to enhance performance and capabilities of Apache Spark. According to Hortonworks and HPE, this collaboration will be mainly focused on the development and analysis of a new class of analytic workloads which benefit from using large pools of shared memory.

Says Scott Gnau, Hortonworks’s chief technology officer, with regard to the collaboration agreement:

This collaboration indicates our mutual support of and commitment to the growing Spark community and its solutions. We will continue to focus on the integration of Spark into broad data architectures supported by Apache YARN as well as enhancements for performance and functionality and better access points for applications like Apache Zeppelin.

According to both companies, this collaboration has already generated interesting results which include more efficient memory usage and increased performance as well as faster sorting and in-memory computations for improving Spark’s performance.

The result of these collaborations will be derived as new technology contributions for the Apache Spark community, and thus carry beneficial impacts for this important piece of the Apache Hadoop framework.

Commenting on the new collaborations, Martin Fink, executive vice president and chief technology officer of HPE and board member of Hortonworks, said:

We’re hoping to enable the Spark community to derive insight more rapidly from much larger data sets without having to change a single line of code. We’re very pleased to be able to work with Hortonworks to broaden the range of challenges that Spark can address.

Additionally Hortonworks signed a partnership with Impetus Technologies, Inc., another solution provider based on open source technology. The agreement includes collaboration around StreamAnalytix™, an application that provides tools for rapid and less code development of real-time analytics applications using Storm and Spark. Both companies have the aim that with the use of HDF and StreamAnalytix together, companies will gain a complete and stable platform for the efficient development and delivery of real-time analytics applications.

But The Real News Is …
Hortonworks is rapidly evolving its vision of data management and integration, and this was in my opinion the biggest news of the analyst event. Hortonworks’s strategy is to integrate the management of both data at rest (data residing in HDP) and data in motion (data HDF collects and curates in real-time), as being able to manage both can power actionable intelligence. It is in this context that Hortonworks is working to increase integration between them.

Hortonworks is now taking a new go-to-market approach to provide an increase in quality and enterprise readiness to its platforms. Along with ensuring that ease of use will avoid barriers for end use adoption its marketing message is changing. Now the Hadoop-based company sees the need to take a step further and convince businesses that open source does more than just do the job; it is in fact becoming the quintessential tool for any important data management initiative—and, of course, Hortonworks is the best vendor for the job. Along these lines, Hortonworks is taking steps to provide Spark with enterprise-ready governance, security, and operations to ensure readiness for rapid enterprise integration. This to be gained with the inclusion of Apache Ambari and other Apache projects.

One additional yet important aspect within this strategy has to do with Hortonworks’s work done around enterprise readiness, especially regarding issue tracking (figure 3) and monitoring for mission critical workloads and security reinforcement.

Figure 3. SmartSense 1.2 includes more than 250 recommendations
(Image courtesy of Hortonworks)

It will be interesting to see how this new strategy works for Hortonworks, especially within the big data market where there is extremely fierce competition and where many other vendors are pushing extremely hard to get a piece of the pie, including important partners of Hortonworks.

Taking its data management strategy to a new level is indeed bringing many opportunities for Hortonworks, but these are not without challenges as the company introduces itself into the bigger enterprise footprint of the data management industry.

What do you think about Hortonworks’s new strategy in data management? If you have any comments, please drop me a line below and I’ll respond as soon as I can.

(Originally published)

Tuesday, March 8, 2016

Creating a Global Dashboard. The GDELT Project

There is probably no bigger dream for a data geek like myself than creating the ultimate data dashboard or scorecard of the world. One that summarizes and enables the analysis of all the data in the world.

Well, for those of you who have also dreamt about this, Kalev H. Leetaru, a senior fellow at the George Washington University Center for Cyber & Homeland Security has tapped into your dreams—and is working on something in this realm. Leetaru, whom some have called “The Wizard of Big Data,” is developing a platform for monitoring and better understanding how human society works.

The project called Global Database of Events, Language, or simply The GDELT Project, is an ambitious endeavor created to “crack” the social numbers of the world, and has the aim of improving our understanding of human society.

As described by the folks at GDELT:

The GDELT Project came from a desire to better understand global human society and especially the connection between communicative discourse and physical societal-scale behavior. The vision of the GDELT Project is to codify the entire planet into a computable format using all available open information sources that provides a new platform for understanding the global world.

To do this, The GDELT Project has collected information dated back to 1979 and keeps updating it regularly, so its catalogs are always fresh. According to GDELT, the project has already more than a quarter billion event records in more than 300 categories. It also keeps up to date a massive network diagram that connects each individual with all existing entities and events in the world, such as locations, organizations, themes, emotions, and other data.

Information is gathered from many sources including: Google, Google ideas, Google News, the Internet Archive, BBC Monitoring, among many others.

So what makes The GDELT Project so interesting?

Well, it’s a perfect opportunity for data aficionados to lay their hands on social data from around the world in three different ways:

  1. Using GDELT Analysis Service, a free cloud-based offering that includes tools and services to visualize, explore, and export the data.
  2. Using the complete dataset available at Google’s Big Query service.
  3. Downloading data in CSV format.

This allows different types of users to get their hands on the data in the way that suits them best. So, for immediate consumption and analysis, users can go with the first option. Users with more specific requirements or with complex projects can use the data provided by the second or third option.

Whichever way you choose to access the worldwide data, this could be a great opportunity for you, my dear data junkie, to explore and embark upon a data deluge journey for a new school, entrepreneurial, or just playtime project.

This is just a brief intro into really cool project. I’ll update you on major advancements of The GDELT Project as they come along.

In the meantime, I would encourage you to have a look at this nice 20-minute video about The GDELT Project.

As always, you can also drop me a line below. Enjoy.



View Jorge García's profile on LinkedIn

Join LinkedIn Group

Tweets by joxdot