The Mormon church is moving forward with its plan to arm missionaries with iPad minis and broaden their proselytizing to social media. A test program that began last fall with 6,500 missionaries serving …
SALT LAKE CITY (AP) — The Mormon church is expanding a program that gives missionaries iPad minis and broadens their proselytizing to social media.
This blog post was authored by: Murshed Zaman, AzureCAT PM and Sumin Mohanan, DS SDET
With the advent of SQL Server Parallel Data Warehouse (the MPP version of SQL Server) V2 AU1 (Appliance Update 1), PDW got a new name: the Analytics Platform System Appliance or APS. The name changed with the addition of Microsoft’s Windows distribution of Hadoop (HDInsight or HDI) and PDW sharing the same communication fabric in one appliance. Customers can buy an APS appliance with PDW or with PDW and HDI in configurable combinations.
Used in current versions of PDW, Polybase is a technology that allows PDW users to query HDFS data. SQL users can quickly get results from Hadoop data without learning Java or C#.
Features of Polybase include:
- Schematization of Hadoop data in PDW as external tables
- Querying Hadoop data
- Querying Hadoop data and joining with PDW tables
- High speed export and archival of PDW data into Hadoop
- Creating persisted tables in PDW from Hadoop data
In V2AU1 Polybase improvements include:
- Predicate push-down for queries in Hadoop as Map/Reduce jobs
- Statistics on Hadoop data in PDW
Another new feature introduced in PDW V2AU1 is the capability to query data that resides in Microsoft Azure Storage Accounts. Just like HDFS data, PDW can place a schema on data in Microsoft Azure Storage Accounts and move data from PDW to Azure and back.
The APS with these new features and improvements has become a first-class citizen in analytics for any type of data. Any company that has Big Data requirements and wants a highly scale-out Data Warehouse appliance can use APS.
Here are four cases that illustrate how different industries are leveraging APS:
One: Retail brand vs. Name brand
Retail companies that use PDW who also want to harvest and curate data from their social analytics sites. This data provides insights into their products and understand the behaviors of the customers. Using APS, the company can offer the right promotion at the right time and to the right demographics. Data also allows the companies to find brand recommendation coming from a friend, relative or a trusted support group that can be much more effective than marketing literature alone. By monitoring and profiling social media, these companies can also gain a competitive advantage.
Today’s empowered shoppers want personalized offers that appeal to their emotional needs. Using social media retailers offer promotions that are tailored to individuals using real-time analytics. This process starts by ranking blogs, forums, Twitter feed and Facebook posts for predetermined KPIs revealed in these posts and conversations. Retail organizations analyze and use the data to profile shoppers to personalize future marketing campaigns. Measureable or sale data reveals the effectiveness of the campaign and the whole process starts again with the insight gained.
In this example, PDW houses the relational sale data and Hadoop houses the social emotions. PDW with built in HDI region gives the company the arsenal to analyze both data sources in a timely manner to be able to react and make changes.
Retail store APS diagram:
Two: Computer Component Manufacturing
Companies that generate massive amounts of electronic test data can get valuable insights from APS. Test data are usually a good candidate for Hadoop due to its key-value type (JSON or XML) structure.
One example in this space is a computer component manufacturer. Due to the volume, velocity and variety of these (ie: Sort/Class) data a conventional ETL process can be very resource expensive. Using APS, companies can gain insight from their data by putting the semi-structured (key-value pair) data into an HDI-Region and other complementary structured data sources (ie: Wafer Electrical Test) into PDW. With the Polybase query feature these two types of data can easily be combined and evaluated for success/failure rates.
Computer Component Manufacturing Diagram:
Three: Game Analytic Platform for online game vendors
The PDW with HDI regions can offer a complete solution for online game companies, to derive insights from their data. MMORPG’s (Massively Multiplayer Online Role Playing Games) are good examples where APS can deliver value. Game engines produce many transactional data (events like which avatar got killed in the current active game) and a lot of semi-structured data such as activity logs containing chat data and historical logs. PDW is well-suited to loading the transactional data in to the PDW workload and semi-structured data to the HDI region of APS. The data can then be used to derive insights such as:
- Customer retention -- Discovering when to give customers offers and incentives to keep them in the game
- Improving game experience -- Discovering where customers are spending more time in the game, and improving in-game experience
- Detecting fraudulent gaming activities
Currently these companies deal with multiple solutions and products to achieve the goal. APS provides a single solution to power both their transactional and non-transactional analytics.
Four: Click stream analysis of product websites for targeted advertisement.
In the past, a relational database system was sufficient to satisfy the data requirements of a medium-scale production website. Ever-increasing competition and advancements in technology have changed the way in which websites interact with customers. Apart from storing data that customers explicitly provide the company, sites now record how customers interact with their website. As an example, when a registered user browses a particular car model, additional targeted advertisements and offers can be sent to the user.
This scenario can be captured using collected clickstream data and the Hadoop eco-system. APS acts as the complete solution to these companies by offering the PDW workload to store and analyze transactional data, combined with HDI region to derive insights from the click-stream data.
This solution also applies to Third party companies that specialize in targeted advertising campaigns for their clients.
While “Big Data” is a hot topic, we very often receive questions from customers about the actual use cases that apply to them and how they can derive new business value from “Big Data.” Hopefully these use cases highlight how various industries can truly leverage their data to mine insights that deliver business value in addition to showcasing how traditional data warehouse capabilities work together with Hadoop
Visit the Microsoft Analytics Platform System page to learn more.
This blog post will detail how APS gives users the ability to:
- Leverage Power Query, Power Pivot, and Power Map at massive scale
- Iteratively query APS, adding BI on the fly
- Combine data seamlessly from PDW, HDI, and Azure using PolyBase
The Microsoft Analytics Platform System (APS) is a powerful scale out data warehouse solution for aggregating data across a variety of platforms. In Architecture of the Microsoft Analytics Platform System and PolyBase in APS -- Yet another SQL over Hadoop solution?, the base architecture of the platform was defined. Here we’ll build on this knowledge to see how APS becomes a key element of your BI story at massive scale.
Let’s first start with a business case. Penelope is a data analyst at a US based restaurant chain with hundreds of locations across the world. She is looking to use the power of the Microsoft BI stack to get insight into the business – both in real time and aggregate form for the last quarter. With the integration of APS with Microsoft BI stack, she is able to extend her analysis beyond simple querying. Penelope is able to utilize the MOLAP data model in SQL Server Analysis Services (SSAS) as a front end to the massive querying capabilities of APS. Using the combined tools, she is able to:
- Quickly access data in stored aggregations that are compressed and optimized for analysis
- Easily update these aggregations based on structured and unstructured data sets
- Transparently access data through Excel’s front-end
Using Excel, Penelope has quick access to all of the aggregations she has stored in SSAS with analysis tools like Power Query, Power Pivot, and Power Map. Using Power Map, Penelope is able to plot the growth of restaurants across America, and sees that lagging sales in two regions, the West Coast and Mid-Atlantic, are affecting the company as a whole.
After Penelope discovers that sales are disproportionately low on the West Coast and in the Mid-Atlantic regions, she can use the speed of APS’ Massively Parallel Processor (MPP) architecture to iteratively query the database, create additional MOLAP cubes on the fly, and focus on issues driving down sales with speed and precision using Microsoft’s BI stack. By isolating the regions in question, Penelope sees that sales are predominantly being affected by two states – California and Connecticut. Drilling down further, she uses Power Chart and Power Pivot to breakdown sales by menu item in the two states, and sees that the items with low sales in those regions are completely different.
While querying relational data stored in APS can get to the root of an issue, by leveraging PolyBase it becomes simple to also take advantage of the world of unstructured data, bringing additional insight from sources such as sensors or social media sites. In this way Penelope is able to incorporate the text of tweets relating to menu items into her analysis. She can use PolyBase’s predicate pushdown ability to filter tweets by geographic region and mentions of the low selling items in those regions, honing her analysis. In this way, she is able to discover that there are two separate issues at play. In California she sees customers complaining about the lack of gluten free options at restaurants, and in Connecticut she sees that many diners find the food to be too spicy.
So how did Penelope use the power of APS to pull in structured data such as Point of Sale (POS), inventory and ordering history, website traffic, and social sentiment into a cohesive, actionable model? By using a stack that combines the might of APS, with the low time to insight of Excel -- let’s breakdown the major components:
- Microsoft Analytics Platform System (APS)
- Microsoft HDInsight
- Microsoft SQL Server Analysis Services (SSAS)
- Microsoft Excel with Power Query, Power Pivot and Power Map
Loading Data in APS and Hadoop
Any analytics team is able to quickly load data into APS from many relational data sources using SSIS. By synchronizing the data flow between their production inventory and POS systems, APS is able to accurately capture and store trillions of transactional rows from within the company. By leveraging the massive scale of APS (up to 6 PB of storage), Penelope doesn’t have to create the data aggregates up front. Instead she can define them later.
Concurrently, her team uses an HDInsight Hadoop cluster running in Microsoft Azure to aggregate all of the individual tweets and posts about the company alongside its menus, locations, public accounts, customer comments, and sentiment. By storing this data in HDInsight, the company is able to utilize the elastic scale of the Azure cloud, and continually update records with real-time sentiment from many social media sites. With PolyBase, Penelope is able to join transactional data with the external tables containing social sentiment data using standard TSQL constructs.
Creating the External Tables
Using the power of PolyBase, the development team can create external tables in APS connected to the HDInsight instance running in Azure. In two such tables, Tweets and WordCloud, Twitter data is easily collected and aggregated in HDFS. Here, the Tweets table is raw data with an additional sentiment value and the WordCloud table is an aggregate of all words used in posts about to the company.
Connecting APS and SSAS to Excel
Within Excel, Penelope has the ability to choose how she would like to access the data. At first she uses the aggregations that are available to her via SSAS – typical sales aggregates like menu items purchases, inventory, etc. – through PowerQuery.
But how does Penelope access the social sentiment data directly from APS? Simple, by using the same data connection tab, Penelope can directly connect to APS and pull in the sentiment data using PolyBase.
Once the process is complete, tables pulled into Excel, as well as their relationships, are shown as data connections.
Once the data connection is created, Penelope is able to create a report using PowerPivot with structured data from the Orders table and the unstructured social sentiment data from HDInsight in Azure.
With both data sets combined in Excel, Penelope is able to then create a Power Map of the sales data layered with the social sentiment. By diving into the details, she can clearly see issues with sentiment from customers in Connecticut and California.
To learn more about APS, please visit http://www.microsoft.com/aps.
Drew DiPalma – Program Manager – Microsoft APS
Drew is a Program Manager working on Microsoft Analytics Platform System. His work on the team has covered many areas, including MPP architecture, analytics, and telemetry. Prior to starting with Microsoft, he studied Computer Science and Mathematics at Pomona College in Claremont, CA.
By Sophie Knight TOKYO (Reuters) -- Japan Display Inc, the world’s biggest smartphone LCD maker, got a third of its revenue from Apple Inc in the year to March, growing more reliant on the iPhone even as it seeks to bump up orders from fast-growing Chinese smartphone makers. The display maker, which supplies screens for the iPhone along with Sharp Corp and LG Display Co Ltd, is expected to further increase sales to Apple this year with the release of the iPhone 6, which supply chain sources say will sport larger panels than previous generations. Apple squeezes its parts suppliers hard on price, resulting in narrow margins, the sources say, while a high reliance on its product cycle can cause large swings in quarterly profits.