Operational Data Finds New Use Beyond the Factory Floor A survey of industrial companies shows growing interest in applying big data analytics and

Size: px
Start display at page:

Download "Operational Data Finds New Use Beyond the Factory Floor A survey of industrial companies shows growing interest in applying big data analytics and"

Transcription

1 Operational Data Finds New Use Beyond the Factory Floor A survey of industrial companies shows growing interest in applying big data analytics and advanced visualization to operational data.

2 Index Introduction Summary Survey Participants Results Conclusion Preparing PI System Data for Operational Analytics Introduction Big data analytics is attracting a great deal of media attention. The world of information technology is seeing a growing number of analytic tool vendors and a rapid rise in the demand for data scientists. Big data analytics has found its calling in areas such as consumer marketing where buyers leave bread crumb clues about what they are likely to purchase next. The industrial world has different types of problems, mostly focused on operations, but these companies are employing similar statistical disciplines for description, prediction, and optimization. We surveyed a sample of PI System users in North America and Europe to understand what operational analytic initiatives they are planning to address with their PI System data. OSIsoft s PI System is a trusted source for storing and managing operational data in the world s largest industrial markets with users at 1,000 leading utilities and 95% of the largest oil and gas companies. Summary Sixty-three percent of the respondents representing 68% of the surveyed companies are performing operational analytics today or will be in the next 18 months Thirty-nine percent of the respondents will be moving PI System data outside of the PI System into a data warehouse or data lake to create some form of operations analytics. Another 32% of the respondents will be using business intelligence visualization platforms like Tableau or Spotfire to visualize PI data Forty-eight percent of the respondents are using three to five data sets to generate insights from operational analytics. An additional 22% of respondents are using six or more data sets The top five reasons for operational analytic initiatives: improving overall business performance, gaining process efficiency, maintaining asset health, better quality tracking, and increasing energy savings 2

3 Survey Participants Who Responded individuals representing 10 broad industry groups 30% 31% 20% 17% 14% 10% 11% 7% 6% 5% 4% 3% 0% 2% Utilities Gas & Oil Chemicals Pharmaeuticals Metals or Mining Pulp & Paper Food & Beverage Water or Wastewater OSIsoft Partner Other Figure 1. Survey respondents distributed across 10 industry segments. A total of 698 respondents from 388 distinct companies and 10 broad industry groups responded to the survey. Utilities (31%), Oil & Gas (17%), and Chemical (14%) supplied the greatest number of respondents. 3

4 Survey Results Industrial Companies Planning to Use Operational Analytics Does your company have plans for operations analytics initiatives, if so when? Now Next 6 months 7 to 18 months Beyond 18 months Don t have any 40% 33% 30% 28% 26% 20% 20% 21% 22% 10% 15% 14% 10% 11% 0% % Companies # Respondents % Companies # Respondents % Companies # Respondents % Companies # Respondents % Companies # Respondents Figure 2. Present engagement and future plans for operational analytics OSIsoft customers are interested in operational analytics. Figure 2 shows 78% of companies are now using or planning to use operational analytics over the next 18 months. Of that total, 33% of the companies are now using operational analytics, another 15% are planning to use operational analytics in six months, and 20% in the next seven to 18 months. 4

5 Creating Operational Insights Requires Multiple Data Sources Performing operational analytics typically requires multiple data to be generated by different systems. Figure 3 shows the largest percentage (54%) of respondents reported the use of three to five data sources. Moreover, 12% indicated they use six to 10 data sources to provide relevant insights and 10% use more than 11 data sources. Multiple data sources must be time-aligned and matched for effective analysis. Acting as a unifying data layer, the PI System aggregates and organizes operational data by time, asset type, asset location, and a variety of other parameters. How many data sources are typically brought together for your operational analytics? 1 to 2 25% 3 to 5 54% 6 to 10 12% % 0% 10% 20% 30% 40% 50% Figure 3. Number of data sources used in operational analytics 5

6 Data Usage Differs by Industry and Individual Role While the largest group of respondents is using three to five data sets for operational analytics, many are using more than six, hinting at the growing use of a broader set of data sources for deeper insight. A greater number of data sets makes it necessary to aggregate PI System data with other enterprise data in storage mediums such as data warehouses or data lakes, to pre-process and create an analytics-ready record. Undoubtedly, many enterprises are performing a great deal of master data management and data engineering. Figure 4 shows the relationship between the total number of respondents that did not check that they don t provide operational analytics and answered the question about the number of data sets they typically use. The figure categorizes the 371 respondents by industry. The utilities industry has the largest number of respondents (133) with 9% of the total (32) using one or two data sets, 20% (75) using three to five data sets, 4% (14) using six to 10, and 3% (12) using 11+ data sets. How many data sources are typically brought together for your operational analytics? Results by industry (n=371) 1 to 2 3 to 5 6 to Utilities Oil & Gas Chemicals Pharmaceuticals Metals or Mining Food & Beverage Pulp & Paper Water or Wastewater Other Industries Grand Total Total Respondents 371 Figure 4. Number of respondents by industry and number of data sets used 6

7 Out of the 371 respondents, it s clear that between three and five data sets are used by the majority, with the Operations Supervisor role typically using only one or two data sets. Conversely the operations or plant manager respondents said that they use the largest number of data sets with 17% using 11+ data sets and 14% using six to 10 data sets. Most analytics data projects follow a process that moves them out of the realm of data science and into a repeatable semi-automated data workflow. Tools that can simplify and automate the enterprise s data analytics workflow should be welcomed. How many data sources are typically brought together for your operational analytics? Results by job role 1 to 2 3 to 5 6 to Automation/ Process Controls Engineer Consultant/ Solution Provider IT (Information Technology) Maintenance Manager or Engineer Manufacturing Process Engineer Operations or Plant Manager Operations Supervisor 6% 15% 46% 33% 13% 10% 53% 23% 13% 8% 59% 20% 11% 22% 50% 15% 46% 17% 14% 45% 24% 17% 83% 38% 17% PI System Administrator Plant Engineer 6% 12% 54% 10% 60% 28% 30% System Integrator 7% 27% 47% 20% Other 8% 15% 66% 11% Percent Respondents by Job Role Figure 5. Percentage of total respondents by job role and number of data sets used 7

8 Biggest Obstacle to Performing Operational Analytics What are the most difficult obstacles to performing analytics projects using IT and OT data sets? Properly cleansing, aligning, matching and aggregating data sets 25% 67% Organizational boundaries, misalignment or lack skills 55% Identifying the most relevant data sets 25% 43% Identifying the right questions to ask of the data 10% 43% Figure 6. Most difficult obstacles to performing analytics that integrate IT and OT data sets According to Figure 6, 67% of respondents say that cleaning, organizing and aligning data is the most challenging obstacle. Given that some of the respondents were using six data sets or more, the process of preparing data can be arduous and fraught with potential for error. This finding is in line with other studies. According to an InfoWorld article dated March 24th, 2016, Hottest job? Data scientists say they re still mostly digital janitors, 60% of a data scientist s job consists of cleaning and organizing data. Panning for gold in operational analytics data is a difficult process. Tools that can automate the process of cleansing and organizing data will save time as well as improve consistency and repeatability. 8

9 PI System Data to be Used in Visual Platforms, Data Warehouses, Data Lakes, and Visualization Platforms Plans for using PI data with visualization platforms like Tableau? 40% 35% 33% 30% 20% 15% 10% 8% 9% 0% Now Within 6 months 7 to 18 months No plans Don t know Figure 7. Respondents planning to use PI System data in business intelligence (BI) tools In Figure 7, 32% of the respondents said they are using or are planning to use PI System data in BI tools. This indicates a significant interest in using operational data from the PI System in new areas including visualization platforms such as Tableau and Spotfire. 9

10 Do you have plans to use PI System data in data warehouse or big data initiatives? 40% 30% 30% 30% 20% 16% 15% 10% 8% 0% 2% Now Within 6 months 7 to 18 months No plans Don t know Other Figure 8. Plans to use PI System data in data warehouses or big data initiatives In Figure 8, 39% of the respondents are using or plan to use operational data from the PI System in data warehouses or in big data initiatives within the next 18 months. Data warehouses contain preorganized data that enables fast reporting on a known set of business questions. Data lakes contain both structured and unstructured data and often require a data scientist to make sense of what s there. Data lakes often act as a catch-all for data that needs meta data to perform better ad hoc analysis. Analysis in data warehouses or data lakes is not typically performed in real time. Instead, these batch analytics integrate historic information from a variety of systems, such as equipment maintenance histories, employee work schedules, energy sources, production data, supply chain data and a lot more. Converting operational data into an analytics-friendly format is often challenging because its primary use is on the plant floor where real-time decisions are made about production processes. Rolling the data up to aggregate business systems requires the format be more universal and time-aligned. 10

11 Motivation for Operational Analytics Why invest in operational analytics? Improve Overall Business Process Efficiency Asset Health Quality Tracking Energy Savings Regulatory Safety or Compliance Security Other Reasons Figure 9. Leading reasons to invest in operational analytics Of the 375 respondents answering the question, PI System customers are investing in operational analytics for five main reasons: boosting overall business performance (290, or 77%), gaining process efficiency (286, or 76%), maintaining asset health (201, or 54%), improving quality tracking (174, or 46%), and increasing energy savings (153, or 41%). All of these are well-known business impacts of the PI System. This confirms OSIsoft s observation that these reasons are basic driving forces in process industries and fundamental to helping customers get the most value out of the PI System. 11

12 For your analytics, visualization and reporting needs, which data sources are you combining with your PI System data? Equipment Maintenance Energy Source Vibration Weather Supply Chain GIS PI Data only Employee Schedules Other Figure 10. Most popular data sources combined with PI data Of the 425 respondents who answered this question, equipment maintenance (cited by 266, or 63%) was the leading data source used in conjunction with PI System data. This was followed by energy source information (173, or 41%), and vibration (159, or 37%) to round out the top three. Seventy-two respondents said that they don t use PI System data with data from other systems. This suggests that condition-based maintenance and energy usage are primary use cases for operational analytics and the PI data. 12

13 Data Visualization Tools Used with PI Data While data visualization tools used by PI System customers include a range of options, the OSIsoft community still uses a fair number of customer-developed visualization applications. Figure 11. Microsoft Excel, Power BI, Custom Development and Tableau are the predominate visualization means being used in conjunction with PI System today While we did not ask a question to identify a trend, visualization tools will mirror the growth of other enterprise software categories, resulting in the widespread adoption of market leading-platforms at the expense of these earlier custom visualizations. 13

14 Top Analytics Technology Vendors in Use Today Which of the following technologies does your operating group USE TODAY for data warehousing, analytics and visualization? Which of the following technologies is your operating group planning to INVEST MORE significantly in over the next 8-12 months? Microsoft databases and their BI tools 346 Microsoft databases and their BI tools 228 Oracle databases and their BI tools 241 Oracle databases and their BI tools 121 Other 112 Other 143 SAP Hana 104 SAP Hana 100 GE Proficy, Predix, or Smart Signal 68 GE Proficy, Predix, or Smart Signal 43 Open source tools like R/ R Studio 60 Open source tools like R/ R Studio Tableau Tableau Hadoop, Spark and its ecosystem 49 Hadoop, Spark and its ecosystem Tibco/ Spotfire Tibco/ Spotfire SAS Analytics SAS Analytics # Respondents # Respondents 350 Figure 12. Top technologies in use for data warehousing Figure 13. Future technology investment Microsoft was the top selected vendor in the installed base of business intelligence and data warehouse technologies with 346 of the 584 respondents (59%) using Microsoft business intelligence and data warehouse technologies today, followed by Oracle (241, or 41%), Other (112, or 19%), and SAP Hana (104, or 18%). The influence of the previous installed base is also evident. The top four vendors are likely to be the future top four. However, investment in Hadoop seems to picking up, with 49 respondents saying they are using Hadoop today and 57 saying that they will be investing in the technology in the next 12 months, as shown in Figure

15 Conclusion There are certainly challenges with getting the analytic record in shape for performing multi-set operational analytics. This survey points out that there is complexity in matching and aligning data sets typically used for operational analytics. Customers are at different stages of their quest to find operational insights, some can get there only using the data that already exists in the PI System while others are blending 11 or more data sets to get the necessary insights. There is some variation in the number of data sets used by industry and by job role. Microsoft, both now and in the foreseeable future, will be a dominant technology vendor with the OSIsoft customer base. Hadoop should show some growth in the next 12 months, most likely for the creation of data lakes, a low-cost repository for both structured and unstructured data. Preparing PI System Data for Operational Analytics OSIsoft has recognized the increasing interest in taking PI System data out of its traditional environment. We have designed the PI Integrator for Business Analytics to simplify and programmatically deal with the complexities of this machine-generated, time-series data. PI System customers, such as CEMEX, have reduced the analytic data workflow from months to minutes while improving the accuracy of their analyses. You can watch or read how CEMEX has benefited from the PI Integrator for Business Analytics. To learn about OSIsoft s PI Integrator for Business Analytics visit: All companies, products and brands mentioned are trademarks of their respective trademark owners. Copyright 2016 OSIsoft, LLC 777 Davis Street, San Leandro, CA SEN