How to Apply Big Data for Fab Operation Efficiency Ian Huh Ph.D. SEMICON WEST 2016 14, July, 2016
1 Today s Speaker Ian Huh, Ph.D. Now Division Head and SVP Enterprise Solution Business Division In charge of Big data solution metatron And other solution business : cloud, surveillance,.. Former Experiences Mobile AP Marketing, Samsung Electronics Consumer Business Division, Samsung Electronics Flash Memory Marketing, Samsung Electronics Global Marketing Operations, Samsung Electronics
2 Agenda What makes big data possible Telco and big data How to implement big data in Semiconductor Conclusion
3 Newer Business Paradigm Managing Data in the numerous transaction would be the core of the competitiveness
manufacturing We need data to be smart Mobile tech.&apps. 75 Big Data analytics 68 Advanced robotics 64 What are IoT/CPS 62 the most important technologies RFID/Location tech. 59 for Smart Manufacturing?? Digital Manufacturing 54 Additive Manufacturing/3D printers 51 2D barcodes 46 Cloud computing 43 Source: SCM World-MESA International survey MESA: Manufacturing Enterprise Solutions Association % of respondents n=166 Social Tech. Augmented reality 30 23 Copyright 2016. SK telecom. All rights reserved 4
5 Big Data Is it really helpful? Big Data is marketing term It is not new analysis, just better one we don t need all data, just few sample we already did all types of analytics
6 Technology Disruptive technologies are changing landscape Intelligence Connectivity Big Data More data beats better algorithms Cheap Network cost Computational power Rule based (Deductive) Machine learning (Inductive) Cheap Sensors Network for IoT Machine Learning ALOGRITHM HYPOTHESIS OBSERVATION CONFIRMATION DATA PATTERN TENTATIVE HYPOTHESIS ALGORITHM ~ $10 / EA ~ $2 / month Humidity Temperature Motion Location.. LoRa LTE-M Cheap Storage and computing Distributed Computing
7 Many data Diversified data makes real insight Value as Individual Data N 개의 Data X M 개의 Data Y f (X 1 ) + f (X 2 ) + f (X 3 ) +. f (X N ) + f (Y 1 ) + f (Y 2 ) + f (Y 3 ) +. f (Y M ) Total Value f (X 1,Y 1, X 2,Y 2.. X N,Y M ) Value as Integrated Data Value as Mashed-Up Data f (X 1,X 2,X 3.. X N ) + f (Y 1,Y 2,Y 3.. Y M ) f (X 1,Y 1, X 2,Y 2.. X N,Y M ) f (X 1 ) + f (X 2 ) + f (X 3 ) +. f (X N ) + f (Y 1 ) + f (Y 2 ) + f (Y 3 ) +. f (Y M ) f (X 1,X 2,X 3.. X N ) + f (Y 1,Y 2,Y 3.. Y M ) Individual Integrated Mashed-Up
8 SK telecom and big data The Largest MNO in Korea Leading technology evolution Key Numbers Date of Foundation March 1984 Sales USD 14.9 billion Operating Profit USD 1.5 billion Total Assets USD 24.9 billion Market Cap. USD 14.5 billion Market Share 51.5% World s 1 st 5G N/W demontration (MWC 2016) LTE Advanced (2013) 5.76Mbps HSUPA (2007) Handset based HSDPA (2006) Satellite DMB (2005) WCDMA R4 (2003) CDMA 2000 1x/EV-DO (2000) CDMA (1996) Consolidation basis, as of 2015 year-end
9 SK telecom and big data Dealing with massive amount of data Massie amount of data : 250TB Data / Day Environment Various types of data (Traffic, VoC, etc) Analysis Challenges (Fraud detection, real time network monitoring)
10 Our solution We built a solution by ourselves Mgmt. Knowhow Internal R&D Hadoop Cluster of 1,500 + servers Integrating data from every source 150+ R&D members Big Data, Storage, Video, IoT Experts Including Apache Top Level project Member,Natural Language Understanding expert,auto Speech Recognition expert Visualization Advanced Analysis Functions (DNN, ML, etc) Sub-second Processing Engine Big data Storage Automatic Data Collector & Pattern Detector RESTFUL API SECURITY ONM(OPERATION AND MANAGEMENT) A A A
11 Internal use case Analyzing mobile network data Source Data Analysis/Prediction Radio Access Data Base station status visualization and real time monitoring Main Analysis algorithms of SKT (Frequency interference, Time series prediction,etc ) 무선환경분석 Core Network Data Station/region based optimization Live Aggregation 비정상성, 성능열화분석 Internet Data 전수데이터저장 Efficient E2E operation Automatic recognition and mapping of call processing and traffic QoE assurance.
12 Actual case at semiconductor industry We will focus on operation efficiency Possible analysis area in semiconductor manufacturing R&D Product Planning Supply Operation Marketing Delivery Competitive Edge Direct impact on revenue Most complex and many data
13 Implementing Big Data : Challenges As-is architecture can t scale out anymore Challenges Current solutions Problems Data Source Complexity of analysis make a well tuned, sophisticated model Low usage rate Process Measurement Massive amount of data Compress data, Use recent data only loss of data value Test Unstructured data New file system (Hadoop,etc) Still using same analysis Lack of storage buy new storage Exponential cost increase Capacity limitation
14 Implementing Big Data : Solution Change of Entire Process Aggregation Storage/ Ingestion Analysis Visualization Auto pattern recognition No loss of data Faster ingestion through parallel processing Make summary during ingestion Raw data ingestion All existing analysis reconstructed based on parallel algorithm Implement new types of analysis : machine learning, deep neural network Real time cause analysis Automatic response to data schema Drill down to find root cause Micro scale analysis
15 Case : Data Lake Infra One single infra Front end process Back end process MES FDC, Source, Response, Test, Defect, Probe, Yield Data Data Data Data Data Data Big Data Lake results Diverse data analysis Using data from many different machines Can analyze data of longer period No data redundancy Efficient data governance
16 Case : Root Cause Analysis Parallel algorithm 1. Implemented parallel/ distributed processing on traditional algorithms 2. Used new types of analysis results 10 times faster analysis Can detect fault in real time Can find anomaly pattern Site managers can do analysis based on their needs
17 Case : Micro Quality Analysis and Package Data Analysis Doing complex analysis in real time Solution Chunk ingestion Visualization in distributed in-memory grids We have many wafer inspection equipment but it takes too long time to inspect 1 wafer image. The wafer image processing time and costs are huddles to implement a new macro defect inspection system to ramp-up yield - Quality control manager- Results 100x faster data ingestion Zoom it/out and rotating/overlapping of defect patterns without visual problem
18 Predictive Maintenance System Enhance productivity management Why? How Results Unplanned maintenance cause serious impact on productivity Implementation of big data system gave a chance to improve pdm accuracy data Ensemble of analysis Root cause Prediction Model Quick decision of Site managers Can be improved to Analyze and alert failure Real time.
19 Conclusion Distributed processing and machine learning technology can maximize the value of data Process Test Source Probe Much Data Generation Data Store & Processing No limitation to store Sub-second processing Cost effective Data Analysis Inductive learning called supervised learning with much data Distributed static algorithms Insights and action Near real-time decision by machine or human. Due to Due to Due to Distributed store & process Machine learning & distributed algorithms Accurate result
20 Conclusion We are just at the beginning I do not know what I may appear to the world, but to myself I seem to have been only like a boy playing on the seashore, And diverting myself in now and then finding a smoother pebble or a prettier shell than ordinary, whilst the great ocean of truth lay all undiscovered before me. Isaac Newton
VI. Propose Copyright 2016. SK telecom. All rights reserved 21