Data Analytics for Semiconductor Manufacturing 2016 The MathWorks, Inc. 1
Competitive Advantage What do we mean by Data Analytics? Analytics uses data to drive decision making, rather than gut feel or intuition. Analytics uses data to derive insights which drives business actions and therefore produces business outcomes. Analytics Maturity Prescriptive Predictive What should happen? What will happen? Diagnostic Descriptive Why did it happen? What happened? Level of Sophistication Copyright of Shell Global solutions 2
Data Analytics 3.0 Optimized workflow and decision making with prescriptive analytics In-Memory processing with streaming data Smart factory Enterprise system Integration Big data (Structured + Unstructured) IoT (Internet of Things) 3Vs (Volume, Variety, Velocity) + Value Machine learning and Deep learning AAA 93.68% 5.55% 0.59% 0.18% 0.00% 0.00% 0.00% 0.00% AA 2.44% 92.60% 4.03% 0.73% 0.15% 0.00% 0.00% 0.06% A 0.14% 4.18% 91.02% 3.90% 0.60% 0.08% 0.00% 0.08% BBB 0.03% 0.23% 7.49% 87.86% 3.78% 0.39% 0.06% 0.16% BB 0.03% 0.12% 0.73% 8.27% 86.74% 3.28% 0.18% 0.64% B 0.00% 0.00% 0.11% 0.82% 9.64% 85.37% 2.41% 1.64% CCC 0.00% 0.00% 0.00% 0.37% 1.84% 6.24% 81.88% 9.67% D 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 100.00% AAA AA A BBB BB B CCC D Structured Data set based Descriptive Analysis Decision making from Historical Data 3
Data Analytics for Semiconductor Manufacturing Process Data is being logged everywhere Data comes from many disparate sources Get insight from aggregated data and Photo lithography Etching Thin Film Deposition make actionable decisions to Improvement Yield Oxidation Metallization Detecting error Reduction Manufacturing cost Process optimize Prognostics Wafer Actionable Insight EDS equipment health/maintenance 4
Workflow of Analytics for Integrated Wafer Inspection system Data Aggregator / Reducer and /Automated / Enterprise System 3. Integrate to Enterprise System Automation, Business 1. Big Data Access Big/Diverse Data Sources Process, feature data Business and Transactional 2. Analytics Machine Learning Data Exploration Analytics Modeling & Evaluation Integration 5
Big Data Variety Various Data Sources from complex process image, sensor, numeric, text, etc. 1. Big Data Access Big/Diverse Data Sources Value Rapid Data exploration from enormous number of manufacturing equipments Scalable Data processing with scalable HW Velocity Volume Any collection of data sets so large and complex that it becomes difficult to process using traditional data processing applications. (Wikipedia) Any collection of data sets so large that it becomes difficult to process using traditional MATLAB functions, which assume all of the data is in memory. (MATLAB) Identify the value from uncertainties Statistics, Clustering, Classification, Regression Ease of deployment Support common development environment (C/C++,.Net, Java, Excel, Hadoop) 6
Reading Big Data in MATLAB 1. Big Data Access Big/Diverse Data Sources Researchers Techniques for Big Data in MATLAB Load, Analyze, Discard parfor, datastore, MapReduce Distributed Memory SPMD and distributed arrays out-of-memory in-memory Embarrassingly Parallel Complexity Non- Partitionable 7
Why use MATLAB for Machine Learning? 2. Analytics Machine Learning 4 1 2 Support Vectors 3 2 Complete environment for Machine Learning (prototyping to production) 1 0-1 Verified and trusted machine learning algorithms Classification Learner app that lets you perform common machine learning tasks interactively Tightly integrated with Signal Processing, Image Processing, Computer Vision and other workflows. -2-3 -2-1 0 1 2 3 1.5 group1 group2 group3 1 Open and flexible architecture enables a customized workflow group5 group6 x2 group4 group7 0.5 group8 0-0.5-0.4-0.2 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 x1 8
2. Analytics Machine Learning ASML Develops Virtual Metrology Technology for Semiconductor Manufacturing with Machine Learning Challenge Apply machine learning techniques to improve overlay metrology in semiconductor manufacturing Solution Use MATLAB to create and train a neural network that predicts overlay metrology from alignment metrology Results Industry leadership established Potential manufacturing improvements identified Maintenance overhead minimized Link to user story Cutaway of a TWINSCAN and Track as wafers receive alignment and overlay metrology As a process engineer I had no experience with neural networks or machine learning. I worked through the MATLAB examples to find the best machine learning functions for generating virtual metrology. I couldn t have done this in C or Python it would ve taken too long to find, validate, and integrate the right packages. Emil Schmitt-Weaver ASML 9
Machine Learning Workflow 2. Analytics Machine Learning Iterate LOAD DATA UNSUPERVISED LEARNING SUPERVISED LEARNING MODEL Step 1: Training GMM AUTO- ENCODER PCA PREPROCESSING KMEANS CLASSIFICATION TRAINING REGRESSION NEW DATA MODEL PREDICTION Step 2: Evaluation 10
Overview of Machine Learning in MATLAB 2. Analytics Machine Learning K-means. Fuzzy K-means Hierarchical Clustering Neural Network Gaussian Mixture Hidden Markov Model Unsupervised Learning Regression Decision Trees Ensemble Methods Neural Network Non-linear Reg. (GLM, Logistic) Linear Regression Supervised Learning Classification K-means. Fuzzy K-means Hierarchical Naive Bayes, Nearest Neighbors Gaussian Mixture Hidden Markov Model 11
Integrate Analytics Application to Enterprise System 3. Integrate to Enterprise System Automation, Business Typical Workflow Process Control System /Reporting System Separated Storage; Local Drive Network Drive Database Business Units Business Business Units Units Analytics Group - Head of BU - Head -of Head BU of BU - Domain experts - Domain - Head experts of Analytics - (Data scientists) - (Data -scientists) Domain experts - Data scientists IT - IT Mgr. - Enterprise Architect - Programmers Re-coding Performance Yield rate Efficiency Increase Decrease Time consumption Operation overhead Error Pitfall 12
Deployment for Analytics Integration 3. Integrate to Enterprise System Automation, Business Standalone Application MATLAB Compiler Excel Add-in Hadoop MATLAB C/C ++ MATLAB Compiler SDK Java.NET MATLAB Production Server Integrated into production IT environments without recoding or creating custom infrastructure Packaged as compatible components with a wide range of systems (Java,.NET, Excel, Python & C/C++) Shared as standalone applications or run from: web, database servers, desktop, enterprise applications, etc. 13
Solutions for Web/Enterprise Application Servers MATLAB Production Server 3. Integrate to Enterprise System Automation, Business Most efficient path for creating enterprise applications Deploy MATLAB programs into pr oduction Manage multiple MATLAB programs and versions Update programs without server resta rts Reliably service large numbers of con current requests Integrate with web, database, and application servers 14
Integrate Analytics Application to Enterprise System Suggested Workflow 3. Integrate to Enterprise System Automation, Business Out of Memory Data Access for Massive Data (Hadoop, DB) Request Broker Data from manufacturing process - Wafer - Oxidation - Photo Lithography - Etching - Overlay Metrology - Scanner / SEM image - Fab/Probe/Assembly/Test Yields - Economic market data - etc Prototyping Analytics solution MATLAB Production Server Analytics Integration Without Re-coding Test deployment in R&D part Excel Add-in Hadoop C/C++ Java.NET Standalone Application Enterprise System/ Process Automation Results/ Reporting system (C# and Java) - Process Optimization - Yield Improvement - Rapid processing - Increase efficiency - IP management 15
Summary : MATLAB Data analytics in Semiconductor Industry 1. Big Data Access Big/Diverse Data Sources 2. Analytics Machine Learning 3. Integrate to Enterprise System Automation, Business Data Exploration/ Visualization Scale Up Computation Analytics / A.I System Integration 4 3 1 2 Support Vectors 2 1 0-1 -2-3 -2-1 0 1 2 3 Excel Add-in Hadoop Standalone Application C/C++ Java.NET Spark Big data Access Datastore, MapReduce HDFS support DB, No SQL support MATLAB Database toolbox Multicore Desktop Cluster / Cloud Computing GPU computing Task Parallel / Data Parallel Parallel Computing toolbox MATLAB Distributed Computing Server Machine Learning/ Deep Learning Convolutional Neural Network Artificial Neural Network Linear/Non-Linear Regression Statistics and Machine learning (Global)Optimization Neural Network toolbox Stand-alone Application Multi-language target library Hadoop integration Web service MATLAB Compiler MATLAB compiler SDK MATLAB Production Server 16