Neural network based modeling of HfO 2 thin film characteristics using Latin Hypercube Sampling

Size: px
Start display at page:

Download "Neural network based modeling of HfO 2 thin film characteristics using Latin Hypercube Sampling"

Transcription

1 Expert Systems with Applications Expert Systems with Applications 3 (007) Neural network based modeling of HfO thin film characteristics using Latin Hypercube Sampling Kyoung Eun Kweon a, Jung Hwan Lee a, Young-Don Ko a, Min-Chang Jeong b, Jae-Min Myoung b, Ilgu Yun a, * a Semiconductor Engineering Laboratory, Department of Electrical and Electronics Engineering, Yonsei University, 134 Shinchon-Dong, Seodaemun-Gu, Seoul -749, Republic of Korea b Information and Electronic Materials Research Laboratory, Department of Materials Science and Engineering, Yonsei University, 134 Shinchon-Dong, Seodaemun-Gu, Seoul -749, Republic of Korea Abstract In this paper, the neural network based modeling for electrical characteristics of the HfO thin films grown by metal organic molecular beam epitaxy was investigated. The accumulation capacitance and the hysteresis index are extracted to be the main responses to examine the characteristics of the HfO dielectric films. The input process parameters were extracted by analyzing the process conditions and the characterization of the films. X-ray diffraction was used to analyze the characteristic variation for the different process conditions. In order to build the process model, the neural network model using the error back-propagation algorithm was carried out and those initial weights and biases are selected by Latin Hypercube Sampling method. This modeling methodology can allow us to optimize the process recipes and improve the manufacturability. Ó 005 Elsevier Ltd. All rights reserved. Keywords: HfO ; Process modeling; Neural networks; Latin Hypercube Sampling 1. Introduction The industrial demands for highly integrated and multifunctional circuits lead to increase circuit density and scaling down the size of semiconductor devices. According to the technology roadmap of the semiconductor industry association (SIA) (Semiconductor Industry Association, 000), a gate oxide thickness is reduced less than 1 nm for the application of the 0.05-lm metal-oxide-semiconductor field-effect-transistors (MOSFETs) in the near future. In this scale, MOSFETs cannot work properly because of the physical limits such as the excessive gate tunneling leakage and the gate oxide reliability (Wilk, Wallace, & Anthony, 001). Therefore, the high-k dielectric materials, such as Al O 3, ZrO, and HfO, have a great attention as * Corresponding author. Tel.: ; fax: address: iyun@yonsei.ac.kr (I. Yun). candidates to replace the current gate oxides such as SiO (Cho et al., 003; Cho, Wang, Sha, & Chang, 00; Gusev et al., 000; Lee, Kang, Nieh, Qi, & Lee, 000; Lee, Kang, et al., 000; Qi et al., 000; Zhu, Li, & Liu, 004). Among these candidates, HfO has risen as the one of the promising dielectric materials due to the large band-gap energy, the high dielectric constant and the high breakdown field. The application of the neural networks in the semiconductor manufacturing has been researched and successfully implemented in the area of the process modeling such as the molecular beam epitaxy and plasma-enhanced chemical vapor deposition processes (Han, Ceiler, Bidstrup, Kohl, & May, 1994; Lee, Ko et al., 000, 000). In this paper, the electrical properties of HfO thin film characteristics, such as the accumulation capacitance (C acc ) and the hysteresis index, were investigated via the neural network model using the error back-propagation algorithm. The accumulation capacitance (C acc ) is defined as the capacitance at the strong accumulation region and /$ - see front matter Ó 005 Elsevier Ltd. All rights reserved. doi:.16/j.eswa

2 K.E. Kweon et al. / Expert Systems with Applications 3 (007) the hysteresis index is defined as the width of the hysteresis loop generated by the bi-directional voltage sweep. The Latin Hypercube Sampling (LHS) was used to generate the weights and the biases of the neural networks and the modeling results were verified using statistical analysis.. Experiments HfO thin film was grown on a p-type Si (0) substrate, of which the native oxide was chemically eliminated by (50:1) H O:Hf solution prior to the growth by MOMBE. Hafnium-tetra-butoxide [Hf (O t-c 4 H 9 ) 4 ] was chosen as the MO precursor because it has an appropriate vapor pressure and relatively low decomposition temperature. High-purity (99.999%) oxygen gas was used as the oxidant. Hf-t-butoxide was introduced into the main chamber using Ar as a carrier gas through a bubbling cylinder. The bubbler was maintained at a constant temperature to supply the constant vapor pressure of Hf-source. The apparatus of the system is schematically shown in Fig. 1. High-purity Ar carrier gas passed through the bubbler containing the Hf-source. The gas line from the bubbler to the nozzle was heated to the same temperature. The mixture of Ar and metal-organic gases heated at the tip of the nozzle flows into the main chamber. The introduced Hf-source decomposed into Hf and ligand parts when it reached a substrate maintained at high temperature and Hf ion was combined with O gas supplied from another nozzle. The base pressure and working pressure were 9 and 7 Torr, respectively. The HfO films grown by MOMBE were annealed at 700 C for min in N ambient. The process conditions are summarized in Table 1. Au dots were deposited to evaluate the electrical properties of grown HfO sample. The stainless shadow mask was used to make regular Au dots and the hole diameter in the mask was 0. mm. The determination of the electrode Table 1 Summary of process conditions Process variables Range Substrate temperature C Bubbler temperature 130 C (Fixed) Nozzle temperature 70 C (Fixed) Base pressure 9 Torr Working pressure 7 Torr Gas flow (Ar) 3 5 sccm Gas flow (O ) 3 5 sccm Growth time 30 min metal and accurate definition of electrode area has influence on the analysis of the electrical properties of HfO. 3. Modeling scheme 3.1. Design of experiments In order to characterize the high-k dielectric properties, the input factors are extracted with respect to the controllable process variables of MOMBE equipment. Those factors are the substrate temperature (T sub ), Ar gas flow (Ar) and O gas flow (O ). Generally, the factorial design creates two levels of each factor, which are called high and low, respectively. The full factorial design specifies factorial design with all possible high (+)/low ( ) combination of all the input factors. Considering the curvature effect, the design of two-level factors with center points is carried out (Montgomery, Keats, Perry, Thompson, & Messina, 000). The full factorial design matrix with one center point is summarized in Table. 3.. Latin Hypercube Sampling The Latin Hypercube Sampling (LHS) is used in this study to select randomized values for the weights and the Turbo Molecular pump Substrate heater Substrate holder Shutter Main chamber Loadlock chamber Ar O Mass flow controller View port Leak valve Nozzle Bubbler valves Pr essure Inlet gau ge Outlet Bubbler Heater Mass flow controller Fig. 1. The schematic of MOMBE systems.

3 360 K.E. Kweon et al. / Expert Systems with Applications 3 (007) Table Factorial design matrix Run T sub [ C] Ar [sccm] O [sccm] Remark Full factorial design Center point biases, which are parameters of neural networks. The LHS method is a stratified sampling technique where the random variable distributions are divided into equal probability intervals. The LHS method generates a sample size N from the n variables. A 1/N probability is randomly selected from within each interval that is partitioned into N nonoverlapping ranges for each basic event (Swidzinski & Chang, 000). Unlike the simple random sampling, the LHS method can describe a full coverage of the sampling range by maximally satisfying each marginal distribution. The distributions of sampling with respect to the selecting method are illustrated in Fig.. The 0 samples were generated in the range of ( 0.5, 0.5). It is presented that the sampling values of the LHS method are uniformly distributed comparing to that of the random sampling. Therefore, the unbiased random values of the weights and biases for the neural networks were selected via the LHS method. The neural networks in this work carried out with the error BP algorithm. The error BP neural networks consist of several layers of neurons which receive, process and transmit critical information regarding the relationships between the input parameters and corresponding responses. Generally, the weight mechanism of the BP algorithm is defined by the following (Chen, 1996): w ijk ðn þ 1Þ ¼w ijk ðnþþgdw ijk ðnþ ð1þ where w ijk is the connection strength between the jth neuron in the layer (k 1) and the ith neuron in layer k, Dw ijk is the calculated change in that weight which reduces the error function of the networks, and g is the learning rate. This algorithm has been shown to be very effective in learning arbitrary nonlinear mappings between noisy sets of input and output factors. The schematic of general feed-forward neural networks are shown in Fig. 3. The neural networks parameters used in this study are summarized in Table 3. These networks were trained on nine experimental runs. The two trials were used for testing data in order to verify the fitness of the NNet outputs for the y y j.... y n W 1j Responses W oj Output Layer 3.3. Neural networks Neural networks are utilized to model the nonlinear relationship between inputs and outputs in semiconductor process modeling. The networks consist of the three layers that are the input layer, the hidden layer and the output layer. That is comprised of simple processing units called neurons, interconnection, and weights that are assigned to the interconnection between neurons (May, 1994). Each neuron contains the weighted sum of its inputs filtered by a nonlinear sigmoid transfer function. W 11 h 1 x h k x i h o W mo x m Inputs Fig. 3. Typical feed-forward neural networks. Hidden Layer(s) Input Layer Frequency 8 6 Frequency Value Value Fig.. Two difference distributions of the sampling values: the simple random sampling and LHS.

4 K.E. Kweon et al. / Expert Systems with Applications 3 (007) Table 3 Summary of the neural network parameters NNets parameters NNet structure NNet learning rate NNet momentum 0.04 results of the training data. The root mean square errors (RMSEs) of the training for C acc and the hysteresis are 0.76 and 0.03, respectively. The RMSEs for the testing are 0.76 and 0.03, respectively. 4. Results and discussion The neural network model results and the residual plots for C acc and the hysteresis are illustrated in Figs. 4 and 5, where the squares represent the training data and the triangles represent the testing data for prediction. The modeling results exhibit a good agreement with the values between the predicted and the measured responses, respectively. It is observed that the residual plots for all responses are randomly distributed and there are no special patterns and features indicating that the results are satisfied with the statistical assumption for the residuals (Mayers & Montgomery, 1995). The statistical significances of three input factors are listed in Table 4 under the significance level (a = 0.05). For the accumulation capacitance (C acc ), T sub and Ar are significance factors and Ar and O are considered as significance factors for the hysteresis index. The response surface plots of the accumulation capacitance are shown in Fig. 6 when O is fixed at the 4 sccm and T sub is fixed at 500 C, respectively. The accumulation capacitance is proportional to the dielectric constant and inversely proportional to the equivalent oxide thickness Table 4 Statistical significance level Factor Significance level C acc Hysteresis T sub Ar O C acc (Network Outputs) [pf] : training data : testing data C acc (Experimental Data) [pf] Residuals Run Order Fig. 4. The neural network modeling results for C acc : the measured vs. the predicted values and the residual plot. Hysteresis (Network Outputs) [V] : training data : testing data Hysteresis (Experimental Data) [V] Residuals Run Order Fig. 5. The neural network modeling results for the hysteresis: the measured vs. the predicted values and the residual plot.

5 36 K.E. Kweon et al. / Expert Systems with Applications 3 (007) Fig. 6. The response surface plots for C acc : O = 4 sccm and T sub = 500 C. (EOT). With increasing T sub, fully decomposed Hf source [Hf (O Æ t-c 4 H 9 ) 4 ] makes the hydrocarbon-rich circumstances. The incorporation of them limits the crystallite size and causes the dominant tetragonal phase in the film. It was found that small O /Ar ratio causes the hydrocarbon-rich plasma and limits crystal size. As small O /Ar ratio causes the hydrocarbon-rich plasma and limits crystal size, the accumulation capacitance (C acc ) is increased (Kim et al., 004). As the substrate temperature (T sub ) is increased, the oxide thickness is decreased and C acc is increased. Based on the results for h XRD scan shown in Fig. 7, the tetragonal phase is observed at The tetragonal phase means that crystallite size is limited and small because the tetragonal phase can be stabilized in very small crystallites (Garvie, 1978; Garvie & Gross, 1985). As shown in Fig. 7, the intensity of the tetragonal phase is increased as the substrate temperature is increased from 450 C to 550 C. It can be interpreted that the tetragonal phase affects the reduction of the oxide thickness (Kim et al., 004). The response surface plots of the hysteresis are shown in Fig. 8 when T sub is fixed at 500 C and O is fixed at 4 sccm, respectively. As shown in Fig. 8, the hysteresis that is proportional to the interfacial trap density (D it ) increases with decreasing O /Ar ratio because the formation of a superior interface of the oxide layers decreases D it with increasing O /Ar ratio (Wilk et al., 001). During the growth of HfO on a Si substrate, Hf is deposited and reacts with the oxygen. HfO bulk þ Hf þ O! HfO bulk ð1þ However, the oxygen is not enough to react with Hf, the oxygen vacancy (V O ) is created. HfO bulk þ Hf! HfO bulk þ V O ðþ As shown in Fig. 8, sufficient oxygen vacancy mobility decomposes the interfacial layer of SiO and creates the interfacial silicate. It was found that these decomposition reactions take place actively when T sub is lower than 500 C (Copel & Reuter, 003). In addition, the charges are trapped by the oxygen vacancies as voltage sweep bidirectionally. The interfacial trap density and the hysteresis index are increased due to the trapped charges. As T sub increases from 450 C to 550 C, the decomposition does 00 Au 00 Au Intensity (a.u.) 0 m(-1 1 1) t(1 1 1) m(1 1 1) m( 0 0) Intensity (a.u.) 0 m(-1 1 1) t(1 1 1) m( 0 0) Θ Θ Fig. 7. The h XRD scan: T sub = 450 C, Ar = 5 sccm, and O = 5 sccm and T sub = 550 C, Ar = 5 sccm, and O = 5 sccm.

6 K.E. Kweon et al. / Expert Systems with Applications 3 (007) Fig. 8. The response surface plots for the hysteresis: T sub = 500 C and O = 4 sccm. not happen actively and the hysteresis index is decreased. Based on this analysis, the modeling results reveal a good agreement with the physical mechanism. 5. Conclusion The electrical characteristics of HfO thin films were investigated via the error BP neural network model using The Latin Hypercube Sampling and the neural network models to correlate between the process conditions and the electrical characteristics were developed. The Latin Hypercube Sampling method used to generate the weights and the biases with equal probability distribution within a specific interval statistically randomly. From these results, the neural network modeling can explain the comprehensive effects of the response on the varying process conditions in accordance with the physical mechanisms. The methodology can allow us to predict electrical properties with respect to process conditions as well as it can improve the manufacturability. Acknowledgement This work was supported by the Brain Korea 1 Project in 005. References Chen, C. H. (1996). Fuzzy logic and neural network handbook. McGraw- Hill. Cho, B.-O., Chang, J. P., Min, J.-H., Moon, S. H., Kim, Y. W., & Levin, I. (003). Material characteristics of electrically tunable zirconium oxide thin films. Journal of Applied Physics, 93(1), Cho, B.-O., Wang, J., Sha, L., & Chang, J. P. (00). Tuning the electrical properties of zirconium oxide thin films. Applied Physics Letters, 80(6), Copel, M., & Reuter, M. C. (003). Decomposition of interfacial SiO during HfO deposition. Applied Physics Letters, 83(16), Garvie, R. C. (1978). Stabilization of the tetragonal structure in zirconia microcrystals. Journal of Physical Chemistry, 8(), Garvie, R. C., & Gross, M. F. (1985). Intrinsic size dependence of the phase transformation temperature in zirconia microcrystals. Journal of Material Science, 1(4), Gusev, E. P., Copel, M., Cartier, E., Baumvol, I. J. R., Krug, C., & Gribelyuk, M. A. (000). High-resolution depth profiling in ultrathin Al O 3 films on Si. Applied Physics Letters, 76(), Han, S., Ceiler, M., Bidstrup, S., Kohl, P., & May, G. (1994). Modeling the properties of PECVD silicon dioxide films using optimized back propagation neural networks. IEEE Transactions on Components, Packaging, and Manufacturing Technology Part A, 17(), Kim, M.-S., Ko, Y.-D., Hong, J.-H., Jeong, M.-C., Myoung, J.-M., & Yun, I. (004). Characteristics and processing effects of ZrO thin films grown by metal-organic molecular beam epitaxy. Applied Surface Science, 7, Lee, B. H., Kang, L., Nieh, R., Qi, W.-J., & Lee, J. C. (000). Thermal stability and electrical characteristics of ultrathin hafnium oxide gate dielectric reoxidized with rapid thermal annealing. Applied Physics Letters, 76(14), Lee, K. K., Brown, T., Dagnall, G., Bicknell-Tassius, R., Brown, A., & May, G. (000). Using neural networks to construct models of the molecular beam epitaxy process. IEEE Transactions on Semiconductor Manufacturing, 13(1), May, G. (1994). Manufacturing ICs the neural way. IEEE Spectrum, Mayers, R. H., & Montgomery, D. C. (1995). Response surface methodology. New York: Wiley. Montgomery, D. C., Keats, J. B., Perry, L. A., Thompson, J. R., & Messina, W. S. (000). Using statistically designed experiments for process development and improvement: an application in electronics manufacturing. Robotics and Computer Integrated Manufacturing, 16, Qi, W.-J., Nieh, R., Lee, B. H., Kang, L., Jeon, Y., & Lee, J. C. (000). Electrical and reliability characteristics of ZrO deposited directly on Si for gate dielectric application. Applied Physics Letters, 77(0), Semiconductor Industry Association (SIA), 000. The national technology roadmap for semiconductors. Swidzinski, J. F., & Chang, K. (000). Nonlinear statistical modeling and yield estimation technique for use in Monte Carlo simulations. IEEE Transactions on Microwave Theory and Techniques, 48(1), Wilk, G. D., Wallace, R. M., & Anthony, J. M. (001). High-k gate dielectrics: current status and materials properties considerations. Journal of Applied Physics, 89(), Zhu, J., Li, Y. R., & Liu, Z. G. (004). Fabrication and characterization of pulsed laser deposited HfO films for high-k gate dielectric applications. Journal of Physics D: Applied Physics, 37,