Electro-active ternary copolymer design using genetic algorithm

Size: px
Start display at page:

Download "Electro-active ternary copolymer design using genetic algorithm"

Transcription

1 Indian Journal of Chemistry Vol. 50A, January 2011, pp Electro-active ternary copolymer design using genetic algorithm Avneet Kaur & A K Bakhshi* Department of Chemistry, University of Delhi, Delhi , India akbakhshi2000@yahoo.com Received 22 September 2010; revised and accepted 20 December 2010 Genetic algorithm has been applied for obtaining the optimal solution for a copolymer with minimum band gap and maximum delocalization. The results for a case study of ternary copolymers belonging to type II staggered class are presented. A comparative study of the results obtained using two alternative definitions of fitness function has also been made. It is seen that the ionization potential of the homopolymers is the determining factor in the percentage ratios of the optimum solution. The results obtained by genetic algorithm constitute an informative set which the designer of conducting polymers can use for a more informed decision. Keywords: Theoretical chemistry, Polymers, Conducting polymers, Copolymers, Electronic properties, Genetic algorithm Low band gap conjugated polymers have attracted attention because of their expected good intrinsic electrical conductivity 1 and nonlinear optical properties 2,3. Over the past decade, much effort has been devoted to the design of new organic conjugated materials which have very low band gaps without the need of doping 4,5. The conduction properties of an undoped polymer are related to its electronic properties such as band gap (E g ) and band widths. Band gap of a polymer is a measure of its ability to show intrinsic conductivity which means that the value of band gap determines whether thermal excitations can lead to appreciable conductivity or not. Band widths, which are a measure of the extent of delocalization in the system, can be qualitatively correlated with the mobility of charge carriers in the band, i.e., a polymer with a relatively smaller band gap and larger band widths will be a better intrinsic conductor of electricity. The band gap of simple conjugated organic polymers can be tuned by modifying the nature of the repeat unit and changing the substituent 6. Copolymerization is a method of designing low band gap polymers with improved functional properties that cannot be observed for the corresponding homopolymers The properties of copolymers can be modified by varying either the ratio of various constituents or the manner by which these are chemically attached. It will be worthwhile to know what ratio of various constituents in a copolymer will make it highly conducting with minimum band gap and maximum delocalization. This task involves searching through large number of possibilities which could take months of trial and error experiments. A novel and interesting approach to such problems is the use of genetic algorithm (GA). GA is a search procedure based on mechanics of natural evolution which guarantees that it will find an optimal ratio of constituents in a reasonable time. This algorithm is increasingly being used by researchers to solve a variety of search and optimization problems To gauge the dependability and efficiency of the algorithm, we have investigated the class of type II staggered copolymers (copolymers in which the top of the valence band of one component lies within the band gap of the other and the bottom of the conduction band of the second lies in the band gap of the first). The systems studied are modeled on real systems of copolymers 4,5. Methodology The algorithm The genetic algorithm starts with a group of randomly generated initial solutions. Each solution can be represented as bit-strings (sequences of zeros and ones) of specified length. Each string is named as a chromosome, genes are parts of a chromosome, and population is a group of chromosomes used in a GA iteration ( generation ). Once the first population

2 10 INDIAN J CHEM, SEC A, JANUARY 2011 is generated the fitness (how good is the proposed solution) of each individual (chromosome) is calculated through an evaluation function. The next populations ( offspring generated from parents ) are composed using the elitism option and crossover among chromosomes. These steps are repeated until the criteria convergence is reached. GA parameters for the polymer problem Consider ternary copolymer with three different constituent units A, B and C (A x B y C z ; x + y + z = 100), where the relative concentrations of A, B and C define the overall conductivity of the copolymer. The objective is to find out the percentages of A, B and C that would give us the most conducting polymer, i.e., the one with minimum band gap and maximum electronic delocalization. We have imposed a restriction that the concentration of none of the homopolymers can be zero, i.e., x, y and z cannot be zero. Percentage x of homopolymer A in the copolymer can be represented as bit-string of specified length (chromosome). The GA adopted by us considers a population of five chromosomes, each one encoding a different solution to the optimization problem 15,16. The robustness of the optimum solution was checked by varying the population size. Also, the optimum solution obtained from genetic algorithm using population size as 5 in the case of a binary copolymer 19 was tallied with the solution obtained using systematic search. The result from GA as well as from the systematic search matched, indicating that the population size taken in the present study gives good results. Since the maximum value of a chromosome on decoding can be 99, the length of each chromosome is seven. From the given generation, a new population is generated keeping the fittest chromosome (elitism) and by single point crossover 17,18 between chromosomes selected with probability proportional to their fitness. Single point crossover was implemented taking the fifth bit position (out of 7 bits) as the point of crossover (i.e. swapping of last three bits). Constructing fitness function Fitness function, f(x, y) is a mathematical formula used to calculate efficiency of a chromosome. There are alternative ways to define f(x, y), but the fitness function must necessarily include two variables (band gap and IPN). In this paper, we have considered two alternative definitions of fitness functions (Eqs 1 & 2), (a) 1 f ( x,y ) = (1/ ) E ( x,y) 1 ρ g + IPN( x,y) (1) where ρ is the energy difference between the uppermost lowest unoccupied molecular orbital (LUMO) and lowermost highest occupied molecular orbital (HOMO) out of the constituent homopolymers. Here, the fitness function ƒ 1 (x, y) has been defined by attributing the same statistical weight to band gap and inverse participation number (IPN) of the copolymer. As they have different range definition (band gap varying from 0.0 to ρ, and IPN from 0.0 to 1.0), we have used a scale constant (1/ρ) to satisfy this condition. (b) f 2 (x, y) = - log e {E g (x, y) * IPN (x, y)} (2) Here, the fitness function ƒ 2 (x, y) has been defined by taking product of band gap and IPN. As band gap and IPN vary over many orders of magnitude the log use is numerically convenient in order to smooth the function to avoid large fluctuations. The band gap and IPN values can be found out considering copolymer chain consisting of N units whose energy states can be obtained by solving the Hückel Hamiltonian (in tight binding approximation) (Eq. 3), N N H = α i i i + i= 1 i, j βi, j i j i j (3) where α s and β s are the usual Coulomb and hopping integrals. The α and β values were obtained from the corresponding model band structure the homopolymers 19. Solving this determinant using Negative Factor Counting method gave the value of the band gap. The delocalization parameter is defined by the IPN, I j of the HOMO, which can be obtained through the use of inverse iteration method (Eq. 4), I = j N r = 1 N r = 1 C C jr jr (4) where C jr are the LCAO expansion coefficients. We have earlier carried out a detailed analysis 20 by varying the seed (sequence generation parameter) to

3 KAUR & BAKHSHI: ELECTRO-ACTIVE TERNARY COPOLYMER DESIGN USING GENETIC ALGORITHM 11 search for optimum structures. It was observed that the optimized solutions are nearly seed independent, i.e., the randomness of the sequence does not affect the solution with best fitness. The optimized solutions at different seeds vary by 1-2 %. Also, if the polymeric chains are long enough the relative concentration of the different units is more important than their specific position in the chains, considering that they are randomly formed. Results and Discussion As already mentioned, our aim is to predict the relative percentage concentrations (x, y and z) of components A, B and C in a copolymer which define the most conductive polymer. The band alignments along with band widths of homopolymers (A) x, (B) x and (C) x are given in Fig. 1. For system 1a, the band gaps of all three homopolymers are different whereas for systems 1b and 1c, the band gaps of two homopolmers are the same. The widths of both conduction and valence bands of the homopolymers are different in all the systems. In the present calculation of the density of electronic states, we have consistently used a chain length of 300 units and an energy grid size of ev. The α and β values used in the Hückel determinant are given in Table 1. A detailed analysis has been done by varying the fitness function to search for structures with minimum gap between HOMO and LUMO and maximum delocalization of HOMO. Fortran 77 programming language was used to write the algorithm. Optimum solution for the copolymers In GA (Fig. 2), a population of randomly generated five chromosomes with seven genes each was created initially. The fitness of each chromosome was evaluated using the fitness function (Eqs 1, 2). The chromosome with highest fitness was carried forward to the next generation. In the next generation, chromosomes were generated by single point crossover of two chromosomes at a time. Finally, when the best fit chromosome value remained the same for 15 generations, the simulation was stopped to obtain the optimized solution. As given in Table 2 for system 1a, the solution obtained was A 3 B 1 C 96 with a band gap value of ev and the IPN value of using both the fitness functions. For system 1b, we obtained A 15 B 84 C 1 as the solution using fitness function f 1 and A 3 B 96 C 1 using f 2. Similarly for system 1c, the solution obtained were A 1 B 76 C 23 and A 1 B 94 C 5 Fig. 1 Band alignments of constituent homopolymers. [(a) system 1a; (b) system 1b; (c) system 1c].

4 12 INDIAN J CHEM, SEC A, JANUARY 2011 Table 1 Values of alpha (α i ) and beta (β ii ) used in the determinant for the constituent homopolymers. [Subscript i represents the component (A, B or C)] System Homopolymer Valence band Conduction band α i (ev) β ii (ev) α i (ev) β ii (ev) System 1a System 1b System 1c (A) x (B) x (C) x (A) x (B) x (C) x (A) x (B) x (C) x Table 2 Results of GA using two fitness functions (FF) System FF used GA solution E g (ev) IPN Fitness value 1a 1b 1c f 1 A 3 B 1 C f 2 A 3 B 1 C f 1 A 15 B 84 C f 2 A 3 B 96 C f 1 A 1 B 76 C f 2 A 1 B 94 C using f 1 and f 2 respectively. Solutions obtained from both the fitness functions satisfy the required conditions of conducting structures, i.e., low band gap and high delocalization. Basically the optimal solution obtained from GA depends on the way we define the fitness function. We observe that function f 1 gives us a better result with regard to lower band gap whereas function f 2 gives a better result with respect to higher delocalization. Fig. 2 Overview of steps involved in obtaining optimum solution. DOS studies Figure 3 gives the density of states (DOS) plot of the valence band of the most conducting copolymer (the proposed GA solution). The DOS distribution of the copolymer consists of broad regions of allowed energy states with gaps in between. For system 1a, most of the valence energy states are localized in the region from -7.0 to ev. For system 1b and 1c, valence energy states are contained in the region -9.0 to ev and -6.5 to ev respectively. The negative of the top of the valence band corresponds to the IP of the copolymer. It can be seen that the IP of the copolymer is almost equal to the lowest IP out of the constituent homopolymers.

5 KAUR & BAKHSHI: ELECTRO-ACTIVE TERNARY COPOLYMER DESIGN USING GENETIC ALGORITHM 13 Fig. 3 DOS curves of valence bands of copolymers. [GA solution: 1a to 1c; Energy in ev and number of states N in relative units]. From the results of these systems the following generalizations (independent of the fitness function used) may be made: the highest percentage of the homopolymer with highest energy of the HOMO gives a more conducting copolymer, i.e., ionization potential plays a major role in this case. Subsequently, the homopolymer with a lower band gap will have the next higher percentage in the optimum solution. If however, the band gaps of the two homopolymers are comparable ((A) x and (C) x in system 1c) and their band widths are different, then a higher percentage of the larger band width homopolymer ((B) x) gives a more conducting copolymer, i.e., band width becomes Fig. 4 Plot of fitness values versus number of generations. the dominating factor. This is because greater the band width, higher is the delocalization and correspondingly smaller would be the IPN. Fortunately, in the present study of the polymers, the parameters, band gap and IPN, do not conflict in the sense that making one better does not render the other worse. Also, the fitness functions chosen in the present study take care of the difference in the range of the two parameters. Therefore, there was no need to invoke Pareto- Optimal improvement 21. Evaluation of convergence criteria The convergence of GA has been studied using a plot (Fig. 4) where x axis represents the number of

6 14 INDIAN J CHEM, SEC A, JANUARY 2011 generations and y axis represents the fitness value (fitness function taken for these plots is f 1 ). For each generation, the overall best fitness value as well the average fitness value was noted. The average fitness is the sum of all the fitness values of a particular generation divided by the population size. Both set of values were then plotted against the number of generations. The plot shows the evolution of the corresponding fitness measure. Observe that due to elitism, the best fitness sequence increases monotonically. The average fitness plot shows a lot of variation indicating that no two populations are identical. For system 1a, starting from a value of , the average fitness showed high variation and was at the end of 16 generations. The maximum average fitness value obtained was , close to best fitness value of The best fit chromosome was obtained in generation one itself, therefore, the algorithm converged after 16 generations and the best fitness value remained invariant. For systems 1b and 1c, the number of generations required for finding the optimum solution was 30 and 34 respectively. For these systems, the best fitness increased rapidly initially and remained constant for about 4 generations but further increased to the maximum fitness and remained constant after that. Conclusions In GA, very good solutions are generated by probing less than 3% of the solution space as on an average 30 generations consisting of five individuals each are required to find an optimum solution. In all the systems studied herein, it is observed that the homopolymer with the highest HOMO energy constitutes the maximum percentage in the most conducting copolymer. Further, the band gap of the copolymer lies close to that of the lowest band gap homopolymer. The obtained results show that both the fitness functions presented are able to give an optimum solution to the problem of designing conducting polymers. Sometimes the copolymer generated may not have the desired stability or functionality irrespective of the fact that it may have the lowest band gap. Therefore, it may be that a particular homopolymer is not contributing in terms of reducing the band gap, but is required to improve the functionality of the copolymer. In such cases, since the parameters considered for evaluation of fitness of an individual are only band gap and IPN, the percentage concentration of that homopolymer in the optimized solution is 1. For such cases, the optimized solution may be improved by introducing chemical and physical restraints. This can be done simply by adding new terms on the evaluation function that will make some undesired structures to have low fitness or through restraints (discarding) in the generation of the polymeric structures. We will be extending this study by including some more real applications to demonstrate the importance of these investigations on the model systems. The results obtained by GA will certainly constitute an informative set, which the designer of conducting polymers can use for a more informed decision. References 1 Schwartz M, Berry R J, Dudis D S & Yeates A T, J Mol Struct (Theochem), 859 (2008) Lanzi M & Paganin L, European Polym J, 45 (2009) Nie G, Qu L, Xu J & Zhang S, Electrochim Acta, 53 (2008) Bakhshi A K, Deepika & Ladik J, Solid State Comm, 101 (1997) Bakhshi A K, Ago H, Yoshizawa K, Tanaka K & Yamabe T, J Chem Phys, 104 (1996) Salzner U, Pickup P G, Poirier R A & Lagowski J B, J Phys Chem A, 102 (1998) Adewumia A O & Ali M M, Math Comp Modell, 51 (2010) Giro R, Caldas M J & Galvaõ D S, Int J Quant Chem, 103 (2005) Savitha P & Sathyanarayana D N, Synth Metals, 145 (2004) Ocampo C, Alemán C, Oliver R, Arnedillo M, Ruiz O & Estrany F, Polym Int, 56 (2007) Kaur A & Bakhshi A K, Chem Phys, 369 (2010) Zhang Y, Xu J L, Yuan Z H, Xu H & Yu Q, Bioresource Technol, 101 (2010) Waheda M E & Ibrahim W Z, Nucl Eng Design, 240 (2010) Jurasovic K & Kusek M, Neurocomputing, 73 (2010) Haupt R L, Practical Genetic Algorithms, 2 nd Edn, (Wiley- IEEE, USA) Melanie M, An Introduction to Genetic Algorithms Complex Adaptive Systems, (MIT Press, USA) Giro R, Cyrillo M & Galvão D S, Mater Res, 6 (2003) Giro R, Cyrillo M & Galvão D S, Chem Phy Lett, 366 (2002) Kaur A & Bakhshi A K, Indian J Chem, 48A (2009) Kaur A, Wazir M, Garg A & Bakhshi A K, Indian J Chem, 48A (2009) Coello C A C & Becerra R L, Mater Manufact Proc, 24 (2009) 119.