CHAPTER 3 FUZZY LOGIC BASED FRAMEWORK FOR SOFTWARE COST ESTIMATION

Size: px
Start display at page:

Download "CHAPTER 3 FUZZY LOGIC BASED FRAMEWORK FOR SOFTWARE COST ESTIMATION"

Transcription

1 CHAPTER 3 FUZZY LOGIC BASED FRAMEWORK FOR SOFTWARE COST ESTIMATION The Fuzzy Logic System is one of the main components of soft computing, the field in computer science that deals with imprecision, uncertainty, and approximation to achieve practicability, robustness and low cost solutions. It can mimic the ability of the human mind to deal with reasoning and approximation problems rather than those that are more exact [1]. If a Neural Network is primarily concerned with learning ability and Probabilistic Reasoning deals with uncertainty, then the Fuzzy Logic methodology introduced by Prof. Lofti Zadeh in 1965 is a useful tool for dealing with imprecision, uncertainty, and complexity in problems that are difficult to solve quantitatively [2]. The three main processes of fuzzy logic system is described in Section 3.1. Section 3.2 describes the proposed fuzzy framework for software cost estimation. Section 3.3 discusses the results and section 3.4 concludes the chapter. 3.1 FUZZY LOGIC SYSTEM The Fuzzy Logic System [3] deals with fuzzy parameters, which address imprecision and uncertainties, by mapping out the path of a given input to an output using the computing framework called the Fuzzy Inference System (FIS) [4, 5]. This framework consists of three main processes: the Fuzzification Process, the Inferences Process from Fuzzy Rules, and the Defuzzification Process. Fig. 3.1 is a diagram of the fuzzy inference system with the three main processes. 35

2 Fig. 3.1: Fuzzy Inference System Fuzzification Process The Fuzzification Process consists of a fuzzifier that transforms crisp input into a fuzzy set of values based on its membership function (MF). A fuzzy set is a mathematical model comprised of vague qualitative or quantitative data, which is frequently generated by means of the natural language. The membership function is a curve that maps the inputs to a membership value that ranges between 0 and 1. The fuzzification process allows the input to the system to be expressed in linguistic terms. The most commonly used membership functions (MF) in a fuzzy system are triangular, trapezoidal, and Gaussian. The triangular membership function is specified by a triplet (a, b, c) as follows: Triangle(x: a, b, c) = x a b a c x c b 0 x < 0 a x b b x c 0 x > 0 (3.1) The parameters a and c locate the feet of the triangle and the parameter b locates the peak which is as shown in Fig

3 Fig. 3.2: A triangular MF specified by (3, 6, 8) The Gaussian membership function is specified by two parameters (m, σ) as follows. Gaussian(x: m, σ) = exp (x m)2 σ 2 (3.2) The parameter m is the position of the centre of the peak, and σ is the standard deviation which controls the width of the "bell". Gaussian MF is shown in Fig Fig. 3.3: A Gaussian MF specified by (2, 5). The trapezoidal curve is a function of a vector, x, and depends on four scalar parameters a, b, c, and d, as given by (3.3) and represented by Fig

4 Trapezoidal (x: a, b, c, d) = x a b a 0 x a a x b 1 b x c d x c x d d c 0 d x (3.3) Fig. 3.4: A Trapezoidal MF specified by (1, 5, 7, 8) Inference Process The inference process involves the fuzzy inference engine that is used to perform the mapping between the input from the fuzzification process and the output based on expert knowledge or rules. The role of fuzzy rules in the inference process is to capture the imprecise modes of reasoning and to act as the means to produce the fuzzy output from the fuzzy input. A fuzzy rule is also known as the Fuzzy IF-THEN rule and is generally expressed as follows [6] : IF (x is A) AND (y is B) THEN (z is Z) Where, x, y, z represent the variables, and A, B, Z are the linguistic values in the universe of discourse. This rule can be divided into two parts, the IF part, which is referred to as the antecedent or premise that contains the fuzzy description of the measured input values, and the 38

5 THEN part, which is referred to as the consequent or conclusion that defines a possible fuzzy output for every corresponding input. The inference process creates the fuzzy output as the aggregation from several fuzzy rules. Fig. 3.5 illustrates the inference process, which involves three fuzzy rules with three parameters i.e. service, food, and tip. Fig. 3.5: Fuzzy Inference Process Defuzzification Process The defuzzification process [7] produces and translates an aggregate fuzzy output from the inference process into a quantifiable result or crisp output. The most popular defuzzification method is the centroid calculation, which returns the center of an area 39

6 under the curve. There are various other methods like bisector, middle of maximum (the average of the maximum value of the output set), largest of maximum, and smallest of maximum. Fig. 3.6 shows the defuzzification process using centroid calculation method. Fig. 3.6: Defuzzification Process 3.2 PROPOSED FUZZY FRAMEWORK A new fuzzy logic based framework is developed using the inbuilt Fuzzy Inference System (FIS) present in Fuzzy Logic toolbox [8] of MATLAB 7 to handle the imprecision and uncertainty present in the most widely used COCOMO model for software cost estimation. [9, 10] The first realization of the fuzziness of several aspects of COCOMO was that of Fei and Liu [11] called F-COCOMO. The reason for fuzziness to be considered in COCOMO lies in the fact that the division of evaluation and rating of some involved factors, which have important influence upon development cost, are vague and indistinct. Fuzzy logic based cost estimation allows inputs and outputs to be represented linguistically and hence contributing to more accuracy in effort estimation [12]. Many researchers have used triangular, trapezoidal and Gaussian MFs extensively [13, 14, 15, 16] to model the COCOMO parameters but none of them have used the PI membership function. So a fuzzy framework using PI membership function is proposed for software cost estimation using COCOMO model. This technique is able 40

7 to handle the transition of the project attributes more smoothly and has provided commendable results as compared to the earlier techniques. The proposed framework is compared with the fuzzy frameworks developed using Triangular and Gaussian membership functions. The PI membership function is specified by four parameters (a b c d) as follows: Pi x: a, b, c, d = 2 x a x b 0 x a x b b a 1 2 x c d c 2 x d 2 d c 2 2 a x a+b 2 a+b 2 x b c x c+d 2 c+d 2 0 x d x d (3.4) The parameters a and d locate the feet of the curve, while b and c locate its shoulders as shown in Fig.3.7. Fig. 3.7: A PI MF specified by (1, 4, 5, 10). In this framework all the parameters of COCOMO model i.e. Size, Mode, and 15 Cost drivers are fuzzified using Fuzzy Inference System (FIS). To implement the fuzzy framework the following steps are followed: 41

8 Step 1. Choice of membership functions In this approach PI membership function is chosen to model the COCOMO parameters but other membership functions (MF) namely triangular MF and Gaussian MF are also used in order to compare with the PI fuzzy framework. Step 2. Fuzzification of nominal effort Fuzzy Inference System (FIS) is developed to calculate nominal effort i.e. the effort without costdrivers as shown in Fig The inputs to this system are MODE and SIZE. The output is Fuzzy Nominal effort. Input variables SIZE and MODE represented as PI MF and are shown in Fig. 3.9 and Fig respectively. Output variable NOMINAL EFFORT represented as PI MF is shown in Fig In the same way FIS are developed for the same using triangular and Gaussian MF s too. Fig.3.8: Fuzzy Inference System for Nominal Effort 42

9 Fig.3.9: Input Variable SIZE represented as PI MF Fig.3.10: Input Variable MODE represented as PI MF 43

10 Fig. 3.11: Output Variable EFFORT represented as PI MF The fuzzy rules formed for size, mode and effort are as follows: If size is s1 and mode is organic then effort is e11 If size is s1 and mode is semi-detached then effort is e12 If size is s1 and mode is embedded then effort is e13 If size is s2 and mode is organic then effort is e21 If size is s2 and mode is semi-detached then effort is e22 If size is s2 and mode is embedded then effort is e23... If size is s11 and mode is organic then effort is e11_1 If size is s11 and mode is semi-detached then effort is e11_2 If size is s11 and mode is embedded then effort is e11_3 Step 3. Fuzzification of Cost Drivers All cost drivers are fuzzified using separate FIS for every cost driver. Also each cost driver is fuzzified using all the three MFs i.e. PI MF, Triangular MF and Gaussian MF. Thus, a total of 15*3=45 FIS are developed for cost drivers. Sample fuzzification 44

11 of VEXP based on Table 3.1 and Table 3.2 is shown in Fig.3.12 and Fig.3.13 using triangular MF (TMF). The same is implemented using Gaussian and PI MF. Table 3.1: VEXP Cost Driver Range Specified in Months Very low Low Nominal High Table 3.2: VEXP Effort Multiplier Range Definition Very low Low Nominal High Fig.3.12: Antecedent MFs for VEXP represented as TMF 45

12 Fig.3.13: Consequent MFs for VEXP represented as TMF Rules obtained from Fig and Fig.3.13 are : If vexpa (vexp antecedent) is vlow then vexpc (vexp consequent) is incsig (increased significantly) If vexpa is low then vexpc is inc (increasing) If vexpa is nom (nominal) then vexpc is uc (unchanged) If vexpa is high then vexpc is dec (decreasing) Sample fuzzification of TIME based on Table 3.3 and Table 3.4 is shown in Fig and Fig.3.15 using PI MF. The same is implemented using triangular and Gaussian MFs also. Table 3.3: TIME Cost Driver Range in Terms of Percentage Nominal High Very High Extra High <=

13 Table 3.4: TIME Effort Multiplier Range Definition Nominal High Very High Extra High Fig. 3.14: Antecedent MFs for TIME represented as PI MF 47

14 Fig. 3.15: Consequent MFs for TIME represented as PI MF Rules obtained from Fig and Fig.3.15 are: If timea (time antecedent) is nom (nominal) then timec (time consequent) is uc (unchanged) If timea is high then timec is inc (increasing) If timea is vhigh (very high) then timec is incsig (increasing significantly) If timea is ehigh (extra high) then timec is incdras (increasing drastically) Sample fuzzification of PCAP based on Table 3.5 and Table 3.6 is shown in Fig and Fig using Gaussian MF. The same is implemented using triangular and PI MFs also. Table 3.5: PCAP Cost Driver Range Defined in Terms of Percentile Very Low Low Nominal High Very High

15 Table 3.6: PCAP Effort Multiplier Range Definition Very low Low Nominal High Very High Fig. 3.16: Antecedent MFs for PCAP represented as Gaussian MF 49

16 Fig. 3.17: Consequent MFs for PCAP represented as Gaussian MF Rules obtained from Fig and Fig.3.17 are: If pcapa (pcap antecedent) is vlow (very low) then pcapc (pcap consequent) is incsig (increasing significantly) If pcapa is low then pcapc is inc (increasing) If pcapa is nom (nominal) then pcapc is uc (unchanged) If pcapa is high then pcapc is dec (decreasing) If pcapa is vhigh(very high) then pcapc is decsig (decreasing significantly) Step 4: Estimated effort calculation by integrating components Estimated effort is obtained by multiplication of nominal effort obtained from step 2 and effort adjustment factor (EAF) obtained from step 3 (by multiplying effort multipliers corresponding to each cost driver). 50

17 3.3 RESULTS Experiments are done by taking some of the original projects from COCOMO81 dataset [9]. The estimated efforts using mathematical equations of COCOMO model and implementing fuzzy framework of COCOMO using PI MF, Triangular MF and Gaussian MF are tabulated and compared. They are shown in Table 3.7. The evaluation consists in comparing the accuracy of the estimated effort with the, actual effort. There are many evaluation criteria for software effort estimation, among them we applied the most frequent one which is Magnitude of Relative Error (MRE). After analysing the results obtained, it is observed that the effort estimation using PI membership function is giving more precise results in maximum projects as compared to triangular and Gaussian membership functions. The magnitude of relative error (MRE) is calculated as discussed in section 2.4. The effort obtained by fuzzifying the size, mode and all the 15 cost drivers using PI MF is yielding better estimate. For example, the MRE calculated for project ID (P.ID) 6 using COCOMO equations, triangular MF, Gaussian MF and the PI MF is 21.6, 20.33, 75 and respectively. This clearly shows that there is a decrement in the relative error, so the proposed technique of modelling COCOMO parameters is more suitable for effort estimation. 51

18 Table 3.7: Effort comparison using COCOMO model, Triangular MF, PI MF and Gaussian MF P.ID Mode Size EAF Actual Effort Cocomo Effort TMF PIMF GAUSSMF EAF Effort EAF Effort EAF Effort CONCLUSIONS This chapter proposed a fuzzy framework for cost estimation using PI membership function (MF). It is designed using COCOMO model and all the parameters of COCOMO i.e. size, mode, cost drivers and effort are fuzzified using PI, Triangular and Gaussian membership functions. The experiments done proved that the new approach using PI MF is better than using TMF (triangular membership function), Gauss MF and Intermediate COCOMO model as used in early researches on software cost estimation. 52

19 REFERENCES [1] Zadeh L A: Fuzzy Logic, Neural Networks, and Soft Computing, Communication of the ACM, Vol.37, No.3, 1994, pp [2] Zadeh L A: Fuzzy Sets, Information and Control, Elsevier, Vol. 8, 1965, pp [3] Sivanandam S N and Deepa S N: Principles of Soft Computing, Wiley, India, [4] Jang J -S R, Sun C T and Mizutani E: Neuro-Fuzzy and Soft Computing, A Computational Approach to Learning and Machine Intelligence, Prentice- Hall, Inc., 1 st Edition, [5] Shin Y C and Xu C: Intelligent Systems: Modeling, Optimization, and Control, CRC Press, Taylor and Francis Group, 2009, pp [6] Sumathi S and Surekha P: Computational Intelligence Paradigms Theory and Applications using MATLAB, CRC Press, [7] [8] [9] promise.site.uottawa.ca/se Repository/datasets-page.html [10] Boehm B W: Software Engineering Economics, Prentice Hall, Englewood Cliffs, NJ, [11] Fei Z and Liu X.: F-COCOMO-Fuzzy Constructive Cost Model in Software Engineering, Proceedings of IEEE International Conference on Fuzzy System, IEEE Press, San Diego, 8-12 March,1992, pp [12] Idri A and Abran A: COCOMO Cost Model using Fuzzy Logic, 7 th International Conference on Fuzzy Theory and Technology, Atlantic City, New Jersey, 27 Feb.-3 March 2000, pp [13] Ch. Satyananda Reddy and KVSVN Raju: Improving the Accuracy of Effort Estimation through Fuzzy Set Representation of Size, Journal of Computer Science, Vol. 5, No. 6, 2009, pp

20 [14] Ch. Satyananda Reddy and KVSVN Raju : An Improved Fuzzy Approach for COCOMO s Effort Estimation using Gaussian Membership Function, Journal of software, Vol. 4, No. 5, 2009, pp [15] Kazemifard M, Zaeri A, Ghasem-Aghaee N, Nematbakhsh M A and Mardukhi F: Fuzzy Emotional COCOMO-II Software Cost Estimation (FECSCE) using Multi-Agent Systems, Applied Soft Computing, Elsevier Vol. 11, No. 2, 2011, pp [16] Saliu M O, Ahmed M : Soft Computing based Effort Prediction Systems A Survey, in : E.Damiani, L.C. Jain (Eds), Computational Intelligence in Software Engineering, Springer-Verlag, July 2004, ISBN