Measuring Perceived Software Quality

Size: px
Start display at page:

Download "Measuring Perceived Software Quality"

Transcription

1 Measurg Perceved Software Qualty M Xeos ad D Chrstodoulaks Departmet of Computer Egeerg ad Iformatcs, Uversty of Patras, Ro 26500, Greece e-mal: xeos@ct.gr Ths paper presets a method for measurg customer s percepto of software qualty. We argue that, although the mportace of the perceved product qualty s recogsed world wde, there does ot exst a rgorous method for measurg customer percepto of product qualty. Ths paper presets a method expaded to measure ot oly ed users percepto for the product, but also compay employees percepto for the qualty of the teral delverables produced wth the compay. Addtoally, ths paper we preset examples of the method s applcato o a set of projects, parallel wth teral measuremets, usg a set of commoly used product metrcs. Subsequetly, we compare measuremet results that are derved from customer percepto measuremets to results that are derved from teral measuremets, ad we dscuss the advatages ad dsadvatages of each method. Keywords: software qualty assurace, software metrcs, user satsfacto measuremets May deftos of software qualty have bee publshed, whch geeral agree o what qualty meas ad ther agreemet ca be eshred by the phrase satsfacto of customer requremets. I smple laguage, software must do what the customer expects t to do. The customer plays a mportat role software qualty. The teratoal stadards ISO9000,2, IEEE 3 ad Baldrge 4,5 place emphass o customer perceved qualty ad expect that customer satsfacto be strogly lked to all fuctos of a busess. Wth the scope of a compay s qualty assurace program however, customers are ot oly the ed users of the product, but also the employees that use the results avalable at the ed of each phase of the software lfe cycle. Therefore, mplemetato teams are the customers that use delverables produced by the desg team ad, tur, mplemetato teams produce delverables for ther customers whch are the testg ad mateace teams. Throughout ths paper the term customer wll be used such a broad maer, cludg the teral compay teams actg as customers as well as the ed users of the product. Ths producer customer relatoshp requres a method order to measure the customer percepto of qualty. The ablty to tally measure ad evetually cotrol customer perceved qualty, s a major success factor software busess. Despte the dcatos derved from teral measuremet, the ed user s the ultmate judge of product qualty. I cases of dsagreemet betwee teral measuremet ad ed user perceved qualty, the best compay choce s to coform wth ed user opo. Furthermore, moder software compaes eed to measure ot oly the percepto of product qualty by ed users, but also the percepto of compay employees for the qualty of teral delverables. As prevously explaed, such employees operate as customers who receve the delverables whch other employees produce. Qualty assurace teams moder software compaes measure product qualty by applyg a method whch relates teral measurable quattes wth exteral qualty characterstcs. May examples of such metrcs ad ther terpretato ca be foud software measuremets lterature. For example, fucto pots 6 are used order to estmate product cost, cyclomatc complexty 7 s used order to estmate software complexty ad mataablty, Halstead 8 Effort Estmator s used order to detfy requred effort ad tme, etc. The great majorty of qualty assurace teams follow a stadard methodology that gudes them o how to orgase ad perform measuremets ad o how to relate measuremet results to product qualty characterstcs amg to cotrol. Although they recogse the mportace of measurg customer perceved qualty, surveys measurg t are, however, ot performed wth a smlar rgorous approach. Furthermore, compaes rarely measure the qualty of teral delverables by hadlg them as products ad employees recevg them, as customers. Therefore, a method s eeded allowg software compaes to rgorously measure the customer percepto of product qualty. Such measuremets could also be used for teral delverables qualty assessmet, as well as for evaluatg the performace of measurg procedures ad calbratg teral product metrcs used by the qualty assurace team. Ths paper presets a method adg rgorous orgasato of customer perceved qualty measuremets. Ths method s applcable to systems havg a suffcet umber of customers, adequate to produce a volume of resposes sutable for aalyss. The method cossts of techques offerg creasg relablty wth

2 Xeos M. & Chrstodoulaks D., Measurg perceved software qualty, Pre-prt verso of the paper publshed Iformato ad Software Techology Joural, Butterworth Publcatos, Vol. 39, Issue 6, pp , Jue 997. smlar crease the cost. Addtoally, examples from surveys applyg ths method are preseted. Measuremets usg ths method ad measuremets usg a set of teral product metrcs o the same projects are compared ad correlato results are dscussed. Product qualty measures Accordg to Feto 0, a measure s a emprcal objectve assgmet of a umber (or symbol) to a etty to characterse a specfc attrbute. Therefore, surveys of customer opo caot be cosdered as measuremets, sce they are based o o objectve assgmets that vary accordg to user judgemet. Joes recogses the eed to measure customers opos ad dstgush such measures (he calls them soft data measuremets ) from other measuremets whch ca be quatfed wth o subjectvty (he calls them hard data measuremets ). Although such hard data measuremets are objectve ad therefore legal measures, they do ot actually measure product qualty characterstcs drectly, but stead they measure teral quattes whch attempt to relate to these characterstcs. Ufortuately, ths relato s ot always successful. McCall 2, a classcal paper, proposed a three level herarchy model for product qualty measuremets. The frst level cossts of qualty characterstcs (called factors ), the secod level cossts of crtera decomposg the hgher level factors ad the thrd level, the lowest level, cossts of metrcs beg used to measure the crtera. Due to these factors crtera metrcs, ths model s also called FCM model. McCall proposed that low level metrcs should be mapped to a set of questos that would be used order to measure each crtero. The same year, Boehm 3 also proposed a model based o a smlar approach. Ths model was probably the bass for the teratoal stadard ISO926 4, whch was proposed may years later. The basc dea of ths model s that low level metrcs should be used stead of questos, order to objectvely measure attrbutes that are related to hgher level characterstcs. The problem wth all these models s ther ablty to combe all metrcs order to provde a global measure that wll actually estmate software qualty. Cote 5 descrbes such a vrtual global metrc whch he calls NWSC Normalsed Weghted Score. Ths vrtual measure combes all metrcs, by summg them accordg to weghts selected by the customer. Ufortuately, such a super metrc that wll combe all measuremets order to provde a ormalsed score dcatg the qualty of a product, caot exst whe measurg hard data. Such a metrc though, ca be acheved whe measurg customer perceved qualty usg surveys. I the followg secto, a method that provdes perceved qualty measuremets usg such a metrc based o customer perceved qualty surveys s preseted. The method Surveys are a valuable tool for a qualty assurace team. Moder software compaes have recogsed the mportace of surveys ad they are usg both teral ad exteral satsfacto surveys to measure from aspects of the employee s workplace 6 to exteral customer satsfacto. As argued by Kapla 7, surveys allow focusg o just the ssues of terest, sce they offer complete cotrol o the questos beg asked. Furthermore, surveys are quatfable ad therefore are ot oly dcators themselves, but also allow the applcato of more sophstcated aalyss techques approprate to orgasatos wth hgher levels of qualty maturty. I our studes, coducted order to measure perceved customer qualty, we have used the method of mal surveys. Although surveys are valuable tools, they have to deal wth four ma problems: subjectvty of measuremets, dffculty of statstcally aalysg results, lack of a weghg techque, frequecy of errors Hadlg the problems Subjectvty of measuremets. The smple truth s that subjectvty of measuremets wll rema a problem, regardless of the measuremets methodology. However, the adopto ad applcato of smple rules whe plag the survey ad desgg the questoare wll mprove the qualty of the measuremets. The qualty egeer who s settg up a mal survey usg questoares must follow gudles 8 o how to structure the questoare formally order to mmse objectvty due to varous terpretatos of questos or choce levels. A syopss of the gudeles we propose 9 s as follows: A troductory ote should descrbe the am of the questoare ad the frst questo must be hghly related to ths am. The vocabulary ad phrasg must be clear ad easy to uderstad. Explaatos must be precse ad bref. Furthermore, possble choces must be carefully selected ad tested order to cover all possble aswers. The questos should be attractve to the users ad the sze of the questoare must be kept short. The questoare should be well structured ad the questos must follow a logcal order wthout refereces to prevous questos. Questos wth pre defed aswers should be used stead of ope questos, where possble. Questos should be objectve, to avod leadg to a specfc aswer (called halo effect ) or affectg user judgemet. Cocepts such as probablty whch may cofuse the user, should be avoded. If these gudeles are followed durg survey desg, the customers wll reply guded by smple rules o how to make ther selecto, choosg from predefed choces or eve better selectg o choce bars. Therefore, problems of msuderstadg, or choosg a approprate aswer because of subjectve judgemet wll be mmsed, resultg measuremets wth creased objectvty. Statstcal aalyss. As Motgomery 20 states: Statstcal methods play a vtal role qualty mprovemet. But, the 2

3 Measurg perceved software qualty: M Xeos ad D Chrstodoulaks statstcal method that wll be used s related to the type of formato cotaed the measuremet results. Accordg to Yeh 2, typcal measuremets ca be classfed to oe of four stadard measuremet scales 22 : Nomal scale Ordal scale Iterval scale Rato scale The problem wth survey measuremets s that survey data based o ordal scale caot be statstcally aalysed usg formal statstcal methods. Ths s a commo problem whe usg questoares wth multple choce. No formato regardg the dstace betwee two eghbourg choces ca be obtaed. The soluto s to use choce bars, whe possble, or provde specfc structos whch wll expla that choces are terval scale. I ths case, the tervewees must fully uderstad that the multple choces are equal dstace to each other. Weghg customer opos. I may cases (especally whe measurg teral perceved qualty) t s ot correct to wegh all user opos equally. Averagg survey data does ot take to respect the sgfcace of each user s opo. Therefore, there s a eed for techques that wll evaluate users opos accordg to ther qualfcatos. The proposed techques, as preseted the followg secto, take to accout user qualfcatos ad wegh user opo based o ther qualfcatos. Idetfyg ad prevetg errors. Due to the ature of surveys, correct resposes wll occur. Such correct resposes, are resposes ot represetg user opo ad heceforth wll be called errors. I our surveys, we have measured a sgfcat umber of such errors caused by varous factors that mght seem extreme, but do occur. Such reasos are: The user dd ot aswer the questoare hmself/herself, but gave t to someoe else who was adequate to respod. The user aswered the questoare very carelessly ad marked radomly whe he/she was cofused, or just dd ot bother to read the structos. The user started to aswer wth ethusasm, but lost terest somewhere the mddle of the questoare ad just made some radom choces order to fsh t. The user respoded wth ethusasm throughout the questoare but msuder-stood some questos ad utetoally provded some wrog resposes. Such errors ca be preveted by followg the smple rules preseted prevously, but caot be elmated. It s a challege to desg techques that wll detect such errors order to hadle aswered questoares cotag a large umber of errors. Ufortuately, what Pressma 23 sad about software testg, also apples here: such techques caot esure the absece of errors, but they ca oly show that errors are preset. Two of the proposed techques preseted the followg secto, are used order to detect such errors. The techques The ma am of ths paper s to preset a rgorous approach to perceved product qualty measuremets that wll help a qualty maager to clude such measuremets the compay s qualty assessmet program. Such measuremets wll be carefully structured surveys to produce measuremet results wth a mmum degree of subjectvty, easy to aalyse, respectg customer qualfcatos ad as error free as possble. The techques proposed order to measure the customers percepto of software qualty are: QWCO QWCO S QWCO DS These techques are ordered wth creasg relablty ad creasg cost. The qualty maager must select the approprate techque accordg to hs/her eeds ad apply t. QWCO (Qualfcatos Weghed Customer Opo) s measured usg the formula show equato (), QWCO S (Qualfcatos Weghed Customer Opo wth Safeguards) s measured usg the formula show equato (2) ad QWCO DS (Qualfcatos Weghed Customer Opo wth Double Safeguards) s measured usg the formula show equato (3). QWCO = QWCO QWCO S DS = ( O E) = = = = E O E S ST E S S = = T O E S P k E S P k = The prme am of all these techques s to wegh customers opos accordg to ther qualfcatos. I order to acheve ths O, measures the ormalsed score of customer opo, E measures the qualfcatos of customer, ad s the umber of customers tervewed. Therefore, each customer cotrbutes to the average accordg to hs/hers qualfcatos. QWCO techque, although weghs customer opos accordg to ther qualfcatos, t does ot hadle errors. I order to detect errors, we have proposed ad used a umber of safeguards embedded to the questoares, as show equato (2) represetg the QWCO S techque. () (2) (3) 3

4 Xeos M. & Chrstodoulaks D., Measurg perceved software qualty, Pre-prt verso of the paper publshed Iformato ad Software Techology Joural, Butterworth Publcatos, Vol. 39, Issue 6, pp , Jue 997. Safeguard s defed as a questo placed sde the questoare so as to measure the correctess of resposes. Therefore, safeguards are ot questos amg to measure customer perceved qualty, but cotrol questos amg to detect errors. I equato (2) S s the umber of safeguards that the customer has repled correctly to, ad S T s the total umber of safeguards. Sce the use of the QWCU S techque mples the use at least of oe safeguard the questoare, dvso by S T s always vald. Fally, QWCO DS techque, as show equato (3), uses the safeguards ot oly order to detect errors whe measurg customer s opo, but also order to detect errors whe measurg customers qualfcatos. I equato (3), P value ca be 0 or. The value of P s zero case that eve a sgle error has bee detected whe measurg the qualfcatos of customer. P value s set to oly f the safeguards have ot detected ay errors whle measurg the qualfcatos of customer. Ths approach results to the rejecto of a customer s resposes, f errors were detected whle measurg hs/her qualfcatos. The reasog for ths approach s based o the followg cocept; a customer who s urelable whe aswerg questos regardg hs/her qualfcatos, caot cotrbute to the overall perceved software qualty by havg hs/her opo weghed accordg to such fake qualfcatos. Measurg customers qualfcatos. I order to measure customer qualfcatos we have preseted 24 ad appled a techque that allows the collecto of data ot oly for opo O of customer for the perceved software qualty, but also for the qualfcatos of customer. Each customer s requested to fll a set of questos requrg formato for three dfferet aspects of hs/hers qualfcatos: persoal backgroud, sytactc kowledge, sematc kowledge. Persoal backgroud s the collecto of all customer qualfcatos whch are ot related to computer applcatos or the actual product tself. Sytactc kowledge s the kowledge of exstg computer applcatos ad the famlarty wth the use of computers geeral. Accordg to the ature of the measured product, sytactc kowledge questos ca be customsed order to qure kowledge of specfc applcatos related to the product. Sematc kowledge measures how well the customer kows the sematcs of the problem automated by the product; meag how well the customer kows the process whch the software ams to facltate. Naturally, whe measurg customer qualfcatos, we measure ot oly the kowledge, but also the years of experece of each customer. After expermets, we have assged as default weghts 0.2 for the persoal backgroud, 0.4 for the sytactc ad 0.4 for the sematc kowledge. Ths meas that persoal backgroud cotrbutes 20% of the overall customer qualfcatos, sytactc kowledge cotrbutes 40% ad sematc kowledge cotrbutes the remag 40%. It s obvous, that the qualty maager could modfy these values accordg to the specfc problem characterstcs. Measurg customers opos. I order to measure customer opo, we use questoares based o the ISO926 teratoal stadard. Each of the sx ma factors of the stadard (fuctoalty, relablty, usablty, effcecy, mataablty ad portablty) has bee decomposed to a umber of crtera, a smlar maer to that McCall s model, ad fally to a set of questos. The actual sze of the fal questoare, the weght of each factor ad the specfc crtera that wll be cluded order to measure each factor, vary accordg to the specfc project requremets ad deped o those qualty factors o whch the qualty pla of the project has focused o. A example of such a questo beg used order to measure the crtero tme behavour whch wll be used for the estmato of the factor effcecy for a database clet program, s show fgure. Please rate the program s performace regardg tme behavour: 9-0 Respose tme equals or exceeds the requremets uder all codto ad resource usage. 6-8 Respose tme equals or exceeds the requremets most codtos wth mor lmtatos whe resource avalablty s decreased. Eve wth these lmtatos the program ca be used wthout modfcatos. 3-5 Respose tme s decreased below the acceptable lmts may stuatos. The system ca be used but wth may lmtatos. 0-2 Respose tme s decreased below the acceptable lmts so ofte that the program caot be used. Fgure Example questo from O measuremets I the example questo llustrated fgure, the user s prompted to rate the program wth a teger from to 0. Istructos gve the descrpto at the begg of the questoare state that all resposes are terval scale ad that 0 rates a perfectly satsfactory program, whereas 0 rates a completely useless program. The specfc gudeles for each questo are gve order to gude the user selectg the approprate respose. Usg safeguards. For cotrollg the errors, our surveys we have used three dfferet types of safeguards embedded to the questoares measurg perceved product qualty: Cotrol Questos Repeated questos phrased dfferetly Repeated questos offerg dfferet types of resposes Cotrol questos are questos whch ca be aswered oly by oe partcular respose. Ay other respose s a dcato of error. Repeated questos phrased dfferetly are questos wth exactly the same meag, but rephrased. These questos are placed to dfferet areas wth the questoare ad have exactly the same choces as caddate aswers. The selecto of two dfferet choces, o matter how dstat the selected error s from the correct aswer s cosdered a error. Repeated questos offerg dfferet types of resposes are questos wth exactly the same phrasg but wth etrely dfferet types 4

5 Measurg perceved software qualty: M Xeos ad D Chrstodoulaks of offered resposes. I our surveys we have used such questos wth multple choce or rato request ther frst appearace ad wth a choce bar ther secod appearace. Naturally, the secod appearace has ot bee placed ear the frst. A dfferet respose to these same questos, o matter how dstat the selected error s from the correct aswer s cosdered a error. A example of a safeguard (repeated questos offerg dfferet types of resposes) s show example questos llustrated fgures 2 ad 3. These two questos were placed two etrely dfferet parts of the questoare. Please rate the program s performace regardg ease of use: 9-0 The program ca be used wthout ay trag. It attracts the user ad provdes a perfect workg evromet. O le help s always avalable o ay tem ad uder ay codtos. 7-8 The program ca be used wth mor pror trag. O le help s almost always avalable. 4-6 The program ca be used after a trag perod. O le help s geerally, but ot always avalable ad may occasos the user has to request exteral assstace. -3 The program ca be used oly after pror extesve trag. O le help s ot provded or s totally effectve. 0 The program s so dffcult to use that caot be used at all. Fgure 2 Frst part of a safeguard Safeguards ca be used ot oly to preserve the tegrty of aswers durg the survey, but also to measure ad cotrol the effectveess of the questoare structure. We have used safeguards the early stages of survey desg, before falsg the structure of the questoare, a small plot survey wth a lmted umber of tervewees. The purpose of ths plot survey has bee to use the safeguards order to measure the average umber of errors produced whe usg some alteratve questoares. These questoares cota the same or smlar questos wth alteratve structures ad choce types. The questoare whch produces the mmum measured umber of errors s the oe selected for the fal survey. Usg ths method we have acheved to sgfcatly reduce the measured umber of errors ad therefore, to mprove the overall qualty of the questoare. Our purpose has bee to be able to detect errors, but also to use the experece from ths error detecto phase to succeed error preveto. Please rate the program cosderg how easy t s to use. Aswer by crclg the respose o the choce bar (select 0 f the program s so dffcult to use so that t caot be used at all, ad select 0 f the program s very easy to use a way that attracts the user ad provdes a perfect workg evromet) Table presets the error rates, detected by safeguards, plot phases from four dfferet surveys (rows A, B, C ad D). Two to four alteratve questoare structures have bee produced (colums Q to Q 4 ) for each of the above cases, ad were appled to a lmted volume of tervewees. The errors measured usg the safeguards o the alteratve questoares are ordered wth the worst error rate the frst colum ad the best the last colum. As show table, (case C) the average umber of errors (detected by the safeguards) was reduced from.29% to 0.98%. Such a reducto, a actual survey usg 000 tervewees, dcates that wthout usg ether safeguards or the plot phase, the fal survey aswer sheets would have 0% hgher error rate tha by usg the QWCO S techque. Such hgh error rate affects the tegrty of the survey s fdgs ad troduces a sgfcat rsk the decso makg based o the questoare results. Table. Detected error rates varous surveys Q Q 2 Q 3 Q 4 A 3.44%.37%.05% B 6.55%.33%.08% 0.88% C.29% 2.33% 0.98% D 3.22%.20% Aother ssue, usg the QWCO S ad the QWCO DS techques, s to decde o the umber of safeguards that wll be used wth the questoare. Usg a great umber of safeguards wll cause sde effects, such as creasg the sze of the questoare ad therefore causg more errors. Ths mght result to a paradox; usg safeguards to detect errors that were caused by excessve use of safeguards. Naturally, as ay real lfe stuato, exaggeratos caot offer acceptable solutos. The maager who s resposble for the survey ad the questoare desg must decde o the umber of safeguards to be used, respect to the overall questoare sze. I our surveys we have used a umber of safeguards ragg from 5% to 0%. Ths umber vares accordg to the actual questoare sze. (I small questoares the percetage of safeguards used s hgher tha the oe large questoares). Huma Aspects. The use of the QWCO DS techque could brg up the huma aspects 25 related wth customer perceved software qualty measuremets. The use of safeguards whe measurg customer qualfcatos could be otced ad msuderstood by the customers. Ths could be a major problem whe such measuremets are used teral compay surveys. The detecto of the safeguards embedded wth the questoare mght be terpreted by the employees as a attempt to measure ther qualfcatos whch wll evetually affect ther career chaces. Ths mght drop the employees moral or chage ther atttude towards the compay. Therefore, the choce of usg the QWCO DS techque o a teral compay survey s a dffcult choce that the qualty maager must make, takg to accout all related huma dmesos. Fgure 3 Secod part of the safeguard 5

6 Xeos M. & Chrstodoulaks D., Measurg perceved software qualty, Pre-prt verso of the paper publshed Iformato ad Software Techology Joural, Butterworth Publcatos, Vol. 39, Issue 6, pp , Jue 997. Applcatos of the Method The method has bee appled wth the scope of our measuremets program amg to measure customer perceved qualty. We have used the method case studes 26 o a umber of software projects, parallel wth teral software qualty measuremets. For example, the case study preseted ths secto, we have used a automated methodology 27 based o the Athea 28 measuremet evromet facltatg teral software qualty measuremets, order to measure teral software qualty characterstcs. I ths case study, we have used surveys based o the QWCO S techque order to measure the customers percepto for qualty. Iteral measuremets were based o a trptych of commoly used teral metrcs (Halstead, McCabe ad Tsa 29 ), whch were completely automated ad therefore expesve. The problem wth teral qualty measuremets s that they measure teral software characterstcs ad ot the desred exteral qualty factors. The terpretato of the teral metrcs, order to estmate these factors, s dffcult ad ot always successful. Exteral measuremets (customer perceved qualty measuremets) for the purpose of ths case study, were based o surveys. These kd of measuremets have a hgher cost level tha teral measuremets, but offer results o customers percepto for the desrable exteral qualty characterstcs. Table 2 shows the ormalsed measuremet results of the 46 software products measured usg the QWCO S techque, assetg order (worst measuremets frst). These projects were the measured samples for the preseted case study. The measuremets of customer percepto of qualty were based o a total of 55 respoded questoares wth proper product evaluato from varous users. Table 3 shows the teral measuremet results for the same 46 projects the same order as table 2. The teral measuremet results are ormalsed ad derved usg a combato formula for the metrcs mplemeted to the teral measuremets program. Ths combato metrcs formula (CMF) does ot measure a physcal quatty of the product, but combes all metrc results. Its sold purpose s to provde a collectve mechasm for comparso as show equato (4). Table 2. QWCO S measuremets Table 3. Normalsed CMF measuremets The metrcs that were chose to partcpate the CMF are the weghed average laguage level (λ wa ), the essetal sze rato (R), the weghed average cyclomatc complexty (V wa ) ad the data structure complexty metrc (T). As a dcato of the laguage level for the etre project, the weghed average laguage level, whch s show equato (5), was used order to measure the cotrbuto of the laguage level of every route to the overall project laguage level. A project s a collecto of routes created by varous programmers. Each oe of these routes has a dfferet laguage level ad cotrbutes to the project laguage level accordg to the route s sze. CMF = R V T λ wa = λ (4) wa ( N λ) N The use of the essetal sze rato R, whch s measured as show equato (6), s justfed by the aalyses 30,3 dcatg that N^ measures the optmal module sze wthout ay code mpurtes. Therefore, R provdes a dcato of the proper, or ot, use of the programmg laguage. R = ^ N N I a smlar maer to λ we, the cyclomatc complexty weghed average (V wa ) s the rato of 0 by the weghed average of McCabe s metrc for each route. The umber 0 s the proposed hghest acceptable complexty by McCabe. The formula to measure V wa s show equato (7). Fally, order to measure the data structures complexty T, the hgher polyomal expoet (T ex ) from the derved data structure polyomals was used as show equato (8). We must emphasse aga that proper aalyss s based o the dvdual results of all metrcs, but sce the complete set of results for all the metrcs used s ot easy to preset a paper, we use CMF whch provdes a way to combe all metrcs a easy to preset maer. V wa T = ( N Vg) N wa (5) (6) = 0 / (7) T ex + The correlato betwee the measuremet results of tables 2 ad 3 was measured to be 70.44%. Such correlato shows that the teral measures used, do ot completely coform to the customer perceved qualty measuremets. The scatter plot of fgure llustrates ths correlato by represetg the QWCO S measuremets the horzotal bar ad the ormalsed CMF results the vertcal bar. The (8) 6

7 Measurg perceved software qualty: M Xeos ad D Chrstodoulaks dagoal le s the correlato le, represetg the le where all pot should be f the two measuremet methods were 00% correlated. The pots whch are marked below ths correlato le represet projects that, although they have ot satsfed teral measuremet stadards, they have acheved hgher tha expected scores customer perceved qualty measuremets. The pots marked above the correlato le represet projects that, have satsfed teral measuremet stadards, but have ot acheved equally hgh scores customer perceved qualty measuremets. CMF,0 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0, 0,0 0,0 0, 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9,0 QWCOs Fgure 4 Scatter Plot of 46 projects As oe ca see the scatter plot, there are almost o projects whch fal the teral qualty measuremets ad acheve hgh scores customer perceved measuremets. Very few pots are below the correlato le ad o pots are great dstace below ths le. O the cotrary, may pots are ot oly just over the correlato le, but further above. We must emphasse the fact that all the dvdual measuremets for each metrc used, produced smlar scatter plots. The CMF s used oly to serve as a collectve formula for presetato purposes. Ths ca lead us to the cocluso that, although teral metrcs ca detect programs that mght get low scores terms of customers percepto for ther qualty, satsfacto of teral measuremet does ot guaratee achevemet of hgh scores customer perceved qualty measuremets. Keepg md that customer percepto of qualty s a success measure for software compaes, we ca coclude that teral metrcs offer a great meas for detectg programs that mght cause low customer perceved qualty measuremets. Naturally, ths s ot the oly purpose of teral measuremets. However, sce teral measuremets caot fully detect programs that wll have low customer perceved qualty measuremets, the use of surveys s requred order to measure the actual customer perceved qualty ad order to test ad calbrate the teral measuremet procedures. Cocluso Ths paper presets a method whch focuses o the defto of software qualty as satsfacto of customer requremets. Ths method fts to ay qualty assurace framework ad especally to those based o ISO9000, IEEE, or Baldrge. As ay method, t has advatages ad dsadvatages. The dsadvatages are cost deployg the techques, error rates, subjectvty of the aswers ad huma factors volved wth surveys ad qualfcato measuremets. Ths paper offers solutos the form of techques ad gudeles order to overcome these dsadvatages. The qualty maager must wegh the prortes for each specfc case ad decde whch oe of the proposed techques wll be used. The ma advatages of the method preseted ths paper are: a) t coforms wth the defto of qualty, b) t fts almost every qualty assurace framework, c) t s quatfable; t measures drectly exteral qualty factors ad ca be subject to more sophstcated aalyss techques, approprate to orgasatos wth hgher levels of qualty maturty, d) t s always applcable ad does ot deped o programmg laguages or tools ad e) t offers teracto wth customers thus provdg cofdece that the compay respects ther opo. The applcato of ths method parallel wth software qualty measuremets based o teral metrcs, proves that the method ca actually be used parallel wth such measuremets. The am of ths method s ot to substtute teral metrcs, but to offer a alteratve soluto. Ths soluto emphassg that qualty focuses o customer requremets, should be used parallel wth curret practces order to ad calbratg metrcs, cotrollg measuremet results, ad provdg cofdece to both the compay ad the users. Ackowledgemets The authors would lke to thak the aoymous referees for ther careful revew ad suggestos ad Ms. M. Stamso Atmatzd, EFL Istructor at the Dep. of Computer Egeerg ad Iformatcs for the proofreadg ad correctos of the Eglsh. Refereces ISO, Qualty Maagemet ad Qualty Assurace Stadards, Iteratoal Stadard, ISO/IEC 900: 99 2 Ice, D, ISO 900 ad Software Qualty Assurace, Qualty Forum, McGraw Hll, sb: , IEEE, Stadard for a Software Qualty Metrcs Methodology, P- 06/D20, IEEE Press, New York, Brow, M G, Baldrge Award Wg Qualty: How to Iterpret the Malcom Baldrge Award Crtera, Mlwaukee, WI: ASQC Qualty Press, 99 5 Steeples, M M, The Corporate Gude to the Malcom Baldrge Natoal Qualty Award, WI: ASQC Qualty Press, Albrecht, A J, Measurg applcato developmet productvty, Proc. of IBM Apllc. Dev. Jot SHARE/GUIDE Symposum, Moterey, CA, pp , McCabe, T J, A complexty measure, IEEE Tras. Soft. Eg. SE- 2(4), pp , Halstead, M H, Elemets of Software Scece, Elsever North Hollad, Ket, R, Marketg Research Acto, Routledge, Lodo, sb: , Feto, N E, Software Metrcs A Rgorous Approach, Chapma & Hall, sb: , 992 Joes, C, Appled Software Measuremet: Assurg Productvty ad Qualty, McGraw Hll, sb: , 99 7

8 Xeos M. & Chrstodoulaks D., Measurg perceved software qualty, Pre-prt verso of the paper publshed Iformato ad Software Techology Joural, Butterworth Publcatos, Vol. 39, Issue 6, pp , Jue McCall, J A, Rchards, P K, ad Walters, G F, Factors Software Qualty, Vols I, II, III, US Rome Ar Developmet Ceter Reports NTIS AD/A , 05, 055, Boehm, B W, Brow, J R, Kaspar, J R, Lpow, M, McCleod, G J, ad Merrt, M J, Characterstcs of Software Qualty, North Hollad, ISO, Iformato techology - Evaluato of software - Qualty characterstcs ad gudes for ther use, Iteratoal Stadard, ISO/IEC 926: 99 5 Cote, S D, Dusmore, H E, ad She, V Y, Software Egeerg Metrcs ad Models, Bejam Cummgs, sb: , Evvardsso, B, Thomasso, B, ad Ovretvet, J, Qualty of Servce. Makg t Really Work, McGraw Hll, sb: , Kapla, C, Clark, R, ad Tag, V, Secrets of Software Qualty, McGraw Hll, sb: , Lahlou, S, Va der majde, R, Messu, M, Poquet, G, ad Prakke, F, A Gudele for Survey Techques Evaluato of Research, Blussels, ESSC EEC-EAEC, Xeos, M, ad Chrstodoulaks, D, Evaluatg Software Qualty by the Use of User Satsfacto Measuremets, 4 th Software Qualty Coferece, SET, Uversty of Abertay, Dudee, pp. 8-88, Motgomery, D C, Itroducto to Statstcal Qualty Cotrol, secod edto, Joh Wley & Sos, sb: X, 99 2 Yeh, H T, Software Process Qualty, McGraw Hll, sb: , Steves, S S, O the Theory of Scales of Measuremet, Scece, 03: 677, Pressma, R S, Software Egeerg. A Practtoer s Approach, 3 rd edto, McGraw Hll, sb: , Xeos, M, ad Chrstodoulaks, D, Software Qualty: The User s Pot of Vew, pp of Software Qualty ad Productvty, Chapma & Hall, sb: , Thomas, B, The Huma Dmeso of Qualty, McGraw Hll, sb: , Xeos, M, Stavrouds, D, ad Chrstodoulaks, D, The Correlato Betwee Developer-oreted ad User-oreted Software Qualty Measuremets (A Case Study), 5th Europea Coferece o Software Qualty, EOQ-SC, Dubl, pp , Xeos, M, ad Chrstodoulaks, D, A Applcable Methodology to Automate Software Qualty Measuremets, IEEE Software Testg ad Qualty Assurace Iteratoal Coferece, New Delh, pp. 2-25, IEEE ID: , Tsalds, C., Chrstodoulaks, D., ad Martsas, D., Athea: A Software Measuremet ad Metrcs Evromet, Software Mateace Research ad Practce, Tsa, W. T., Lopez, M. A., Rodrguez, V., ad Volovk, D., A Approach to Measurg Data Structure Complexty, Compsac86, pp , Ftzsmmos, A., ad Love, T., A Revew ad Evaluato of Software Scece, Computg Surveys, Vol. 0, No, pp , Chrstese, K., Ftsos, G. P., ad Smth, C. P., A Perspectve o Software Scece, IBM Syst. Joural, Vol. 20, No 4, pp , 986 8