Phd Program in Transportation. Transport Demand Modeling

Size: px
Start display at page:

Download "Phd Program in Transportation. Transport Demand Modeling"

Transcription

1 Phd Pogam in Tanspotation Tanspot Demand Modeling João de Abeu e Silva Facto Analysis Phd in Tanspotation / Tanspot Demand Modelling 1/38

2 Facto Analysis Definition and Pupose Exploatoy technique aimed at defining the undelying stuctue among a goup of inteelated vaiables. Facto analysis allows the constuction of a measuement scale fo the factos that contol the oiginal vaiables. It is also a technique used to educe the numbe of vaiables in othe multivaiate methods. It uses the coelations between vaiables to estimate the common factos Phd in Tanspotation / Tanspot Demand Modelling 2/38

3 Facto Analysis Definition and Pupose If two vaiables ae coelated this association esults fom the fact that both shae a common chaacteistic which cannot be diectly obseved (a latent common facto). Common Latent Facto X X 1 X 2 X 1 2 Phd in Tanspotation / Tanspot Demand Modelling 3/38

4 Facto Analysis Definition and Pupose The geneal pupose of facto analysis is to find a way to condense the infomation contained in a goup of vaiables into a smalle set of composite dimensions loosing the minimum amount of infomation in this pocess seach fo the fundamental dimensions that undelie the oiginal vaiables R type facto analysis analyses the coelation matices of the vaiables Q type facto analysis analyses the coelation matix of the individual espondents based on thei chaacteistics Phd in Tanspotation / Tanspot Demand Modelling 4/38

5 Facto Analysis Definition and Pupose Facto analysis is an intedependence technique all vaiables ae simultaneous consideed with no distinctions between dependent and independent vaibales Vaiate (o facto) is a linea composite of vaiables. It is fomed in ode to maximize the explanation of the entie vaiable set of vaiables Phd in Tanspotation / Tanspot Demand Modelling 5/38

6 Matix notation In matix notation the facto analysis model is Z is the vecto of p standatized vaiables f is the vecto of common factos ( m=0 and s=i) h is the vecto of specific factos (m=0 and s=y) L is the matix of facto loadings P is the coelation matix Assuming that f and h ae independent y is diagonal Phd in Tanspotation / Tanspot Demand Modelling 6/38

7 Uses of facto analysis Data summaization the goal of summaization is to define a small numbe of factos that adequately epesent the oiginal set of vaiables. Each dependent vaiable is a function of an undelying and latent set of factos. Each vaiable is pedicted by all the factos (and indiectly by all the othe vaiables) Data eduction Identify epesentative vaiables fom a much lage set of vaiables fo use in subsequent analysis o ceate a new set of vaiables, much smalle in numbe to eplace them. Data eduction elies on facto loadings but uses them as the basis fo identifying vaiables o making estimates of the factos fo subsequent analysis Phd in Tanspotation / Tanspot Demand Modelling 7/38

8 Vaiable selection Facto analysis poduces factos thus special cae should be taken against gabage in gabage out phenomena The quality and meaning of the deived factos eflect the conceptual undepinnings of the vaiables consideed fo the facto analysis Facto analysis could be used to intoduce in othe statistical techniques a smalle numbe of new vaiables eithe using epesentative vaiables o the facto scoes Phd in Tanspotation / Tanspot Demand Modelling 8/38

9 Vaiable selection Nonmetic vaiables could be poblematic, is pudent to avoid nonmetic vaiables and substitute them by dummy vaiables. But if all vaiables included in the facto analysis ae dummy then othe methods should be used Since facto analysis aims to find pattens among goups of vaiables factos with only one vaiable don t make sense Phd in Tanspotation / Tanspot Demand Modelling 9/38

10 Assumptions of facto analyusis Sample size <50 unaceptable Pefeably at least 5 obsevations fo each vaiable Assumptions of facto analysis Moe conceptual than statistical (doesn t mean that the statistical assumptions shouldn't be met) Thee is an undelying stuctue in the data (coelation between vaiables does not ensue the existence of this) The sample should be homogeneous with espect to the undelying facto stuctue (eg vaiables that ae diffeent between men and women) Phd in Tanspotation / Tanspot Demand Modelling 10/38

11 Statistical assumptions Depatues fom nomality (the moe impotant significance tests), homecedasticity and lineaity diminish coelations Some multicollineaity is desiable When coelations among vaiables ae small (<0,3) o ae all equal facto analysis is not appopiate Patial coelations coelation that is unexplained when the effects of othe vaiables ae taken into account. If they ae high (>0,7) facto analysis is ielevant. Anti-Image coelation matix negative value of the patial coelation. Lage values indicate that the vaiables ae independent Phd in Tanspotation / Tanspot Demand Modelling 11/38

12 Adequacy tests The values in the diagonal of the Anti-Image Coelation Matix ae the Measues of Sampling Adequacy. They could be intepeted in a way simila to KMO KMO (Kaise-Meye-Olkin) measue of homogeneity, which compaes simple with patial coelations obseved between vaiables Whee Phd in Tanspotation / Tanspot Demand Modelling 12/38

13 Adequacy tests KMO value Recomendations elative to Facto Analyis ]0,9;1,0] Excelent ]0,8;0,9] Good ]0,7;0,8] Aveage ]0,6;0,7] Medioce ]0,5;0,6] Bad but still acceptable 0,5 Unacceptable Phd in Tanspotation / Tanspot Demand Modelling 13/38

14 Adequacy tests Batlett test of spheicity Statistical test fo the pesence of coelations among vaiables. Statistical significance that the coelation matix is diffeent fom the identity matix. It is sensible to sample size (moe sensible in detecting coelations) Phd in Tanspotation / Tanspot Demand Modelling 14/38

15 Extaction Methods Pincipal Components consides the total vaiance and deives factos that contain small popotions of unique vaiance (appopiate when data eduction is a pimay concen). Common facto analysis consides only the common o shaed vaiance. Eo and specific vaiance ae not of inteest (objective is to identify the latent dimensions o constucts well specified theoetical applications) Phd in Tanspotation / Tanspot Demand Modelling 15/38

16 Facto extaction The numbe of factos to be extacted: Latent oot citeion Any individual facto should account fo the vaiance of at least one single vaiable latent oot o eingenvalue >1. A Pioi citeion define a pioi the numbe of factos to be extacted (testing an hypothesis about the numbe of factos). Pecentage of Vaiance citeion Achieving a specified cumulative pecentage of total vaiance. Usual values natual sciences 95%; Usual values social sciences >60% (not uncommon). Phd in Tanspotation / Tanspot Demand Modelling 16/38

17 Facto extaction Scee test In the latent oot plot look fo the point whee thee is na inflexion. The point whee the cuve begins to staighten coesponds to the maximum numbe of factos. Pasimony is impotant to have the most epesentative and pasimonious set of factos. Phd in Tanspotation / Tanspot Demand Modelling 17/38

18 Facto Rotation Facto otation Most of the times otation impoves the facto intepetation Fom a mathematical point of view the extacted factos ae not unique. They could be tanslated in ode to otate the facto axis (doesn t change the data stuctue) The ultimate effect of otation is to edistibute the vaiance fom ealie factos to latte ones simple and moe meaningful. Phd in Tanspotation / Tanspot Demand Modelling 18/38

19 Facto Rotation Othogonal facto otation - peseves the otogonality (not coelated). Is the most widely used Oblique facto otation no estictions as being othogonal. It also povides infomation about the extent to which the factos ae coelated Phd in Tanspotation / Tanspot Demand Modelling 19/38

20 Rotation Methods Vaimax Obtain a facto stuctue in which only one of the oiginal vaiables is stongly associated with only one facto (the associations with othe factos ae much less stong). Cleae sepaation of the factos. Simplifies the facto matix columns. Quatimax Obtain a facto stuctue in which all vaiables have stong weights in one facto (geneal facto) and each vaiable has stong facto loadings in anothe facto (common facto) and small loadings in the othe factos. It assumes that the data stuctue could be explained by one geneal facto and one o moe common factos. Simplifies the facto matix ows. Phd in Tanspotation / Tanspot Demand Modelling 20/38

21 Rotation Methods Equimax compomise between Vaimax and Quatimax. It is not fequently used. Oblique otation methods simila to the othogonal otations, except that they allow coelations between the factos (e.g. Oblimin in SPSS). Cae must be taken in the analysis. The nonothogonality could be anothe way of becoming specific to the sample and non genealizable. Used when the goal is to obtain seveal theoetical meaningful factos. Phd in Tanspotation / Tanspot Demand Modelling 21/38

22 Pactical significance The facto loading is the coelation between each vaiable and a facto: Loadings on the ange of +-0,3 o +-0,4 ae consideed to meet the minimal level fo intepetation of stuctue; Loadings +-0,5 ae consideed pactically significant; Loadings +-0,7 ae indicative of a well defined stuctue. Phd in Tanspotation / Tanspot Demand Modelling 22/38

23 Pactical significance Phd in Tanspotation / Tanspot Demand Modelling 23/38

24 Example 1 Phd in Tanspotation / Tanspot Demand Modelling 24/38

25 Example 1 Phd in Tanspotation / Tanspot Demand Modelling 25/38

26 Example 1 Phd in Tanspotation / Tanspot Demand Modelling 26/38

27 Example 1 Phd in Tanspotation / Tanspot Demand Modelling 27/38

28 Example 1 Phd in Tanspotation / Tanspot Demand Modelling 28/38

29 Example 1 Phd in Tanspotation / Tanspot Demand Modelling 29/38

30 Example 1 Phd in Tanspotation / Tanspot Demand Modelling 30/38

31 Example 1 KMO mean that the ecomendation fo FA is Aveage Phd in Tanspotation / Tanspot Demand Modelling 31/38

32 Example 1 MAS All vaiables ae at least aceptable Phd in Tanspotation / Tanspot Demand Modelling 32/38

33 Example 1 Commonalaties epesent the amount of vaiance accounted fo by the facto analysis At least half of the vaiance of each vaiable should be taken into account This means that vaiables with communalities smalle than 0,5 should be excluded Phd in Tanspotation / Tanspot Demand Modelling 33/38

34 Example 1 Total vaiance explained is 73% Is it good? Phd in Tanspotation / Tanspot Demand Modelling 34/38

35 Example 1 Phd in Tanspotation / Tanspot Demand Modelling 35/38

36 Example 1 Phd in Tanspotation / Tanspot Demand Modelling 36/38

37 Example 1 This is the matix that tansfoms the unotated solution in the otated component matix (by matix multiplication Phd in Tanspotation / Tanspot Demand Modelling 37/38

38 Recommended Readings Hai, Joseph P. et al (1995) Multivaiate Data Analysis with Readings, Fouth Edition, Pentice Hall - Chapte 2 Maoco, João (2003) Análise Estatística com utilização do SPSS, Ed. Sílabo Capítulo 10 Phd in Tanspotation / Tanspot Demand Modelling 38/38