Course on Data Analysis and Interpretation P Presented by B. Unmar. Sponsored by GGSU PART 1

Size: px
Start display at page:

Download "Course on Data Analysis and Interpretation P Presented by B. Unmar. Sponsored by GGSU PART 1"

Transcription

1 Course on Data Analysis and Interpretation P Presented by B. Unmar Sponsored by GGSU PART 1 1

2 Data Collection Methods Data collection is an important aspect of any type of research study. Inaccurate data collection can impact the results of a study and ultimately lead to invalid results. Data collection methods for impact evaluation vary along a continuum. At the one end of this continuum are quantitative methods and at the other end of the continuum are qualitative methods for data collection. Census. A census is a study that obtains data from every member of a population. In most studies, a census is not practical, because of the cost and/ or time required. (Example: Housing & Population Census) Sample survey. A sample survey is a study that obtains data from a subset of a population, in order to estimate population attributes. (Examples: HBS and CMPHS) 2

3 Primary Data Collection Methods Face-to- Face Interview Telephone Interview Secondary Internet Date: 24 November 2010 Training Course in Basic Statistics 3

4 Information can be collected in statistics using qualitative or quantitative data. Qualitative data, such as eye colour of a group of individuals, is not computable by arithmetic relations. They are labels that advise in which category or class an individual, object, or process falls. They are called categorical variables. Quantitative data sets consist of measures that take numerical values for which descriptions such as means and standard deviations are meaningful. They can be put into an order and further divided into two groups: discrete data or continuous data. Data are called "primary type" data if the analyst has been involved in collecting the data relevant to his/her investigation. Otherwise, it is called "secondary type" data. 4

5 Data come in the forms of Nominal, Ordinal, Interval, and Ratio (remember the French word NOIR for the colour black). Nominal items are usually categorical, in that they belong to a definable category, such as 'employees'. Items on an ordinal scale are set into some kind of order by their position on the scale. We cannot do arithmetic with ordinal numbers - they show sequence only. Example: The first, third and fifth person in a race Interval data is measured along a scale in which each position is equidistant from one another. Interval data cannot be multiplied or divided. Example: Level of happiness, rated from 1 to 10 In a ratio scale, numbers can be compared as multiples of one another. Thus one person can be twice as tall as another person. Important also, the number zero has meaning. Example: A person's weight 5

6 Measures of central tendency (average): 1. Mean 2. Median 3. Mode WHY represent a set of data by means of single number which, in its way, is descriptive of the entire set? To compare different sets of data (over time and space) Examples Distribution of population over time Distribution of household size of poor and non-poor households Output of industrial groups over time Arithmetic mean versus Geometric mean The arithmetic mean of a set of values is the sum of the values divided by their number. The geometric mean of n numbers is the nth root of the product of these number. The geometric mean is relevant when the data to be averaged is in terms of growth rates, Indices etc. (i.e) a percentage change over time for several time periods. TRY AVERAGE GDP GROWTH RATE FOR THE PERIOD ( ) 6

7 Median The median is defined as the value of the middle term or the mean of the values of the two middle terms when the data are arranged in ascending or descending order of magnitude. Mode The mode is defined as the value, which occurs with the highest frequency. Square Root of Variance = SD Measure of variation (dispersion): Range Quartile deviation Mean deviation Standard deviation (most common method used to measure variation) It indicates the degree of scatter of the different values about the central value. 7

8 Statistical Inference We attempt to extrapolate the findings of the study to the population from which the sample was drawn. Assumptions are: Sample is representative of population Sample is randomly drawn from population Variability of sample & population are similar Steps in Statistical Inference Generating NULL and ALTERNATIVE hypothesis Type I and Type II Error Testing the hypothesis using appropriate statistical tests Obtaining p value Concluding from the p value Type 1 error: The probability of falsely concluding a difference when actually there is no difference. Conventionally this is set at 5% of = 0.05 or lower. In medicine the probability of a false +ve conclusion should be kept low. 8

9 Confidence interval 1.96 CI tells you that.. If the study is repeated several times the values of the sample mean would fall within this range 95 out of 100 times.

10 Absenteeism Bivariate analysis: Consider the following cross table TABLE 1 Job satisfaction Yes No Row marginals Yes No Column marginals (a) Using the data from Table 1 calculate the row and column percentages separately. (b) Describe briefly what the row and column percentages would emphasize upon. 10 Training Course in Basic Statistics Date: 10 December 2010

11 The question of whether to use row or column percentages in part depends on what aspects of the data one wants to highlight. It is sometimes suggested that the decision depends on whether the independent variable is across the top or along the side of the table: if the former, column percentages should be used; if the latter row percentages should be employed. Typically, the independent variable will go across the table, in which case column percentages should be used. However, this suggestion implies that there is straightforward means of identifying the independent and dependent variables, but this is not always the case and great caution should be exercised in making such an inference. 11

12 Economic Statistics Economic statistics is a branch of applied statistics focusing on the collection, processing, compilation and dissemination of statistics concerning the economy of a region, a country or a group of countries. Economic statistics provide the empirical data needed in economic research (econometrics) and they are the basis for decision and economic policy making. In Mauritius, official economic data are produced and disseminated by CSO and Bank of Mauritius. Economic indicators An economic indicator (or business indicator) is a statistic about the economy. Economic indicators allow analysis of economic performance and predictions of future performance. Economic indicators include various indices, earnings reports, and economic summaries, such as unemployment, Consumer Price Index (CPI), industrial production, Gross Domestic Product (GDP), retail sales, stock market prices, and money supply changes. 12

13 Why Economic Data? Good economic data is a precondition to effective macroeconomic management. With the complexity of modern economies and the lags inherent in macroeconomic policy instruments, a country must have the capacity to promptly identify any adverse trends in its economy and to apply the appropriate corrective measure. This cannot be done without economic data that is complete, accurate and timely. Increasingly, the availability of good economic data is coming to be seen by international markets as an indicator of a country that is a promising destination for foreign investment. International investors are aware that good economic data is necessary for a country to effectively manage its affairs and, other things being equal, will tend to avoid countries that do not publish such data. The public availability of reliable and up-to-date economic data also reassures international investors by allowing them to monitor economic developments and to manage their investment risk. 13

14 Social statistics is the use of statistical measurement systems to study human behaviour in a social environment. This can be accomplished through polling a particular group of people, evaluating a particular subset of data obtained about a group of people, or by observation and statistical analysis of a set of data that relates to people and their behaviours. Social indicators are defined as statistical measures relating to major areas of social concern and/or individual well being. Examples of social indicators are projections, forecasts, outlook statements, time-series statistics, and extrapolations related to topics such as population, housing, social security, income, education, and health. In Mauritius our main social indicators cover the following areas: Population and vital statistics Health Education Crime Social Security Environment In Mauritius, official social indicators are produced and disseminated by CSO and Ministry of Health & Quality of Life. 14

15 Key Uses of Social Indicators Description: to inform citizens and policy makers about the circumstances of their society, to track trends and patterns, and to identify areas of concern as well as positive outcomes. Monitoring: to track outcomes that may or may not require policy intervention of some kind. Most people are familiar with using indicators for the purpose of monitoring in the public health field. Setting goals: to establish quantifiable thresholds to be met within a specific time period. Increasing accountability: to achieve positive or improved outcomes. Reflective practice: to inform practices of communities and individual programs on an ongoing basis. Date: 17 November 2010 Training Course in Basic Statistics 15

16 The Proper Use of Social Indicators Social indicators can be helpful tools for policy makers, practitioners, and the public, but using them correctly requires attention to a number of issues: Social indicators need to be measured for the appropriate population. For example, if a policy focuses on services for lowincome children, then the outcomes should be measured for low-income children not middleclass or all children. Social indicators need to be measured at the appropriate geographic level. Looking just at trends on the national level may obscure how a policy is affecting individuals in their own states and home communities. Social indicators need to be well conceptualised. That is, social indicators need to accurately reflect the concept that they are intended to capture. Date: 17 November 2010 Training Course in Basic Statistics 16

17 QUESTIONS AND ANSWERS 17