ReCap. Numerical methods substitute repeated calculations for a single analytic solution.

Similar documents
Chapter 12. Sample Surveys. Copyright 2010 Pearson Education, Inc.

POPULATIONS USING CAPTURE TECHNIQUES

AP Statistics Scope & Sequence

Target Populations, Sampling Frames, and Coverage Error. Professor Ron Fricker Naval Postgraduate School Monterey, California

Bias Associated with Sampling Interval in Removal Method for Fish Population Estimates

Chapter 8 Interpreting Uncertainty for Human Health Risk Assessment

2 Population dynamics modelling of culturebased fisheries

Non-parametric optimal design in dose finding studies

Distinguish between different types of numerical data and different data collection processes.

COMPUTER SIMULATIONS AND PROBLEMS

The natural world around us is filled with many types of wildlife. Everything

Lab 4. MARK-RECAPTURE SAMPLING

Genetic drift. 1. The Nature of Genetic Drift

B. Statistical Considerations

Machine learning applications in genomics: practical issues & challenges. Yuzhen Ye School of Informatics and Computing, Indiana University

SAMPLING A MOBILE POPULATION: THE MARK-RECAPTURE METHOD

Week 1 Tuesday Hr 2 (Review 1) - Samples and Populations - Descriptive and Inferential Statistics - Normal and T distributions

How Much Do We Know About Savings Attributable to a Program?

Lecture 10: Introduction to Genetic Drift. September 28, 2012

Progress on Objective 3-5: Monitoring Designs, Trade Offs, and Recommendations

MONITORING CHANGES IN EXOTIC VEGETATION

SIMULATION. Simulation. advantages (cont.) Advantages

Department of Economics, University of Michigan, Ann Arbor, MI

Glossary of terms for harvest strategies, management procedures and management strategy evaluation

Chapter 3: Evolutionary genetics of natural populations

MATHEMATICAL MODELLING

Attempt any FOUR questions. All questions are of equal marks.

METHODOLOGIES FOR SAMPLING OF CONSIGNMENTS

ESTIMATING GENETIC VARIABILITY WITH RESTRICTION ENDONUCLEASES RICHARD R. HUDSON1

DRAFT NON-BINDING BEST PRACTICES EXAMPLES TO ILLUSTRATE THE APPLICATION OF SAMPLING GUIDELINES. A. Purpose of the document

BIOSTATISTICS AND MEDICAL INFORMATICS (B M I)

Metaheuristics. Approximate. Metaheuristics used for. Math programming LP, IP, NLP, DP. Heuristics

Spreadsheets in Education (ejsie)

thebiotutor.com A2 Biology Unit 4 Populations & Ecosystems

12-1. TIMBER ESTIMATION 291

CHAPTER 1: Introduction to Data Analysis and Decision Making

Statistics 712: Applied Statistical Decision Theory Spring 1999 Syllabus

Finding your instructor

MARK2052: Marketing Research

On The Use of an Almost Unbiased Ratio Estimator in the Two- Phase Sampling Scheme

PopGen1: Introduction to population genetics

LECTURE 2 SINGLE VARIABLE OPTIMIZATION

GENE FLOW AND POPULATION STRUCTURE

Crowe Critical Appraisal Tool (CCAT) User Guide

Statistics, Data Analysis, and Decision Modeling

BUSS1020. Quantitative Business Analysis. Lecture Notes

1. The characteristics of populations are shaped by the interactions between individuals and their environment

1. The characteristics of populations are shaped by the interactions between individuals and their environment

Measurement Systems Analysis

Application of Statistical Sampling to Audit and Control

TOOL #57. ANALYTICAL METHODS TO COMPARE OPTIONS OR ASSESS

TEACHER VERSION FISHERY HARVESTING: ATLANTIC COD

Contact: Version: 2.0 Date: March 2018

ChIP-seq and RNA-seq. Farhat Habib

Personal interview questionnaire suitable when open ended questions are asked

Data Mining and Applications in Genomics

Recent Developments in Assessing and Mitigating Nonresponse Bias

Understanding UPP. Alternative to Market Definition, B.E. Journal of Theoretical Economics, forthcoming.

Optimization Prof. Debjani Chakraborty Department of Mathematics Indian Institute of Technology, Kharagpur

Using environmental data to inform spatial stock assessment assumptions in Stock Synthesis

LECTURE - 2 INTRODUCTION DR. SHALABH DEPARTMENT OF MATHEMATICS AND STATISTICS INDIAN INSTITUTE OF TECHNOLOGY KANPUR

Asian Research Journal of Business Management Issue 4 (Vol.4)2017 Issn: Probability and Non Probability Sampling

Stock Assessment Form Demersal species Reference year: Reporting year:

Supplemental Digital Content. A new severity of illness scale using a subset of APACHE data elements shows comparable predictive accuracy

Fisheries Genetics for Stock Identification NWWAC 1st March 2017, Paris

IMPLEMENTATION OF AN OPTIMIZATION TECHNIQUE: GENETIC ALGORITHM

Introduction to Business Research 3

Comparison of Efficient Seasonal Indexes

Quantitative Tools for the Monitoring of Threatened Species: A Marbled Murrelet Case Study

Glossary of Research Terms

Training Needs Analysis

Introduction to Sample Surveys

BSAI Crab Rationalization EDR Audits

Data Science in a pricing process

The Travelling Salesman Problem and its Application in Logistics

Understanding Fish Stock Assessment. Gary Shepherd Northeast Fisheries Science Center Woods Hole, MA

Chapter 19. Confidence Intervals for Proportions. Copyright 2012, 2008, 2005 Pearson Education, Inc.

Framework for undertaking eradication programs on insular populations of vertebrate pests

Course Information. Introduction to Algorithms in Computational Biology Lecture 1. Relations to Some Other Courses

Independently Assessed and Endorsed by NASBA, the official board that regulates accounting and financial education in the United States of America.

(31) Business Statistics

More-Advanced Statistical Sampling Concepts for Tests of Controls and Tests of Balances

Population Genetics. Chapter 16

DECISION MAKING. Chapter - 4. B. H. Gardi College of Engineering & Technology, RAJKOT Department of Master of Computer Application

Selecting a Sample Size

Why do we need statistics to study genetics and evolution?

Statistical Sampling in Healthcare Audits and Investigations

PARALLEL LINE AND MACHINE JOB SCHEDULING USING GENETIC ALGORITHM

ISE 320 Quality Control and Industrial Statistics CHAPTER 06 MEASUREMENT SYSTEM ANALYSIS (MSA) Engineering College, Hail University, Saudi Arabia

Introduction to Algorithms in Computational Biology Lecture 1

Business Quantitative Analysis [QU1] Examination Blueprint

EOH 2504, Principals of Environmental Exposure: Lecture 5, Strategies for and Design of Exposure Assessment Studies

Identify Risks. 3. Emergent Identification: There should be provision to identify risks at any time during the project.

BSAI Crab Rationalization EDR Audits

Identifying the best customers: Descriptive, predictive and look-alike profiling Received (in revised form): 8th April, 1999

Counting species. Types of ecological diversity. Diversity can be partitioned geographically

PRE-READING 4 TYPES OF MODEL STRUCTURES

Syllabus for BIOS 101, SPRING 2013

Population Structure and Gene Flow. COMP Fall 2010 Luay Nakhleh, Rice University

FUNDAMENTALS OF QUALITY CONTROL AND IMPROVEMENT. Fourth Edition. AMITAVA MITRA Auburn University College of Business Auburn, Alabama.

Transcription:

Lecture Notes in Quantitative Biology Estimating Population Size Chapter 21 (from 6 December 1990) ReCap Numerical methods Determining population size Reasons Good practices Methods Censusing Complete enumeration Sampling Mark-recapture revised 1995, + 26 November 1997 used in 1995, skipped in 1996 ReCap. Numerical methods substitute repeated calculations for a single analytic solution. They are used with increasing frequency to solve problems in quantitative biology, as the price of computing power drops, and the accessibility increases. Examples: Monte Carlo estimates of p-values Finding an optimum (maximum) value E.g. "mystery statistic" (the mean, minimizes squared deviations) E.g. Finding "best" cladistic tree Computing fluid flows, using Newton's laws Computing genetic combinations Rate of evolution by assembly of subunits Numerical methods provide an answer with little in the way of complex mathematics. They provide an answer to the problem at hand, not to a general sets of problems. (Mathematical analysis to arrive at answers to sets of problems) Today: Estimating population size. Emphasis has been on model based statistics, particularly models used in experimental analysis (GLM). Today brief look at a different approach, used in survey design. Sometimes called design based. Finite rather than infinite populations. Wrap-up 1

Reasons for Determining Population Size Goal is to determine an important quantity, the numbers of members of a populations. Obviously important in human populations: expected requirements for resources. Use of natural resources. Building schools and hospitals. Also important in exploited populations, such as codfish or trees. What is stock size? Goal is to set harvest at level that maintains stock size by setting harvest at the renewal rate. Analogy of withdrawing interest from bank account, leaving capital intact. Important in natural populations. Rate of evolution depends on population size. If rate of appearance of new genetic material through mutation is 10!6 then takes one generation to produce new gene in population of a million, it takes on average 100 generations in population of 10,000. Knowledge of population size also important in very small populations, at verge of extinction. Techniques can be classified as enumeration, sampling, mark-recapture Good Practices. 1. Define the population limits in space (e.g. all sea lions in the N. Pacific). Define quantity, state type of measurements unit (N = sea lions in N. Pacific) (N = sea lions per km of coast) Define units used in sampling (e.g. quadrats, colonies). If on ratio scale, state units (e.g. sea lions per colony). A list of all possible units is called the frame. This is the spatial range and resolution of the quantity N. 2. State model, assumptions, and methods. This can often be done by citation to Seber 1984 3. Test assumptions. If this is not possible, evaluate assumptions. 4. Compare different methods, if possible, because different methods differ in their sensitivity to some assumptions. 5. Check indices (widely used, inexpensive) against absolute estimate (drain pond once). 6. Use accessory information to improve estimates. 2

Good Practices (continued) 7. Provide an estimate of uncertainty, or range of possible estimates. Avoid giving misleading sense of certainty by reporting a single number. 1. Theoretical, based on behaviour of organism. 2. Statistical distribution, based on fit to similar data or situation. 3. 4. Taylor's Power Law Var = Mean a 1< a< 3 Bootstrap 8. Power (Type II error). Is this study doomed to failure? Can enough samples and information be gathered to obtain an answer of sufficient accuracy and precision, within logistic limits? Pilot studies. Use of preliminary results to design better study. Comprehensive list of methods. (cf Seber) I Censusing A Complete enumeration. B Sampling 1. Relative measures, to evaluate differences. Indices. 2. Absolute a. Quadrat (1) Size and shape (2) Number of quadrats (3) Location - random or stratified random b. Strip transect c. Line tranect. Estimate detection function. d. Distance methods (1) from random point (2) from random individuals II Mark-recapture A. Petersen. Closed population, using single mark. B. Closed population, using multiple mark. C. Open populations. Require estimate of immigration/emigration. D. Catch Effort. Individuals "marked" by removal. E. Change in ratio methods. 3

I Censusing 30 minute tour. Basic idea of each one: Type of data, assumptions, formal model Computational route. Advantages/disadvantages. A Complete enumeration. Example: People, trees. colonies (sea lions). Evaluation: How complete is the enumeration? All individuals detectable? Immigration or emigration? Easier to defend. Just prove that no member of the population was missed. Human populations. Seems easy, just count up the number of individuals in some registry, such as social insurance numbers. But existing registry are for other purposes, and some are incomplete. E.g. SIN number registry incomplete for children. So a registry is created each census by sending out forms. Estimates of subpopulations not allowed, until very recently. Wild populations. No registry. So complete enumeration is expensive. Usually more practical in small populations, e.g. California Condor I. Censussing B. Sampling. Produces estimate $N. Harder to defend More practical because cost is lower. Usually not possible to cover population range, so samples used to estimate. 1. Relative measures, to evaluate differences. Used to examine trends through time, from place to place. Catchability assumed constant, but unit of efffort cannot by used calculate absolute density, numbers/area. Example: Catch per unit effort, indices of abundance. Evaluation: are samples comparable? (though not representative). 4

I. Censussing A. Sampling 2. Absolute. Can calculate absolute density. Example: scallops per dredge haul, trees per hectare. Evaluation: Are samples representative and random? a Quadrat (1) Size and shape. Choice decided by logistics. (2) Number of quadrats. " " " (3) Location - random or stratified random to be representative, to make inferences. Notation: N = Population size $N = estimate of N A = area of population L 2 D = N/A L -2 a = quadrat area Typically L -2 but other units of effort possible for example are swept per unit time s = number of quadrats S' = A/s = number of quadrats possible p = sa/a = proportion of area sampled by s quadrats pa = area sampled n = Ex i = number collected ) x = average per quadrat. This is an estimate of N/A if random encounter: $N = n/p = ) x A/a = ) x S' Example: n = 30 p = 10% then $N = 30/0.1 = 300 5

I. Censusing A. Sampling 2. Absolute a. quadrat Report uncertainty of estimate. Several possible methods. To set confidence limits, we need the distribution of possible outcomes. Could use bootstrap methods to compute limits from distribution generated by randomization, if we have several samples. Could use a probability model, the binomial distribution. If binomial sampling then Var[ $N] = $N@q/p = $N@(1!p)/p Confidence limits: Think of $N as number of trials, p as probability of success on each trial, giving us a binomial distribution with k = 300 and p = 0.1 Cumulative distribution function, go from prob to outcome. Use spreadsheet or package (e.g. Minitab) to make calculations. MTB > invcdf.5; SUBC> binomial 300.1. K P(X LESS OR = K) K P(X LESS OR = K) 29 0.4719 30 0.5484 MTB > invcdf.025; SUBC> binomial 300.1. K P(X LESS OR = K) K P(X LESS OR = K) 19 0.0171 20 0.0287 MTB > invcdf.975; SUBC> binomial 300.1. K P(X LESS OR = K) K P(X LESS OR = K) 40 0.9746 41 0.9834 The 95% confidence limits around the observed count of 30 are 20 and 40. The 95% confidence limits around the estimate $N = 30/0.1 = 300 are L = 20/0.1 = 200 U = 40/0.1 = 400 2. Absolute. b Strip transect. Same as quadrat. c. Line transect. Estimate detection function. To correct for differential detection. d. Distance methods (1) from random point (2) from random individuals 6

II Mark-recapture A. Petersen. Closed population, single mark. Based on idea that probability of capture initially = probability of recapture Probability = Probability initial capture recapture n initial capture n marked SSSSSSSS = SSSSSSSS N n second capture Hence: $N = n initial capture SSSSSSSS @ n second capture n marked The estimate $N is obtained by boosting the original number captured n initial capture upward by the ratio of n second capture /n marked Advantages: Quick, requires only one type of mark, provides a good estimate if assumptions met. Disadvantages. Assumptions often not met. B. Closed population multiple mark Use two or more marks to obtain more information Two tags at same time to evaluate tag loss different marks should give same estimate tags can be in different proportions, depending on cost Two more tags in sequence. One tag at time 1 Next tag at time 2 Still another tag at time 3 More information allows mortality and recruitment also to be estimated. These effects can then be removed to obtain less biased estimate of N Usually better to carry out over short period, to reduce bias due to mortality and recruitment. Pick times with low mortality and recruitment. Advantages: mortality and tag loss can be taken into account Disadvantage: more work, more chance of tag effects. 7

C. Open populations. Immigration/emigration has to be measured, to remove biassing effect on estimate of N This can be done with accessory information on rates of movement. This can also be accomplished by using multiple marks (sequential) in a well designed study. D. Catch/Effort methods. Individuals are captured and removed. Ie. "marked" Regress catch per unit effort against trial. Intercept ("trial 0") is estimate of population size. E. Change in ratio methods. Probability changes. Application: How would you determine the number of newly settled codfish? You can ask me anything you like. I'll tell you, if it is known. From this decide on methods that can be used. 8