Sampling (statistika)

Ti Wikipédia, énsiklopédia bébas basa Sunda
Luncat ka: pituduh, sungsi


Nyokot conto atawa samplingngarupakeun bagean tina statistika praktis nu museurkeun kana pamilihan individu nu ditalungtik, numana diharepkeun bakal ngahasilkeun pangaweruh ngeunaan populasi, hususna keur kaperluan kaputusan statistik. Sabagian ti eta, hasil tina teori kamungkinan jeung tiori statistik bisa digunakeun keur panunjuk dina kaperluan praktis.

Proses nyokot conton aya lima tahapan, nyaeta:

  • Hartikeun populasi anu ditalungtik
  • Husukeun heula rohangan conto, susun barang atawa kajadian anu mungkin keur diukur
  • Hususkeun metoda nyokot conto keur pamilihan barang atawa kajadian tina frame
  • Cokot conto jeung kumpulkeun data
  • Talungtik proses nyokot conto
Panneau travaux.png Artikel ieu keur dikeureuyeuh, ditarjamahkeun tina basa Inggris.
Bantosanna diantos kanggo narjamahkeun.

Harti Populasi[édit | sunting sumber]

Suksesna statistik praktis dumasar kana pokus problem definition. Sacara tipe, we seek to take action on some population, for example when a batch of material from production must be released to the customer or sentenced for scrap or rework. Alternatively, we seek knowledge about the cause system of which the population is an outcome, for example when a researcher performs an experiment on rats with the intention of gaining insights into biochemistry that can be applied for the benefit of humans. In the latter case, the population of concern can be difficult to specify, as it is in the case of measuring some physical characteristic such as the electrical conductivity of copper.

However, in all cases, time spent in making the population of concern precise is always well spent, often because it raises many issues, ambiguities and questions that would otherwise have been overlooked at this stage.

Sampling frame[édit | sunting sumber]

In the most straightforward case, such as the sentencing of a batch of material from production (acceptance sampling by lots), it is possible to identify and measure every single item in the population and to include any one of them in our sample. However, in the more general case this is not possible. There is no way to identify all rats in the set of all rats. There is no way to identify every voter at a forthcoming election (in advance of the election).

These imprecise populations are not amenable to sampling in any of the ways below and to which we could apply statistical theory.

As a remedy, we seek a sampling frame which has the property that we can identify every single element and include any in our sample. For example, in an electoral poll, possible sampling frames include:

  • Electoral register
  • Telephone directory
  • Shoppers in Anytown, High Street on the Monday afternoon before the election.

The sampling frame must be representative of the population and this is a question outside the scope of statistical theory demanding the judgement of experts in the particular subject matter being studied. All the above frames omit some people who will vote at the next election and contain some people who will not. People not in the frame have no prospect of being sampled. Statistical theory tells us about the uncertainties in extrapolating from a sample to the frame. In extrapolating from frame to population its role is motivational and suggestive.

In defining the frame, practical, economic, ethical and technical issues need to be addressed. The need to obtain timely results may prevent extending the frame far into the future.

The difficulties can be extreme when the population and frame are disjoint. This is a particular problem in forecasting where inferences about the future are made from historical data. In fact, in 1703, when Jacob Bernoulli proposed to Gottfried Leibniz the possibility of using historical mortality data to predict the probability of early death of a living man, Gottfried Leibniz recognised the problem in replying:

Nature has established patterns originating in the return of events but only for the most part. New illnesses flood the human race, so that no matter how many experiments you have done on corpses, you have not thereby imposed a limit on the nature of events so that in the future they could not vary.

Having established the frame, there are a number of ways of organising it to improve efficiency and effectiveness.

Simple sampling[édit | sunting sumber]

In this case, all elements of the frame are treated equally and it is not subdivided or partitioned. One of the sampling methods below is applied to the whole frame.

Stratified sampling[édit | sunting sumber]

Where the population embraces a number of distinct categories, the frame can be organised by these categories into separate strata or demographics. One of the sampling methods below is then applied to each stratum separately, maintaining the same balance in numbers as exists in the population and resulting in an improvement in precision.

Cluster sampling[édit | sunting sumber]

Where items in the population are clustered, sampling can reflect this to minimise costs. For example, in a national survey by personal interview, many people will be remotely located and costly to reach. Cluster sampling locates the frame in areas of concentrated habitation.

Multistage sampling[édit | sunting sumber]

...

Sampling method[édit | sunting sumber]

Within any of the types of frame identified above, a variety of sampling methods can be employed, individually or in combination.

Random sampling[édit | sunting sumber]

In Random sampling, every combination of items from the frame, or stratum, has an equal probability of occurring. It guarantees that the sample is representative of the frame but is infeasible in many practical situations. It is a type of probability sampling.

Systematic sampling[édit | sunting sumber]

Selecting (say) every tenth name from the telephone directory is simple to implement and is an example of systematic sampling. Though simple to implement, asymmetries and biases in the structure of the data can lead to bias in results. It is a type of nonprobability sampling

Mechanical sampling[édit | sunting sumber]

Mechanical sampling occurs typically in sampling solids, liquids and gases, using devices such as grabs, scoops, thief probes, the coliwasa and riffle splitter.

Mechanical sampling is not random and is a type of nonprobability sampling. Care is needed in ensuring that the sample is representative of the frame. Much work in this area was developed by Pierre Gy.

Convenience sampling[édit | sunting sumber]

Sometimes called, grab sampling, this is the method of choosing items arbitrarily and in an unstructured manner from the frame. Though almost impossible to treat rigorously, it is the method most commonly employed in many practical situations.

Ukuran sampel[édit | sunting sumber]

Where the frame and population are identical, statistical theory yields exact recommendations on sample size. However, where it is not straightforward to define a frame representative of the population, it is more important to understand the cause system of which the population are outcomes and to ensure that all sources of variation are embraced in the frame. Large number of observations are of no value if major sources of variation are neglected in the study.

Sampling and data collection[édit | sunting sumber]

Good data collection involves:

  • Following the defined sampling process
  • Keeping the data in time order
  • Noting comments and other contextual events
  • Recording non-responses

Review of sampling process[édit | sunting sumber]

After sampling, a review should be held of the exact process followed in sampling, rather than that intended, in order to study any effects that any divergences might have on subsequent analysis. A particular problem is that of non-responses.

Non-responses[édit | sunting sumber]

In survey sampling, many of the individuals identified as part of the sample may be unwilling to participate or impossible to contact. In this case, there is a risk of differences, between (say) the willing and unwilling, leading to bias in conclusions. This is often addressed by follow-up studies which make a repeated attempt to contact the unresponsive and to characterise their similarities and differences with the rest of the frame.

Bibliography[édit | sunting sumber]

  • Cochran, W G (1977) Sampling Techniques
  • Deming, W E (1975) On probability as a basis for action, The American Statistician, 29(4), pp146-152
  • Gy, P (1992) Sampling of Heterogeneous and Dynamic Material Systems: Theories of Heterogeneity, Sampling and Homogenizing

Related topics[édit | sunting sumber]

External Links[édit | sunting sumber]

Rujukan[édit | sunting sumber]

  • Brown, K.W., Cozby, P.C., Kee, D.W., & Worden, P.E. (1999). Research Methods in Human Development, 2d ed. Mountain View, CA : Mayfield. ISBN 1-55934-875-5
  • Bartlett, J. E., II, Kotrlik, J. W., & Higgins, C. (2001). Organizational research: Determining appropriate sample size for survey research. Information Technology, Learning, and Performance Journal, 19(1) 43-50.
  • Chambers, R L, and Skinner, C J (editors) (2003), Analysis of Survey Data, Wiley, ISBN 0-471-89987-9
  • Cochran, W G (1977) Sampling Techniques, Wiley, ISBN 0-471-16240-X
  • Deming, W E (1975) On probability as a basis for action, The American Statistician, 29(4), pp146-152.
  • Flyvbjerg, B (2006) "Five Misunderstandings About Case Study Research." Qualitative Inquiry, vol. 12, no. 2, April 2006, pp. 219-245. [1]
  • Gy, P (1992) Sampling of Heterogeneous and Dynamic Material Systems: Theories of Heterogeneity, Sampling and Homogenizing
  • Kish, L (1995) Survey Sampling, Wiley, ISBN 0-471-10949-5
  • Korn, E L, and Graubard, B I (1999) Analysis of Health Surveys, Wiley, ISBN 0-471-13773-1
  • Lohr, H (1999) Sampling: Design and Analysis, Duxbury, ISBN 0-534-35361-4
  • Sarndal, Swenson, and Wretman (1992), Model Assisted Survey Sampling, Springer-Verlag, ISBN 0-387-40620-4
  • Stuart, Alan (1962) Basic Ideas of Scientific Sampling, Hafner Publishing Company, New York
  • ASTM E105 Standard Practice for Probability Sampling Of Materials
  • ASTM E122 Standard Practice for Calculating Sample Size to Estimate, With a Specified Tolerable Error, the Average for Characteristic of a Lot or Process
  • ASTM E141 Standard Practice for Acceptance of Evidence Based on the Results of Probability Sampling
  • ASTM E1402 Standard Terminology Relating to Sampling
  • ASTM E1994 Standard Practice for Use of Process Oriented AOQL and LTPD Sampling Plans
  • ASTM E2234 Standard Practice for Sampling a Stream of Product by Attributes Indexedby AQL