Definition of sampling:
“In research terms, a sample is a group of people, objects, or items that are taken from a larger population for measurement. The sample should be representative of the population to ensure that we can generalize the findings from the research sample to the population as a whole”.
Type of sampling:
There are two major sampling types:
“In the context of nonprobabilistic sampling, the likelihood of selecting some individuals from the target population is null. This type of sampling does not render a representative sample; therefore, the observed results are usually not generalizable to the target population”. Still, unrepresentative samples may be useful for some specific research objectives and may help answer particular research questions, as well as contribute to the generation of new hypotheses.
- The different types of nonprobabilistic sampling are detailed below:
- Convenience sampling:
“The participants are consecutively selected in order of appearance according to their convenient accessibility (also known as consecutive sampling).”
The sampling process comes to an end when the total amount of participants (sample saturation) and/or the time limit (time saturation) are reached. Randomized clinical trials are usually based on convenience sampling. After sampling, participants are usually randomly allocated to the intervention or control group (randomization). Although randomization is a probabilistic process to obtain two comparable groups (treatment and control), the samples used in these studies are generally not representative of the target population.
- Purposive sampling:
“This is used when a diverse sample is necessary or the opinion of experts in a particular field is the topic of interest.”
This technique was used in the study by Roubille et al, in which recommendations for the treatment of comorbidities in patients with rheumatoid arthritis, psoriasis, and psoriatic arthritis were made based on the opinion of a group of experts.
- Quota sampling:
“According to this sampling technique, the population is first classified by characteristics such as gender, age, etc. Subsequently, sampling units are selected to complete each quota.”
For example, in the study by Larkin et al., the combination of vemurafenib and cobimetinib versus placebo was tested in patients with locally advanced melanoma, stage IIIC or IV, with BRAF mutation. The study recruited 495 patients from 135 health centers located in several countries. In this type of study, each center has a “quota” of patients.
- “Snowball” sampling:
“In this case, the researcher selects an initial group of individuals. Then, these participants indicate other potential members with similar characteristics to take part in the study.”
This is frequently used in studies investigating special populations, for example, those including illicit drug users, as was the case of the study by Gonçalves et al, which assessed 27 users of cocaine and crack in combination with marijuana.
“In the context of probabilistic sampling, all units of the target population have a nonzero probability to take part in the study.”
If all participants are equally likely to be selected in the study, equiprobability sampling is being used, and the odds of being selected by the research team may be expressed by the formula: P=1/N, where P equals the probability of taking part in the study and N corresponds to the size of the target population. The main types of probabilistic sampling are described below.
- Simple random sampling:
“In this case, we have a full list of sample units or participants (sample basis), and we randomly select individuals using a table of random numbers”.
An example is a study by Pimenta et al, in which the authors obtained a listing from the Health Department of all elderly enrolled in the Family Health Strategy and, by simple random sampling, selected a sample of 449 participants.
- Systematic random sampling:
“In this case, participants are selected from fixed intervals previously defined from a ranked list of participants”.
For example, in the study of Kelbore et al, children who were assisted at the PediatricDermatology Service were selected to evaluate factors associated with atopic dermatitis, selecting always the second child by consulting order.
- Stratified sampling:
“In this type of sampling, the target population is first divided into separate sections. Then, samples are selected within each section, either through simple or systematic sampling.
The total number of individuals to be selected in each section can be fixed or proportional to the size of each section. Each individual may be equally likely to be selected to participate in the study.”
However, the fixed method usually involves the use of sampling weights in the statistical analysis (inverse of the probability of selection or 1/P).
An example is a study conducted in South Australia to investigate factors associated with vitamin D deficiency in preschool children. Using the national census as the sample frame, households were randomly selected in each stratum and all children in the age group of interest identified in the selected houses were investigated.
- Cluster sampling:
In this type of probabilistic sampling, groups such as health facilities, schools, etc., are sampled. In the above-mentioned study, the selection of households is an example of cluster sampling.
- Complex or multi-stage sampling:
“This probabilistic sampling method combines different strategies in the selection of the sample units”.
An example is the study of Duquia et al. to assess the prevalence and factors associated with the use of sunscreen in adults. The sampling process included two stages.
Using the 2000 Brazilian demographic census as a sampling frame, all 404 census tracts from Pelotas (Southern Brazil) were listed in ascending order of family income. A sample of 120 tracts was systematically selected (first sampling stage units).
In the second stage, 12 households in each of these census tract (second sampling stage units) were systematically drawn. All adult residents in these households were included in the study (third sampling stage units). All these stages have to be +considered in the statistical analysis to provide correct estimates.
Frequently, sample sizes are increased by 10% to compensate for potential nonresponses (refusals/losses). Let us imagine that in a study to assess the prevalence of premalignant skin lesions there is a higher percentage of nonrespondents among men (10%) than among women (1%).
If the highest percentage of nonresponse occurs because these men are not at home during the scheduled visits, and these participants are more likely to be exposed to the sun, the number of skin lesions will be underestimated.
For this reason, it is strongly recommended to collect and describe some basic characteristics of nonrespondents (sex, age, etc.) so they can be compared to the respondents to evaluate whether the results may have been affected by this systematic error.
Often, in study protocols, refusal to participate or sign the informed consent is considered an “exclusion criteria”. However, this is not correct, as these individuals are eligible for the study and need to be reported as “nonrespondents”.
SAMPLING METHOD ACCORDING TO THE TYPE OF STUDY:
In general, clinical trials aim to obtain a homogeneous sample that is not necessarily representative of any target population.
Clinical trials often recruit those participants who are most likely to benefit from the intervention. Thus, the more strict criteria for inclusion and exclusion of subjects in clinical trials often make it difficult to locate participants: after verification of the eligibility criteria, just one out of ten possible candidates will enter the study.
Therefore, clinical trials usually show limitations to generalize the results to the entire population of patients with the disease, but only to those with similar characteristics to the sample included in the study. These peculiarities in clinical trials justify the necessity of conducting multicenter and/or global studies to accelerate the recruitment rate and to reach, in a shorter time, the number of patients required for the study.
In turn, in observational studies to build a solid sampling plan is important because of the great heterogeneity usually observed in the target population.
Therefore, this heterogeneity has to be also reflected in the sample. A cross-sectional population-based study aiming to assess disease estimates or identify risk factors often uses complex probabilistic sampling, because the sample representativeness is crucial.
However, in a case-control study, we face the challenge of selecting two different samples for the same study. One sample is formed by the cases, which are identified based on the diagnosis of the disease of interest.
The other consists of controls, which need to be representative of the population that originated the cases. Improper selection of control individuals may introduce selection bias in the results. Thus, the concern with representativeness in this type of study is established based on the relationship between cases and controls (comparability).
In group studies, individuals are recruited based on the exposure (exposed and unexposed subjects), and they are followed over time to evaluate the occurrence of the outcome of interest.
At baseline, the sample can be selected from a representative sample (population-based cohort studies) or a non-representative sample. However, in the successive follow-ups of the cohort member, study participants must be a representative sample of those included in the baseline. In this type of study, losses over time may cause follow-up bias.
How we can select research participants?
During the planning phase, you thought about which community members would best be able to provide the information you want, or if you are looking at issues within your organization or department, which staff members can provide the information.
As well as these questions there are many other decisions you will need to make when selecting your participants. This section provides you with a list of issues that you will need to consider before making the final decisions regarding study participants.
- Method of selection (sampling):
There are many methods for selecting your participants, and the type of sampling will depend on how you will use the information. Focus group results cannot usually be used to describe how an entire population would respond to the same questions, so the type of sampling used in studies designed to describe whole populations is not necessary.
The common (and simplest) method for selecting participants for focus groups is called “purposive” or “convenience” sampling. This means that you select those members of the community who you think will provide you with the best information. It need not be a random selection; indeed, a random sample may be foolish.
For example, if you are investigating why leprosy patients do not always present for medication, it would seem more “convenient” and more useful to select those patients, relatives, and staff involved in the leprosy program. A random sample of the whole community may not provide you with a single person with leprosy!
- Who can provide the best information?
Do not forget to think carefully about all aspects of the problem and be creative when deciding who can provide you with the best information. People in positions of power and authority, or with technical skills, are not necessarily the best people to talk to if you are interested in community attitudes and beliefs.
Sometimes less obvious people can be extremely useful. Try to think of all the members of your community that could have some knowledge or influence on the problem.
If you do not understand the community well enough to know who can be of most use, do not be afraid to ask local health staff, local leaders, or simply members of the community that you have access to. Never rely only on your ideas about a problem, particularly where you are studying people’s attitudes and beliefs. You could be on the wrong track completely by viewing things from your own experiences.
- What will the composition be in each focus group?
As focus groups are discussions among people with similar characteristics, it is important to ensure that participants in any one group have something in common with each other. The reason for this is simple. People talk more openly if they are in a group of people who share the same background or experiences. For example, suppose you are interested in sexual practices in a project concerned with community education to prevent HIV/AIDS.
A group that included both young single women and older married women might not be very successful; the young women may feel obliged to discuss “acceptable” practices rather than their true range of experiences and behaviors. Participants with different backgrounds and experiences can restrict the openness of discussion within the group. Given this, you need to
think about the status of participants in the community, their socio-economic status, educational background, religion, sex, age, and so on, considering which characteristics might most influence a free and natural discussion.
- How many groups are necessary?
In general, once the focus groups cease to provide you with new information, then you do not need to conduct any more sessions. Sometimes this may occur after only two or three sessions with each grouping of participants; sometimes you may need to run six, seven, or more before you are satisfied. If this is the first time your team has used focus groups, then you need to allow also for a few practice sessions that may not provide you with the quality of information you require.
You should group “types” of people together. This is probably obvious, but worth mentioning. Say, for example, in a study of leprosy, you have identified as target groups for focus group discussions local health workers, traditional healers, adult patients, caretakers of young people with leprosy, and other members of families with leprosy patients. It would be most appropriate to conduct focus groups separately for each group.
However, do not get too complicated in your selection process. This is a very easy mistake to make! In the above example, you already have identified five separate groups of participants.
If you now decided that sex, education and residence might all inhibit discussion, and so decided to interview women and men separately, to interview those with and without formal education, and to interview rural and urban dwellers separately, and you aim to hold three focus groups for each group of participants, you’d end up with 120 focus groups! Use your common sense about the criteria for selection.
Ask yourself some basic questions. Will separating leprosy patients according to education, for example, really provide you with more clues to understanding their presentation for therapy?
- How many participants do we want to select?
After deciding who it is you want to include in the project, you need to decide how many people you will want to contact for each session. Focus groups work well with around four to twelve people.
Groups with more than eight can be difficult to control, but the decision on how many you want in each group will depend on how your particular community groups together and conducts discussions in natural community settings.
If you have decided on eight participants for each group, it is still advisable to invite ten people, in case some do not arrive at the session. Be careful though not to over-recruit. In many communities, it would not be acceptable to turn away participants who had already arrived.
- How do we contact the participants?
This will depend, again, on the community with which you are working. Simply observe the local custom in your area. This usually involves contacting local leaders first, providing an explanation of the study, and gaining permission to work in that village or location.
It could also involve meeting with local health workers. Provided you approach such people appropriately, they will usually be happy to help you to locate individuals for the focus groups.
How much notice you give the participants of the focus groups will vary according to the logistics of gaining access to the community. It is ideal to notify the participants the week before and then provide a reminder the day before.
In many situations, this is not possible, and in some cases, participants have been successfully recruited one hour before the session! You need to consider your participants’ daily routine and take into account the ease or difficulty for them to attend the session. They are making a sacrifice to assist you, and this should always be recognized and allowed for.
When the participants are contacted for the first time, they should be provided with information about the study (without actually discussing the focus group questions or directly stating the aim of the study, as this may reduce the quality of the session), about why they have been selected, and how the results will be used.
For example, you might introduce a study on perceptions of disease in an area with a high prevalence of schistosomiasis by explaining that you are interested in the health problems of the study community, that to understand these you need to talk to as many people as possible, that you hope to learn from their own experience of health and illness, and that the information you gather will be used to help formulate plans to try to ensure better health.
At this time you will also need to check whether anything will need to be provided to help the participant attend, like child care or transport. Personal contact by the project team is strongly recommended as this can show the participants that their contribution is considered important.
How we can sample planes in research?
“A sampling plan is a term widely used in research studies that provide an outline based on which research is conducted. It tells which category is to be surveyed, what should be the sample size and how the respondents should be chosen out of the population.”
- Three major decision while making sample plane:
The sampling plan is a base from which the research starts and includes the following three major decisions:
- What should be the Sampling unit i.e. choosing the category of the population to be surveyed is the first and the foremost decision in a sampling plan that initiates the research? E.g. In the case of the banking industry, should the sampling unit consist of current account holders, saving account holders, or both? Should it include male or female account holders? These decisions once made the then sampling frame is designed to give everyone in the target population an equal chance of being sampled.
- The second decision in the sampling plan is determining the size of the sample i.e. how many objects in the sample area to be surveyed. Generally, “the larger the sample size, the more is the reliability” and therefore, researchers try to cover as many samples as possible.
- The final decision that completes the sampling plan is selecting the sampling procedure i.e. which method can be used such that every object in the population has an equal chance of being selected. Generally, the researchers use probability sampling to determine the objects to be chosen as these represent the sample more accurately.
Define population and sample:
“A population is the entire group that you want to conclude about”.
Populations are used when your research question requires, or when you have access to, data from every member of the population.
Usually, it is only straightforward to collect data from a whole population when it is small, accessible, and cooperative.
“A sample is the specific group that you will collect data from. The size of the sample is always less than the total size of the population.”
And what you understand from the sample and population in research?
In research, a population doesn’t always refer to people. It can mean a group containing elements of anything you want to study, such as objects, events, organizations, countries, species, organisms, etc. And the sample is a small portion of that population that suits your research problem and that sample represents the whole population.