Research methodology vocabulary — quantitative, qualitative, and statistical

C1 academic English assumes a working vocabulary of research methodology. Whether you are reading a journal article, writing a methods section, listening to a research talk, or analyzing data presented in a memo, the language is dense with specialized terms that have specific meanings — and small mistranslations from Russian can produce serious misreadings. Variable, sample, correlation, significance all have technical definitions in research English that differ from their casual or Russian equivalents.

This lesson covers the vocabulary in four blocks: the basic architecture of empirical research (quantitative vs qualitative, sample vs population), the vocabulary of experiments (control, treatment, randomization, blinding), the language of variables and measurement (independent, dependent, confounding, operationalization), and the statistical vocabulary that appears constantly in modern academic and business writing (significance, p-value, confidence interval, effect size). The goal is recognition and productive use — at C1 you should be able to read a methods section without struggling and write your own at undergraduate-thesis level.

Note that we focus on US conventions, including American Psychological Association style for statistics reporting, which is the dominant convention in social sciences, business, and most applied fields in 2026.

Quantitative vs qualitative research

The first architectural distinction in research design.

Quantitative research

Quantitative research uses numerical data and statistical analysis to test hypotheses or estimate parameters. Typical methods:

Surveys with structured questionnaires
Experiments with random assignment
Observational studies with measurable outcomes
Secondary analysis of existing datasets

Quantitative results take the form of numbers, percentages, statistical tests, and confidence intervals. The language emphasizes precision, replicability, and generalization.

Qualitative research

Qualitative research uses non-numerical data — words, images, observations — to understand meaning, process, and context. Typical methods:

Interviews (structured, semi-structured, or open-ended)
Focus groups
Ethnographic observation
Document analysis and discourse analysis
Case studies

Qualitative results take the form of themes, narratives, and frameworks. The language emphasizes depth, context, and the interpretation of meaning.

Mixed methods

Mixed-methods research combines both. Increasingly common in business research, public health, and applied social science. A typical mixed-methods design might use a quantitative survey to identify patterns and qualitative interviews to interpret why those patterns occur.

Vocabulary distinction

Quantitative	Qualitative
measure, quantify, estimate	explore, describe, interpret
sample, dataset, observation	participant, informant, case
hypothesis, prediction	research question, inquiry
variable, factor	theme, category, code
significance, effect size	salience, prominence
generalize, infer	transfer, illuminate

A Russian-speaker error: using quantitative and qualitative interchangeably or treating qualitative as a synonym for high-quality. Qualitative in research English refers specifically to non-numerical methods, not to quality assessment.

Sample and population

A population is the complete set of cases the research is about. A sample is the subset of the population actually studied. The relationship between sample and population governs whether results can be generalized.

Key terms

Population: the full group of interest (all US adults, all Fortune 500 companies, all hospital admissions in 2025)
Sample: the subset studied (1,000 US adults surveyed by Gallup)
Sampling frame: the list from which the sample is drawn (registered voters, company directory)
Sample size (denoted n): the number of cases in the sample
Representativeness: how well the sample reflects the population
Generalizability: whether sample results extend to the population
External validity: closely related — whether results extend to other contexts

Sampling strategies

Random sampling: every member of the population has an equal chance of selection
Stratified sampling: random sampling within subgroups (strata) to ensure representation
Cluster sampling: random selection of groups (clusters) rather than individuals
Convenience sampling: selecting whoever is available (weak generalizability; common in pilot studies)
Snowball sampling: participants recruit other participants (used for hard-to-reach populations)
Purposive sampling: deliberate selection based on criteria (qualitative research)

Sample size language

A sample of 1,200 adults (specific number)
A nationally representative sample (claims representativeness)
An adequately powered sample (statistical-power language)
A convenience sample of 50 students (acknowledges limitations)

Experimental design vocabulary

Experiments aim to establish causal relationships by manipulating variables and measuring outcomes.

Core concepts

Treatment group (or experimental group): receives the intervention
Control group: does not receive the intervention; provides a baseline
Random assignment: participants assigned to groups by chance (key to causal inference)
Intervention: the treatment being tested
Outcome (or dependent variable): what is measured
Baseline: measurement before the intervention

Blinding

Single-blind: participants do not know which group they are in
Double-blind: neither participants nor researchers know
Triple-blind: data analysts are also blinded
Open-label: no blinding

Blinding reduces bias from expectations. Pharmaceutical trials use double-blinding wherever possible.

Types of experiments

Randomized controlled trial (RCT): the gold standard for causal inference
Quasi-experiment: experimental setup without true random assignment
Natural experiment: real-world events that approximate random assignment (a policy change in one state but not another)
Field experiment: experiment conducted in a real-world setting rather than a lab
A-B test: experimental comparison of two versions (common in tech and marketing)

Sample language

Participants were randomly assigned to either the treatment group, which received the new training program, or the control group, which continued with the existing curriculum. The study was single-blind: trainers knew the assignment, but participants did not.

Variables — the building blocks

A variable is anything that varies and can be measured or categorized. Research design hinges on identifying variables correctly.

Types of variables

Independent variable (IV): the variable manipulated or treated as the cause
Dependent variable (DV): the outcome being measured
Moderator variable: a variable that changes the strength or direction of the IV-DV relationship
Mediator variable: a variable that explains how the IV affects the DV
Control variable: a variable held constant to isolate the effect of the IV
Confounding variable (or confounder): an unmeasured variable that distorts the IV-DV relationship

Example: studying whether exercise reduces depression.

IV: exercise
DV: depression symptoms
Moderator: age (effect may be larger for older adults)
Mediator: sleep quality (exercise improves sleep, which improves mood)
Confounder: pre-existing health condition (sicker people exercise less and are also more depressed)

Measurement levels

Nominal: categories with no order (gender, country)
Ordinal: ordered categories without equal intervals (education level: high school, college, graduate)
Interval: ordered with equal intervals but no true zero (temperature in Fahrenheit)
Ratio: ordered, equal intervals, true zero (income, age, distance)

The measurement level determines which statistical tests are appropriate.

Operationalization

Operationalization is the process of defining how a concept is measured. Concepts like happiness, engagement, intelligence must be operationalized before they can be studied quantitatively.

We operationalized engagement as the number of weekly logins to the platform.

That definition is necessary for measurement but inevitably narrows the concept. A C1 reader of research knows to ask: is this operationalization a reasonable proxy for the underlying concept?

Statistical vocabulary — the C1 core

Statistical reporting appears constantly in academic and business writing. You do not need to do statistics at C1; you need to read and interpret them correctly.

Significance

Statistical significance means that the observed result is unlikely to have occurred by chance alone, given a chosen threshold. The conventional threshold in most fields is 5% — written as alpha equals 0.05 or the 5% level.

A result is described as significant when the p-value is below the threshold. Not significant means the data do not rule out chance.

Common phrases:

The effect was statistically significant at the 5% level.
The difference was not significant (p equals 0.12).
Significance was reached only after controlling for income.
The result remained significant in robustness checks.

The p-value

The p-value is the probability of observing a result as extreme as the one obtained, assuming the null hypothesis is true. Note the language carefully — the p-value is not the probability that the hypothesis is wrong; that is a common misreading.

Conventional reporting (APA style):

p equals 0.03 (specific value)
p less than 0.001 (very small)
p greater than 0.05 (not significant)

Important: avoid writing the inequality with the angle bracket directly before a digit in flowing prose. Spell it out as p less than 0.05 in body text, or use the LaTeX-style notation in formal reports.

Confidence interval

A confidence interval gives a range of plausible values for the estimated parameter, with a stated confidence level (usually 95%).

The estimated effect was a 12% increase, with a 95% confidence interval of 8% to 16%.

Interpretation: if the study were repeated many times, 95% of the constructed intervals would contain the true value. Note that 95% confident the true value lies in this interval is a slight misinterpretation that statisticians flag but practitioners use anyway.

Effect size

Statistical significance tells you whether an effect exists; effect size tells you how big it is. A study with a huge sample can detect a tiny effect as significant; the effect may not matter practically.

Common effect size measures:

Cohen’s d: standardized mean difference (small: 0.2, medium: 0.5, large: 0.8)
Pearson’s r: correlation coefficient (range: minus 1 to plus 1)
Odds ratio: used for binary outcomes
R-squared: proportion of variance explained

Modern best practice in journals is to report effect size alongside significance — significance without effect size can mislead.

Common statistical tests

t-test: compares means of two groups
ANOVA (analysis of variance): compares means across three or more groups
Chi-squared test: tests association between categorical variables
Regression (linear, logistic, multiple): models relationships between variables
Correlation: measures strength and direction of a linear relationship

You do not need to compute these at C1. You do need to recognize them when they appear in methods sections.

Causal language — the discipline of careful claims

A key C1 skill is calibrating causal language to the strength of the design.

Design strength	Allowable language
Randomized controlled trial	causes, leads to, produces (with appropriate hedging)
Quasi-experiment	appears to cause, is associated with, predicts
Observational study	is associated with, correlates with, is linked to
Cross-sectional survey	is correlated with, co-occurs with
Anecdote or case study	may suggest, raises the possibility that

The famous warning: correlation does not imply causation. Cross-sectional and observational studies can identify associations but cannot, on their own, establish that one variable causes another.

A C1 academic writer reading a press release that says eating broccoli causes longevity will check the underlying study — and almost always find that the study showed an association, not causation.

Reliability and validity

Two key methodological concepts often confused.

Reliability: consistency of measurement (same result on repeated measurement)
Validity: accuracy of measurement (measures what it claims to measure)

A scale that always reports your weight as 5 kg too heavy is reliable (consistent) but not valid (inaccurate). A scale that randomly varies but averages to the correct weight is valid on average but not reliable.

Types of validity:

Internal validity: confidence that the observed effect is due to the IV, not confounders
External validity (or generalizability): whether results extend to other contexts
Construct validity: whether the measure captures the underlying concept
Statistical conclusion validity: whether statistical inferences are appropriate

Phrase bank — research methodology

Describing methods:

We conducted a randomized controlled trial with [N] participants.
Participants were randomly assigned to one of three conditions.
The study used a within-subjects design with two measurement points.
We employed a mixed-methods design combining surveys and interviews.

Describing results:

The treatment produced a significant reduction in [outcome], with an effect size of [d].
The 95% confidence interval ranged from [low] to [high].
The difference between groups was significant at the 1% level.
No significant effect was observed for [variable].

Acknowledging limitations:

The study has several limitations that warrant caution in interpretation.
Generalizability is limited by the convenience sample.
We cannot rule out unmeasured confounding.
Replication in a larger sample is warranted.

Interpreting findings:

The findings are consistent with the hypothesis that…
These results suggest that…
Taken together, the evidence points to…
The pattern is consistent across both samples.

Full model — research methodology paragraph

The paragraph below describes the methodology of a hypothetical workplace-intervention study. Methodology vocabulary is in italics.

We conducted a randomized controlled trial to evaluate the effect of a four-week mindfulness training program on self-reported workplace stress. Participants were 240 employees drawn from three large organizations in the financial services sector, recruited through internal communications. Random assignment placed 120 employees in the treatment group, which received the training, and 120 in the control group, which continued with their normal routine. The study was single-blind: training facilitators knew the assignment, but participants in the control group were blinded to whether they were receiving an active or inactive condition. The primary outcome was the Perceived Stress Scale score, measured at baseline, immediately post-intervention, and at three-month follow-up. Secondary outcomes included self-reported sleep quality and a measure of work engagement. We operationalized program adherence as completion of at least 75% of training sessions. Analyses used intent-to-treat with mixed-effects regression controlling for baseline score, gender, and job tenure. Statistical significance was set at p less than 0.05 and effect sizes were reported as Cohen’s d. The trial was pre-registered on the Open Science Framework, and the analysis plan was specified in advance.

Word count: 175. This paragraph reads as standard methodology section in a social-science journal. A C1 reader should be able to parse every italicized term confidently.

Проверка знанийKnowledge check

A press release states: 'A new study shows that drinking three cups of coffee per day causes a 15% reduction in heart disease, with a p-value of 0.03.' What three methodological questions should a C1 reader immediately ask, and why does the press release language likely overstate the actual finding?

ОтветAnswer

Three questions. First, was the study a randomized controlled trial or an observational study? The word *causes* in the press release is almost certainly inappropriate if this was observational research (most large coffee studies are observational, following large populations over time). Observational data can show association but cannot, on their own, establish causation. The accurate verb would be *is associated with* or *predicts*. Second, what was the effect size and confidence interval? A p-value of 0.03 tells us the result is unlikely due to chance, but says nothing about the size of the effect or the precision of the estimate. A 15% reduction with a 95% confidence interval of 14% to 16% is impressive; a 15% reduction with a confidence interval of 1% to 29% is barely meaningful. Third, what confounders were controlled for? People who drink three cups of coffee per day differ systematically from those who do not — by income, by education, by exercise habits, by smoking. Without rigorous control for confounders, the apparent coffee-heart effect could reflect any of these other factors. The press release language overstates because it uses causal language (*causes*) where the data likely support only associational language, and because it omits effect size, confidence interval, and confounder discussion — all of which a rigorous reader needs to evaluate the claim.

Common Russian-speaker mistakes

Confusing significant with important. In research English, statistically significant is a technical term about probability, not a judgment about importance. The effect was significant but small is a coherent sentence, not a contradiction.
Treating correlation as causation. Russian связь covers both correlation and causation; English research vocabulary distinguishes them strictly. Use is associated with for correlation; reserve causes for experimental designs.
Mistranslating исследование. Research (general), study (specific piece), investigation (formal inquiry), trial (experimental study, often clinical) — context dictates the choice.
Calling qualitative research qualitative. Some Russian academic traditions treat качественный as a quality judgment. In English research vocabulary, qualitative refers to non-numerical methods, not to high-quality work.
Confusing sample with example. Sample in research is the studied subset of a population, not an illustrative case. A sample of 500 participants not a sample of how to do something.
Writing p-value with the wrong direction. p less than 0.05 means significant; p greater than 0.05 means not significant. Russian speakers sometimes reverse this because the direction of inequality in Russian academic prose differs. Memorize: smaller p means stronger result.
Underspecifying methods. Russian academic writing sometimes describes methodology in general terms; American journals require precise specification of sample size, recruitment method, randomization procedure, blinding, and analysis plan. Methods sections should be replicable.

Summary

Quantitative research uses numerical data; qualitative uses non-numerical data; mixed methods combines both.
Sample is studied subset; population is the group of interest; representativeness governs generalization.
Experimental design uses treatment vs control groups with random assignment; gold standard is the RCT.
Variables include independent, dependent, moderator, mediator, control, and confounding; operationalization defines measurement.
Statistical significance is a probability claim, not an importance claim; report effect size alongside significance.
Causal language must match design strength — causes for RCTs, is associated with for observational data.
Reliability is consistency; validity is accuracy; both are required for credible measurement.

B2: Academic essay — 5-paragraph structure deep C2: Reading scholarly papers in unfamiliar fields

This concludes M13. You now have the vocabulary, structural moves, and conventions for academic and business English at the C1 standard. Next module: Discourse and real speech at C1.