Key Concepts: The Language of Language Learning
Understanding the terminology and concepts used in linguistics and language education can significantly enhance your journey to becoming a polyglot. This knowledge base provides precise definitions of key terms, explores fundamental linguistic concepts, and explains the frameworks used to describe language proficiency. Whether you're a beginner seeking to understand the landscape of language learning or an advanced learner looking to deepen your theoretical knowledge, this glossary and conceptual overview will serve as a valuable reference.
Language Proficiency Scales
Measuring language proficiency requires standardized frameworks that describe what learners can do at different levels of competence. Several major scales exist, with the Common European Framework of Reference for Languages (CEFR) being the most widely used internationally.
Common European Framework of Reference (CEFR)
The CEFR, developed by the Council of Europe between 1989 and 1996, divides language proficiency into six levels organized into three broad categories: Basic User (A1 and A2), Independent User (B1 and B2), and Proficient User (C1 and C2).
A1 (Beginner): Can understand and use familiar everyday expressions and very basic phrases. Can introduce themselves and others and ask and answer simple questions about personal details. Can interact in a simple way provided the other person talks slowly and clearly.
A2 (Elementary): Can understand sentences and frequently used expressions related to areas of immediate relevance (personal and family information, shopping, local geography, employment). Can communicate in simple and routine tasks requiring direct exchange of information on familiar matters.
B1 (Intermediate): Can understand the main points of clear standard input on familiar matters regularly encountered in work, school, leisure, etc. Can deal with most situations likely to arise while traveling. Can produce simple connected text on familiar topics. Can describe experiences, events, dreams, hopes, and ambitions and briefly give reasons and explanations for opinions and plans.
B2 (Upper Intermediate): Can understand the main ideas of complex text on both concrete and abstract topics, including technical discussions in their field of specialization. Can interact with a degree of fluency and spontaneity that makes regular interaction with native speakers quite possible. Can produce clear, detailed text on a wide range of subjects and explain a viewpoint on a topical issue.
C1 (Advanced): Can understand a wide range of demanding, longer texts and recognize implicit meaning. Can express ideas fluently and spontaneously without much obvious searching for expressions. Can use language flexibly and effectively for social, academic, and professional purposes. Can produce clear, well-structured, detailed text on complex subjects.
C2 (Mastery): Can understand with ease virtually everything heard or read. Can summarize information from different spoken and written sources, reconstructing arguments and accounts in a coherent presentation. Can express themselves spontaneously, very fluently and precisely, differentiating finer shades of meaning even in more complex situations.
Other Proficiency Scales
The American Council on the Teaching of Foreign Languages (ACTFL) has developed its own proficiency guidelines, which include levels such as Novice (Low, Mid, High), Intermediate (Low, Mid, High), Advanced (Low, Mid, High), Superior, and Distinguished. These guidelines focus on functional language ability—what learners can actually do with the language—rather than discrete points of grammar or vocabulary.
The Interagency Language Roundtable (ILR) scale, used by the United States government, ranges from Level 0 (no proficiency) to Level 5 (functionally native). This scale is particularly relevant for professional linguists and translators working in government contexts.
Language Families and Classification
Languages are classified into families based on shared ancestry. Understanding language families can help polyglots leverage similarities between related languages and set realistic expectations for learning difficulty.
Indo-European Family
The Indo-European language family is the world's largest language family by number of native speakers, encompassing most languages of Europe, Iran, and northern India. Major branches include:
- Germanic: English, German, Dutch, Swedish, Norwegian, Danish, Icelandic, Afrikaans, Yiddish
- Romance (Italic): Spanish, French, Italian, Portuguese, Romanian, Catalan
- Slavic: Russian, Polish, Czech, Ukrainian, Bulgarian, Serbian, Croatian
- Indo-Iranian: Hindi, Urdu, Bengali, Persian (Farsi), Punjabi, Marathi
- Celtic: Irish, Scottish Gaelic, Welsh, Breton
- Hellenic: Greek
- Baltic: Lithuanian, Latvian
Speakers of one Indo-European language typically find it easier to learn other languages within the same branch due to shared vocabulary, grammatical structures, and cognates. For example, a Spanish speaker learning Italian will encounter thousands of cognates and similar verb conjugation patterns.
Other Major Language Families
Sino-Tibetan: Includes Chinese (Mandarin, Cantonese, and other varieties), Burmese, and Tibetan. Mandarin Chinese alone has over 900 million native speakers, making Sino-Tibetan the world's second-largest language family by native speakers.
Afro-Asiatic: Includes Arabic (with its many dialects), Hebrew, Amharic, Hausa, and Somali. Arabic, a Central Semitic language, is particularly significant as the liturgical language of Islam and a major language of international diplomacy and media.
Niger-Congo: The largest language family by number of distinct languages, primarily spoken in sub-Saharan Africa. Includes Swahili (a major lingua franca in East Africa), Yoruba, Zulu, and Wolof.
Dravidian: Languages of southern India and parts of Sri Lanka, including Tamil, Telugu, Kannada, and Malayalam. These languages are not related to the Indo-Aryan languages of northern India.
Turkic: A language family spoken across Central Asia and parts of Eastern Europe and Western Asia, including Turkish, Azerbaijani, Uzbek, Kazakh, and Uyghur.
Austronesian: One of the world's largest language families by geographic spread, extending from Madagascar to Easter Island. Includes Malay/Indonesian, Tagalog, Javanese, Malagasy, and Maori.
Language Isolates
Some languages have no known relationship to any other language. These "language isolates" include Basque (spoken in northern Spain and southwestern France), Korean, and Sumerian (extinct, formerly spoken in ancient Mesopotamia). Language isolates must be learned without the benefit of cognates or related grammatical structures.
Linguistic Concepts for Language Learners
Cognates and False Friends
Cognates are words in different languages that share a common etymological origin and typically have similar meanings. English "mother," German "Mutter," and Spanish "madre" are cognates, all descended from Proto-Indo-European *méh₂tēr. Recognizing cognates can rapidly expand vocabulary when learning related languages.
False friends (or false cognates) are words that look similar in two languages but have different meanings. English "actual" means "real" or "existing," while Spanish "actual" means "current" or "present." English "gift" is a present, while German "Gift" means "poison." Awareness of false friends helps prevent misunderstandings and errors.
Linguistic Interference
Linguistic interference (also called language transfer) occurs when features of one language affect the production or comprehension of another. Interference can be positive (facilitating learning) when structures are similar, or negative (causing errors) when structures differ.
Common manifestations of interference include:
- Phonological interference: Applying sounds from one's native language to the target language
- Grammatical interference: Using word orders or grammatical structures from another language
- Lexical interference: Borrowing words or using calques (literal translations) from another language
- Semantic interference: Extending word meanings based on another language's usage
Polyglots often experience interference between their non-native languages as well as from their native language. This phenomenon, sometimes called "language confusion," is normal and typically decreases as proficiency in all languages increases.
Code-Switching
Code-switching is the practice of alternating between two or more languages or language varieties in a single conversation. Contrary to early beliefs that code-switching reflected linguistic deficiency, research has established that it is a normal and sophisticated linguistic behavior among multilingual speakers.
Code-switching serves various functions: filling lexical gaps, expressing group identity, quoting someone, adding emphasis, or clarifying meaning. Polyglots often code-switch unconsciously when speaking with other multilingual individuals who share their languages.
The Critical Period Hypothesis
The Critical Period Hypothesis (CPH) proposes that there is a biologically determined window during which language acquisition must occur for native-like mastery to be possible. First proposed by neurologist Wilder Penfield and later developed by linguist Eric Lenneberg, the hypothesis suggests that this period ends around puberty, after which language acquisition becomes more difficult and complete mastery unlikely.
Evidence supporting the CPH includes:
- The difficulty adults experience in achieving native-like accents in second languages
- Cases of feral children who, when deprived of language input during childhood, fail to fully acquire language even after intensive intervention
- Brain plasticity research showing decreased neuroplasticity after puberty
- Studies of deaf individuals who acquire sign language at different ages
However, the CPH remains controversial. Research by David Birdsong, Janna White, and others has demonstrated that some adults can achieve native-like grammatical competence in second languages. The hypothesis has been refined to distinguish between different aspects of language—phonology (sound system) may indeed have a critical period, while syntax (grammar) and semantics (meaning) may be more accessible to adult learners.
A sensitive period (as opposed to a critical period) interpretation suggests that language learning is easier and more efficient during childhood but remains possible throughout life. This perspective aligns with the experiences of many adult polyglots who have achieved high levels of proficiency in multiple languages.
Metalinguistic Awareness
Metalinguistic awareness refers to the ability to reflect on and manipulate the structural features of language. It involves thinking about language as an object of study rather than simply using it for communication.
Developing metalinguistic awareness provides several advantages for polyglots:
- Explicit learning: The ability to analyze and understand grammatical rules accelerates acquisition
- Transfer recognition: Awareness of similarities between languages allows strategic leveraging of prior knowledge
- Error analysis: Understanding why errors occur helps prevent them and facilitates self-correction
- Learning strategy development: Conscious awareness of how language works informs study methods
Research suggests that bilingual and multilingual individuals often display higher metalinguistic awareness than monolinguals. Each language learned provides additional data points for understanding how languages can work, enhancing the ability to analyze and learn subsequent languages.
Register and Code
Register refers to the variety of language used in a particular social setting or for a particular purpose. The same speaker uses different registers when giving a formal presentation, chatting with friends, writing an academic paper, or sending a text message. Registers differ in vocabulary, grammar, and level of formality.
Understanding register is crucial for language learners because native speakers expect appropriate register use in different contexts. Using overly formal language in casual settings or overly casual language in formal settings marks the speaker as non-native or socially unaware.
Code refers to a distinct language or language variety. In sociolinguistics, code-switching involves moving between codes. Diglossia—a situation where two distinct varieties of a language (often a "high" formal variety and a "low" colloquial variety) are used for different purposes—is common in many speech communities.
Fluency and Accuracy
Language proficiency encompasses both fluency and accuracy, though these dimensions are somewhat independent. Fluency refers to the smoothness and flow of speech—the ability to produce language without excessive pauses or hesitations. Accuracy refers to correctness—producing language that conforms to the grammatical and phonological rules of the target language.
Early stages of language learning often require trade-offs between fluency and accuracy. Focusing on accuracy can produce halting, careful speech. Focusing on fluency may result in more errors but also more natural communication. Effective language development eventually integrates both dimensions.
The concept of automaticity is relevant here: when language use becomes automatic, accurate production requires little conscious attention, allowing fluency to emerge. Achieving automaticity requires extensive practice and exposure.
Additional Terminology
Comprehensible Input: Language that learners can understand through context, even if they don't know every word. Essential for acquisition according to Krashen's Input Hypothesis.
Interlanguage: The linguistic system that second language learners develop as they acquire a target language. It represents a stage between the native language and the target language.
Fossilization: The process by which certain non-target forms become permanent in a learner's interlanguage, persisting despite continued exposure to the target language and instruction.
Input Hypothesis: Stephen Krashen's theory that language acquisition occurs when learners receive comprehensible input slightly beyond their current level (i+1).
Language Acquisition Device (LAD): Noam Chomsky's theoretical construct proposing that humans are born with an innate capacity for language acquisition.
Noticing Hypothesis: Richard Schmidt's theory that learners must consciously notice linguistic features in input for those features to be acquired.
Output Hypothesis: Merrill Swain's theory that producing language (output) plays an important role in language acquisition, complementing input.
Subtractive Bilingualism: A situation where learning a second language leads to loss of the first language, often in contexts of language shift or assimilation.
Additive Bilingualism: A situation where learning a second language adds to one's linguistic repertoire without replacing the first language.
For more on how these concepts apply to practical language learning strategies, see our Methods & Techniques page.