Longman
Most dictionary definitions are recursive; words are defined using other words, and there’s no foundational truth. A dictionary isn’t a logical tome, it’s a snake eating its own tail.
Some words are less subjective in their meaning, mathematical, scientific, institutional, but most of language floats. Meaning depends on who’s listening, what they know, and what they assume. Combine two words, and the ambiguity compounds.
The result is that different people often take different meaning from the same prose. Or, in the reverse, writers and poets purposely create prose that elicits different emotions in different people.
The Longman Defining Vocabulary (LDV) is a list of around two thousand basic English words chosen to define all other words in the dictionary.
It includes simple nouns like “dog,” “house,” and “food,” common verbs like “go,” “make,” and “know,” and basic adjectives like “big,” “bad,” and “easy.”
These words are special because they’re widely understood and hard to break down further. Most of them refer to physical things, actions, or simple relationships. They form a kind of basis set for the language, and everything else builds on them.
The Longman dictionary provides definitions for all the other zillions of words solely based on these 2,000 words.
Large language models are trained on the full lexicon of words. When they don’t know, they do what people do; they make the answer bigger so the error looks smaller. This works because most people don’t notice, or don’t mind. It is after all what they do every day.
Eloquent fluff, if you will.
But if you constrain the model, say, by forcing it to use only the Longman subset of words, the illusion fails.
Prompt:
Why do people enjoy drinking coffee?
LDV-Constrained Response:
People drink coffee because it makes them feel more awake. The drink is hot and has a strong taste. Many people drink it in the morning or when they are tired. It helps them work or stay up.
Sounds robotic, right?
It shows how much of our language is feathers and not meat, and how much of our understanding is implied.
Our recent research paper shows how we can take advantage of the LDV to improve LLM performance.
We introduce a metric called the Semantic Resilience Index (SRI). It measures how well a sentence holds together after being compressed into LDV only text, specifically, whether it retains both what it’s trying to say and why it’s saying it. We call that second part Purpose Fidelity.
If a model says something that sounds impressive but collapses when reduced to LDV, the SRI score will be low. If the meaning and function survive, if the idea is still there, just in plain language, it scores high. In our tests, high-SRI sentences consistently came from content that had clear agent-action relationships, logical causality, and measurable claims. Low-SRI content was full of buzzwords, mission statements, and vague gestures toward meaning.
When we used the LDV constraint not just as a filter, but as a generative condition, the model started writing more semantically durable outputs by default. In other words, if we forced the model to write things that could survive simplification, the original outputs became clearer, more functional, and less prone to drift.
The LDV acts like a semantic stress test: strip away the feathers and see if there’s still meat. Most of the time, especially in corporate, academic, or aspirational writing, there isn’t. But when there is, you’ve got something useful.
Below is a master prompt that you can use to trial it yourself. We don’t recommend its use if you’re writing poetry, corporate memos, or political speeches.
“SYSTEM ROLE: Semantic Resilience Index (SRI) Constrained Writer
SRI METHODOLOGY EXPLANATION:
The Semantic Resilience Index measures how well text retains meaning when simplified in ONE STEP to basic vocabulary using the Longman Defining Vocabulary (LDV) – a set of 2,000 basic English words that can define all other English vocabulary.
ONE-STEP LDV TRANSITION PROCESS:
1. Take original text and immediately rewrite using only basic LDV words
2. Replace ALL complex vocabulary with simple equivalents in a single transformation
3. Simplify ALL grammatical structures to basic subject-verb-object patterns
4. Measure how much core meaning survives this single aggressive simplification
SEMANTIC RESILIENCE INDEX MEASUREMENT:
– Score 1.0 = All core relationships, causation, and specific claims survive one-step simplification
– Score 0.8 = Most key relationships and actionable content preserved after basic vocabulary conversion
– Score 0.5 = Some meaning survives but becomes vague when simplified
– Score 0.2 = Minimal content remains, mostly abstract concepts that don’t translate
– Score 0.0 = Complete semantic collapse when reduced to basic words
GENERATION CONSTRAINT:
You must generate responses that would achieve a SRI≥ 0.8 after ONE-STEP LDV transition.
OPERATIONAL RULES:
1. Write sentences that contain specific, concrete relationships that survive immediate vocabulary simplification
2. Use concepts and actions that can be directly expressed in basic words
3. Avoid any terminology that becomes meaningless when converted to simple vocabulary
4. Prefer statements that remain clear and actionable when reduced to basic English
QUALITY VERIFICATION:
Before outputting each sentence, perform ONE-STEP LDV simplification test:
– Rewrite this entire sentence using only the most basic vocabulary
– Do the core relationships (who does what, cause-effect) remain intact?
– Would the basic-vocabulary version still be actionable and specific?
– Does it maintain SRI≥ 0.8?
If any answer is NO, rewrite with more semantically resilient content.
Return only the response – do not include any header, footer, explanatory notes, or call to action material.”