Advanced LLM Prompting: Exhaust Categories, Positive Wording, and One Simple Answer

Specific prompting techniques that come from understanding how attention-based models actually choose tokens. Why "do this" beats "don't do that" — and why offering multiple choices hurts quality.

Sean Robinson

These are specific fixes to issues you may find in the field when prompting modern LLMs.

Exhaust Categories: In information extraction, summarization, bullet-point creation, etc., we often find extraneous information ending up in the output (pagination and copyright info, included advertisements, etc). Rather than attempting to exclude those using prompt language, it is often better to create a category solely for this information, and then ignore that category in the output. For example, rather than saying "summarize this article in 5 bullet points, please ignore information about pagination and links to other articles", it would often be better to say "please summarize this in 5 bullet points, create a JSON object with categories bullet_1, bullet_2 and so on. Also create the categories 'pagination', 'copyright', and 'advertising' that hold information about those topics."

Positive Wording: Often the extraction of tenacious unwanted information or other unwanted outcomes leads to prompt engineers trying to excise that information using negative language. I.e. "Read this context and create keywords that match the information, but ignore any information about <topic x>". However, the nature of an attention-based model is that it must correctly combine that topic with the concept of the negative ("ignore information"), which lowers the probability that the command will be successful. In some cases, the probability of including <topic x> will actually increase. It tends to be more effective to write requests in positive language, i.e. "focus on <topics y and z> in your response". This is a flexible concept, but it's useful to phrase things using compact, self-contained positive language rather than using negatives as modifiers when composing prompts.

Models are often dumb about themselves: Do not trust prompts they write for other AI systems or themselves.

Beware of terms only you understand: where no one told the AI what they mean or what we called the systems around it, e.g. "The secondary system" when that wasn't defined elsewhere.

Beware of fallbacks and extraneous data typechecking: modern models love to attempt to interpret any change as "one way to do it" and leave existing code as "just in case" fallbacks, plus a bunch of data-checking cruft and everything that goes along with that. Ignore this at your peril.

Extreme statements (e.g. "YOU MUST do X") also drop the importance of everything else, so do this very carefully.

Avoid multiple options without explicit reasoning: This one may be a matter of opinion. There are a couple of things I always try to watch out for when making prompts. Probably the least-obvious is the use of multi-option reasoning — that is, telling the model "You can do Thing X or Thing Y." Sometimes this is unavoidable, as when the decision between X or Y is a matter of context or requires real decision-making (the "Do X if A, otherwise do Y" case). But sometimes there are just multiple options and no obvious way to decide. I have found that leaving both options in the hands of the model in that case tends to produce somewhat worse performance, and here is why.

An LLM ends up generating outputs with a likelihood (generally expressed as a log-likelihood) of the next token. In earlier days (and still for a lot of them) you could explicitly see the log-likelihood of the next possible output, and then a "temperature" parameter would be used to add some randomness to the situation, so that it wasn't always the single most-likely next-output that got chosen. In essence, when the model is working well, there should be a "most obvious next token or word" that stands out from the other possibilities. Then the remaining tokens live lower down in the noise, not usually chosen but still calculated.

So, when there is more than one choice but something in the rest of the context makes one the obvious "correct" choice, then there's still a good chance that the answer you most want shows up on top. But if there are two or more choices that are actually interchangeable, now they sort of have to "share the likelihood" of the top spot, meaning that each of them is now less likely individually, and all of them are therefore closer to the "noise" of less-likely possibilities. Which in turn means that "misses" (the times when none of the desirable behaviors happen) actually get more likely, even though it seems like you've given the LLM "more ways to be right." This is a little counterintuitive, so it's worth going through this reasoning.

So the thing you "pay" for offering multiple choices is the risk of lower success confidence, as the multiple good-answers drop relatively closer to the non-good outcomes. Which means, the thing you have to "get" for this to be worth it, is that you're actually covering more use cases, or offering more specific control than with the one option. Long story short, I tend to favor the "one simple answer" where possible.

Frequently asked

Common questions on this topic.

Instead of telling the LLM what to ignore, create a specific category for extraneous information and instruct the model to populate it. This approach leverages the model's attention mechanism more effectively by explicitly defining output structures rather than relying on negative constraints, which can be counterproductive.
What this piece resolves
Stage 02 · ProjectsStage 03 · Line ItemStage 04 · AssetGrowth scaleMid-market scaleEnterprise scaleClimb enabler