Why AI Is Making Grants Worse, Not Better — And What It Means for Your Voice

Apr 16

There is a paradox sitting in the middle of grant writing right now, and most investigators have not yet seen it.

In February 2026, Nature reported on a Northwestern preprint that examined what happens when scientists use large language models to draft federal grant proposals. The researchers had access to something rare: confidential NIH and NSF submissions from two large R1 universities between 2021 and 2025 — funded, unfunded, and pending. They paired that with publicly available grant abstracts from a federal database. Then they measured how AI-assisted proposals compared to non-AI proposals on the dimension that matters most to scientific progress: distinctiveness.

The paper has not yet been peer-reviewed. Its findings should be treated as provisional. I am writing about it anyway because the pattern it identifies is consistent with what reviewers have been saying anecdotally for two years, and because the questions it raises about voice and originality deserve attention before peer review closes the loop twelve months from now.

The findings, if they hold up under peer review, are uncomfortable.

At the NIH, high AI involvement was linked to a 4% jump in the likelihood of a successful application, compared with low AI involvement. It was also associated with a 5% increase in the number of papers published about the funded work.

So AI helps you win. A little.

Here is the part that should stop you.

This jump in papers was made up mostly of articles that were not highly cited. And across both agencies, the applications that had the highest AI scores were more likely to be similar to work that had previously been awarded funding than those with lower AI scores.

AI-assisted proposals are getting funded at slightly higher rates. They are also, the data suggest, producing science nobody remembers. Those two findings are not separate. They are the same finding said two ways.

What's actually happening

The Northwestern team did something clever. They took thousands of public NIH and NSF grant abstracts from 2021 — written before ChatGPT existed — and asked an AI model to rewrite them. Comparing the human and AI versions of the same text taught them the telltale signs that separate human writing from machine-generated text. They then scored grant applications from the two universities on how much of the writing showed AI-associated patterns.

What they found was not mass AI ghostwriting. Those who did use AI tools generated roughly 10–15% of their text with them. We are not talking about proposals written wholesale by ChatGPT. We are talking about proposals where a scientist wrote their own draft and then asked a model to sharpen it — to fix the awkward sentence, strengthen the aims, improve the flow.

That is the exact use case most advisors have been telling investigators is fine.

And it is the use case that is quietly pulling the field toward sameness.

"In using LLMs as an editorial tool, one is basically asking them to 'sound more like a funded proposal,'" says Cassidy Sugimoto, a science-policy researcher at Georgia Tech. "It appears they are performing well on this task."

She is right. That is exactly what is happening. The question is whether you want that to be happening to your grant.

The NSF clue

Here is the finding almost nobody is talking about. At the NSF, the authors detected no benefit to using AI tools.

Think about what that means.

If AI were making proposals genuinely clearer, more rigorous, or easier to understand, that benefit would show up at both agencies. It does not. It shows up at NIH specifically. Why?

Because NIH review rewards pattern-matching to what has worked before. Reviewers scan quickly. Familiarity reads as competence. The shape of an already-funded proposal activates the review heuristics that say "I have seen this work." AI is exceptionally good at generating that shape. NSF reviewers, for whatever cultural or structural reasons, appear to weight that shape less.

So the AI advantage at NIH is not an advantage of clarity. It is an advantage of resemblance. You are not winning because your science is better explained. You are winning because your proposal has been smoothed toward the existing pile.

That is a short-term win. It is also why the science that follows produces non-hit papers. The proposal was never distinctive to begin with.

The counter-argument that has to be faced

Sugimoto raises the most serious objection to everything I am about to say. She disagrees that LLMs are fundamentally changing what ideas are funded; rather, researchers are using them to communicate the ideas they've already come up with better. It's possible that proposals reviewed by a professional editor would improve the chances of a successful application to the same degree.

This deserves a real answer.

A professional editor asks you questions. A professional editor tells you when your logic does not work. A professional editor says "I do not understand this paragraph — what are you trying to say?" and forces you to articulate the thought more clearly in your own language. The editing process teaches you to think.

A language model does not do any of that. It takes your draft and returns a version that sounds more like the corpus it was trained on. You do not become a clearer thinker. You become a more fluent imitator of already-published work. The next time you sit down to write, you are no better at it. The editor has replaced the learning, not accelerated it.

This matters for careers, not just for grants. A scientist who develops the ability to translate their thinking clearly onto the page can write the next grant, and the one after that. A scientist who learns to prompt an LLM well has built a skill that belongs to the model, not to them.

What reviewers can already feel

The detection conversation usually focuses on tools — algorithms that scan for AI-generated text, watermarking, statistical fingerprints. Those exist. NIH says it is using them.

But the more honest detection happens in the reviewer's body, before any tool gets involved. Reviewers have been reading grant prose for decades. They know what human thinking sounds like when a person has wrestled an idea into clarity. They know what grant prose sounds like when it has been smoothed by a machine. The tells are not the obvious ones — em dashes, certain transition phrases, "delve" and "tapestry." The tells are deeper. They are the absence of friction. The absence of the small infelicities that mark a real person doing real thinking. The absence of the one weird sentence that only this scientist would have written.

When a reviewer reads a proposal and thinks "this could have been written by anyone," that reaction does not need to be conscious. It only needs to be present. And once it is present, the application is no longer in the pile that gets fought for.

This is the second translation gap from the Lost in Translation framework — the gap between your words and their meaning. AI does not help you cross it. AI rewrites both sides into a third language that belongs to neither you nor the reviewer.

The voice problem is now a strategy problem

For years I have been teaching that voice is a craft issue. You write better when you sound like yourself. You write worse when you perform. The Coffee Conversation exercise in Module 1 of Lost in Translationexists for exactly this reason: before you can translate your science for reviewers, you have to translate it for yourself, in your own words, without performance.

After the Northwestern data, voice is no longer just a craft issue.

Voice is now a strategy issue.

"We could be on a path towards homogeneity," says Misha Teplitskiy, a science, technology and policy researcher at the University of Michigan. He is being polite. The path is the data. AI-assisted proposals cluster toward the center of each funding portfolio. The science that emerges is less distinctive. The publications are less cited. The work, at scale, becomes interchangeable.

But here is what that trend actually creates: scarcity of distinctiveness. When everyone's proposal sounds like everyone else's proposal, the one that does not — the one written by a specific person who actually thought specific thoughts — will start to stand out. Reviewers who are tired of reading the same proposal in slightly different shapes will start hungering for the one that sounds different. Not gimmicky. Not performative. Just unmistakably written by a human being who had something to say.

That person is you. Before you sanded yourself down.

What to actually do

This is the part where most blog posts about AI tell you either to embrace it carefully or avoid it entirely. Both positions miss the point.

The question is not whether you use AI. The question is whether you let AI do the part of the work that only you can do.

Use AI to format a reference list. Use it to check for typos. Use it to compare your specific aims against the parent announcement language. Use it to summarize a long methods section so you can audit your own structure. None of that is the original work of the application. NIH's policy, NOT-OD-25-132, permits limited assistance for reducing administrative burden.

Do not use AI to write your significance section. Do not use AI to articulate why your science matters. Do not use AI to strengthen your voice. Do not use AI to "polish" your aims. The moment you do that, you have outsourced the part of the application that was supposed to be evidence that you are the scientist to do this work. NIH's policy is also clear on this: NIH will not consider applications that are either substantially developed by AI, or contain sections substantially developed by AI, to be original ideas of applicants. Grants & Funding

Here is a test. Take any paragraph you are about to submit and ask yourself: if I had not used AI on this, would the sentence be different? If yes — even slightly — then the AI is in the part of the application where it should not be. Take it out. Write it again, badly, in your own voice. Then make it better, still in your own voice.

The proposal you produce that way will probably not look like already-funded work.

That is the entire point.

A closing thought

The investigators who will succeed in the next five years are the ones who refuse to let their voice be smoothed into the average. Not because voice is precious. Because voice is now competitive infrastructure. Reviewers are about to be saturated with applications that all sound like each other. The applications that sound like a specific human being will become the ones that get remembered, advocated for, and funded — not because reviewers will say "I love the voice," but because they will say "I remember that one."

The translation gap between your thinking and your writing has always been where grants succeed or fail. AI does not close that gap. It widens it, then hides the widening behind prose that sounds professional.

Close it yourself. Write the sentence only you would write.

Sources

Kozlov, M. "Grant proposals drafted with AI help more likely to win NIH funding." Nature, 12 February 2026. https://www.nature.com/articles/d41586-026-00369-3

Qian, Y., Wen, Z., Furnas, A. C., Bai, Y., Shao, E., & Wang, D. (2026). "The Rise of Large Language Models and the Direction and Impact of US Federal Research Funding." arXiv preprint (not peer-reviewed). https://arxiv.org/abs/2601.15485

NIH Notice NOT-OD-25-132: "Supporting Fairness and Originality in NIH Research Applications." https://grants.nih.gov/grants/guide/notice-files/NOT-OD-25-132.html

Lisa Carter-Bawa, PhD, MPH, APRN, ANP-C, FAAN, FSBM is a behavioral scientist, NIH-funded investigator, and creator of the Lost In Translation © Grantsmanship Curriculum. She is teaching a free webinar on May 8: "Why Good Science Gets Lost." Register at www.lisacarterbawa.com/may8.

Lisa Carter-Bawa