Try and get LLMs to return content matching the language of the original content by dkotter · Pull Request #357 · WordPress/ai

dkotter · 2026-03-31T20:50:23Z

What?

Update some of our system instructions to prompt the LLM to return content in the same language as the original content they were given.

Why?

At the moment, you may get content returned from an LLM that isn't the same as your original content. For example, if you have a post that is written in French and you generate a title, the title may be in English (though in my testing, it always returned in French).

Ideal is to have the LLMs match the language of the content they were given. This supports the widest amount of use cases, like multi-lingual sites.

How?

Updates the system instructions for any Ability that generates user-facing text to tell the LLM to match the language of the content they were given.

For alt text generation, we may not have other content (say when generating alt text straight in the media library) so we pass the site locale to use as our default.

Worth noting it's still up to the LLM to follow these instructions so there's almost certainly edge cases that this won't work for, particularly if using a smaller model.

Use of AI Tools

None

Testing Instructions

Checkout this PR
Create some content in a language other than English
Turn on some Experiments (Title Generation, Excerpt Generation, Summarization, Alt Text Generation)
Run these Experiments and ensure the output returned matches the language of the content
Change your site locale to something other than English
Generate alt text for an image in the Media Library and ensure it matches the site locale

…n content matching the language given

…ontent matching the language given

…urn content matching the language given

…ons and update those instructions to prompt the LLM to return content matching either the language of the context or the site language

github-actions · 2026-03-31T20:50:33Z

The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the props-bot label.

If you're merging code through a pull request on GitHub, copy and paste the following into the bottom of the merge commit message.

Co-authored-by: dkotter <dkotter@git.wordpress.org>
Co-authored-by: gziolo <gziolo@git.wordpress.org>
Co-authored-by: afercia <afercia@git.wordpress.org>

To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook.

codecov · 2026-03-31T20:55:52Z

Codecov Report

❌ Patch coverage is 66.66667% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 57.76%. Comparing base (cde564d) to head (d0548d2).

Files with missing lines	Patch %	Lines
includes/Abilities/Image/Alt_Text_Generation.php	0.00%	1 Missing ⚠️
...es/Abilities/Image/alt-text-system-instruction.php	80.00%	1 Missing ⚠️

Additional details and impacted files

@@              Coverage Diff              @@
##             develop     #357      +/-   ##
=============================================
+ Coverage      57.70%   57.76%   +0.06%     
  Complexity       617      617              
=============================================
  Files             46       46              
  Lines           3173     3180       +7     
=============================================
+ Hits            1831     1837       +6     
- Misses          1342     1343       +1

Flag	Coverage Δ
unit	`57.76% <66.66%> (+0.06%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

gziolo · 2026-04-01T08:18:21Z

While reviewing this PR, I noticed a pre-existing bug in load_system_instruction_from_file(): it uses require_once for explicit filenames, which means the second call to get_system_instruction() with the same file silently returns an empty string (since require_once returns true instead of the file's return value on subsequent calls).

This doesn't block this PR, but it could affect alt text generation if called multiple times in a single request (e.g., batch processing images).

Opened #358 with a failing test and one-line fix (require_once → require).

gziolo

I tested this with a post written in Polish.

English site locale + Polish content:

Feature	Language	Expected
Excerpt	Polish ✅	Polish
Summary	English ❌	Polish
Alt text	English ❌	Polish (from content) or English (from locale)

Polish site locale + Polish content:

Feature	Language	Expected
Excerpt	Polish ✅	Polish
Summary	Polish ✅	Polish
Alt text	English ❌	Polish

Excerpt generation consistently respects content language, which is great. Summary only matched Polish when the site locale was also set to Polish — suggesting the LLM may not reliably follow the instruction when the default locale is English. Alt text returned English in all cases, even with Polish locale — the model seems to disregard the locale instruction entirely when working from an image alone.

(Page was refreshed before each action to ensure clean state.)

Aside, it might be not trivial to instruct LLM about the language for the alt image, as it largely depend on the context. In the Media library modal, it should probably will always default to the site's locale. However, in the post, you would need to pass some content so LLM can infer the language from it.

dkotter added 4 commits March 31, 2026 14:19

Update Title Generation system instructions to guide the LLM to retur…

f30ed17

…n content matching the language given

Update Summarization system instructions to guide the LLM to return c…

74b175c

…ontent matching the language given

Update Excerpt Generation system instructions to guide the LLM to ret…

85e6d92

…urn content matching the language given

For alt text generation, pass the site locale to our system instructi…

a37f75d

…ons and update those instructions to prompt the LLM to return content matching either the language of the context or the site language

dkotter added this to the 0.7.0 milestone Mar 31, 2026

dkotter self-assigned this Mar 31, 2026

This was referenced Mar 31, 2026

Abilities that generate content that will be exposed on the front end should use the Site Language locale #351

Open

Alt text generation should generate text in the same locale of the Site Language setting #349

Open

Fix lint error

ffa4782

dkotter linked an issue Mar 31, 2026 that may be closed by this pull request

Abilities that generate content that will be exposed on the front end should use the Site Language locale #351

Open

6 tasks

Fix lint error

d0548d2

dkotter requested a review from jeffpaul March 31, 2026 21:17

gziolo reviewed Apr 1, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Try and get LLMs to return content matching the language of the original content#357

Try and get LLMs to return content matching the language of the original content#357
dkotter wants to merge 6 commits intoWordPress:developfrom
dkotter:feature/match-content-language

dkotter commented Mar 31, 2026 •

edited by github-actions bot

Loading

github-actions bot commented Mar 31, 2026 •

edited

Loading

codecov bot commented Mar 31, 2026 •

edited

Loading

gziolo commented Apr 1, 2026

gziolo left a comment •

edited

Loading

Labels

2 participants

Conversation

dkotter commented Mar 31, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What?

Why?

How?

Use of AI Tools

Testing Instructions

github-actions bot commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

codecov bot commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

gziolo commented Apr 1, 2026

gziolo left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Labels

2 participants

dkotter commented Mar 31, 2026 •

edited by github-actions bot

Loading

github-actions bot commented Mar 31, 2026 •

edited

Loading

codecov bot commented Mar 31, 2026 •

edited

Loading

gziolo left a comment •

edited

Loading