I recently created a multilingual llms.txt structure for one of my international B2B clients. Their website has 27 language versions, which turned this experiment into a scalability challenge.
How do you set up llms.txt for an international website that has several country or language versions? There’s not a lot of guidance available for this topic, so I decided to figure it out and share my findings here. Most existing llms.txt examples focus on single-language websites and say little about challenges related to international structure or scale.
The toughest part was that you can’t just translate an llms.txt file, as it contains URLs that are subject to differences between language versions. This piece provides a detailed description of an AI-supported automated workflow that addresses this issue.
A word on intent and assumptions: Although llms.txt is not a widely accepted standard yet and there is little public information on how AI systems fetch or interpret such files, it is a topic with growing relevance and relatively low cost of experimentation. Even if llms.txt turns out to be a temporary trend, the workflows and checks established as part of this experiment are valuable beyond this specific use case.
This article is aimed at teams who are responsible for international websites and who want to experiment with complex llms.txt file structures without ending up with an unmanageable manual process.
TL;DR
If you’re in a hurry, here’s a summary of the approach:
- Scalable structure: One llms.txt “index” file that links to 27 language-specific llms.txt files (one per website version).
- Clear separation of purposes: The index file contains general, language-agnostic information about the business and the international website structure, while each language-specific llms.txt file focusses mainly on the content of that website version.
- The English llms.txt file as a template for all others: Create an llms.txt file for one website version first (in this case, English) and use it as the structural and editorial template for all other languages.
- The real challenge – URL mapping instead of simple translation: The most complex part of this process involves creating an llms.txt file for each website language version, replacing the English URLs with the correct selection of URLs for each language version.
- Automation makes this manageable: An AI-supported workflow can handle URL mapping, selective translation, and file generation at scale, turning what would be a significant amount of manual work into a repeatable process.
If this sounds relevant to your setup, the rest of this article walks you through the process, using a real-world, 27-language website as an example.
International llms.txt structure
When I first worked on international llms.txt setups, it quickly became clear that creating one single llms.txt file for all language versions does not scale well. Instead, I use a structure with a main llms.txt file that’s similar to an XML sitemap index and that references individual llms.txt files for all website versions.
On a different international website, I was able to confirm (by checking Cloudflare logs) that AI bots also request llms.txt files that are placed in subdirectories and linked from an llms.txt index file.
The goal was to create a structure that’s easy to manage and process, and that’s also aligned with the international and multilingual structure of the website itself:
- One llms.txt index file in the root directory: This file links to the language-specific llms.txt files for each website version. The content of the index file is written in English (as this is the main global language of the business).
- 27 language-version llms.txt files: One for each language version – These files are each written in the language of their corresponding website version, placed in the directory of the language version. Each language-specific llms.txt file also links to all other llms.txt at the end of its content.
Unfortunately, my client has strict compliance and impartiality rules, so I cannot link to their llms.txt here, but the URL structure for the llms.txt files looks like this (domain name redacted):
- https://www.***.com/llms.txt (llms.txt index file in the domain’s root directory that links to all other llms.txt files)
- https://www.***.com/en/llms.txt (English language version llms.txt file)
- https://www.***.com/fr/llms.txt (French language version llms.txt file)
- https://www.***.com/de/llms.txt (German language version llms.txt file)
- https://www.***.com/ja/llms.txt (Japanese language version llms.txt file)
- https://www.***.com/fa/llms.txt (Persian language version llms.txt file)
- https://www.***.com/id/llms.txt (Indonesian language version llms.txt file)
- etc. (remaining files omitted for brevity)
You can also implement this structure across different domains, if your website uses a multi-ccTLD approach instead of a global gTLD.
Next, let’s talk about how I went about creating the content for all of those llms.txt files.
The llms.txt index file
The main purpose of the llms.txt index file is to provide general, fact-driven information about the business and to explain the international and multilingual website structure, while linking to the language-specific llms.txt files.
I created the llms.txt index file with AI-assisted writing, based on the suggestions described at https://llmstxt.org/ – Here’s the content of the file, partly redacted for anonymity and brevity:
# *** (company name / redacted for anonymity)
> *** (company description / redacted for anonymity)
## About this llms.txt file
***.com is a multilingual website. You are currently viewing the llms.txt file for the ENTIRE WEBSITE (including all language versions). This llms.txt file only provides an overview of the multilingual website structure, but it does not include any details about the content of the different language versions.
Below, you can find a list of all llms.txt files, one for each language version, which contain details about the key content on each website version. Each llms.txt file contains information about the corresponding website version in the language of that version.
## Multilingual website structure
- [llms.txt for website version in English](https://www.***.com/en/llms.txt): Global website version for all English speaking users.
- [llms.txt for website version in Dutch](https://www.***.com/nl/llms.txt): Global website version for all Dutch speaking users.
- [llms.txt for website version in Bosnian](https://www.***.com/bs/llms.txt): Global website version for all Bosnian speaking users.
- [llms.txt for website version in French](https://www.***.com/fr/llms.txt): Global website version for all French speaking users.
- [llms.txt for website version in Greek](https://www.***.com/el/llms.txt): Global website version for all Greek speaking users.
- [llms.txt for website version in Italian](https://www.***.com/it/llms.txt): Global website version for all Italian speaking users.
- *** (remaining website versions removed for brevity)
The company description in the llms.txt index file should ideally be created together with the business’s PR and communications department, in order to describe the company in the exact way that it wants to be represented, while focussing on factual information and avoiding marketing language.
The llms.txt file of the English website version as a template for all others
After completing the llms.txt index file, the next step was to create the llms.txt file for the English website version, which would be used as a template for all other language versions.
The reason for choosing English as the default language is strictly organisational: English is the main global language of communication for the business and the English website version also serves as the reference version for all other languages.
The English llms.txt file contains the same general information about the company and the website as the llms.txt index file and it also lists all other llms.txt files at the end.
Additionally, and first and foremost, it includes a curated selection of important pages on the English website version, along with a brief description of each page.
At the same time, it has to serve as a template for all other language versions, meaning that the llms.txt files for all other language versions will be created based on the content of the English llms.txt file.
Here’s the content of the English llms.txt file, again redacted for anonymity and shortened for brevity:
# *** (company name / redacted for anonymity)
> *** (company description / redacted for anonymity)
## About this llms.txt file
***.com is a multilingual website. You are currently viewing the llms.txt file for the GLOBAL ENGLISH WEBSITE VERSION. This llms.txt file only provides an overview of the most important content on this website version, but it does not include any details about the content of the other language versions.
At the end of this file, you can find a list of all llms.txt files, one for each language version, which contain details about the key content on each website version. Each llms.txt file contains information about the corresponding website version in the language of that version.
## Most important product & service pages on the current website version
- [***](https://www.***.com/en/***/***): *** (description of the page)
- [***](https://www.***.com/en/***/***): *** (description of the page)
- [***](https://www.***.com/en/***/***): *** (description of the page)
- *** (remaining product & services pages removed for brevity)
## Other important pages on the current website version
- [***](https://www.***.com/en/about/***/***): *** (description of the page)
- [About ***](https://www.***.com/en/about): *** (description of the page)
- [***](https://www.***.com/en/about/***/***): *** (description of the page)
- *** (remaining pages removed for brevity)
## Multilingual website structure
- [llms.txt for website version in English](https://www.***.com/en/llms.txt): Global website version for all English speaking users.
- [llms.txt for website version in Dutch](https://www.***.com/nl/llms.txt): Global website version for all Dutch speaking users.
- [llms.txt for website version in Bosnian](https://www.***.com/bs/llms.txt): Global website version for all Bosnian speaking users.
- [llms.txt for website version in French](https://www.***.com/fr/llms.txt): Global website version for all French speaking users.
- [llms.txt for website version in Greek](https://www.***.com/el/llms.txt): Global website version for all Greek speaking users.
- [llms.txt for website version in Italian](https://www.***.com/it/llms.txt): Global website version for all Italian speaking users.
- *** (remaining website versions removed for brevity)
The selection of pages to be included in the llms.txt file was done manually, based on performance data and business objectives, and the descriptions of all pages were created with AI-assisted writing and then checked and revised by the company’s global PR and communications team.
Even when working on AI-assisted, automated workflows, I strongly recommend that you do not skip human feedback loops.
The main challenge: “Translating” the English version into all languages
Until this point, this has all been conceptually straightforward. The most complicated part of the process is the following step: creating one version of the llms.txt for each language version of the website.
The challenging bit is that the URLs that are included in the English template file can’t simply be translated. They have to be replaced with the correct URLs for each version, and to complicate things a little more, not all URLs that exist on the English version also exist on all other website language versions.
At this stage, manual work is no longer a viable option. With dozens of language versions and hundreds of URLs involved, this step has to be automated.
International URL mapping
In order to get a mapping of all available international URL versions, I decided to do a crawl of the URLs in the English llms.txt file with Screaming Frog in list mode, and then export the hreflang annotations. This gave me a spreadsheet with all English URLs in the first column, and the other available language versions for each URL in the following columns.
If you want to replicate this process and you do not have a clean or complete hreflang setup to lean on, you will want to find another way to create or export a complete international URL mapping for your website versions.
My hreflang annotation spreadsheet was messy, as it did not contain one column per language. Instead, for each English URL in the first column, the following columns simply contained all other language versions of the same URL in the order that they showed up in the hreflang annotations in the source code. I needed a step in the process that would help clean up this data.
Taking this requirement into account, I then created an AI workflow in Promptmate that is able to execute the entire process for each of the 27 target languages. The reason why I chose this approach is that if I find a working solution for creating an llms.txt file “translation” for one language, I want to be able to easily repeat it for all other languages.
Automated AI workflow for creating llms.txt language versions
My workflow consists of three main steps: URL mapping cleanup, text-only translation, and URL-aware transformation.
Step 1: Cleaning up the international URL mapping
The first step is to clean up the URL mapping spreadsheet and return only the URLs for the current target language.
Within Promptmate, I selected Claude Sonnet 4.5 for this task and worked with the following Prompt (more about the variable “[[Input fields::Target language]]” below). It contains raw input data and the point of this step is to extract just what we need from it:
From the following URL mapping, please return only the English URLs from the column "Address" and the corresponding URLs that you might find in any of the following columns. You can recognise [[Input fields::Target language]] URLs by checking the hreflang values in the column before each URL and the language directory of the URL itself. If there is no [[Input fields::Target language]] URL for an English URL, please return "NO [[Input fields::Target language]] URL available". The output should be a table with two columns (English URL and [[Input fields::Target language]] URL).
Address Title 1 Occurrences HTML hreflang 1 HTML hreflang 1 URL HTML hreflang 2 HTML hreflang 2 URL HTML hreflang 3 HTML hreflang 3 URL HTML hreflang 4 HTML hreflang 4 URL HTML hreflang 5 HTML hreflang 5 URL HTML hreflang 6 HTML hreflang 6 URL HTML hreflang 7 HTML hreflang 7 URL HTML hreflang 8 HTML hreflang 8 URL HTML hreflang 9 HTML hreflang 9 URL HTML hreflang 10 HTML hreflang 10 URL HTML hreflang 11 HTML hreflang 11 URL HTML hreflang 12 HTML hreflang 12 URL HTML hreflang 13 HTML hreflang 13 URL HTML hreflang 14 HTML hreflang 14 URL HTML hreflang 15 HTML hreflang 15 URL HTML hreflang 16 HTML hreflang 16 URL HTML hreflang 17 HTML hreflang 17 URL HTML hreflang 18 HTML hreflang 18 URL HTML hreflang 19 HTML hreflang 19 URL HTML hreflang 20 HTML hreflang 20 URL HTML hreflang 21 HTML hreflang 21 URL HTML hreflang 22 HTML hreflang 22 URL HTML hreflang 23 HTML hreflang 23 URL HTML hreflang 24 HTML hreflang 24 URL HTML hreflang 25 HTML hreflang 25 URL HTML hreflang 26 HTML hreflang 26 URL HTML hreflang 27 HTML hreflang 27 URL
https://www.***.com/en/*** *** 27
en https://www.***.com/en/***
nl https://www.***.com/nl/***
bs https://www.***.com/bs/***
fr https://www.***.com/fr/***
el https://www.***.com/el/***
it https://www.***.com/it/***
hr https://www.***.com/hr/***
pl https://www.***.com/pl/***
sr https://www.***.com/sr/***
sk https://www.***.com/sk/***
tr https://www.***.com/tr/***
hu https://www.***.com/hu/***
uk https://www.***.com/uk/***
pt https://www.***.com/pt/***
es https://www.***.com/es/***
zh-CN https://***.cn/***
id https://www.***.com/id/***
ja https://www.***.com/ja/***
ko https://www.***.com/ko/***
zh https://www.***.com/zh/***
vi https://www.***.com/vi/***
de https://www.***.com/de/***
bg https://www.***.com/bg/***
ro https://www.***.com/ro/***
fa https://www.***.com/fa/***
sl https://www.***.com/sl/***
cs https://www.***.com/cs/***
*** (remaining lines redacted for brevity)
You can see that the data (copied and pasted into the prompt from a Screaming Frog CSV export) is slightly messy, but Claude handled this well. The output is a clean two-column table with only English URLs and the corresponding target language URLs.
The Promptmate variable [[Input fields::Target language]] allows us to run this process several times for different target languages. It also gives us the option to run it just once for many target languages simultaneously, which is what I did after validating the first few languages individually.
Step 2: Translating content without URLs
The second step of the automated Promptmate workflow is more straightforward and it does not use the result of the previous prompt yet. It consists of a prompt to translate the sections of the English llms.txt file that do not contain any URLs to replace (again, using Claude Sonnet 4.5).
I decided not to include the content of this prompt here, as it is a simple translation prompt that would not add much value. Let’s focus on the third, more interesting, step instead.
Step 3: From simple translation to URL mapping & replacement
This is the step where simple translation is no longer an option, so it combines translation with URL replacement and filtering.
The prompt used in this step needs to handle three challenges:
- Translation of the parts that are translatable.
- Replacement of English URLs with the correct URLs from the target language version.
- Omission of URLs that do not exist on the target language version.
Again, I selected Claude Sonnet 4.5 within Promptmate and added a step to the automated workflow that executes the following prompt (in addition to the “[[Input fields::Target language]]” variable, there is now a “[[Prompt result::Prompt1]]” variable that uses the URL mapping cleanup output from the first prompt).
The output of the prompt is the section of the llms.txt file for the target language version that contains the most important URLs of that language version, along with a description of each page, written in the target language:
Please translate the content that I provide at the end of this prompt to [[Input fields::Target language]].
The content to be translated consists of two sections (1. ## Most important product & service pages on the current website version, 2. ## Other important pages on the current website version). These sections contain URLs, which should not be translated, but replaced. Before I provide the content to be translated, I will include a mapping of the URLs. For each URL in the table, please find the corresponding [[Input fields::Target language]] URL. Please do not under any circumstances translate URLs. If you cannot find a matching [[Input fields::Target language]] URL, please leave out the entire line in the output.
Please return the result in markdown format within a code box.
URL mapping:
[[Prompt result::Prompt1]]
Here is the content to translate (please translate all content from here until "END OF CONTENT TO BE TRANSLATED"):
## Most important product & service pages on the current website version
*** (content of the English llms.txt file omitted for brevity)
## Other important pages on the current website version
*** (content of the English llms.txt file omitted for brevity)
END OF CONTENT TO BE TRANSLATED
I ran the first few languages one by one, and as the results were satisfying, I then ran the remaining 20+ languages in bulk.
Promptmate gave me a CSV with the output for all languages (CSV is the standard output format for bulk operations in PromptMate), which I could then easily upload to ChatGPT 5.1 to transform it into downloadable llms.txt files for each language version.
Quality-checking the AI-generated output
Human quality checks are a critical part of any AI-assisted workflow. After generating all 27 llms.txt language versions, I added two validation steps to ensure both technical correctness and linguistic quality:
- First, I re-crawled all URLs included in the final language versions with Screaming Frog to make sure that Claude and ChatGPT had not hallucinated any URLs in the process and that the hreflang-based mapping had not produced any broken or invalid URLs.
- The resulting llms.txt files in all languages were then checked by native speakers from the business’s different country teams.
I did not find any URL errors in the final files, which confirms the reliability of the workflow at scale. The feedback from the native speakers was also positive and there were just some minor stylistic revisions, which were very welcome, as they further improve the quality of the result.
Next steps & outlook
It is still too early to say whether this specific implementation will have any measurable impact, or whether the overall effort will prove to be worthwhile in the long run. On other projects where similar llms.txt structures have been in place for some time, I have observed bots belonging to Claude, OpenAI, and Perplexity fetching these files on a regular basis.
Considering that a setup like this can be implemented with limited effort when supported by AI-assisted and automated workflows, I think it is a reasonable experiment to run.
Even if llms.txt never becomes a widely adopted or formalised standard, the work involved in creating it is not wasted. The processes established along the way, from structuring information and mapping international URLs to building scalable automation and quality checks, are transferable and useful well beyond this specific use case.
Looking ahead, the main challenge will be maintaining and evaluating the setup over time. A logical next step is to introduce basic monitoring that regularly checks the llms.txt files and verifies that all referenced URLs are still valid.
Beyond that, it would be useful to better understand how often these files are accessed and whether their presence has any observable effect on how the business appears in AI-generated answers and conversations.
What do you think about international and multilingual llms.txt setups?
If you have any questions or would like to share your experience with international or multilingual llms.txt setups, I invite you to use the comment feature or get in touch with me directly via LinkedIn or other channels.
Leave a Reply