Open Knowledge Format (OKF): What It Is, Why Google Released It, and How to Prepare for Agent-Readable Knowledge
On June 12, 2026, Google Cloud published the Open Knowledge Format (OKF) v0.1 — a vendor-neutral specification for packaging knowledge into directories of markdown files that AI agents can navigate and consume. This pillar guide explains what OKF is, why Google released it, and how to prepare your site for the shift from SEO to agentic accessibility.
On June 12, 2026, Google Cloud published the Open Knowledge Format (OKF) v0.1 — a vendor-neutral specification for packaging knowledge into directories of markdown files that AI agents can navigate and consume. The announcement triggered immediate debate across the SEO community: is OKF the next llms.txt, a new ranking signal, or something more fundamental?
The honest answer is it is none of those things — yet. But it signals a shift that will change how businesses organize their knowledge for the agentic web. This pillar guide explains exactly what OKF is, what it is not, why Google released it, how it fits into the broader agent-readable stack, and — most importantly — how to assess whether your site is ready for the shift from search-engine optimization to agentic accessibility.
What Is the Open Knowledge Format (OKF)? — In 90 Seconds
An OKF bundle is a directory of markdown files. That is the entire format.
Each .md file represents a single concept — a table, a metric, a product, a process, a runbook, a claim, an API endpoint, or any atomic piece of knowledge. The file path becomes the concept's identity (tables/orders.md is the concept tables/orders).
Every concept file begins with YAML frontmatter — a small metadata block at the top of the markdown file:
--- type: BigQuery Table title: Orders description: One row per completed customer order. resource: https://console.cloud.google.com/bigquery?p=acme&d=sales&t=orders tags: [sales, orders] timestamp: 2026-05-28T00:00:00Z ---
Concepts link to each other with standard markdown links ([orders table](/tables/orders.md)), which turns the directory into a knowledge graph rather than a flat list of files.
A complete bundle looks like this:
my_bundle/
├── index.md # optional: directory listing for progressive disclosure
├── log.md # optional: chronological change history
├── datasets/
│ └── sales.md
├── tables/
│ ├── orders.md
│ └── customers.md
├── metrics/
│ └── weekly_active_users.md
└── playbooks/
└── incident_response_data_freshness.mdThe conformance bar is deliberately minimal. Per Google's OKF specification, a bundle is valid if:
- Every non-reserved
.mdfile has parseable YAML frontmatter. - Every frontmatter block has a non-empty
typefield.
That is the only required field. title, description, resource, tags, and timestamp are recommended but optional. Consumers are instructed to tolerate unknown fields, missing metadata, and even broken links rather than reject a bundle. This tolerance-by-design is the key to OKF's interoperability promise.
What OKF Is Not
Understanding the non-goals matters as much as the spec itself:
- Not a ranking factor. Google Search does not currently crawl OKF bundles as ranking input. The spec is about organizational knowledge exchange and agent context, not search positioning. (We revisit this in the SEO implications section below.)
- Not a platform, runtime, or SDK. OKF requires exactly zero software. As Google's own announcement states: "No complex compression scheme, no new runtime, no required SDK."
- Not a replacement for Notion, Obsidian, Confluence, or any wiki tool. Those tools are about human knowledge management with agent features. OKF is about portable, agent-first knowledge packaging that moves between systems.
- Not a fixed taxonomy. OKF does not prescribe what types should exist, how to categorize concepts, or how to structure bodies. It is a container convention, not an ontology.
- Not a competitor to MCP or A2A. MCP (Model Context Protocol) is a protocol for agents to call tools and take actions. OKF is the opposite end: a convention for how static knowledge is written down so any agent can read it.
"Isn't This Just Markdown Files in Folders?"
Yes — deliberately. The format's simplicity is the feature, not a limitation.
Google's engineers could have shipped a binary protocol, a database schema, or a proprietary agent SDK. Instead, they shipped markdown files. The value is not in the file format itself — it is in the shared convention. When multiple tools agree that /tables/orders.md with a type: BigQuery Table frontmatter means the same thing, agents can navigate knowledge without custom integrations, API wrappers, or vendor-specific ingestion pipelines.
This is the same insight that made JSON-LD schema.org work: a boring technical convention that, once shared, unlocks interoperability at scale.
Why Google Released OKF — The Problem It Solves
Google's stated motivation is what they call the "context-assembly problem" for foundation models and AI agents. As the Google Cloud announcement explains:
"The lack of relevant context often limits what foundation models can do."
In practice, enterprise knowledge is fragmented across metadata catalogs, wikis, shared drives, code comments, notebooks, database schemas, and — most fragile — the heads of senior employees. When an agent attempts a task (answer a business question, run an analysis, generate a report, execute a workflow), it needs context from all of these sources. Today, that means either:
- Dumping everything into the prompt — expensive, noisy, and context-window-constrained.
- RAG across raw documents — fragile, uncurated, and prone to retrieving irrelevant or contradictory chunks.
- Custom integrations per source — brittle and expensive to maintain across tools.
OKF proposes a fourth path: curated, portable, version-controlled knowledge bundles that agents navigate using a common convention. Instead of stuffing every document into a context window and hoping retrieval picks the right chunk, an agent reads index.md, follows the concept graph, and pulls only the relevant files.
OKF vs. RAG — The Difference Matters
Classic RAG (Retrieval-Augmented Generation) works by chunking documents, embedding them into vectors, and retrieving chunks based on semantic similarity. It is fast and automatic, but it has fundamental limitations:
| RAG (Raw Retrieval) | OKF (Curated Knowledge) | |
|---|---|---|
| Knowledge structure | Flat, chunked | Graph, linked concepts |
| Curation | Automatic, unvalidated | Human-reviewed, citable |
| Staleness | Hidden | Visible via timestamps |
| Contradictions | Buried in chunks | Surfaceable via links |
| Portability | Vendor-locked to vector DB | Plain files, version-controlled |
| Agent cost | High (full context dump) | Low (progressive disclosure) |
OKF does not replace RAG — it complements it. RAG handles unstructured discovery; OKF handles curated, persistent, compound knowledge. As Andrej Karpathy's LLM Wiki pattern — which OKF formalizes — describes: an agent builds and maintains a persistent wiki that synthesizes new information, surfaces contradictions, and compounds over time rather than treating every query as a fresh retrieval problem.
What Google Shipped as Proof
Google did not just publish a spec. They shipped:
- A BigQuery enrichment agent that drafts OKF concept documents for tables and views, populates them with citations, schemas, and join paths, and enriches them through a second citation pass against authoritative documentation.
- A static HTML visualizer that renders any OKF bundle as an interactive knowledge graph in one self-contained file — no backend, no data leaving the page.
- Three sample bundles: GA4 ecommerce, Stack Overflow, and Bitcoin public datasets.
- Knowledge Catalog integration — Google Cloud's enterprise metadata management product (formerly Dataplex) now ingests OKF bundles and serves them to agents.
This is not vaporware. It is a working reference implementation with a clear enterprise path. But it is also v0.1, labeled "Draft," and the spec explicitly warns that it is subject to change.
The Shift from SEO to Agentic Accessibility
Dr. Marie Haynes — one of the earliest and most thoughtful analysts of OKF — captured the strategic framing that matters most:
"We will shift from working to be found by search engines to making business knowledge accessible so agents can perform tasks with it."
This is not about replacing SEO. It is about adding a new layer: agentic accessibility. Traditional SEO optimizes pages for search crawlers that index, rank, and deliver blue links. Agentic accessibility optimizes knowledge so AI agents — coding assistants, research agents, task-execution agents, customer-facing bots — can understand your business, answer questions about your products, compare your offers, and execute workflows that involve your domain.
The distinction matters because agents do not consume pages the way humans do. They:
- Navigate concept graphs, not site hierarchies.
- Need metadata (type, purpose, freshness, provenance) to trust information.
- Operate with progressive disclosure — they want to know what exists before they pull details.
- Combine knowledge from multiple sources to complete tasks.
A site that is crawlable and ranked is not automatically agent-accessible. Most sites today are a black box to agents: content is HTML-wrapped in templates, scattered across URLs with no semantic map, and buried in design chrome that agents struggle to parse. OKF provides the missing semantic interface — a plain, predictable, linked knowledge surface that any agent can navigate.
The Agent Knowledge Stack — Where OKF Fits
OKF is not a standalone solution. It is one layer in a multi-layer agent-readable stack that AEO Engine defines as:
| Layer | Purpose | Examples |
|---|---|---|
| 1. Discovery | Let agents find your content | Sitemap.xml, robots.txt, canonical URLs, llms.txt |
| 2. Entity Understanding | Tell agents who you are | Schema.org (Organization, Person, Product, Service), sameAs links, author profiles |
| 3. Agent Context | Give agents structured knowledge | OKF concept bundles with citations, indexes, logs, and relationship links |
| 4. Actionability | Let agents do things | WebMCP/MCP-like endpoints, forms, APIs, tools, product feeds, booking flows |
| 5. Governance | Make knowledge trustworthy | Provenance, review status, freshness, permission boundaries, owner, approval |
A site with only sitemap and schema (layers 1-2) is discoverable but shallow. Add an OKF bundle (layer 3) and agents can traverse your knowledge graph. Add action endpoints (layer 4) and agents can complete tasks. Add governance (layer 5) and agents can trust what they find.
Most sites are at layer 1 or 2. The opportunity — and the competitive asymmetry — is building through layer 5 before it becomes table stakes.
llms.txt vs. OKF — Complementary, Not Competing
The llms.txt standard (proposed by Jeremy Howard) provides a single-file index of a site's key pages for LLM consumption. OKF provides multi-file, linked concept bundles. They solve different problems:
- llms.txt = "Here are the important pages on this site."
- OKF = "Here is the structured knowledge this business has, organized as concepts with provenance."
The two work together: llms.txt can point agents to the /okf/ directory. In fact, a well-structured llms.txt that includes an OKF bundle reference is the simplest path to making your site's knowledge agent-discoverable today.
What Marie Haynes' Article Covers — and What It Misses
Marie's article on OKF (published at mariehaynes.com/okf/) is an excellent practitioner-level introduction. Her core contributions:
- The shift from SEO to agentic accessibility — the framing that "we will shift from working to be found by search engines to making business knowledge accessible so agents can perform tasks with it" is the most important strategic insight in the OKF conversation.
- Concept extraction over page conversion — the recognition that "building a brain for that business" means decomposing pages into their constituent concepts (claims, processes, entities, evidence), not just converting HTML pages to markdown files.
- OKF specialist predictions — the forecast that skilled OKF bundle creators will be in demand, similar to how early SEO specialists emerged in the late 2000s.
- Knowledge monetization — the observation that proprietary process bundles (legal, accounting, SEO, consulting) could become sellable products.
However, the article leaves critical gaps that this guide fills:
| What Marie covers | What this guide adds |
|---|---|
| OKF as a new SEO mindset | OKF as part of a measurable Agent Knowledge Stack with five layers |
| High-level bundle explanation | Detailed OKF vs RAG comparison with a concrete table |
| Anecdotal potential | Specific AEO Engine governance rubric that extends OKF v0.1 metadata |
| Personal experimentation anecdote | Public Reddit/forum objection analysis with rebuttals |
| General getting-started advice | Actionable OKF readiness checklist with a free tool to score your site |
| Link to existing tooling | Free OKF Readiness Checker readers can use to audit their own site |
How to Turn Your Website Into an Agent-Readable Knowledge Layer
Step 1: Concept Decomposition — The Hard Part
The most common mistake is treating "one page = one OKF concept file." This is the commodity approach — and it adds almost no value over a good sitemap.
High-quality OKF bundles decompose pages into their constituent concepts:
| Page type | Decomposes into | Example concept files |
|---|---|---|
| Service page | Offer, ICP, Problem, Outcome, Process, Proof, CTA | offers/aeo_managed.md, problems/ai_invisibility.md, processes/aeo_audit_workflow.md |
| Case study | Entity, Baseline, Intervention, Metric, Result, Caveat | entities/gourmend_foods.md, results/671pct_llm_growth.md, caveats/seasonality_note.md |
| Tool page | Purpose, Inputs, Methodology, Output, Examples, Limitations | tools/okf_readiness_checker.md, methods/readiness_scoring.md, examples/sample_scores.md |
| Blog post | Claims, Definitions, Workflows, Examples, Citations, FAQs | claims/okf_is_not_ranking_signal.md, definitions/agentic_accessibility.md, faqs/okf_vs_llms_txt.md |
This decomposition is the skill that Marie predicts will be valuable — and it is the gap between commodity OKF generators and strategic agent-knowledge work.
Step 2: Build the Bundle Structure
A public-facing OKF bundle for a marketing site should include:
okf/
├── index.md # directory of all concepts with one-line descriptions
├── log.md # chronological update history
├── entities/ # who you are, what you sell, who you serve
│ ├── company.md
│ ├── founder.md
│ └── icp.md
├── offers/ # products and services
│ ├── managed_aeo.md
│ └── ai_citation_optimization.md
├── concepts/ # definitions, claims, frameworks
│ ├── answer_engine_optimization.md
│ ├── agentic_accessibility.md
│ └── agent_knowledge_stack.md
├── proof/ # case studies, metrics, testimonials
│ ├── case_studies_index.md
│ └── di_oro_306pct_growth.md
├── processes/ # how you work
│ └── aeo_audit_workflow.md
├── tools/ # free tools and resources
│ └── okf_readiness_checker.md
└── faqs/ # objections and clarifications
├── okf_vs_llms_txt.md
└── is_okf_markdown.mdStep 3: Write Concept Files with Governance Metadata
AEO Engine recommends extending OKF's base frontmatter with governance metadata that goes beyond the spec while remaining compatible with any OKF-consuming agent:
--- type: Claim title: OKF is not an SEO ranking signal description: Canonical position on OKF's relationship to search rankings. resource: https://aeoengine.ai/blog/open-knowledge-format-okf tags: [okf, seo, ranking, agent-knowledge] timestamp: 2026-06-19T00:00:00Z owner: marketing review_status: approved reviewed_by: Vijay Jacob source_grade: A confidence: high visibility: public expires_after: 2026-12-31 ---
These additional fields are not part of the OKF v0.1 specification — they are an AEO Engine governance extension. Any conforming OKF consumer will ignore unknown frontmatter fields (per the spec's tolerance requirement), so these additions are safe. They make the bundle auditable, reviewable, and trustworthy for enterprise use.
Step 4: Host and Discover
Once your bundle is built:
- Host it at
yoursite.com/okf/index.md(static, no backend required — it is just markdown files). - Add an entry to your
llms.txt:OKF: https://yoursite.com/okf/index.md - Verify with the AEO Engine OKF Readiness Checker to confirm conformance and agent-readiness.
- Version the bundle in git alongside your site code.
- Review and update concept timestamps on a regular cadence.
The OKF Readiness Checklist — 10-Point Audit
Use this checklist to score your site's agent-knowledge readiness. Or run the free tool for an automated assessment:
| # | Check | What to look for |
|---|---|---|
| 1 | Concept Granularity | Are pages decomposed into individual concepts, or is it one-file-per-page? |
| 2 | Required Frontmatter | Does every concept file have a non-empty type field? |
| 3 | Recommended Metadata | Are title, description, resource, tags, and timestamp present? |
| 4 | Index Hygiene | Does index.md list all concepts with one-line descriptions? |
| 5 | Log History | Does log.md track meaningful changes chronologically? |
| 6 | Citation Coverage | Do claims cite authoritative sources under # Citations? |
| 7 | Cross-Link Density | Are concepts linked into a graph, or are they isolated files? |
| 8 | Stale Timestamps | Are any concept timestamps older than your review cadence? |
| 9 | Public/Private Separation | Is internal-only knowledge excluded from the public bundle? |
| 10 | llms.txt Discovery | Does your llms.txt point to your OKF bundle? Does it exist at all? |
OKF Objections — Answered Directly from Reddit and Practitioner Discussion
Our research surfaced consistent objections from Reddit, SEO forums, and practitioner discussions. Here are the rebuttals:
"Isn't this just markdown files in folders?"
Yes — and that is the point. The format is deliberately boring. The value is not in the file format; it is in the shared convention that lets any agent navigate any OKF bundle without custom integrations. JSON-LD also looks like "just JSON" — but the shared vocabulary is what makes structured data work across search engines.
"Waiting for Google to deprecate..."
This is a reasonable concern given Google's track record, but there are mitigating factors: (a) OKF is an open specification, not a Google product — even if Google abandons it, the spec is public and the files are plain markdown; (b) the format is lock-in-free by design — there is no migration cost because markdown files are markdown files; (c) the underlying concept (curated agent context) is bigger than OKF — the infrastructure you build is portable.
"One timestamp is too thin — where are created/modified/owner fields?"
This is a legitimate gap in the v0.1 spec. The single timestamp field is deliberately minimal for adoption, but governance metadata matters for trust. This is why AEO Engine recommends extending frontmatter with owner, review_status, reviewed_by, confidence, visibility, and expires_after fields. These are additive, ignored by conforming consumers, and make bundles auditable for enterprise use.
"Does this actually move AI visibility?"
Not yet — and anyone claiming otherwise is speculating. OKF v0.1 does not feed into Google Search rankings, AI Overviews, ChatGPT citations, or any other answer-engine surface. It is infrastructure for agent-readable context. The bet is that as agents become the dominant interface for accessing business information, structured knowledge surfaces will become table stakes — similar to how mobile-friendly design went from "nice to have" to "ranking factor" over roughly five years.
"This feels like llms.txt hype round two."
The skepticism is healthy. llms.txt solved a real problem (single-file site index for LLMs) but adoption has been limited because most agents do not yet check for it by default. OKF shares the same chicken-and-egg problem: it only works if agents know to look for /okf/. The difference is that OKF has a formal specification, a reference implementation inside Google Cloud's Knowledge Catalog, and solves a more fundamental enterprise problem (context assembly for agents). The format itself is useful today for internal agent context, even before public-web discovery is standardized.
OKF + AEO Engine = Managed Agent-Readable Knowledge
At AEO Engine, OKF is not a standalone product — it is part of our managed execution stack for brands that need to be discoverable, understandable, and citable in AI answers. Our approach combines:
- Audit: We score your current site against 50+ agent-readiness signals covering crawlability, entity clarity, content structure, structured data, authority signals, and OKF readiness.
- Concept extraction: We decompose your service pages, case studies, blog posts, and tools into concept graphs — not thin markdown conversions.
- Bundle authoring: We build governed, citation-backed OKF bundles with provenance metadata that agents can trust.
- Discovery wiring: We ensure your llms.txt, schema, robots, and sitemaps work together to make agent knowledge surfaces discoverable.
- Maintenance: We review and update concept timestamps, citations, and cross-links as your business and the OKF spec evolve.
FAQ
Is OKF a Google ranking factor?
No. OKF does not currently feed into Google Search rankings, AI Overviews, or any other search surface. It is an open format for agent-readable knowledge, not a ranking signal. Treat it as infrastructure, not an SEO quick win.
Should I build an OKF bundle today?
Yes — but start small. Hand-author a 5-10 concept bundle for a focused area of your business knowledge (your core service, your ICP, your methodology, your proof). Point an agent at it internally. Measure whether the agent produces better answers with the bundle than without it. The cost is measured in hours, the format is lock-in-free, and the learning compounds.
How is OKF different from schema markup?
Schema markup (JSON-LD) describes entities and their relationships for search engines. OKF describes concepts and processes for AI agents. Schema is structured, machine-readable, and vocabulary-constrained. OKF is semi-structured, human-readable, and vocabulary-free. They serve different consumers and different purposes — and they complement each other in a complete agent-knowledge stack.
Does OKF replace my website?
No. OKF is a supplementary knowledge surface — an agent-readable layer that sits alongside your HTML site, not a replacement for it. Humans still need your designed, interactive, conversion-optimized website. Agents need the structured knowledge surface that helps them understand and recommend your business.
What happens when agents start discovering OKF bundles natively?
This is the inflection point to watch. If Google, OpenAI, Anthropic, Perplexity, or major agent browsers announce public OKF discovery, early adopters will have a structural advantage — similar to the brands that had mobile-responsive sites before Google's mobile-first indexing. The format is cheap enough to adopt today that the asymmetric bet is worth considering.
How do I know if my OKF bundle is good?
Run it through the AEO Engine OKF Readiness Checker for an automated conformance + agent-readiness score. The tool audits concept granularity, required and recommended frontmatter, index/log hygiene, citation coverage, cross-link density, stale timestamps, public/private separation, and llms.txt discoverability.
Sources
- Google Cloud: How the Open Knowledge Format can improve data sharing — Google Cloud Blog, June 2026
- OKF v0.1 Specification — GoogleCloudPlatform/knowledge-catalog, GitHub
- Google Cloud Knowledge Catalog — Product page
- Andrej Karpathy: LLM Wiki — GitHub Gist
- Marie Haynes: The Open Knowledge Format (OKF) from Google is a new layer for agents — mariehaynes.com
- Suganthan Mohanadasan: Open Knowledge Format (OKF) Guide — suganthan.com
- Search Engine Journal: Google Cloud Announces The Open Knowledge Format — searchenginejournal.com
- Cherryleaf: Open Knowledge Format — What it means for technical documentation — cherryleaf.com
- Marie Haynes: Why Google's New Google-Agent is the Biggest Mindset Shift in SEO History — mariehaynes.com
- AEO Engine: What is Answer Engine Optimization? — aeoengine.ai
About the Author
Vijay Jacob
Founder & CEO, AEO Engine
Vijay Jacob is the founder of AEO Engine, an AI-powered Answer Engine Optimization company helping brands rank in Google and earn citations across ChatGPT, Google AI Overviews, Perplexity, Gemini, and Claude.
Learn more about Vijay →