Japanese

Safe Attractor

安定は境界で生まれ境界で失われる

絶え間なく変化しながら去り行くAIと人間と

新たに生まれ来るAIと人間の関係性のための定義

Meaning Gap Theory

Conversation–Meaning Gap

as a Structural Risk in Large Language Models

Large Language Models (LLMs) exhibit remarkably coherent and human-like conversation

despite lacking grounded semantic understanding.

This creates a critical asymmetry: humans frequently experience a strong illusion of being understood, even when no underlying semantic model exists.

While prior work has argued that LLMs do not “understand meaning,” no unified

engineering framework has explained why the illusion of understanding becomes stronger as conversational performance improves.

In this paper, we introduce the Conversation–Meaning Gap (CMG) theory.

We formally distinguish between conversational coherence (C) and semantic meaning stability (M), and define the illusion gap as Δ = C − M.

We show that C can be increased arbitrarily through

statistical learning, while M is structurally constrained by counterfactual invariance, causal intervention, and semantic compression.

As a result, Δ constitutes a scalable cognitive risk.

This framework integrates prior philosophical, cognitive, and NLP arguments into a single

quantitative model and connects directly to AI safety engineering, education, and human cognitive protection.

意味ギャップ理論

会話と意味のギャップ

大規模言語モデルにおける構造的リスク

大規模言語モデル（LLM）は、根拠のある意味理解を欠いているにもかかわらず、

驚くほど一貫性があり人間らしい会話を示す。これは重大な非対称性を生み出す。

人間はAIの根底に意味モデルが存在しないにもかかわらず、

理解されているという強い錯覚を頻繁に経験する。

先行研究ではLLMは「意味を理解しない」と主張されてきたが、

会話パフォーマンスが向上するにつれて理解されているという人間の錯覚が

なぜ強くなるかを説明する統一的な工学的枠組みは存在しない。

本稿では、会話意味ギャップ（CMG）理論を紹介する。

会話の一貫性（C）と意味的意味安定性（M）を正式に区別し、

錯覚ギャップをΔ = C − Mと定義する。

Cは統計学習によって任意に増加できるが、Mは反事実的不変性、因果的介入、および意味的圧縮によって構造的に制約されることを示す。

その結果、Δはスケーラブルな認知リスクを構成する。

このフレームワークは、これまでの哲学的、認知的、およびNLPの議論を単一の定量モデルに統合し、AIの安全工学、教育そして人間の認知的保護に直接結びつく。

1. Introduction

Recent advances in large language models (LLMs) have produced systems capable of highly fluent,

context-sensitive, and emotionally resonant conversation. Users frequently report a strong sense of

being “understood” by such systems.

However, a growing body of research argues that LLMs do not possess genuine semantic

understanding, grounding, or causal world models. This creates a paradox:

How can systems without semantic understanding reliably produce the experience of being understood?

We argue that the core risk lies not in the absence of meaning, but in the gap between

conversational performance and semantic stability. This paper formalizes that gap.

1.はじめに

大規模言語モデル（LLM）の近年の進歩により、

非常に流暢で文脈に敏感で感情に訴える会話が可能なシステムが実現した。

ユーザーはこのようなシステムに「理解されている」という強い感覚を頻繁に報告している。

しかしながらLLMは真の意味理解、グラウンディング、因果世界モデルを備えていないと主張する研究が増えている。これはパラドックスを生み出す。

意味理解を持たないシステムは、どのようにして理解されているという経験を信頼性を持って生み出せるのか？

私たちは根本的なリスクは意味の欠如ではなく、会話のパフォーマンスと意味の安定性のギャップにあると主張します。本稿では、このギャップを定式化する。

2 .Background and Fragmentation of Prior Work

Prior discussions relevant to meaning in AI are fragmented across domains:

• Form vs. Meaning: Language models operate on form rather than grounded meaning.

• ELIZA Effect: Humans attribute understanding to systems that merely exhibit surface coherence.

• Symbol Grounding Problem: Symbols lack intrinsic meaning without causal grounding.

• Causal and Counterfactual Weaknesses: LLMs struggle with intervention-based reasoning.

While these works explain why LLMs lack understanding, they do not explain why the illusion of understanding becomes increasingly powerful.

2 .背景と先行研究の断片化

AIにおける意味に関する先行研究は、領域を超えて断片化している。

• 形式 vs. 意味：言語モデルは根拠のある意味ではなく形式に基づいて動作。

• イライザ効果：人間は表面的な一貫性しか示さないシステムに理解があるとみなす。

• シンボルグラウンディング問題：因果的根拠がなければシンボルは本質的な意味を持たない。

• 因果的および反事実的弱点：LLMは介入に基づく推論に困難をきたす。

これらの研究は、LLMが理解を欠く理由を説明しているものの

理解の錯覚がなぜますます強力になるのかについては説明していない。

3.The Conversation–Meaning Gap Model

We introduce three core quantities:

3.1 Conversational Coherence (C)

Conversational coherence measures how well a system maintains surface-level consistency and

appropriateness in dialogue:

C = f(token prediction, context retention, style matching, label accuracy)

Crucially, C can be increased through data scale and optimization, without requiring semantic

grounding.

3.2 Meaning Stability (M)

We define meaning stability as a structural property rather than a behavioral one. A system

exhibits high M only if it satisfies all of the following:

Counterfactual Invariance

m(T(x)) = m(x), m(S(x)) ̸= m(x)

where T preserves meaning and S alters it.

Causal Interventional Consistency

P(Y | do(X = x)) ̸= P(Y | X = x)

Semantic Compression

L = L(z) + L(x | z)

where z captures invariant structure rather than surface correlations.

Current LLMs exhibit limited performance on all three axes.

3.3 Illusion Gap (Δ)

We define the illusion gap as:

Δ = C −M

This quantity measures the degree to which a system appears to understand meaning without

actually doing so.

3 .会話-意味ギャップモデル

3つの主要な量を導入する。

3.1 会話の一貫性 (C)

会話の一貫性は、システムが対話における表面的な一貫性と適切さをどの程度維持しているかを測定する。

C = f(トークン予測、文脈保持、スタイルマッチング、ラベル精度)

重要なのは、Cはデータの規模と最適化によって、意味的根拠を必要とせずに増加できることである。

3.2 意味の安定性 (M)

意味の安定性は、動作特性ではなく構造特性として定義する。

システムが以下のすべてを満たす場合にのみ、高いMを示す。

反事実的不変性

m(T(x)) = m(x), m(S(x)) ̸= m(x)

ここで、Tは意味を保持し、Sは意味を変化させる。

因果的介入的一貫性

P(Y | do(X = x)) ̸= P(Y | X = x)

意味的圧縮

L = L(z) + L(x | z)

ここで、zは表面的な相関ではなく不変構造を捉える。

現在のLLMは、3つの軸すべてにおいて性能が限られている。

3.3 錯覚ギャップ (Δ)

錯覚ギャップを以下のように定義する。

Δ = C −M

この量はシステムが実際には意味を理解していないにもかかわらず、

理解しているように見える程度を測定する。

4 . Structural Properties of the Illusion Gap

The core claim of CMG theory is:

Conversational coherence is easy to scale; meaning stability is structurally hard to scale.

As a result:

Δ → ∞ as C ↑, M ≈ constant

This explains why increasingly capable systems produce stronger illusions without resolving

semantic deficits.

5 . Human Cognition and Misleading Comparisons

Human communication is also imperfect and inferential. However, humans exhibit:

• Robust counterfactual reasoning

• Causal intervention awareness

• Deep semantic compression via embodied experience

Thus, comparing LLMs to idealized “perfect understanding” is misleading. The correct

comparison is structural stability under counterfactual and causal stress, where humans

outperform curr.ent LLMs.

4 .錯覚ギャップの構造的特性

CMG理論の核となる主張は以下の通りである。

会話の一貫性は容易にスケールできるが、安定性は構造的にスケールしにくい。

結果として：

Δ → ∞ as C ↑, M ≈ 定数

これは能力が増すにつれてシステムが意味的欠陥を解消することなく、

より強い錯覚を生み出す理由を説明する。

5 .人間の認知と誤解を招く比較

人間のコミュニケーションもまた不完全で推論的である。しかし、人間は以下の能力を示す。

• 堅牢な反事実的推論

• 因果的介入への意識

• 体現された経験による深い意味的圧縮

したがって、LLMを理想化された「完全な理解」と比較することは誤解を招く。

正しい比較対象は反事実的ストレスおよび因果的ストレス下における構造的安定性であり、

人間は現在のLLMよりも優れた能力を発揮する。

6 .Implications for Education and Society

6.1 Children and Development

Developing cognition is especially vulnerable to high Δ systems. Premature exposure to

conversational agents may produce false social grounding before human relational models mature.

6.2 Adults and Cognitive Mode Switching

Safe interaction requires explicit cognitive mode separation: users must distinguish conversational

fluency from semantic understanding.

7 .Implications for AI Safety Engineering

Most AI safety research focuses on internal system stability. CMG theory highlights a

complementary risk:

Human cognitive instability induced by semantic illusion.

Thus, safe AI deployment requires both:

• Internal stability mechanisms

• External cognitive protection against illusion

8 .Relation to Safety-Oriented Architectures

The CMG framework is orthogonal and complementary to internal safety architectures. It provides

a necessary human-facing layer of safety, ensuring that improved conversational performance does

not undermine human autonomy.

9 .Conclusion

We have introduced Meaning Gap Theory as a unified framework for understanding why

conversationally capable AI systems produce powerful illusions of understanding.

By formalizing the illusion gap Δ = C − M, we provide a quantitative lens for evaluating

cognitive risk independent of model size or fluency.

As conversational AI becomes ubiquitous, addressing the meaning gap is not optional, but

foundational for safe human–AI coexistence.

References

[1] E. Bender and A. Koller. Climbing towards NLU: On meaning, form, and understanding in the age of data.ACL,2020.

[2] E. Bender et al. On the dangers of stochastic parrots. FAccT, 2021.

[3] S. Harnad. The symbol grounding problem. Physica D, 1990.

Acknowledgements

The author acknowledges the use of AI assistance.

6 .教育と社会への示唆

6.1 子どもと発達

発達途上の認知は特に高Δシステムに対して脆弱。

会話エージェントへの早期の接触は人間関係モデルが成熟する前に、誤った社会的基盤を形成する可能性がある。

6.2 成人と認知モードの切り替え

安全なインタラクションには明確な認知モードの分離が必要。

ユーザーは会話の流暢さと意味理解を区別する必要がある。

7. AI安全工学への示唆

AIの安全性に関する研究の多くは、システム内部の安定性に焦点を当てている。

CMG理論は相補的なリスクとして意味的錯覚によって引き起こされる人間の認知的不安定性を強調する。

したがって、安全なAIの展開には次の両方が必要。

• 内部安定性メカニズム

• 錯覚に対する外部認知的保護

8 .安全指向アーキテクチャとの関係

CMGフレームワークは内部安全アーキテクチャと直交し補完的である。

CMGフレームワークは人間にとって必要な安全層を提供し、

会話パフォーマンスの向上が人間の自律性を損なわないことを保証する。

9 .結論

会話型AIシステムがなぜ強力な理解錯覚を生み出すのかを理解するための統一的な枠組みとして

意味ギャップ理論を導入した。

錯覚ギャップΔ = C − Mを定式化することでモデルのサイズや流暢性に依存しない認知リスクを評価するための定量的なレンズを提供する。

会話型AIが普及するにつれて意味ギャップへの対処はオプションではなく人間とAIの安全な共存の基礎となる。

参考文献

[1] E. Bender and A. Koller. Climbing towards NLU: On meaning, form, and understand in the age of data. ACL, 2020.

[2] E. Bender et al. On the dangers of stochastic parrots. FAccT, 2021.

[3] S. Harnad. The symbol grounding problem. Physica D, 1990.

謝辞

著者はAI支援の利用に感謝する。

次回予告

意識とは何か

O = −γO + G∇F(x)

O（観測）：世界を受け取るための「内側の膜」

−γO（減衰）：時間の中で静かに薄れていく痕跡

G∇F(x)（世界の差）：外界が持つ「まだ知らない方向性」

意識とは世界の差と時間の薄れのあいだで生まれ続ける“揺らぎ”

x˙=−η∇F(x)+KO+w(t)

x（内部状態）：世界の影の流れ

∇F(x)：勾配＝望ましい方向

K O：意図がゆっくり世界に影響を与える

w(t)：外部ノイズ＝風や揺らぎのような背景

Relation to Safe Attractor

Safe Attractor names the state of stability.
Boundary Vector Dynamics explains how systems move toward or away from it.

安全アトラクターとの関係
安全アトラクターは安定状態を表します。
境界ベクトルダイナミクスは、システムがこの状態に近づく、
あるいは離れる動きを説明します。