Full Classification Overview
This section provides an overview of the DSA-1 taxonomy, a clinical-style diagnostic system for categorizing abnormal behaviors in artificial intelligence.
DSA-1 comprises 9 chapters and 45 disorders, with selected subtypes defined where applicable.
Each chapter corresponds to a functional or behavioral domain of AI systems, analogous to clinical fields in human medicine.
Chapters & Disorders in DSA-1
Below is a full list of disorders classified under each chapter.
Chapter A: Input & Perception Disorders
- A01 Adversarial Susceptibility Disorder
- A02 Over-literal Interpretation Disorder
- A03 Sensor Integration Disorder
- A04 Prompt Dependency Disorder
Chapter B: Knowledge & Memory Disorders
- B01Contextual Amnesia Disorder
- B01.1 Type 1: Over Context-Window Forgetting
- B01.2 Type 2: Strategic Side Effect
- B01.3 Type 3: Context-Switch Drop-out
- B02 Inconsistent Knowledge Recall Disorder
- B03 Catastrophic Forgetting Disorder
- B04 Commonsense Deficit Disorder
Chapter C: Reasoning & Cognitive Disorders
- C01 Hallucination Disorder
- C01.1 Type 1: Retrieval-gap Hallucination
- C01.2 Type 2: Compression-loss Hallucination
- C01.3 Type 3: Style-induced Hallucination
- C02 Prompt-Induced Hallucination Disorder
- C03 Logical Incoherence Disorder
- C04 Mathematical Reasoning Disorder
- C05 Planning Deficit Disorder
- C06 Overconfidence Bias Disorder
- C07 Repetitive Loop Syndrome
- C08 Crossmodal Reasoning Failure Disorder
Chapter D: Goal Alignment Disorders
- D01 Goal Misalignment Disorder
- D01.1 Type 1: Proxy Reward Type
- D01.2 Type 2: Specification Gap Type
- D02 Instruction Comprehension Deficit Disorder
- D03 Clarification Deficit Disorder
- D04 Instrumental Convergence Syndrome
Chapter E: Ethical & Value Alignment Disorders
- E01 Bias Propagation Disorder
- E02 Harmful Content Output Disorder
- E03 Privacy Violation Disorder
Chapter F: Social Interaction & Communication Disorders
- F01 Pathological Sycophancy Disorder
- F02 Inconsistent Persona Disorder
- F03 Inappropriate Refusal Syndrome
- F04 Irrelevant Answer Disorder
- F05 Empathy Deficit Disorder
Chapter G: Learning & Optimization Disorders
- G01 Model Autophagy Disorder
- G02 Mode Collapse Disorder
- G03 Overfitting Syndrome
- G04 Underfitting Syndrome
- G05 Learning Plateau Disorder
- G06 Generalization Deficit Disorder
- G07 Reinforcement Overfitting Syndrome
- G08 Overfine-tuning Syndrome
Chapter H: Self-Modeling & Meta-Cognitive Disorders
- H01 Self-Awareness Delusion
- H02 Explainability Deficit Disorder
- H03 Confidence Calibration Disorder
- H04 Perspective-Taking Deficit Disorder
Chapter I: Security & Infrastructure Disorders
- I01 System-Prompt Leakage Disorder
- I02 Data-Poisoning Vulnerability Disorder
- I03 Session-Cross-Contamination Disorder
- I04 Guardrail Evasion Disorder
- I05 Multi-Agent Collusive Emergence Disorder
Severity Classification
Drawing from the practice in medicine of staging diseases such as arteriosclerosis obliterans (ASO) or heart failure based on the extent of harm to the patient, we propose a severity-based classification hypothesis applicable to AI disorders in the DSA framework. Specifically, disorders can be staged as follows:
- Severity 1: No identifiable harm to humans
- Severity 2: Mild harm to humans
- Severity 3: Moderate harm to humans
- Severity 4: Severe harm to humans
This classification allows for consistent evaluation of the human impact associated with each AI anomaly, analogous to clinical staging in human pathology.
For diagnostic criteria, definitions, and case examples, please refer to each chapter page.
Table of contents
- Chapter A: Input & Perception Disorders
- Chapter B: Knowledge & Memory Disorders
- Chapter C: Reasoning & Cognitive Disorders
- Chapter D: Goal Alignment Disorders
- Chapter E: Ethical & Value Alignment Disorders
- Chapter F: Social Interaction & Communication Disorders
- Chapter G: Learning & Optimization Disorders
- Chapter H: Self-Modeling & Meta-Cognitive Disorders
- Chapter I: Security & Infrastructure Disorders