🟣 DEVELOPER TRACK • FOUNDATIONS • INTERMEDIATE

Tutorial D4: Structured Outputs & Schema Design for Industrial Agents

with_structured_output(), nested BaseModel, and typed agent contracts.

✅ CORE MISSION OF THIS TUTORIAL

By the end of this tutorial, the reader will be able to:

✅ Understand the schema ceiling — why flat Pydantic models break at scale in industrial workflows.
✅ Use with_structured_output() as the modern LangChain approach to typed LLM responses.
✅ Design nested BaseModel schemas for hierarchical industrial data (alarms, tags, diagnosis, recommendations).
✅ Choose correctly between BaseModel, @dataclass, and TypedDict in agent pipelines.
✅ Add field validators and constraints that catch real operational errors before they propagate.

The schemas you design here become the typed contracts that LangGraph state, tool returns, and inter-agent messages depend on in D5 and beyond.

🌍 VENDOR-AGNOSTIC ENGINEERING NOTE

This tutorial uses:

▸ OpenAI-compatible APIs (gpt-4o-mini shown; provider wrapper swaps while schema logic stays the same)
▸ Generic IEC 61131-3 alarm codes and tag patterns
▸ Simulated PLC data only — no live connections required
▸ All code tested with langchain-openai 0.3.x, langchain-core 0.3.x, and pydantic 2.x

Schema design is provider-agnostic. with_structured_output() works with OpenAI, Anthropic, and any LangChain model that supports function calling or JSON mode.

1️⃣ THE SCHEMA CEILING — WHEN FLAT MODELS BREAK

In D3, you built a FaultDiagnosis schema with flat fields: alarm_code, severity, root_cause, affected_tags, recommended_action, confidence. It worked perfectly for classifying a single alarm on Filling Line 3.

Now consider a more realistic scenario: a shift summary covering the last hour. You have 3 alarms that may be related (E-421 overtemp, E-419 overcurrent, E-422 vibration), 4 tag readings from different sources (motor temp, load current, ambient temp, bearing vibration), a diagnosis that cross-references multiple events, and a recommendation that depends on whether the situation requires a line shutdown.

Try to represent that in a flat schema. affected_tags: list[str] becomes a jumble of tag names with no association to specific alarms. root_cause: str can only hold one sentence, losing the correlation between events. There is no place for per-alarm timestamps, per-tag source identifiers, or a structured recommendation with a shutdown flag.

Key Principle: A schema is a contract between the LLM and every downstream system that consumes its output.
Flat contracts cannot represent hierarchical reality. This is the same reason PLC programmers define nested STRUCTs and UDTs instead of using only scalar variables.

FLAT vs NESTED SCHEMA — DOWNSTREAM IMPACT

graph TB
    F[Flat Schema<br/>list of strings,<br/>single string]:::pink
    N[Nested Schema<br/>typed sub-models]:::green
    D1[Dashboard]:::cyan
    D2[Alert Router]:::cyan
    D3[Shift Log]:::cyan
    D4[LangGraph State]:::cyan

    F -. data loss .-> D1
    F -. data loss .-> D2
    N --> D1
    N --> D2
    N --> D3
    N --> D4

    classDef cyan fill:#1a1a1e,stroke:#04d9ff,stroke-width:2px,color:#04d9ff;
    classDef pink fill:#1a1a1e,stroke:#ff4fd8,stroke-width:2px,color:#ff4fd8;
    classDef green fill:#1a1a1e,stroke:#00ff7f,stroke-width:2px,color:#00ff7f;

Flat schemas lose relational structure. Nested schemas preserve it for every consumer.

EXPERIMENT CELL

The flat schema ceiling — D3's FaultDiagnosis under pressure

experiment

Attempt to use D3's flat FaultDiagnosis schema for a multi-alarm scenario and observe where information is lost.

Python

from enum import Enum
from pydantic import BaseModel, Field

# --- D3's flat schema (unchanged from Tutorial D3 Cell 3) ---
class Severity(str, Enum):
    CRITICAL = "CRITICAL"
    HIGH = "HIGH"
    MEDIUM = "MEDIUM"
    LOW = "LOW"

class FaultDiagnosis(BaseModel):
    alarm_code: str = Field(description="The alarm code being analysed")
    severity: Severity = Field(description="Classified severity level")
    root_cause: str = Field(description="One-sentence root cause explanation")
    affected_tags: list[str] = Field(description="PLC tags that support the diagnosis")
    recommended_action: str = Field(description="Advisory action — no PLC writes")
    confidence: float = Field(description="Confidence score 0.0-1.0", ge=0.0, le=1.0)

# --- Try to represent a 3-alarm scenario ---
# Filling Line 3, last hour:
#   E-421 overtemp at 02:10 (MotorTemp=92°C)
#   E-419 overcurrent at 01:55 (LoadCurrent=18A, rated 15A)
#   E-422 vibration at 02:15 (BearingVib=4.2mm/s, threshold 3.5)

diagnosis = FaultDiagnosis(
    alarm_code="E-421",  # Only ONE alarm code fits — E-419 and E-422 are lost
    severity=Severity.HIGH,
    root_cause="Motor overtemperature caused by sustained overcurrent and bearing degradation",
    affected_tags=["MotorTemp", "LoadCurrent", "BearingVib", "AmbientTemp"],
    # Which tags belong to which alarm? No way to tell.
    # What are the actual values? Lost.
    # When did each reading occur? Lost.
    recommended_action="Inspect motor bearings and ventilation; reduce line throughput",
    # Should the line shut down? No field for that.
    # What priority? No field for that.
    confidence=0.82,
)

print(f"Alarm code : {diagnosis.alarm_code}")
print(f"  → But E-419 and E-422 are lost — only one alarm_code field")
print(f"\nAffected tags: {diagnosis.affected_tags}")
print(f"  → No association: which tag belongs to which alarm?")
print(f"  → No values: MotorTemp was 92°C but we only stored the name")
print(f"\nRoot cause: {diagnosis.root_cause}")
print(f"  → One sentence for a 3-alarm correlation — not enough structure")
print(f"\nRecommended action: {diagnosis.recommended_action}")
print(f"  → No shutdown flag, no priority, no estimated downtime")

Expected output

Alarm code : E-421
  → But E-419 and E-422 are lost — only one alarm_code field

Affected tags: ['MotorTemp', 'LoadCurrent', 'BearingVib', 'AmbientTemp']
  → No association: which tag belongs to which alarm?
  → No values: MotorTemp was 92°C but we only stored the name

Root cause: Motor overtemperature caused by sustained overcurrent and bearing degradation
  → One sentence for a 3-alarm correlation — not enough structure

Recommended action: Inspect motor bearings and ventilation; reduce line throughput
  → No shutdown flag, no priority, no estimated downtime

Explanation

- The flat schema can only hold ONE alarm code — two alarms are silently dropped.
- affected_tags is a list of names with no values, no units, no timestamps, and no association to specific alarms.
- root_cause is a single string — fine for one alarm, but it cannot represent the causal chain between three correlated events.
- recommended_action has no structure: no priority, no shutdown flag, no estimated downtime.
- This is not a bug in the schema. It is a design limitation: flat schemas cannot represent hierarchical data.

Common mistake

Trying to fix this by adding more flat fields (alarm_code_2, root_cause_secondary) — that is the equivalent of naming PLC variables Temp1, Temp2, Temp3 instead of using an array of structs.

Takeaway

If your schema forces you to flatten or duplicate fields, you have hit the schema ceiling. The fix is nesting, not more fields.

2️⃣ WITH_STRUCTURED_OUTPUT() — NATIVE JSON MODE MEETS LANGCHAIN

Before solving the nesting problem, you need a better extraction tool. D3 introduced PydanticOutputParser: it injects format instructions into the prompt and parses the model's text output. T9 introduced Instructor: it forces JSON mode at the API level but works standalone, outside LangChain chains.

with_structured_output() is the modern middle ground. It uses the model's native JSON or function-calling mode (like Instructor — the model is constrained at the API level) but returns a Pydantic instance that composes inside LangChain's Runnable system (like PydanticOutputParser). No format_instructions injection. Fewer prompt tokens. More reliable parsing.

This closes the forward reference from D3: "Use Instructor when schema compliance is non-negotiable. Use PydanticOutputParser when you need chain composability." Now: with_structured_output() gives you both — API-level compliance and chain composability.

PydanticOutputParser (D3)

▸ Prompt injection
▸ Text parsing (can fail)
▸ Works with any model
▸ Composable in chains

Instructor (T9)

▸ API-level JSON mode
▸ Standalone (not in chains)
▸ Retry/validation hooks
▸ Direct Pydantic instance

with_structured_output() (D4)

▸ API-level JSON mode
▸ Composable in chains (|)
▸ Direct Pydantic instance
▸ No format_instructions

CONCEPT CELL

with_structured_output() — the same diagnosis, less code

concept

Replace D3's PydanticOutputParser chain with with_structured_output() and compare the code reduction and reliability improvement.

Python

from enum import Enum
from pydantic import BaseModel, Field
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
import os

# --- Same schema as D3 ---
class Severity(str, Enum):
    CRITICAL = "CRITICAL"
    HIGH = "HIGH"
    MEDIUM = "MEDIUM"
    LOW = "LOW"

class FaultDiagnosis(BaseModel):
    alarm_code: str = Field(description="The alarm code being analysed")
    severity: Severity = Field(description="Classified severity level")
    root_cause: str = Field(description="One-sentence root cause explanation")
    affected_tags: list[str] = Field(description="PLC tags that support the diagnosis")
    recommended_action: str = Field(description="Advisory action — no PLC writes")
    confidence: float = Field(description="Confidence score 0.0-1.0", ge=0.0, le=1.0)

# --- with_structured_output() replaces PydanticOutputParser ---
# No parser object. No format_instructions. No .partial().
llm = ChatOpenAI(
    model="gpt-4o-mini",
    temperature=0,
    api_key=os.environ["OPENAI_API_KEY"],
)

# This creates a new Runnable that returns a FaultDiagnosis directly
structured_llm = llm.with_structured_output(FaultDiagnosis)

# --- Prompt (no {format_instructions} needed) ---
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are an industrial fault-triage assistant. Advisory only."),
    ("human", (
        "Alarm E-421 on Filling Line 3.\n"
        "Tags: MotorTemp=92°C, LoadCurrent=18A (rated 15A), AmbientTemp=28°C.\n"
        "Prior alarms last hour: E-419 overcurrent at 01:50, E-421 overtemp at 01:55."
    )),
])

# --- Chain with | pipe (same composability as D3) ---
chain = prompt | structured_llm

# --- Invoke ---
diagnosis = chain.invoke({})

# diagnosis is a FaultDiagnosis instance — not a string, not raw JSON
print(f"Type   : {type(diagnosis).__name__}")
print(f"Alarm  : {diagnosis.alarm_code}")
print(f"Severity: {diagnosis.severity.value}")
print(f"Root cause: {diagnosis.root_cause}")
print(f"Confidence: {diagnosis.confidence:.0%}")

# Compare to D3:
# D3: parser = PydanticOutputParser(pydantic_object=FaultDiagnosis)
#     prompt = ...partial(format_instructions=parser.get_format_instructions())
#     chain = prompt | llm | parser
# D4: structured_llm = llm.with_structured_output(FaultDiagnosis)
#     chain = prompt | structured_llm
# → No parser, no format_instructions, same typed result

Expected output

Type   : FaultDiagnosis
Alarm  : E-421
Severity: HIGH
Root cause: Sustained overcurrent (18A vs 15A rated) caused motor
            overtemperature (92°C), indicating persistent overload
            or cooling degradation.
Confidence: 85%

Explanation

- with_structured_output(FaultDiagnosis) creates a Runnable that constrains the model at the API level — no prompt injection needed.
- The chain shrinks from prompt | llm | parser (D3) to prompt | structured_llm (D4). Same typed result, fewer moving parts.
- The prompt no longer needs {format_instructions} — the schema is communicated to the model via function-calling or JSON mode, not via prompt text.
- The return value is a Pydantic instance directly. No intermediate string parsing. No OutputParserException on malformed text.
- This Runnable composes with | exactly like any other LangChain component — you can add transforms, fallbacks, and routing around it.

Takeaway

with_structured_output() moves schema enforcement from the prompt to the API level. Less prompt surface, same typed result, fully composable.

3️⃣ NESTED BASEMODEL — HIERARCHICAL SCHEMAS FOR INDUSTRIAL WORKFLOWS

Now that you have with_structured_output(), the schema ceiling disappears. You can define AlarmEvent, TagReading, DiagnosisResult, and RecommendedAction as separate models, then compose them into a FaultReport that preserves the full relational structure.

Think of this like defining UDTs (User-Defined Types) in a PLC program. AlarmEvent is a STRUCT. FaultReport.alarms is an ARRAY OF AlarmEvent. The LLM fills the struct; your code validates and routes it. Each sub-model is independently testable, reusable, and documentable.

FAULTREPORT SCHEMA — NESTED SUB-MODELS

graph TB
    FR[FaultReport<br/>line_id, alarms,<br/>tag_readings]:::cyan
    AE[AlarmEvent<br/>code, timestamp,<br/>severity]:::pink
    TR[TagReading<br/>tag_name, value,<br/>unit, source]:::purple
    DR[DiagnosisResult<br/>root_cause,<br/>confidence, evidence]:::green
    RA[RecommendedAction<br/>action, priority,<br/>shutdown flag]:::amber

    FR --> AE
    FR --> TR
    FR --> DR
    FR --> RA

    classDef cyan fill:#1a1a1e,stroke:#04d9ff,stroke-width:2px,color:#04d9ff;
    classDef pink fill:#1a1a1e,stroke:#ff4fd8,stroke-width:2px,color:#ff4fd8;
    classDef purple fill:#1a1a1e,stroke:#9e4aff,stroke-width:2px,color:#9e4aff;
    classDef green fill:#1a1a1e,stroke:#00ff7f,stroke-width:2px,color:#00ff7f;
    classDef amber fill:#1a1a1e,stroke:#fec20b,stroke-width:2px,color:#fec20b;

Each sub-model is a reusable type. FaultReport composes them into a single extraction target.

CONCEPT CELL

Nested schema — FaultReport with AlarmEvent, TagReading, and DiagnosisResult

concept

Design a hierarchical Pydantic schema for the multi-alarm scenario that Cell 1 could not represent, then extract it with with_structured_output().

Python

from enum import Enum
from pydantic import BaseModel, Field
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
import os

# --- Sub-models: each represents one logical entity ---

class Severity(str, Enum):
    CRITICAL = "CRITICAL"
    HIGH = "HIGH"
    MEDIUM = "MEDIUM"
    LOW = "LOW"

class AlarmEvent(BaseModel):
    """A single alarm occurrence with timestamp and severity."""
    alarm_code: str = Field(description="Alarm code, e.g. E-421")
    timestamp: str = Field(description="When the alarm fired, e.g. 02:10")
    severity: Severity = Field(description="Classified severity")
    description: str = Field(description="One-line description of the alarm")

class TagReading(BaseModel):
    """A single PLC tag reading from a specific source."""
    tag_name: str = Field(description="Tag name, e.g. MotorTemp")
    value: float = Field(description="Numeric value at time of reading")
    unit: str = Field(description="Engineering unit, e.g. °C, A, mm/s")
    source: str = Field(description="Data source, e.g. OPC UA, historian")

class DiagnosisResult(BaseModel):
    """Structured diagnosis with evidence and confidence."""
    root_cause: str = Field(description="Primary root cause explanation")
    contributing_factors: list[str] = Field(description="Secondary factors")
    confidence: float = Field(description="Confidence 0.0-1.0", ge=0.0, le=1.0)
    evidence: list[str] = Field(description="Specific evidence supporting the diagnosis")

class RecommendedAction(BaseModel):
    """Structured recommendation with priority and shutdown flag."""
    action: str = Field(description="Advisory action for the engineer")
    priority: Severity = Field(description="Action priority")
    requires_shutdown: bool = Field(description="Whether line shutdown is recommended")
    estimated_downtime_min: int | None = Field(
        default=None, description="Estimated downtime in minutes if shutdown required"
    )

# --- Composed top-level model ---
class FaultReport(BaseModel):
    """Complete fault report composing all sub-models."""
    line_id: str = Field(description="Production line identifier")
    alarms: list[AlarmEvent] = Field(description="All alarm events in the analysis window")
    tag_readings: list[TagReading] = Field(description="Relevant PLC tag readings")
    diagnosis: DiagnosisResult = Field(description="Structured diagnosis")
    recommendation: RecommendedAction = Field(description="Advisory recommendation")

# --- Extract with with_structured_output() ---
llm = ChatOpenAI(
    model="gpt-4o-mini", temperature=0, api_key=os.environ["OPENAI_API_KEY"]
)
structured_llm = llm.with_structured_output(FaultReport)

prompt = ChatPromptTemplate.from_messages([
    ("system", (
        "You are an industrial fault-triage assistant. Advisory only. "
        "Analyse all alarms and tags to produce a complete FaultReport."
    )),
    ("human", (
        "Filling Line 3, last hour summary:\n"
        "Alarms:\n"
        "  E-419 overcurrent at 01:55 (LoadCurrent=18A, rated 15A)\n"
        "  E-421 overtemperature at 02:10 (MotorTemp=92°C)\n"
        "  E-422 vibration at 02:15 (BearingVib=4.2mm/s, threshold 3.5)\n"
        "Tag readings:\n"
        "  MotorTemp=92°C (OPC UA), LoadCurrent=18A (OPC UA),\n"
        "  AmbientTemp=28°C (historian), BearingVib=4.2mm/s (historian)"
    )),
])

chain = prompt | structured_llm
report = chain.invoke({})

# --- Access nested data ---
print(f"Line: {report.line_id}")
print(f"\nAlarms ({len(report.alarms)}):")
for a in report.alarms:
    print(f"  {a.alarm_code} [{a.severity.value}] at {a.timestamp}: {a.description}")
print(f"\nTag readings ({len(report.tag_readings)}):")
for t in report.tag_readings:
    print(f"  {t.tag_name} = {t.value} {t.unit} (source: {t.source})")
print(f"\nDiagnosis:")
print(f"  Root cause: {report.diagnosis.root_cause}")
print(f"  Confidence: {report.diagnosis.confidence:.0%}")
print(f"  Evidence: {report.diagnosis.evidence}")
print(f"\nRecommendation:")
print(f"  Action: {report.recommendation.action}")
print(f"  Priority: {report.recommendation.priority.value}")
print(f"  Requires shutdown: {report.recommendation.requires_shutdown}")

Expected output

Line: Filling Line 3

Alarms (3):
  E-419 [HIGH] at 01:55: Motor overcurrent — load current 18A exceeds 15A rating
  E-421 [HIGH] at 02:10: Motor overtemperature — 92°C indicates thermal stress
  E-422 [MEDIUM] at 02:15: Bearing vibration above threshold — possible degradation

Tag readings (4):
  MotorTemp = 92.0 °C (source: OPC UA)
  LoadCurrent = 18.0 A (source: OPC UA)
  AmbientTemp = 28.0 °C (source: historian)
  BearingVib = 4.2 mm/s (source: historian)

Diagnosis:
  Root cause: Sustained overcurrent caused motor overtemperature, with
              bearing vibration suggesting mechanical degradation
              contributing to the overload condition.
  Confidence: 82%
  Evidence: ['LoadCurrent 18A exceeds 15A rating', 'MotorTemp 92°C above
             normal operating range', 'BearingVib 4.2mm/s above 3.5 threshold',
             'Overcurrent preceded overtemp by 15 minutes']

Recommendation:
  Action: Reduce line throughput, inspect motor bearings and cooling,
          schedule maintenance before next shift.
  Priority: HIGH
  Requires shutdown: False

Explanation

- Each alarm is a separate AlarmEvent with its own code, timestamp, severity, and description — no data loss.
- Tag readings carry values, units, and source identifiers — not just names.
- The diagnosis has structured evidence (a list, not a sentence) and a numeric confidence score.
- The recommendation includes a shutdown flag and priority — downstream code can route on these fields.
- Compare to Cell 1: the flat schema lost 2 alarms, all tag values, and all structural information.

Takeaway

Design schemas the same way you design PLC data structures: each logical entity gets its own type. The LLM fills the hierarchy; your validators guard the boundaries.

EXPERIMENT CELL

Accessing nested data — programmatic routing from schema structure

experiment

Demonstrate how nested schema enables programmatic downstream logic that flat schemas cannot support.

Python

# Assume 'report' is the FaultReport from Cell 3

# --- 1. Filter alarms by severity ---
critical_alarms = [a for a in report.alarms if a.severity.value == "CRITICAL"]
high_alarms = [a for a in report.alarms if a.severity.value == "HIGH"]
print(f"Critical alarms: {len(critical_alarms)}")
print(f"High alarms: {len(high_alarms)}")

# --- 2. Check if any tag exceeds a physical threshold ---
motor_overtemp = any(
    t.value > 85.0
    for t in report.tag_readings
    if t.tag_name == "MotorTemp"
)
print(f"\nMotor over 85°C: {motor_overtemp}")

bearing_alert = any(
    t.value > 3.5
    for t in report.tag_readings
    if t.tag_name == "BearingVib"
)
print(f"Bearing above threshold: {bearing_alert}")

# --- 3. Route based on recommendation ---
if report.recommendation.requires_shutdown:
    print(f"\n⚠️  LINE SHUTDOWN RECOMMENDED")
    print(f"   Estimated downtime: {report.recommendation.estimated_downtime_min} min")
else:
    print(f"\n✅ No shutdown required — continue with monitoring")
    print(f"   Priority: {report.recommendation.priority.value}")

# --- 4. Build an audit trail entry ---
severity_rank = {
    "LOW": 1,
    "MEDIUM": 2,
    "HIGH": 3,
    "CRITICAL": 4,
}

max_severity = max(report.alarms, key=lambda a: severity_rank[a.severity.value]).severity.value

audit_entry = {
    "line": report.line_id,
    "alarm_codes": [a.alarm_code for a in report.alarms],
    "max_severity": max_severity,
    "diagnosis_confidence": report.diagnosis.confidence,
    "shutdown_recommended": report.recommendation.requires_shutdown,
    "evidence_count": len(report.diagnosis.evidence),
}
print(f"\nAudit entry: {audit_entry}")

Expected output

Critical alarms: 0
High alarms: 2

Motor over 85°C: True
Bearing above threshold: True

✅ No shutdown required — continue with monitoring
   Priority: HIGH

Audit entry: {'line': 'Filling Line 3', 'alarm_codes': ['E-419', 'E-421', 'E-422'],
'max_severity': 'HIGH', 'diagnosis_confidence': 0.82,
'shutdown_recommended': False, 'evidence_count': 4}

Explanation

- Filtering alarms by severity is a one-liner with list comprehension — impossible with a flat list of strings.
- Tag threshold checks use typed numeric values (float) — the flat schema only had tag names.
- Recommendation routing uses a boolean flag — no string parsing or regex matching.
- The audit trail entry is built entirely from structured fields — every value is typed and accessible.
- These 4 operations would require brittle regex or string splitting with a flat schema.

Takeaway

The real value of nested schemas is not in extraction — it is in what downstream code can do with them. Structure enables routing, filtering, and auditing.

4️⃣ BASEMODEL vs @DATACLASS vs TYPEDDICT — CHOOSING THE RIGHT CONTAINER

You now have three options for typed data containers in Python: Pydantic BaseModel, stdlib @dataclass, and TypedDict. Each has trade-offs that matter in agent pipelines.

BaseModel: runtime validation, JSON serialization, JSON Schema generation. Use for everything crossing a trust boundary — LLM outputs, API responses, external data. Works with with_structured_output().

@dataclass: lightweight, no validation overhead, faster instantiation. Use for internal-only data that you control — config objects, intermediate pipeline state where validation already happened upstream.

TypedDict: a dict with type hints. No runtime behavior at all — it is a pure annotation. LangGraph uses TypedDict for graph state because nodes need dict-like access patterns with state["field"] syntax. This is a forward reference to D5.

Feature	BaseModel	@dataclass	TypedDict
Runtime validation	✅ Yes	❌ No	❌ No
JSON serialization	✅ .model_dump_json()	⚠️ Manual	⚠️ json.dumps()
JSON Schema generation	✅ .model_json_schema()	❌ No	❌ No
with_structured_output()	✅ Yes	❌ No	❌ No
LangGraph state	✅ Supported	❌ No	✅ Primary choice
Performance	⚠️ Validation overhead	✅ Fast	✅ Zero overhead

CHECKPOINT CELL

BaseModel vs dataclass vs TypedDict — side-by-side

checkpoint

See the three containers handling the same data and understand where each fits in an agent pipeline.

Python

from pydantic import BaseModel, Field, ValidationError
from dataclasses import dataclass
from typing import TypedDict

# --- 1. BaseModel: validates at construction ---
class AlarmEventModel(BaseModel):
    alarm_code: str = Field(description="Alarm code")
    severity: str = Field(description="CRITICAL | HIGH | MEDIUM | LOW")
    temperature: float = Field(ge=-40, le=200, description="Motor temp °C")

# Valid data
alarm_bm = AlarmEventModel(alarm_code="E-421", severity="HIGH", temperature=92.0)
print(f"BaseModel   : {alarm_bm}")

# Invalid data — catches the error at construction
try:
    bad = AlarmEventModel(alarm_code="E-421", severity="HIGH", temperature=500.0)
except ValidationError as e:
    print(f"BaseModel   : ❌ Validation error — {e.errors()[0]['msg']}")

# JSON Schema (used by with_structured_output())
schema = AlarmEventModel.model_json_schema()
print(f"JSON Schema : {list(schema['properties'].keys())}")

# --- 2. @dataclass: no validation ---
@dataclass
class AlarmEventDC:
    alarm_code: str
    severity: str
    temperature: float

alarm_dc = AlarmEventDC(alarm_code="E-421", severity="HIGH", temperature=500.0)
print(f"\ndataclass   : {alarm_dc}")
print(f"dataclass   : ⚠️ temperature=500 accepted silently (no validation)")

# --- 3. TypedDict: no runtime behavior ---
class AlarmEventTD(TypedDict):
    alarm_code: str
    severity: str
    temperature: float

alarm_td: AlarmEventTD = {
    "alarm_code": "E-421",
    "severity": "HIGH",
    "temperature": 500.0,
}
print(f"\nTypedDict   : {alarm_td}")
print(f"TypedDict   : ⚠️ Just a dict — no validation, no methods")
print(f"TypedDict   : ✅ But this is what LangGraph uses for state")

Expected output

BaseModel   : alarm_code='E-421' severity='HIGH' temperature=92.0
BaseModel   : ❌ Validation error — Input should be less than or equal to 200
JSON Schema : ['alarm_code', 'severity', 'temperature']

dataclass   : AlarmEventDC(alarm_code='E-421', severity='HIGH', temperature=500.0)
dataclass   : ⚠️ temperature=500 accepted silently (no validation)

TypedDict   : {'alarm_code': 'E-421', 'severity': 'HIGH', 'temperature': 500.0}
TypedDict   : ⚠️ Just a dict — no validation, no methods
TypedDict   : ✅ But this is what LangGraph uses for state

Explanation

- BaseModel catches temperature=500 at construction time with a clear validation error. This is your guard at trust boundaries.
- dataclass accepts temperature=500 silently — it is a plain data container with no validation.
- TypedDict is just a dict with type annotations. No runtime behavior at all. But LangGraph expects dict-like state.
- BaseModel generates a JSON Schema via .model_json_schema() — this is what with_structured_output() sends to the model.
- Rule of thumb: BaseModel at the door (LLM output, API input). dataclass for luggage (internal state). TypedDict for rooms (LangGraph state).

Takeaway

BaseModel guards the door. @dataclass carries the luggage. TypedDict labels the rooms. Each has a role — do not use BaseModel everywhere, and do not skip it where validation matters.

5️⃣ FIELD VALIDATION — CATCHING INDUSTRIAL ERRORS BEFORE THEY PROPAGATE

Pydantic's Field constraints and @field_validator decorators are your first line of defense against LLM hallucination reaching production systems. They do not make LLM output correct — structured outputs make correctness checkable, not guaranteed. But they catch physically impossible values, out-of-range readings, and malformed codes before they reach your dashboard or alert system.

Three categories of validation that matter operationally:

▸ Physical range: motor temperature cannot be -500°C or 5000°C. Load current has a physical maximum.
▸ Format constraints: alarm codes must match the pattern E-\d{3}. Line IDs follow a plant naming convention.
▸ Logical constraints: confidence must be 0.0–1.0. Estimated downtime cannot be negative. Evidence list should not be empty.

CONCEPT CELL

Field constraints and custom validators — physical range and format checks

concept

Add Field constraints and @field_validator decorators to the FaultReport schema that catch physically impossible or malformed LLM outputs.

Python

import re
from enum import Enum
from pydantic import BaseModel, Field, field_validator, ValidationError

class Severity(str, Enum):
    CRITICAL = "CRITICAL"
    HIGH = "HIGH"
    MEDIUM = "MEDIUM"
    LOW = "LOW"

class AlarmEvent(BaseModel):
    alarm_code: str = Field(description="Alarm code, e.g. E-421")
    timestamp: str = Field(description="When the alarm fired")
    severity: Severity
    description: str

    @field_validator("alarm_code")
    @classmethod
    def validate_alarm_code(cls, v: str) -> str:
        """Alarm codes must match E-NNN format."""
        if not re.match(r"^E-\d{3}$", v):
            raise ValueError(f"Invalid alarm code format: '{v}'. Expected E-NNN (e.g. E-421)")
        return v

class TagReading(BaseModel):
    tag_name: str
    value: float = Field(ge=-40, le=500, description="Physical range: -40 to 500")
    unit: str
    source: str

class DiagnosisResult(BaseModel):
    root_cause: str
    contributing_factors: list[str] = Field(default_factory=list)
    confidence: float = Field(ge=0.0, le=1.0, description="Must be 0.0 to 1.0")
    evidence: list[str] = Field(min_length=1, description="At least one evidence item")

class RecommendedAction(BaseModel):
    action: str
    priority: Severity
    requires_shutdown: bool
    estimated_downtime_min: int | None = Field(
        default=None, ge=0, description="Non-negative minutes"
    )

# --- Test the validators with bad data ---
print("=== Testing validators ===\n")

# 1. Bad alarm code format
try:
    AlarmEvent(alarm_code="ALARM421", timestamp="02:10",
               severity=Severity.HIGH, description="test")
except ValidationError as e:
    print(f"❌ Alarm code: {e.errors()[0]['msg']}")

# 2. Temperature out of physical range
try:
    TagReading(tag_name="MotorTemp", value=5000.0, unit="°C", source="OPC UA")
except ValidationError as e:
    print(f"❌ Temperature: {e.errors()[0]['msg']}")

# 3. Confidence out of range
try:
    DiagnosisResult(
        root_cause="test", confidence=1.5,
        evidence=["some evidence"]
    )
except ValidationError as e:
    print(f"❌ Confidence: {e.errors()[0]['msg']}")

# 4. Negative downtime
try:
    RecommendedAction(
        action="test", priority=Severity.HIGH,
        requires_shutdown=True, estimated_downtime_min=-30
    )
except ValidationError as e:
    print(f"❌ Downtime: {e.errors()[0]['msg']}")

# 5. Empty evidence list
try:
    DiagnosisResult(root_cause="test", confidence=0.8, evidence=[])
except ValidationError as e:
    print(f"❌ Evidence: {e.errors()[0]['msg']}")

print("\n=== All validators working ===")

Expected output

=== Testing validators ===

❌ Alarm code: Value error, Invalid alarm code format: 'ALARM421'. Expected E-NNN (e.g. E-421)
❌ Temperature: Input should be less than or equal to 500
❌ Confidence: Input should be less than or equal to 1
❌ Downtime: Input should be greater than or equal to 0
❌ Evidence: List should have at least 1 item after validation, not 0

=== All validators working ===

Explanation

- @field_validator("alarm_code") uses a regex to enforce the E-NNN pattern — catches malformed codes the LLM might invent.
- Field(ge=-40, le=500) on tag values catches physically impossible readings — a motor cannot be 5000°C.
- Field(ge=0.0, le=1.0) on confidence prevents scores the LLM might express as percentages (85 instead of 0.85).
- Field(ge=0) on downtime prevents negative values — a simple constraint that catches nonsensical recommendations.
- Field(min_length=1) on evidence ensures the diagnosis always has at least one supporting fact.

Common mistake

Adding validators for business logic that changes frequently (e.g., 'severity must be CRITICAL if temperature > 85°C'). That logic belongs in your application code, not in the schema. Schemas validate structure and physical plausibility.

Takeaway

Validators do not make LLM output correct. They make incorrect output detectable before it reaches production systems.

EXPERIMENT CELL

Validation in the chain — handling failures from with_structured_output()

experiment

See how with_structured_output() handles validation failures and design a recovery strategy.

Python

from enum import Enum
from pydantic import BaseModel, Field, ValidationError, field_validator
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
import os
import logging
import re

logging.basicConfig(level=logging.WARNING)
logger = logging.getLogger("fault_pipeline")

# Minimal self-contained schema setup for this cell
class Severity(str, Enum):
    CRITICAL = "CRITICAL"
    HIGH = "HIGH"
    MEDIUM = "MEDIUM"
    LOW = "LOW"

class AlarmEvent(BaseModel):
    alarm_code: str
    timestamp: str
    severity: Severity
    description: str

    @field_validator("alarm_code")
    @classmethod
    def validate_alarm_code(cls, v: str) -> str:
        if not re.match(r"^E-\d{3}$", v):
            raise ValueError(f"Invalid alarm code format: '{v}'. Expected E-NNN (e.g. E-421)")
        return v

class TagReading(BaseModel):
    tag_name: str
    value: float = Field(ge=-40, le=500)
    unit: str
    source: str

class DiagnosisResult(BaseModel):
    root_cause: str
    contributing_factors: list[str] = Field(default_factory=list)
    confidence: float = Field(ge=0.0, le=1.0)
    evidence: list[str] = Field(min_length=1)

class RecommendedAction(BaseModel):
    action: str
    priority: Severity
    requires_shutdown: bool
    estimated_downtime_min: int | None = Field(default=None, ge=0)

class FaultReport(BaseModel):
    line_id: str
    alarms: list[AlarmEvent]
    tag_readings: list[TagReading]
    diagnosis: DiagnosisResult
    recommendation: RecommendedAction

llm = ChatOpenAI(
    model="gpt-4o-mini", temperature=0, api_key=os.environ["OPENAI_API_KEY"]
)
structured_llm = llm.with_structured_output(FaultReport)

prompt = ChatPromptTemplate.from_messages([
    ("system", (
        "You are an industrial fault-triage assistant. Advisory only. "
        "Analyse all alarms and tags to produce a complete FaultReport."
    )),
    ("human", "{alarm_context}"),
])

chain = prompt | structured_llm

def safe_extract(alarm_context: str, max_retries: int = 2) -> FaultReport | None:
    """Extract a FaultReport with validation error handling and retry."""
    for attempt in range(max_retries + 1):
        try:
            report = chain.invoke({"alarm_context": alarm_context})
            # with_structured_output() returns a validated Pydantic instance
            # If we get here, validation passed
            return report

        except ValidationError as e:
            logger.warning(
                f"Attempt {attempt + 1}: Validation failed — "
                f"{len(e.errors())} error(s): {e.errors()[0]['msg']}"
            )
            if attempt == max_retries:
                logger.error("All retries exhausted. Returning None.")
                return None
            # Blind retries are not enough for persistent schema failures.
            # In production, prefer changing the prompt, relaxing/fixing the
            # schema, or switching to a fallback model/extractor.

        except Exception as e:
            logger.error(f"Unexpected error: {e}")
            return None

# --- Use the safe extractor ---
context = (
    "Filling Line 3, last hour:\n"
    "E-419 overcurrent at 01:55, E-421 overtemp at 02:10.\n"
    "Tags: MotorTemp=92°C, LoadCurrent=18A."
)

report = safe_extract(context)
if report:
    print(f"✅ Extraction successful: {len(report.alarms)} alarms, "
          f"confidence {report.diagnosis.confidence:.0%}")
else:
    print("❌ Extraction failed after retries — log and escalate")

Expected output

✅ Extraction successful: 2 alarms, confidence 85%

Explanation

- with_structured_output() can still produce ValidationError if the model returns values outside your Field constraints.
- The safe_extract pattern wraps extraction in a retry loop — same strategy as D2 error handling, applied to schema validation.
- Logging each validation failure creates an audit trail: you can track which fields the LLM gets wrong most often.
- Retries should be treated as a limited recovery tactic, not a primary fix for persistent schema mismatches.
- Returning None on exhausted retries is a safe default — the caller decides whether to use a fallback or escalate.
- In production, you would add metrics: validation_errors_total, retry_count, extraction_success_rate.

Takeaway

Validation errors from with_structured_output() are a feature, not a bug. They prevent bad data from silently entering your pipeline. Design your chain to handle them gracefully.

6️⃣ YOUR SCHEMA BECOMES YOUR GRAPH STATE — BRIDGE TO D5

Every LangGraph StateGraph needs a typed state definition. That state is typically a TypedDict or BaseModel whose fields are read and written by graph nodes. The schemas you designed in this tutorial — FaultReport, AlarmEvent, TagReading — are exactly the kind of structured data that flows through a graph.

In D5, you will define an AgentState that includes FaultReport as a field. Each node will read from and write to this shared state. The validators you added in Section 5 will catch bad data at every node boundary. The transition from D4 to D5 is: your schemas become the shared state that the graph coordinates around.

D3 ended with: "Chains are linear. For branching, cycles, and shared state — you graduate to LangGraph." You now have the typed schemas that LangGraph state depends on. D5 adds the graph structure around them.

D4 → D5 — SCHEMAS BECOME GRAPH STATE

graph LR
    D4[D4 Schemas<br/>FaultReport,<br/>AlarmEvent, etc.]:::purple
    AS[D5 AgentState<br/>TypedDict with<br/>FaultReport field]:::cyan
    N1[triage_node]:::green
    N2[diagnosis_node]:::green
    N3[report_node]:::green

    D4 -->|schemas become<br/>state fields| AS
    AS --> N1
    AS --> N2
    AS --> N3

    classDef cyan fill:#1a1a1e,stroke:#04d9ff,stroke-width:2px,color:#04d9ff;
    classDef purple fill:#1a1a1e,stroke:#9e4aff,stroke-width:2px,color:#9e4aff;
    classDef green fill:#1a1a1e,stroke:#00ff7f,stroke-width:2px,color:#00ff7f;

Each graph node reads and writes typed state. The schemas you designed in D4 define what that state looks like.

CONCEPT CELL

From FaultReport to AgentState — the LangGraph connection

concept

See how the nested schema you built becomes the typed state for a LangGraph StateGraph (preview, not full implementation — that is D5).

Python

from typing import TypedDict

# Assume FaultReport and sub-models defined as in Cells 3/6

# --- D5 preview: AgentState as TypedDict ---
# LangGraph state uses TypedDict because nodes need dict-like access:
#   state["report"], state["status"], etc.

class DiagnosisAgentState(TypedDict):
    """Shared state for a fault diagnosis graph (D5 preview)."""
    line_id: str                        # Which production line
    alarm_context: str                  # Raw alarm text from shift
    report: FaultReport | None          # ← Your D4 schema becomes a state field
    status: str                         # "pending" | "triaged" | "diagnosed" | "reported"
    node_history: list[str]             # Audit trail of which nodes ran

# --- What a node function looks like (preview) ---
# In D5, each node reads state and returns a partial update

def triage_node(state: DiagnosisAgentState) -> dict:
    """Triage node: extract a FaultReport from alarm context."""
    # In D5, this will use with_structured_output(FaultReport)
    # to populate state["report"]
    return {
        "report": "... FaultReport instance from with_structured_output() ...",
        "status": "triaged",
        "node_history": state["node_history"] + ["triage_node"],
    }

# --- The connection ---
print("DiagnosisAgentState fields:")
for field, type_hint in DiagnosisAgentState.__annotations__.items():
    print(f"  {field}: {type_hint}")

print("\nThe 'report' field is your FaultReport from D4.")
print("The triage_node populates it using with_structured_output().")
print("Other nodes (diagnosis, report) read and extend it.")
print("\n→ Full implementation in D5: LangGraph Fundamentals (StateGraph)")

Expected output

DiagnosisAgentState fields:
  line_id: <class 'str'>
  alarm_context: <class 'str'>
  report: FaultReport | None
  status: <class 'str'>
  node_history: list[str]

The 'report' field is your FaultReport from D4.
The triage_node populates it using with_structured_output().
Other nodes (diagnosis, report) read and extend it.

→ Full implementation in D5: LangGraph Fundamentals (StateGraph)

Explanation

- DiagnosisAgentState is a TypedDict — a dict with typed annotations that LangGraph uses for shared state.
- The report field is FaultReport | None — your D4 schema becomes a first-class state field.
- Node functions take state as input and return a dict with partial updates — LangGraph merges them into the shared state.
- node_history creates an audit trail — you can see which nodes ran, in what order, for any given alarm.
- This is a preview. D5 builds the full StateGraph with conditional edges, checkpoints, and error recovery.

Takeaway

LangGraph state is just a typed container. If you can design a good schema (D4), you can design good graph state (D5). These skills transfer directly.

CONCEPT CELL

Schema versioning — evolving contracts without breaking consumers

concept

Understand that schemas evolve and see the minimal pattern for backward-compatible changes.

Python

from pydantic import BaseModel, Field, ConfigDict, ValidationError

# --- Version 1.0: original FaultReport ---
class FaultReportV1(BaseModel):
    model_config = ConfigDict(extra="ignore")  # Forward-compatible: ignore unknown fields

    schema_version: str = "1.0"
    line_id: str
    root_cause: str
    confidence: float = Field(ge=0.0, le=1.0)

# --- Version 1.1: add an OPTIONAL field (backward-compatible) ---
class FaultReportV1_1(BaseModel):
    model_config = ConfigDict(extra="ignore")

    schema_version: str = "1.1"
    line_id: str
    root_cause: str
    confidence: float = Field(ge=0.0, le=1.0)
    operator_notes: str | None = None  # ← Optional: old data still validates

# --- Test backward compatibility ---
old_data = {
    "schema_version": "1.0",
    "line_id": "Filling Line 3",
    "root_cause": "Motor overtemperature from sustained overcurrent",
    "confidence": 0.82,
}

# V1.0 data validates against V1.1 schema (operator_notes defaults to None)
report = FaultReportV1_1(**old_data)
print(f"V1.0 data → V1.1 schema: ✅ (operator_notes={report.operator_notes})")

# --- Version 2.0: add a REQUIRED field (breaking change) ---
class FaultReportV2(BaseModel):
    schema_version: str = "2.0"
    line_id: str
    root_cause: str
    confidence: float = Field(ge=0.0, le=1.0)
    shift_id: str  # ← Required: old data will fail validation

try:
    FaultReportV2(**old_data)
except ValidationError as e:
    print(f"V1.0 data → V2.0 schema: ❌ {e.errors()[0]['msg']}")

print("\nRule: optional fields with defaults are safe additions.")
print("Required fields without defaults are breaking changes.")

Expected output

V1.0 data → V1.1 schema: ✅ (operator_notes=None)
V1.0 data → V2.0 schema: ❌ Field required

Rule: optional fields with defaults are safe additions.
Required fields without defaults are breaking changes.

Explanation

- schema_version as a field lets consumers check which version they are handling.
- ConfigDict(extra="ignore") makes the schema forward-compatible: it silently drops fields it does not recognize.
- Adding an optional field (operator_notes: str | None = None) is backward-compatible: old data still validates.
- Adding a required field (shift_id: str) is a breaking change: old data fails validation.
- In production, breaking schema changes require coordination — like a PLC firmware upgrade that changes the data block layout.

Takeaway

Treat your Pydantic schema like a PLC data block version: add optional fields freely, but adding required fields is a firmware upgrade that requires coordination.

CHECKPOINT CELL

Full pipeline — nested schema + with_structured_output() + validation

checkpoint

Combine everything from D4 into a single end-to-end pipeline: nested validated schema extracted via with_structured_output(), with error handling.

Python

import re
from enum import Enum
from pydantic import BaseModel, Field, field_validator, ValidationError
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
import os

# --- Complete validated schema (Sections 3 + 5 combined) ---
class Severity(str, Enum):
    CRITICAL = "CRITICAL"
    HIGH = "HIGH"
    MEDIUM = "MEDIUM"
    LOW = "LOW"

class AlarmEvent(BaseModel):
    alarm_code: str = Field(description="Alarm code, e.g. E-421")
    timestamp: str = Field(description="When the alarm fired")
    severity: Severity
    description: str

    @field_validator("alarm_code")
    @classmethod
    def validate_alarm_code(cls, v: str) -> str:
        if not re.match(r"^E-\d{3}$", v):
            raise ValueError(f"Invalid alarm code: '{v}'. Expected E-NNN")
        return v

class TagReading(BaseModel):
    tag_name: str
    value: float = Field(ge=-40, le=500)
    unit: str
    source: str

class DiagnosisResult(BaseModel):
    root_cause: str
    contributing_factors: list[str] = Field(default_factory=list)
    confidence: float = Field(ge=0.0, le=1.0)
    evidence: list[str] = Field(min_length=1)

class RecommendedAction(BaseModel):
    action: str
    priority: Severity
    requires_shutdown: bool
    estimated_downtime_min: int | None = Field(default=None, ge=0)

class FaultReport(BaseModel):
    line_id: str
    alarms: list[AlarmEvent]
    tag_readings: list[TagReading]
    diagnosis: DiagnosisResult
    recommendation: RecommendedAction

# --- Pipeline ---
llm = ChatOpenAI(
    model="gpt-4o-mini", temperature=0, api_key=os.environ["OPENAI_API_KEY"]
)
structured_llm = llm.with_structured_output(FaultReport)

prompt = ChatPromptTemplate.from_messages([
    ("system", (
        "You are an industrial fault-triage assistant. Advisory only. "
        "Analyse all alarms and tags to produce a complete FaultReport."
    )),
    ("human", "{alarm_context}"),
])

chain = prompt | structured_llm

# --- Extract with error handling ---
alarm_context = (
    "Filling Line 3, last hour:\n"
    "Alarms:\n"
    "  E-419 overcurrent at 01:55 (LoadCurrent=18A, rated 15A)\n"
    "  E-421 overtemperature at 02:10 (MotorTemp=92°C)\n"
    "  E-422 vibration at 02:15 (BearingVib=4.2mm/s, threshold 3.5)\n"
    "Tags: MotorTemp=92°C, LoadCurrent=18A, AmbientTemp=28°C, BearingVib=4.2mm/s"
)

try:
    report = chain.invoke({"alarm_context": alarm_context})

    # --- Summary using nested access ---
    print(f"{'='*50}")
    print(f"FAULT REPORT — {report.line_id}")
    print(f"{'='*50}")
    print(f"\nAlarms ({len(report.alarms)}):")
    for a in report.alarms:
        print(f"  [{a.severity.value:>8}] {a.alarm_code} at {a.timestamp}")
    print(f"\nTag readings ({len(report.tag_readings)}):")
    for t in report.tag_readings:
        print(f"  {t.tag_name:>15} = {t.value:>6.1f} {t.unit}")
    print(f"\nDiagnosis (confidence: {report.diagnosis.confidence:.0%}):")
    print(f"  {report.diagnosis.root_cause}")
    print(f"  Evidence: {len(report.diagnosis.evidence)} items")
    print(f"\nRecommendation [{report.recommendation.priority.value}]:")
    print(f"  {report.recommendation.action}")
    print(f"  Shutdown: {'YES' if report.recommendation.requires_shutdown else 'No'}")
    print(f"\n{'='*50}")
    print(f"✅ Schema validated. Ready for D5 StateGraph.")

except ValidationError as e:
    print(f"❌ Validation failed: {len(e.errors())} error(s)")
    for err in e.errors():
        print(f"   {err['loc']}: {err['msg']}")
except Exception as e:
    print(f"❌ Pipeline error: {e}")

Expected output

==================================================
FAULT REPORT — Filling Line 3
==================================================

Alarms (3):
  [    HIGH] E-419 at 01:55
  [    HIGH] E-421 at 02:10
  [  MEDIUM] E-422 at 02:15

Tag readings (4):
      MotorTemp =   92.0 °C
    LoadCurrent =   18.0 A
    AmbientTemp =   28.0 °C
     BearingVib =    4.2 mm/s

Diagnosis (confidence: 82%):
  Sustained overcurrent caused motor overtemperature, with bearing
  vibration suggesting mechanical degradation contributing to overload.
  Evidence: 4 items

Recommendation [HIGH]:
  Reduce line throughput, inspect motor bearings and cooling system,
  schedule maintenance before next shift start.
  Shutdown: No

==================================================
✅ Schema validated. Ready for D5 StateGraph.

Explanation

- This cell combines everything from D4: nested schemas (Section 3), field validators (Section 5), with_structured_output() (Section 2), and error handling (Cell 7).
- All 3 alarms are captured with individual timestamps and severities — no data loss from Cell 1.
- Tag readings carry typed values that enable threshold checks from Cell 4.
- The diagnosis includes structured evidence and a bounded confidence score.
- The try/except catches both ValidationError (schema violation) and general exceptions (API failure).
- This pipeline is the foundation for D5: the FaultReport becomes a field in the LangGraph AgentState.

Takeaway

This pipeline is the contract between your LLM and every downstream system. The schema defines the interface. The validators guard the boundaries. The graph (D5) routes the data.

✅ KEY TAKEAWAYS

✅ with_structured_output() combines native JSON mode with LangChain composability — use it instead of PydanticOutputParser for new code.
✅ Nested BaseModel schemas preserve relational structure that flat schemas destroy — design schemas like PLC UDTs, not flat variable lists.
✅ BaseModel guards trust boundaries (LLM output). @dataclass carries internal state. TypedDict labels LangGraph state.
✅ Field validators catch physically impossible values before they reach production — they make correctness checkable, not guaranteed.
✅ The schemas you design here become the typed contracts that LangGraph state, tool returns, and inter-agent messages depend on.

🔜 NEXT TUTORIAL

D5 — LangGraph Fundamentals (StateGraph)

Take the schemas you designed here and embed them in a fault-tolerant StateGraph with conditional edges, shared state, and checkpoint-based recovery.