🟢 TECHNICIAN TRACK • BEGINNER

Tutorial #9: Structured Outputs for PLC Data Extraction

Deterministic, Machine-Readable JSON

✅ CORE MISSION OF THIS TUTORIAL

By the end of this tutorial, the reader will be able to:

✅ Understand why free-text AI outputs are unreliable for systems
✅ Design strict output schemas for PLC analysis
✅ Force AI agents to return machine-readable JSON
✅ Validate and parse structured outputs safely
✅ Prepare agents for integration into larger systems

This tutorial marks the transition from analysis for humans to data for systems.

⚠️

⚠️ SAFETY BOUNDARY REMINDER

This tutorial performs analysis only.

It must never be connected to:

Live PLCs
Production deployment pipelines
Safety-rated controllers
Motion or power systems

> All outputs are advisory-only and always require explicit human approval before any real-world action.

🌍 VENDOR-AGNOSTIC ENGINEERING NOTE

This tutorial uses:

▸ Generic IEC 61131-3 Structured Text (ST)
▸ Python-only processing
▸ JSON schemas
▸ No PLC runtimes or SDKs

Applicable to all IEC-compliant environments.

1️⃣ WHY FREE-TEXT IS NOT ENOUGH

Humans like prose.
Systems do not.

Free-Text AI Outputs:

❌ Are Ambiguous

Wording varies unpredictably

❌ Change Unpredictably

No stable format

❌ Hard to Parse

No structured extraction

❌ Break Downstream Logic

Can't automate processing

For agents to cooperate with software, structure is mandatory.

2️⃣ DEFINING A STRUCTURED EXTRACTION TARGET

We want to extract the following from PLC logic:

📥 Inputs

Button signals, sensors

📤 Outputs

Motor states, actuators

🔄 Internal Variables

State flags, counters

🔍 Detected Behaviors

Logic patterns identified

⚠️ Potential Issues

Edge cases, warnings

✅ Must Be JSON

Strict structure required

3️⃣ REFERENCE PLC LOGIC & OUTPUT SCHEMA

Reference PLC Logic

PASCAL

IF StartButton THEN
    MotorRunning := TRUE;
END_IF;

IF StopButton THEN
    MotorRunning := FALSE;
END_IF;

Target Output Schema

JSON

{
  "inputs": [],
  "outputs": [],
  "internal_variables": [],
  "behaviors": [],
  "notes": []
}

Anything outside this schema is invalid

Structured Output Flow

graph LR
    A[PLC Code] --> B[Agent Analysis]
    B --> C[JSON Schema]
    C --> D[Validation]
    D --> E[Structured Data]
    E --> F[Downstream Systems]

    style A fill:#1a1a1e,stroke:#04d9ff,stroke-width:2px,color:#fff
    style B fill:#1a1a1e,stroke:#9e4aff,stroke-width:2px,color:#fff
    style C fill:#1a1a1e,stroke:#9e4aff,stroke-width:2px,color:#fff
    style D fill:#1a1a1e,stroke:#fec20b,stroke-width:2px,color:#fec20b
    style E fill:#1a1a1e,stroke:#00ff7f,stroke-width:2px,color:#00ff7f
    style F fill:#1a1a1e,stroke:#04d9ff,stroke-width:2px,color:#fff

Validation ensures reliability before downstream processing

4️⃣ PRACTICAL EXPERIMENTS

🧪 Experiment 1: Unstructured vs Structured Output

Objective

Observe the difference between free-text and structured output.

Python Code

Python

from openai import OpenAI
import json

client = OpenAI()

code = """
IF StartButton THEN
    MotorRunning := TRUE;
END_IF;

IF StopButton THEN
    MotorRunning := FALSE;
END_IF;
"""

# --- Unstructured prompt ---

prompt_unstructured = f"""
Analyze the following PLC logic and describe what it does:

{code}
"""

resp_text = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "user", "content": prompt_unstructured}
    ]
)

print("UNSTRUCTURED OUTPUT:\n")
print(resp_text.choices[0].message.content)

# --- Structured prompt ---

prompt_structured = f"""
You are a PLC analysis agent.

Return ONLY valid JSON matching this schema exactly:

{{
  "inputs": [],
  "outputs": [],
  "internal_variables": [],
  "behaviors": [],
  "notes": []
}}

PLC LOGIC:
{code}
"""

resp_json = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "user", "content": prompt_structured}
    ]
)

parsed = json.loads(resp_json.choices[0].message.content)
print("\nSTRUCTURED OUTPUT:\n")
print(json.dumps(parsed, indent=2))

Expected Output

UNSTRUCTURED OUTPUT:

This PLC logic implements a basic start/stop control for a motor...
(output varies with each run)

STRUCTURED OUTPUT:

{
  "inputs": ["StartButton", "StopButton"],
  "outputs": ["MotorRunning"],
  "internal_variables": [],
  "behaviors": [
    "MotorRunning set TRUE when StartButton is TRUE",
    "MotorRunning set FALSE when StopButton is TRUE"
  ],
  "notes": [
    "No fault handling present",
    "MotorRunning retains last state when no buttons are pressed"
  ]
}

Interpretation

▸ ❌ Free text is expressive but unstable
▸ ✅ Structured output is predictable and automatable
▸ ✅ JSON enables reliable parsing
▸ Cost: ~$0.03 | Runtime: 2-3 seconds

🧪 Experiment 2: Enforcing Schema Compliance

Objective

Reject invalid outputs and ensure reliability.

Python Code

Python

from openai import OpenAI
import json

client = OpenAI()

def validate_schema(data: dict):
    """Validate that response matches expected schema."""
    required_keys = {
        "inputs",
        "outputs",
        "internal_variables",
        "behaviors",
        "notes"
    }
    return set(data.keys()) == required_keys

code = """
IF StartButton THEN
    MotorRunning := TRUE;
END_IF;

IF StopButton THEN
    MotorRunning := FALSE;
END_IF;
"""

prompt_structured = f"""
You are a PLC analysis agent.

Return ONLY valid JSON matching this schema exactly:

{{
  "inputs": [],
  "outputs": [],
  "internal_variables": [],
  "behaviors": [],
  "notes": []
}}

PLC LOGIC:
{code}
"""

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "user", "content": prompt_structured}
    ]
)

try:
    data = json.loads(response.choices[0].message.content)
    if validate_schema(data):
        print("✅ Schema valid:", json.dumps(data, indent=2))
    else:
        print("❌ Schema violation: Missing or extra keys")
except json.JSONDecodeError as e:
    print("❌ Invalid JSON:", e)
except Exception as e:
    print("❌ Validation error:", e)

Expected Output

✅ Schema valid: {
  "inputs": ["StartButton", "StopButton"],
  "outputs": ["MotorRunning"],
  "internal_variables": [],
  "behaviors": [
    "MotorRunning set TRUE when StartButton is TRUE",
    "MotorRunning set FALSE when StopButton is TRUE"
  ],
  "notes": [
    "No fault handling present"
  ]
}

Interpretation

▸ ✅ AI outputs must be validated like any other input
▸ ✅ Never trust raw responses
▸ ✅ Structured output enables safe composition
▸ ✅ Validation catches malformed data
▸ Cost: ~$0.02 | Runtime: <2 seconds

🔒 EXPLICIT OPERATIONAL PROHIBITIONS

❌ Never Use Structured Output For:

❌ Using structured output to drive control logic directly
❌ Skipping validation of schema compliance
❌ Allowing partial schemas or missing fields
❌ Treating AI output as ground truth without verification

✅ KEY TAKEAWAYS

✅ Structured output is mandatory for system integration
✅ JSON schemas act as contracts between agent and system
✅ Validation is non-negotiable
✅ This enables multi-agent and pipeline designs

🔜 NEXT TUTORIAL

#10 — Fault Diagnosis Agents from Clean Alarm Logs (Capstone)

Combine reasoning + tools + structure into a complete advisory agent.

🧭 ENGINEERING POSTURE

This tutorial enforced:

▸ Determinism over creativity
▸ Validation over trust
▸ Data contracts over prose
▸ Systems thinking over demos