Tutorial #9: Structured Outputs for PLC Data Extraction
Deterministic, Machine-Readable JSON
โ CORE MISSION OF THIS TUTORIAL
By the end of this tutorial, the reader will be able to:
- โ Understand why free-text AI outputs are unreliable for systems
- โ Design strict output schemas for PLC analysis
- โ Force AI agents to return machine-readable JSON
- โ Validate and parse structured outputs safely
- โ Prepare agents for integration into larger systems
This tutorial marks the transition from analysis for humans to data for systems.
โ ๏ธ SAFETY BOUNDARY REMINDER
This tutorial performs analysis only.
It must never be connected to:
- Live PLCs
- Production deployment pipelines
- Safety-rated controllers
- Motion or power systems
> All outputs are advisory-only and always require explicit human approval before any real-world action.
๐ VENDOR-AGNOSTIC ENGINEERING NOTE
This tutorial uses:
- โธ Generic IEC 61131-3 Structured Text (ST)
- โธ Python-only processing
- โธ JSON schemas
- โธ No PLC runtimes or SDKs
Applicable to all IEC-compliant environments.
1๏ธโฃ WHY FREE-TEXT IS NOT ENOUGH
Humans like prose.
Systems do not.
Free-Text AI Outputs:
โ Are Ambiguous
Wording varies unpredictably
โ Change Unpredictably
No stable format
โ Hard to Parse
No structured extraction
โ Break Downstream Logic
Can't automate processing
For agents to cooperate with software, structure is mandatory.
2๏ธโฃ DEFINING A STRUCTURED EXTRACTION TARGET
We want to extract the following from PLC logic:
๐ฅ Inputs
Button signals, sensors
๐ค Outputs
Motor states, actuators
๐ Internal Variables
State flags, counters
๐ Detected Behaviors
Logic patterns identified
โ ๏ธ Potential Issues
Edge cases, warnings
โ Must Be JSON
Strict structure required
3๏ธโฃ REFERENCE PLC LOGIC & OUTPUT SCHEMA
Reference PLC Logic
IF StartButton THEN
MotorRunning := TRUE;
END_IF;
IF StopButton THEN
MotorRunning := FALSE;
END_IF; Target Output Schema
{
"inputs": [],
"outputs": [],
"internal_variables": [],
"behaviors": [],
"notes": []
} Anything outside this schema is invalid
Structured Output Flow
graph LR
A[PLC Code] --> B[Agent Analysis]
B --> C[JSON Schema]
C --> D[Validation]
D --> E[Structured Data]
E --> F[Downstream Systems]
style A fill:#1a1a1e,stroke:#04d9ff,stroke-width:2px,color:#fff
style B fill:#1a1a1e,stroke:#9e4aff,stroke-width:2px,color:#fff
style C fill:#1a1a1e,stroke:#9e4aff,stroke-width:2px,color:#fff
style D fill:#1a1a1e,stroke:#fec20b,stroke-width:2px,color:#fec20b
style E fill:#1a1a1e,stroke:#00ff7f,stroke-width:2px,color:#00ff7f
style F fill:#1a1a1e,stroke:#04d9ff,stroke-width:2px,color:#fff Validation ensures reliability before downstream processing
4๏ธโฃ PRACTICAL EXPERIMENTS
๐งช Experiment 1: Unstructured vs Structured Output
Objective
Observe the difference between free-text and structured output.
Python Code
from openai import OpenAI
import json
client = OpenAI()
code = """
IF StartButton THEN
MotorRunning := TRUE;
END_IF;
IF StopButton THEN
MotorRunning := FALSE;
END_IF;
"""
# --- Unstructured prompt ---
prompt_unstructured = f"""
Analyze the following PLC logic and describe what it does:
{code}
"""
resp_text = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "user", "content": prompt_unstructured}
]
)
print("UNSTRUCTURED OUTPUT:\n")
print(resp_text.choices[0].message.content)
# --- Structured prompt ---
prompt_structured = f"""
You are a PLC analysis agent.
Return ONLY valid JSON matching this schema exactly:
{{
"inputs": [],
"outputs": [],
"internal_variables": [],
"behaviors": [],
"notes": []
}}
PLC LOGIC:
{code}
"""
resp_json = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "user", "content": prompt_structured}
]
)
parsed = json.loads(resp_json.choices[0].message.content)
print("\nSTRUCTURED OUTPUT:\n")
print(json.dumps(parsed, indent=2)) Expected Output
UNSTRUCTURED OUTPUT:
This PLC logic implements a basic start/stop control for a motor...
(output varies with each run)
STRUCTURED OUTPUT:
{
"inputs": ["StartButton", "StopButton"],
"outputs": ["MotorRunning"],
"internal_variables": [],
"behaviors": [
"MotorRunning set TRUE when StartButton is TRUE",
"MotorRunning set FALSE when StopButton is TRUE"
],
"notes": [
"No fault handling present",
"MotorRunning retains last state when no buttons are pressed"
]
} Interpretation
- โธ โ Free text is expressive but unstable
- โธ โ Structured output is predictable and automatable
- โธ โ JSON enables reliable parsing
- โธ Cost: ~$0.03 | Runtime: 2-3 seconds
๐งช Experiment 2: Enforcing Schema Compliance
Objective
Reject invalid outputs and ensure reliability.
Python Code
from openai import OpenAI
import json
client = OpenAI()
def validate_schema(data: dict):
"""Validate that response matches expected schema."""
required_keys = {
"inputs",
"outputs",
"internal_variables",
"behaviors",
"notes"
}
return set(data.keys()) == required_keys
code = """
IF StartButton THEN
MotorRunning := TRUE;
END_IF;
IF StopButton THEN
MotorRunning := FALSE;
END_IF;
"""
prompt_structured = f"""
You are a PLC analysis agent.
Return ONLY valid JSON matching this schema exactly:
{{
"inputs": [],
"outputs": [],
"internal_variables": [],
"behaviors": [],
"notes": []
}}
PLC LOGIC:
{code}
"""
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "user", "content": prompt_structured}
]
)
try:
data = json.loads(response.choices[0].message.content)
if validate_schema(data):
print("โ
Schema valid:", json.dumps(data, indent=2))
else:
print("โ Schema violation: Missing or extra keys")
except json.JSONDecodeError as e:
print("โ Invalid JSON:", e)
except Exception as e:
print("โ Validation error:", e) Expected Output
โ
Schema valid: {
"inputs": ["StartButton", "StopButton"],
"outputs": ["MotorRunning"],
"internal_variables": [],
"behaviors": [
"MotorRunning set TRUE when StartButton is TRUE",
"MotorRunning set FALSE when StopButton is TRUE"
],
"notes": [
"No fault handling present"
]
} Interpretation
- โธ โ AI outputs must be validated like any other input
- โธ โ Never trust raw responses
- โธ โ Structured output enables safe composition
- โธ โ Validation catches malformed data
- โธ Cost: ~$0.02 | Runtime: <2 seconds
๐ EXPLICIT OPERATIONAL PROHIBITIONS
โ Never Use Structured Output For:
- โ Using structured output to drive control logic directly
- โ Skipping validation of schema compliance
- โ Allowing partial schemas or missing fields
- โ Treating AI output as ground truth without verification
โ KEY TAKEAWAYS
- โ Structured output is mandatory for system integration
- โ JSON schemas act as contracts between agent and system
- โ Validation is non-negotiable
- โ This enables multi-agent and pipeline designs
๐ NEXT TUTORIAL
#10 โ Fault Diagnosis Agents from Clean Alarm Logs (Capstone)
Combine reasoning + tools + structure into a complete advisory agent.
๐งญ ENGINEERING POSTURE
This tutorial enforced:
- โธ Determinism over creativity
- โธ Validation over trust
- โธ Data contracts over prose
- โธ Systems thinking over demos