Tutorial #7: Chain-of-Thought for Logic Review
Transparent Step-by-Step Reasoning
β CORE MISSION OF THIS TUTORIAL
By the end of this tutorial, the reader will be able to:
- β Understand what chain-of-thought (CoT) means in practical engineering terms
- β Ask AI to explain its reasoning, not just give a verdict
- β Use structured reasoning formats for PLC logic review
- β Spot subtle logic issues more easily by making assumptions explicit
- β Prepare for future agent behavior that is transparent and auditable
This tutorial focuses on logic review, not safety certification.
β οΈ SAFETY BOUNDARY REMINDER
This tutorial performs analysis only.
It must never be connected to:
- Live PLCs
- Production deployment pipelines
- Safety-rated controllers
- Motion or power systems
> All outputs are advisory-only and always require explicit human approval before any real-world action.
π VENDOR-AGNOSTIC ENGINEERING NOTE
This tutorial uses:
- βΈ Generic IEC 61131-3 Structured Text (ST)
- βΈ TwinCAT, Siemens TIA Portal, CODESYS
- βΈ Allen-Bradley ST
- βΈ Any IEC-based runtime
No vendor-specific libraries, no runtime access, no PLC connections.
1οΈβ£ WHAT IS CHAIN-OF-THOUGHT IN ENGINEERING TERMS?
In this context:
Practical βchain-of-thoughtβ = a structured rationale (assumptions + checks) that explains how a conclusion was reached.
β Without Chain-of-Thought
Not auditable, hard to verify
β With Chain-of-Thought
Step 2: When StopButton is TRUE...
Step 3: Therefore the logic behaves as specified.
Transparent, inspectable reasoning
Chain-of-thought does not make the model smarter by itself.
It makes parts of the model's rationale inspectable by you.
Benefits for PLC Engineers
πΉ See Where AI Might Be Wrong
Inspect each reasoning step
πΉ Challenge Assumptions
Correct flawed logic immediately
πΉ Reuse Reasoning Patterns
Apply across similar reviews
πΉ Build Trust Through Transparency
Not blind faith
2οΈβ£ REFERENCE SCENARIO β REVIEWING SIMPLE MOTOR LOGIC
We reuse a familiar IEC ST snippet:
IF StartButton THEN
MotorRunning := TRUE;
END_IF;
IF StopButton THEN
MotorRunning := FALSE;
END_IF; Specification (Informal)
- βΈ When
StartButtonis TRUE βMotorRunningshould become TRUE - βΈ When
StopButtonis TRUE βMotorRunningshould become FALSE - βΈ When neither button is pressed β
MotorRunningshould keep its last state
We want the AI to:
- βΈ Explain whether the logic matches the spec
- βΈ Show intermediate reasoning steps
- βΈ Output a clear VERDICT
3οΈβ£ CONCEPT: VERDICT-ONLY VS CHAIN-OF-THOUGHT OUTPUT
Two modes of AI behavior:
β Verdict-Only Mode
graph LR
A[Code + Spec] --> B[AI Analysis]
B --> C[Verdict Only]
style A fill:#1a1a1e,stroke:#04d9ff,stroke-width:2px,color:#fff
style B fill:#1a1a1e,stroke:#9e4aff,stroke-width:2px,color:#fff
style C fill:#1a1a1e,stroke:#ff4fd8,stroke-width:2px,color:#ff4fd8 - β Hard to verify
- β Hard to challenge
- β Black box decision
β Chain-of-Thought Mode
graph LR
A[Code + Spec] --> B[Assumptions]
B --> C[Step-by-Step]
C --> D[Verdict]
style A fill:#1a1a1e,stroke:#04d9ff,stroke-width:2px,color:#fff
style B fill:#1a1a1e,stroke:#9e4aff,stroke-width:2px,color:#fff
style C fill:#1a1a1e,stroke:#9e4aff,stroke-width:2px,color:#fff
style D fill:#1a1a1e,stroke:#00ff7f,stroke-width:2px,color:#00ff7f - β Lists assumptions
- β Walks through conditions
- β Compares to spec
- β Clear verdict
Prefer requesting the structured rationale mode for logic review (but still validate with testing/simulation).
4οΈβ£ PRACTICAL EXPERIMENTS
π§ͺ Experiment 1: Verdict-Only vs Explicit Chain-of-Thought
Objective
See the difference between a shallow answer and a structured reasoning trace.
Python Code
from openai import OpenAI
client = OpenAI()
iec_code = """
IF StartButton THEN
MotorRunning := TRUE;
END_IF;
IF StopButton THEN
MotorRunning := FALSE;
END_IF;
"""
spec = """
Specification:
- When StartButton is TRUE, MotorRunning should become TRUE.
- When StopButton is TRUE, MotorRunning should become FALSE.
- When neither button is pressed, MotorRunning should retain its last state.
"""
# --- Part A: Verdict-only style prompt ---
prompt_verdict_only = f"""
You are a PLC logic reviewer.
Here is the code (IEC 61131-3 ST):
{iec_code}
{spec}
Question:
Does this logic match the specification? Answer briefly with YES or NO and one short sentence.
"""
response_verdict = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "user", "content": prompt_verdict_only}
]
)
print("=== VERDICT-ONLY ===")
print(response_verdict.choices[0].message.content)
# --- Part B: Structured rationale prompt ---
prompt_rationale = f"""
You are a PLC logic reviewer.
Here is the code (IEC 61131-3 ST):
{iec_code}
{spec}
Follow this format exactly:
ASSUMPTIONS:
- ...
CHECKS (brief, human-auditable):
1. ...
2. ...
3. ...
VERDICT:
- PASS or FAIL, with one short justification.
Do not provide hidden internal chain-of-thought. Provide only a concise rationale and checks that a human can audit.
"""
response_rationale = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "user", "content": prompt_rationale}
]
)
print("\n=== STRUCTURED RATIONALE ===")
print(response_rationale.choices[0].message.content) Expected Output
=== VERDICT-ONLY === YES. The logic matches the specification. === STRUCTURED RATIONALE === ASSUMPTIONS: - MotorRunning keeps its previous value unless explicitly set. - StartButton and StopButton are momentary signals. CHECKS (brief, human-auditable): 1. When StartButton is TRUE, MotorRunning is set to TRUE. 2. When StopButton is TRUE, MotorRunning is set to FALSE. 3. When both are FALSE, MotorRunning is not reassigned and keeps its last state. VERDICT: - PASS. The logic behaves according to the given specification.
Interpretation
- βΈ β Verdict-only is not auditable
- βΈ β Structured rationale exposes assumptions and checks
- βΈ β You can now agree or disagree with concrete steps
- βΈ Cost/runtime vary by model, pricing, and system load
π§ͺ Experiment 2: Chain-of-Thought for Detecting a Logic Bug
Objective
Use structured reasoning to catch a subtle behavior issue.
Python Code
from openai import OpenAI
client = OpenAI()
# Deliberately faulty code with ELSIF
iec_faulty = """
IF StartButton THEN
MotorRunning := TRUE;
ELSIF StopButton THEN
MotorRunning := FALSE;
END_IF;
"""
spec = """
Specification:
- When StartButton is TRUE, MotorRunning should become TRUE.
- When StopButton is TRUE, MotorRunning should become FALSE.
- If both StartButton and StopButton are TRUE at the same time,
StopButton must take priority and MotorRunning should become FALSE.
- When neither button is pressed, MotorRunning should retain its last state.
"""
prompt_faulty_cot = f"""
You are a PLC logic reviewer.
Here is the code (IEC 61131-3 ST):
{iec_faulty}
{spec}
Follow this format exactly:
ASSUMPTIONS:
- ...
STEP_BY_STEP_REASONING:
1. ...
2. ...
3. ...
EDGE_CASE_ANALYSIS:
- Describe what happens when both StartButton and StopButton are TRUE.
- Compare this to the specification.
VERDICT:
- PASS or FAIL, with a short justification.
Do not provide hidden internal chain-of-thought. Provide only a concise rationale and checks that a human can audit.
"""
response_faulty = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "user", "content": prompt_faulty_cot}
]
)
print(response_faulty.choices[0].message.content) Expected Output
ASSUMPTIONS: - MotorRunning keeps its previous value unless assigned. - StartButton and StopButton can be TRUE at the same time. STEP_BY_STEP_REASONING: 1. When StartButton is TRUE and StopButton is FALSE, MotorRunning is set to TRUE. 2. When StartButton is FALSE and StopButton is TRUE, MotorRunning is set to FALSE. 3. When both are FALSE, MotorRunning holds its last state. EDGE_CASE_ANALYSIS: - When both StartButton and StopButton are TRUE, only the StartButton branch executes due to ELSIF. - This sets MotorRunning to TRUE, while the specification requires StopButton to take priority and force MotorRunning to FALSE. VERDICT: - FAIL. The ELSIF structure prioritizes StartButton over StopButton, which violates the specification.
Interpretation
- βΈ β Surfaces edge case behavior clearly
- βΈ β Maps behavior back to the written specification
- βΈ β Provides a clear, auditable FAIL verdict
- βΈ β ELSIF bug caught through explicit reasoning
- βΈ Cost/runtime vary by model, pricing, and system load
β οΈ THE INTEGRATION CHALLENGE
In production: reasoning traces need to be aggregated, searched, and checked for completeness. Free-form CoT text makes this hard to do reliably at scale.
Chain-of-thought provides transparency, but not processability:
Example: Programmatically checking if edge cases were analyzed
cot_output = """
ASSUMPTIONS:
- MotorRunning keeps its previous value unless assigned.
- StartButton and StopButton can be TRUE at the same time.
STEP_BY_STEP_REASONING:
1. When StartButton is TRUE...
2. When StartButton is FALSE and StopButton is TRUE...
3. When both are FALSE...
EDGE_CASE_ANALYSIS:
- When both buttons TRUE, only StartButton branch executes...
VERDICT:
- FAIL. The ELSIF structure prioritizes StartButton...
"""
# How do you automatically detect if edge case analysis was performed?
# How do you extract which specific edge cases were considered?
# How do you validate that all required reasoning steps are present?
# String parsing is fragile:
has_edge_analysis = "EDGE_CASE_ANALYSIS:" in cot_output # Brittle
verdict = "FAIL" if "FAIL" in cot_output else "PASS" # Unreliable
# Problem: Can't reliably build automation on top of prose β What CoT Provides
- β Human-readable reasoning
- β Auditable step-by-step logic
- β Transparent assumptions
β οΈ What CoT Doesn't Provide
- β οΈ Machine-parseable fields
- β οΈ Reliable extraction of verdicts
- β οΈ Automated validation of completeness
- β οΈ Multi-agent interoperability
Tutorial #9 covers schema-first design + field-level validation + audit logging β converting rationale into structured JSON with explicit fields for assumptions, edge cases, and verdicts. This makes reasoning checkable: you can validate completeness, extract specific findings, and aggregate across multiple analyses.
Important: Structured outputs solve format problems, not correctness problems. An agent can return perfectly valid JSON with wrong reasoning. Structured outputs make reasoning checkable (constraints, cross-checks, validators), not correct.
Note: In production, prefer brief rationale summaries over full reasoning transcripts. The audit trail is the structured output + evidence + validation results, not verbatim internal reasoning.
π EXPLICIT OPERATIONAL PROHIBITIONS
β Never Use Chain-of-Thought For:
- β Treating chain-of-thought outputs as formal verification
- β Using AI reasoning as a replacement for testing or simulation
- β Letting AI approve or merge PLC code changes automatically
- β Using this process for safety certification or compliance
Chain-of-thought is a review aid, not a formal method.
β KEY TAKEAWAYS
- β Chain-of-thought = showing the reasoning, not just verdicts
- β It makes AI outputs auditable, challengeable, and reusable
- β You can catch subtle logic bugs by forcing edge case analysis
- β The engineer remains the final decision-maker
- β This pattern will later power transparent agent behavior in higher tracks
π NEXT TUTORIAL
#8 β Building Your First Tool-Using Agent
Extend your skills by connecting reasoning to a single, strictly read-only tool in a controlled way.
π§ ENGINEERING POSTURE
This tutorial enforced:
- βΈ Transparency over black-box answers
- βΈ Structured reasoning over intuition
- βΈ Human authority over all conclusions
- βΈ Advisory tooling over autonomous control