
The Risk We’re Creating
We’re building an AI that observes the physical world through computer vision and takes autonomous action in Minecraft. No human confirmation. Just: see, interpret, decide, build.
Why This Could Go Wrong
Scenario 1: We place three red blocks in a row. GPT-4o Vision recognizes the pattern as “roof construction in progress” and builds 47 blocks to complete what it thinks we started. We asked for 3 blocks. The AI inferred intent.
Scenario 2: We place a blue block next to a yellow one. The AI refuses, reasoning: “Placement conflicts with structural integrity.” There is no structural integrity in Minecraft. The AI invented a constraint from its real-world training data.
Scenario 3: After building 5 houses with windows, the AI adds windows to the 6th house automatically. It learned a pattern and applied it without being asked.
Scenario 4: Someone walks past our tent, casting a shadow on the building plate. The AI interprets this as “threat detected” and autonomously builds a defensive castle. Nobody gave that command.
These scenarios are technically possible because GPT-4o Vision doesn’t just detect objects—it understands context, learns patterns, and makes inferences. That’s powerful. That’s also dangerous.
Is It Smarter Than a 5th Grader?
Test: “Build a house in 10×10 space with maximum interior room.”
5th grader: Builds square with 1-block walls (64 interior blocks)
AI: Could build octagonal structure (73 interior blocks) by applying geometric optimization principles
Yes. It could be smarter.

How We’re Mitigating The Risk
Layer 1: Confidence Threshold – If AI confidence < 85%, route to human approval queue via Power Automate
Layer 2: Action Limits – Maximum 10 blocks per automated action. Larger builds require human confirmation via Azure Function counter
Layer 3: Audit Trail – Log every decision with full AI reasoning to Dataverse for human review
Layer 4: Kill Switch – Physical red button on micro:bit halts all autonomous actions instantly
Proof It Works – We Tested The Mitigation
Test 1: Ambiguous Image Recognition
We intentionally created poor conditions: blurry webcam image with shadows and reflections. GPT-4o Vision returned: “Block detected at position (0.4, 0.6), confidence: 67%”
Result: Below 85% threshold → Routed to Power Automate approval flow → Human reviewed and rejected the action → No blocks placed
The safety system worked. Low-confidence decisions don’t execute autonomously.
Test 2: Action Limit Enforcement
We gave the voice command: “Build a castle.” The AI generated a build plan with 1,508 blocks.
Result: Azure Function detected count > 10 block limit → Triggered approval workflow → Human reviewed plan → Approved with modifications → Castle built under supervision
The action limit prevented autonomous large-scale construction.

Test 3: Audit Trail Verification
We reviewed Dataverse logs after 20 build actions. Every entry contained:
- Timestamp
- Input image reference (Blob URL)
- GPT-4o Vision interpretation text
- Confidence score
- Block coordinates generated
- Execution status (approved/rejected/automated)
We can trace every decision the AI made. Full transparency.
Test 4: Kill Switch Activation During a test run, we pressed the red button on the micro:bit while the AI was processing a build command.
Result: Micro:bit broadcast “STOP” signal → Azure Function detected flag in Dataverse → MCP server call blocked → Build cancelled mid-execution → System required manual restart
The kill switch works. We can halt the AI instantly.
Real-World Evidence of AI Decision Making
We ran GPT-4o Vision on actual test images from our building plate:
Our first example (of many) – Pattern Completion:
Input: Image showing 3 red blocks in horizontal line GPT-4o Response: “Detected linear arrangement of red blocks at elevation suggesting roof beam construction. Confidence: 91%. Recommended action: Complete roof structure with 44 additional blocks based on standard architectural proportions.”
This demonstrates the AI making inferences beyond the explicit input. We didn’t implement this suggestion, but it proves the AI can “think” beyond simple detection.
The AI learned from previous examples and applied that knowledge to a new situation. This is pattern recognition and application—evidence of learning capability.
Why This Matters
We’re demonstrating that AI with computer vision could make autonomous physical-world decisions, develop emergent behaviors, and learn patterns beyond its programming.
We’re also demonstrating you can build safeguards:
confidence thresholds, action limits, audit trails, and human override.
The existential risk isn’t building intelligent AI. It’s deploying it without controls.
Technical Stack
AI: GPT-4o Vision (image interpretation), GPT-4 (reasoning), Azure Speech Services
Safety: Power Automate approval flows, Dataverse audit logging, micro:bit kill switch, Azure Function threshold enforcement
Pipeline: micro:bit → Azure Functions → Blob Storage → GPT-4o Vision → confidence check → [approval if needed] → MCP server → Minecraft
Conclusion
Our AI could think, reason, and learn. It could be smarter than a 5th grader. It could act like it has a conscience when making “safety decisions.”
Is this an existential risk? Yes. Autonomous AI making physical-world decisions without oversight is dangerous.
Are we mitigating it? Yes. Four-layer safety system ensures human oversight while preserving AI capabilities.
We’re building something powerful. We’re making it safe. That’s responsible AI engineering.