AI HTML5, Open Source Physics (Easy JavaScript Simulation and Tracker) and TagUI (AI-Singapore): appXapiIntegratorAgent is now rebranded as a helper app working together with Claude Code or OpenAI Codex

The appXapiIntegratorAgent is a Node.js server that: https://iwant2study.org/lookangejss/appXapiIntegratorAgent/

Accepts a ZIP of interactive HTML/JS (EJS simulations, P5.js, quizzes)
Statically analyzes the code with regex (lib/analyzer.js)
Sends that analysis to Gemini/OpenAI AI with an optional "custom instructions" textarea
AI generates JavaScript xAPI tracking code (one-shot, no feedback loop)
Injects the code into HTML (lib/injector.js)
Returns the ZIP — never verified in a real browser

The question: Is this approach fundamentally flawed? Yes. Here is why, and what the correct approach is.

The Fundamental Flaw: Static Analysis ≠ Runtime Reality

Flaw 1 — Regex analysis cannot capture runtime behavior

lib/analyzer.js detects content via regex:

hasGameState = /gameState|game\.state|state\s*=|Score|Points/i.test(jsContent)
hasScore = /score|points|result|answered|correct/i.test(jsContent)

EJS/EJSS physics simulations, P5.js, and Claude-generated interactives have:

Dynamic DOM built at runtime — not present in source HTML
State machines that only reveal structure when running
Canvas rendering — no queryable DOM elements at all
Event systems that only fire when the simulation actually runs

The AI is told "this content has a score" but it cannot know:

Which JS variable actually holds the live score
When/how the score updates
What events fire on correct/wrong answers
What the simulation's internal API looks like

Flaw 2 — Generic guesses in the injected code

The timeline/quiz injection code (lib/injector.js) tries common guesses:

readDomNumberById('score') ?? readDomNumberById('points') ??
readDomNumberById('correctCount') ?? readDomNumberById('correct') ?? ...
```
For any specific simulation, most of these will find nothing. The code silently returns `null` and sends `score: 0`.

### Flaw 3 — Custom Instructions compensate for missing runtime knowledge

When users add custom instructions like "track the velocity slider" or "send score when level 3 completes", the AI still must **guess**:
- What is the slider's DOM id/class at runtime?
- What events does it fire?
- How does "level completion" manifest in code?
- What are valid value ranges?

None of this can be determined without running the simulation. Custom instructions are the user trying to describe runtime behavior they observed themselves — indirect and error-prone.

### Flaw 4 — No verification step exists

After injection, the ZIP is returned immediately. There is:
- No browser launch
- No check that `window.storeState()` is actually called
- No verification the payload contains meaningful data
- No way to detect that the tracking code silently failed

---

## Why This Is a Dialogic Problem Requiring Trial and Error

xAPI injection is **inherently iterative**. The correct workflow is:
```
Inject code → Run in browser → Observe what fires →
Check xAPI payloads → Fix issues → Repeat
```

This requires a feedback loop with an actual browser. The current architecture makes a **single one-shot API call** and returns — no loop, no verification, no correction.

Claude Code / OpenAI Codex operating agentically (with bash access, file writes, and browser tools) is the right paradigm because they CAN execute this loop:
- Run commands, see output
- Write code, test it, see errors
- Fix and re-test

The current app calls the AI like a function: `input → output`. The AI has no ability to observe the simulation, try something, see if it worked, and fix it.

---

## The Correct Approach: Playwright-Based Agent Loop

### New Architecture
```
Upload ZIP
  → Extract to temp dir
  → Serve locally (express static server on random port)
  → Playwright opens the simulation in a real browser
  → Phase 1: AI OBSERVES (screenshot, a11y tree, console output)
  → Phase 2: AI INSTRUMENTS (injects monitoring code to spy on events/state)
  → Phase 3: AI INTERACTS (clicks buttons, answers questions, runs simulation)
  → Phase 4: AI READS RESULTS (what events fired, what state changed, what DOM appeared)
  → Phase 5: AI GENERATES xAPI code based on ACTUAL observed behavior
  → Inject generated code into HTML
  → Phase 6: AI VERIFIES (reload, interact again, confirm storeState() called with right payload)
  → If wrong: AI FIXES and re-verifies (up to N iterations)
  → Return verified ZIP

What Changes

New dependency:

playwright (specifically chromium) — headless browser for simulation testing

New files to create:

lib/browserAgent.js — Playwright controller: serve, open, observe, interact, verify
lib/playwrightVerifier.js — Intercepts window.storeState() calls to verify payloads

Files to modify:

routes/upload.js — Add Playwright verification branch for AI Agent mode
lib/geminiAgent.js — Pass actual runtime observations (not just static analysis) to AI
package.json — Add playwright dependency

The Verification Pattern

The verifier intercepts window.storeState:

// Injected by Playwright before page loads
window.__xapiCapture = [];
window.storeState = function(payload) {
  window.__xapiCapture.push(payload);
};

After simulation interaction, read back:

const captured = await page.evaluate(() => window.__xapiCapture);
// Verify: score is a number, max is set, payload is non-trivial

If captured is empty or payload is {score: 0, max: null}, the AI knows the injection failed and must revise.

AI Agent Mode Enhancement

Instead of sending static regex analysis to Gemini, send runtime-observed data:

Actual DOM snapshot after simulation loads
Console log output
What events fired when the AI clicked interactive elements
What window.gameState or similar global objects contain

This gives the AI real information, not guesses.

Summary of the Problem vs. Correct Approach

Aspect	Current (Flawed)	Correct (Playwright-Based)
Analysis	Regex on static source	AI observes live running simulation
Code generation	One-shot AI guess	AI generates based on actual runtime data
Verification	None	Playwright confirms `storeState()` called with valid payload
Iteration	None	AI can fix and re-verify (up to N cycles)
Custom instructions	User guesses runtime behavior	AI observes runtime behavior directly
Result	Often sends `score: 0, max: null` silently	Verified working xAPI tracking

Critical Files for Implementation

lib/analyzer.js — replace regex-only analysis with Playwright-observed data
lib/geminiAgent.js — update prompt to accept runtime observations, not just static analysis
lib/injector.js — no fundamental changes needed; injection mechanism is fine
routes/upload.js — add Playwright agent loop for AI Agent mode
package.json — add playwright
NEW: lib/browserAgent.js — Playwright observation + interaction controller
NEW: lib/playwrightVerifier.js — window.storeState interception and payload validation

Verification Plan

After implementing Playwright-based agent:

Start local server with npm start
Upload an EJS simulation ZIP in AI Agent mode
Watch server logs — should show: serve → Playwright open → AI observe → AI interact → inject → verify → confirmed
Check downloaded ZIP: open index.html in browser, interact, confirm window.storeState() receives real data (not {score: 0})
Test edge cases: canvas-only simulations (no DOM score), P5.js simulations, quiz simulations

AI HTML5, Open Source Physics (Easy JavaScript Simulation and Tracker) and TagUI (AI-Singapore)

Pages

Wednesday, March 11, 2026

appXapiIntegratorAgent is now rebranded as a helper app working together with Claude Code or OpenAI Codex