Pages

Wednesday, March 11, 2026

appXapiIntegratorAgent is now rebranded as a helper app working together with Claude Code or OpenAI Codex

 The appXapiIntegratorAgent is a Node.js server that: https://iwant2study.org/lookangejss/appXapiIntegratorAgent/

  1. Accepts a ZIP of interactive HTML/JS (EJS simulations, P5.js, quizzes)
  2. Statically analyzes the code with regex (lib/analyzer.js)
  3. Sends that analysis to Gemini/OpenAI AI with an optional "custom instructions" textarea
  4. AI generates JavaScript xAPI tracking code (one-shot, no feedback loop)
  5. Injects the code into HTML (lib/injector.js)
  6. Returns the ZIP — never verified in a real browser

The question: Is this approach fundamentally flawed? Yes. Here is why, and what the correct approach is.


The Fundamental Flaw: Static Analysis ≠ Runtime Reality

Flaw 1 — Regex analysis cannot capture runtime behavior

lib/analyzer.js detects content via regex:

js
hasGameState = /gameState|game\.state|state\s*=|Score|Points/i.test(jsContent)
hasScore = /score|points|result|answered|correct/i.test(jsContent)

EJS/EJSS physics simulations, P5.js, and Claude-generated interactives have:

  • Dynamic DOM built at runtime — not present in source HTML
  • State machines that only reveal structure when running
  • Canvas rendering — no queryable DOM elements at all
  • Event systems that only fire when the simulation actually runs

The AI is told "this content has a score" but it cannot know:

  • Which JS variable actually holds the live score
  • When/how the score updates
  • What events fire on correct/wrong answers
  • What the simulation's internal API looks like

Flaw 2 — Generic guesses in the injected code

The timeline/quiz injection code (lib/injector.js) tries common guesses:

js
readDomNumberById('score') ?? readDomNumberById('points') ??
readDomNumberById('correctCount') ?? readDomNumberById('correct') ?? ...
```
For any specific simulation, most of these will find nothing. The code silently returns `null` and sends `score: 0`.

### Flaw 3 — Custom Instructions compensate for missing runtime knowledge

When users add custom instructions like "track the velocity slider" or "send score when level 3 completes", the AI still must **guess**:
- What is the slider's DOM id/class at runtime?
- What events does it fire?
- How does "level completion" manifest in code?
- What are valid value ranges?

None of this can be determined without running the simulation. Custom instructions are the user trying to describe runtime behavior they observed themselves — indirect and error-prone.

### Flaw 4 — No verification step exists

After injection, the ZIP is returned immediately. There is:
- No browser launch
- No check that `window.storeState()` is actually called
- No verification the payload contains meaningful data
- No way to detect that the tracking code silently failed

---

## Why This Is a Dialogic Problem Requiring Trial and Error

xAPI injection is **inherently iterative**. The correct workflow is:
```
Inject code → Run in browser → Observe what fires →
Check xAPI payloads → Fix issues → Repeat
```

This requires a feedback loop with an actual browser. The current architecture makes a **single one-shot API call** and returns — no loop, no verification, no correction.

Claude Code / OpenAI Codex operating agentically (with bash access, file writes, and browser tools) is the right paradigm because they CAN execute this loop:
- Run commands, see output
- Write code, test it, see errors
- Fix and re-test

The current app calls the AI like a function: `input → output`. The AI has no ability to observe the simulation, try something, see if it worked, and fix it.

---

## The Correct Approach: Playwright-Based Agent Loop

### New Architecture
```
Upload ZIPExtract to temp dir
  → Serve locally (express static server on random port)Playwright opens the simulation in a real browser
  → Phase 1: AI OBSERVES (screenshot, a11y tree, console output)Phase 2: AI INSTRUMENTS (injects monitoring code to spy on events/state)Phase 3: AI INTERACTS (clicks buttons, answers questions, runs simulation)Phase 4: AI READS RESULTS (what events fired, what state changed, what DOM appeared)Phase 5: AI GENERATES xAPI code based on ACTUAL observed behavior
  → Inject generated code into HTMLPhase 6: AI VERIFIES (reload, interact again, confirm storeState() called with right payload)If wrong: AI FIXES and re-verifies (up to N iterations)Return verified ZIP

What Changes

New dependency:

  • playwright (specifically chromium) — headless browser for simulation testing

New files to create:

  • lib/browserAgent.js — Playwright controller: serve, open, observe, interact, verify
  • lib/playwrightVerifier.js — Intercepts window.storeState() calls to verify payloads

Files to modify:

  • routes/upload.js — Add Playwright verification branch for AI Agent mode
  • lib/geminiAgent.js — Pass actual runtime observations (not just static analysis) to AI
  • package.json — Add playwright dependency

The Verification Pattern

The verifier intercepts window.storeState:

js
// Injected by Playwright before page loads
window.__xapiCapture = [];
window.storeState = function(payload) {
  window.__xapiCapture.push(payload);
};

After simulation interaction, read back:

js
const captured = await page.evaluate(() => window.__xapiCapture);
// Verify: score is a number, max is set, payload is non-trivial

If captured is empty or payload is {score: 0, max: null}, the AI knows the injection failed and must revise.

AI Agent Mode Enhancement

Instead of sending static regex analysis to Gemini, send runtime-observed data:

  • Actual DOM snapshot after simulation loads
  • Console log output
  • What events fired when the AI clicked interactive elements
  • What window.gameState or similar global objects contain

This gives the AI real information, not guesses.


Summary of the Problem vs. Correct Approach

AspectCurrent (Flawed)Correct (Playwright-Based)
AnalysisRegex on static sourceAI observes live running simulation
Code generationOne-shot AI guessAI generates based on actual runtime data
VerificationNonePlaywright confirms storeState() called with valid payload
IterationNoneAI can fix and re-verify (up to N cycles)
Custom instructionsUser guesses runtime behaviorAI observes runtime behavior directly
ResultOften sends score: 0, max: null silentlyVerified working xAPI tracking

Critical Files for Implementation

  • lib/analyzer.js — replace regex-only analysis with Playwright-observed data
  • lib/geminiAgent.js — update prompt to accept runtime observations, not just static analysis
  • lib/injector.js — no fundamental changes needed; injection mechanism is fine
  • routes/upload.js — add Playwright agent loop for AI Agent mode
  • package.json — add playwright
  • NEW: lib/browserAgent.js — Playwright observation + interaction controller
  • NEW: lib/playwrightVerifier.jswindow.storeState interception and payload validation

Verification Plan

After implementing Playwright-based agent:

  1. Start local server with npm start
  2. Upload an EJS simulation ZIP in AI Agent mode
  3. Watch server logs — should show: serve → Playwright open → AI observe → AI interact → inject → verify → confirmed
  4. Check downloaded ZIP: open index.html in browser, interact, confirm window.storeState() receives real data (not {score: 0})
  5. Test edge cases: canvas-only simulations (no DOM score), P5.js simulations, quiz simulations

No comments:

Post a Comment