The appXapiIntegratorAgent is a Node.js server that: https://iwant2study.org/lookangejss/appXapiIntegratorAgent/
- Accepts a ZIP of interactive HTML/JS (EJS simulations, P5.js, quizzes)
- Statically analyzes the code with regex (
lib/analyzer.js) - Sends that analysis to Gemini/OpenAI AI with an optional "custom instructions" textarea
- AI generates JavaScript xAPI tracking code (one-shot, no feedback loop)
- Injects the code into HTML (
lib/injector.js) - Returns the ZIP — never verified in a real browser
The question: Is this approach fundamentally flawed? Yes. Here is why, and what the correct approach is.
The Fundamental Flaw: Static Analysis ≠ Runtime Reality
Flaw 1 — Regex analysis cannot capture runtime behavior
lib/analyzer.js detects content via regex:
hasGameState = /gameState|game\.state|state\s*=|Score|Points/i.test(jsContent)
hasScore = /score|points|result|answered|correct/i.test(jsContent)EJS/EJSS physics simulations, P5.js, and Claude-generated interactives have:
- Dynamic DOM built at runtime — not present in source HTML
- State machines that only reveal structure when running
- Canvas rendering — no queryable DOM elements at all
- Event systems that only fire when the simulation actually runs
The AI is told "this content has a score" but it cannot know:
- Which JS variable actually holds the live score
- When/how the score updates
- What events fire on correct/wrong answers
- What the simulation's internal API looks like
Flaw 2 — Generic guesses in the injected code
The timeline/quiz injection code (lib/injector.js) tries common guesses:
readDomNumberById('score') ?? readDomNumberById('points') ??
readDomNumberById('correctCount') ?? readDomNumberById('correct') ?? ...
```
For any specific simulation, most of these will find nothing. The code silently returns `null` and sends `score: 0`.
### Flaw 3 — Custom Instructions compensate for missing runtime knowledge
When users add custom instructions like "track the velocity slider" or "send score when level 3 completes", the AI still must **guess**:
- What is the slider's DOM id/class at runtime?
- What events does it fire?
- How does "level completion" manifest in code?
- What are valid value ranges?
None of this can be determined without running the simulation. Custom instructions are the user trying to describe runtime behavior they observed themselves — indirect and error-prone.
### Flaw 4 — No verification step exists
After injection, the ZIP is returned immediately. There is:
- No browser launch
- No check that `window.storeState()` is actually called
- No verification the payload contains meaningful data
- No way to detect that the tracking code silently failed
---
## Why This Is a Dialogic Problem Requiring Trial and Error
xAPI injection is **inherently iterative**. The correct workflow is:
```
Inject code → Run in browser → Observe what fires →
Check xAPI payloads → Fix issues → Repeat
```
This requires a feedback loop with an actual browser. The current architecture makes a **single one-shot API call** and returns — no loop, no verification, no correction.
Claude Code / OpenAI Codex operating agentically (with bash access, file writes, and browser tools) is the right paradigm because they CAN execute this loop:
- Run commands, see output
- Write code, test it, see errors
- Fix and re-test
The current app calls the AI like a function: `input → output`. The AI has no ability to observe the simulation, try something, see if it worked, and fix it.
---
## The Correct Approach: Playwright-Based Agent Loop
### New Architecture
```
Upload ZIP
→ Extract to temp dir
→ Serve locally (express static server on random port)
→ Playwright opens the simulation in a real browser
→ Phase 1: AI OBSERVES (screenshot, a11y tree, console output)
→ Phase 2: AI INSTRUMENTS (injects monitoring code to spy on events/state)
→ Phase 3: AI INTERACTS (clicks buttons, answers questions, runs simulation)
→ Phase 4: AI READS RESULTS (what events fired, what state changed, what DOM appeared)
→ Phase 5: AI GENERATES xAPI code based on ACTUAL observed behavior
→ Inject generated code into HTML
→ Phase 6: AI VERIFIES (reload, interact again, confirm storeState() called with right payload)
→ If wrong: AI FIXES and re-verifies (up to N iterations)
→ Return verified ZIPWhat Changes
New dependency:
playwright(specificallychromium) — headless browser for simulation testing
New files to create:
lib/browserAgent.js— Playwright controller: serve, open, observe, interact, verifylib/playwrightVerifier.js— Interceptswindow.storeState()calls to verify payloads
Files to modify:
routes/upload.js— Add Playwright verification branch for AI Agent modelib/geminiAgent.js— Pass actual runtime observations (not just static analysis) to AIpackage.json— Addplaywrightdependency
The Verification Pattern
The verifier intercepts window.storeState:
// Injected by Playwright before page loads
window.__xapiCapture = [];
window.storeState = function(payload) {
window.__xapiCapture.push(payload);
};After simulation interaction, read back:
const captured = await page.evaluate(() => window.__xapiCapture);
// Verify: score is a number, max is set, payload is non-trivialIf captured is empty or payload is {score: 0, max: null}, the AI knows the injection failed and must revise.
AI Agent Mode Enhancement
Instead of sending static regex analysis to Gemini, send runtime-observed data:
- Actual DOM snapshot after simulation loads
- Console log output
- What events fired when the AI clicked interactive elements
- What
window.gameStateor similar global objects contain
This gives the AI real information, not guesses.
Summary of the Problem vs. Correct Approach
| Aspect | Current (Flawed) | Correct (Playwright-Based) |
|---|---|---|
| Analysis | Regex on static source | AI observes live running simulation |
| Code generation | One-shot AI guess | AI generates based on actual runtime data |
| Verification | None | Playwright confirms storeState() called with valid payload |
| Iteration | None | AI can fix and re-verify (up to N cycles) |
| Custom instructions | User guesses runtime behavior | AI observes runtime behavior directly |
| Result | Often sends score: 0, max: null silently | Verified working xAPI tracking |
Critical Files for Implementation
lib/analyzer.js— replace regex-only analysis with Playwright-observed datalib/geminiAgent.js— update prompt to accept runtime observations, not just static analysislib/injector.js— no fundamental changes needed; injection mechanism is fineroutes/upload.js— add Playwright agent loop for AI Agent modepackage.json— addplaywright- NEW:
lib/browserAgent.js— Playwright observation + interaction controller - NEW:
lib/playwrightVerifier.js—window.storeStateinterception and payload validation
Verification Plan
After implementing Playwright-based agent:
- Start local server with
npm start - Upload an EJS simulation ZIP in AI Agent mode
- Watch server logs — should show: serve → Playwright open → AI observe → AI interact → inject → verify → confirmed
- Check downloaded ZIP: open
index.htmlin browser, interact, confirmwindow.storeState()receives real data (not{score: 0}) - Test edge cases: canvas-only simulations (no DOM score), P5.js simulations, quiz simulations
No comments:
Post a Comment