Pages

Tuesday, April 14, 2026

analysis of support MAthType conversion for QTI files.

 

Context

File: 2026 Sec 1 G3 Math post WA1 practice.docx Request: Can equations from this file appear in editable form in QTI ZIP and SLS?


What's in the DOCX

FormatCountStorageComplexity
OLE MathType (Equation.DSMT4)19Binary .bin in word/embeddings/High — proprietary binary
OMML (<m:oMath>)4XML in document.xmlTrivial — all just minus signs

Each OLE equation has a WMF (Windows Metafile) preview image stored alongside it in word/media/imageN.wmf. These are small (416–1794 bytes), low-resolution previews. Browsers cannot display .wmf files natively.


What the Current Code Already Does

In docx_to_qti.html (lines 1171–1175 and 1263–1265), a previous AI added:

javascript
if (run.getElementsByTagNameNS(W,'object').length) {
  parts.push('$$\\text{[equation]}$$');
  continue;
}

OLE objects are detected and replaced with the literal text $$\text{[equation]}$$. The 4 OMML elements are silently ignored (no handling at all).

Result: All equations are lost from the output — only the placeholder text appears.


Approach Comparison

Approach A — Extract WMF Previews as Images

Idea: Each OLE equation already has a WMF preview linked via <v:imagedata r:id="..."/>. Extract the WMF → convert to PNG → include as an image token in the QTI ZIP.

Coverage19/19 OLE equations (all complex ones)
Editable in SLS?❌ No — displayed as image, students cannot interact
EffortHigh — WMF is a 1990s Windows binary vector format; browsers cannot render it
BlockerNeed server-side conversion (LibreOffice, Inkscape, or ImageMagick) or a JS WMF renderer (none are reliable/maintained)
Feasibility⚠️ Medium — viable only with a Python preprocessing step on the server

What would be needed:

  • Python script using subprocess + LibreOffice headless: soffice --headless --convert-to png olePreview.wmf
  • Or use wand (ImageMagick Python binding) to convert WMF → PNG
  • Replace $$\text{[equation]}$$ with actual image tokens in iterBlockItems
  • This requires the Python backend (app.py) to handle equation pre-processing

Approach B — OMML → MathML (browser-feasible, but trivial equations only)

Idea: Parse <m:oMath> elements and convert to MathML using Microsoft's published OMML-to-MathML XSLT stylesheet. MathML can be embedded directly in QTI 2.1.

Coverage4/4 OMML equations — but all 4 are just minus signs (-)
Editable in SLS?⚠️ Unknown — SLS QTI MathML support is unverified
EffortMedium — XSLT is available, browser DOMParser can run it via XSLTProcessor
BlockerSolves none of the 19 complex equations; SLS MathML support unconfirmed
Feasibility✅ High technically — but near-zero practical impact for this document

Approach C — Parse OLE Binary (MTEF) → MathML/LaTeX

Idea: The .bin files contain MathType's proprietary MTEF (MathType Equation Format) binary. Parse MTEF directly → convert to LaTeX/MathML.

Coverage19/19 OLE equations
Editable in SLS?✅ Yes — LaTeX renders in SLS if supported
Effort🔴 Extremely High — weeks to months
BlockerMTEF is partially documented but complex; no maintained open-source JS/Python parser
Feasibility❌ Not recommended

Approach D — Server-side Preprocessing Pipeline (Recommended if pursuing this)

Idea: Add a Python preprocessing step that converts the DOCX before the browser tool processes it:

  1. Open DOCX, find each <w:object> with Equation.DSMT4
  2. Extract the linked WMF preview via its relationship ID
  3. Convert WMF → PNG using LibreOffice headless or ImageMagick
  4. Replace the OLE object in the XML with a <w:drawing> containing the PNG
  5. Save a modified DOCX → pass to existing browser converter
Coverage19/19 OLE equations
Editable in SLS?❌ No — images only
EffortMedium (3–5 days) — Python script + server integration
Feasibility✅ Viable — but adds server dependency to what is currently a pure browser tool

Summary Recommendation

Editable?Browser-only?EffortPractical?
Current (placeholder text)DoneMinimal fidelity
A: WMF → PNG❌ (needs server)HighGood fidelity, not editable
B: OMML → MathML⚠️MediumOnly 4 trivial equations
C: MTEF → LaTeXExtremeNot feasible now
D: Server preprocessingMediumBest trade-off if server available

Bottom line:

  • Editable math equations in QTI/SLS from legacy MathType OLE objects is not currently feasible in a browser-only tool
  • If the Python backend (app.py) is extended, server-side WMF→PNG conversion (Approach D) gives the best practical outcome — equations appear as images, which SLS can display
  • The cleanest long-term solution is for teachers to re-author equations in Word's native OMML math editor (Insert > Equation) instead of the legacy MathType plugin — OMML can be parsed and converted to MathML in-browser

Files Involved (if implementing Approach D)

  • docx_to_qti.html — change OLE handler to support image tokens from pre-processed DOCX
  • converter.py — add WMF→PNG preprocessing using LibreOffice/ImageMagick
  • app.py — expose preprocessing endpoint or integrate into existing DOCX upload flow
  • requirements.txt — add wand or Pillow + LibreOffice headless dependency

No comments:

Post a Comment