Context
File: 2026 Sec 1 G3 Math post WA1 practice.docx
Request: Can equations from this file appear in editable form in QTI ZIP and SLS?
What's in the DOCX
| Format | Count | Storage | Complexity |
|---|---|---|---|
| OLE MathType (Equation.DSMT4) | 19 | Binary .bin in word/embeddings/ | High — proprietary binary |
OMML (<m:oMath>) | 4 | XML in document.xml | Trivial — all just minus signs |
Each OLE equation has a WMF (Windows Metafile) preview image stored alongside it
in word/media/imageN.wmf. These are small (416–1794 bytes), low-resolution previews.
Browsers cannot display .wmf files natively.
What the Current Code Already Does
In docx_to_qti.html (lines 1171–1175 and 1263–1265), a previous AI added:
if (run.getElementsByTagNameNS(W,'object').length) {
parts.push('$$\\text{[equation]}$$');
continue;
}OLE objects are detected and replaced with the literal text $$\text{[equation]}$$.
The 4 OMML elements are silently ignored (no handling at all).
Result: All equations are lost from the output — only the placeholder text appears.
Approach Comparison
Approach A — Extract WMF Previews as Images
Idea: Each OLE equation already has a WMF preview linked via <v:imagedata r:id="..."/>.
Extract the WMF → convert to PNG → include as an image token in the QTI ZIP.
| Coverage | 19/19 OLE equations (all complex ones) |
| Editable in SLS? | ❌ No — displayed as image, students cannot interact |
| Effort | High — WMF is a 1990s Windows binary vector format; browsers cannot render it |
| Blocker | Need server-side conversion (LibreOffice, Inkscape, or ImageMagick) or a JS WMF renderer (none are reliable/maintained) |
| Feasibility | ⚠️ Medium — viable only with a Python preprocessing step on the server |
What would be needed:
- Python script using
subprocess+ LibreOffice headless:soffice --headless --convert-to png olePreview.wmf - Or use
wand(ImageMagick Python binding) to convert WMF → PNG - Replace
$$\text{[equation]}$$with actual image tokens initerBlockItems - This requires the Python backend (
app.py) to handle equation pre-processing
Approach B — OMML → MathML (browser-feasible, but trivial equations only)
Idea: Parse <m:oMath> elements and convert to MathML using Microsoft's published
OMML-to-MathML XSLT stylesheet. MathML can be embedded directly in QTI 2.1.
| Coverage | 4/4 OMML equations — but all 4 are just minus signs (-) |
| Editable in SLS? | ⚠️ Unknown — SLS QTI MathML support is unverified |
| Effort | Medium — XSLT is available, browser DOMParser can run it via XSLTProcessor |
| Blocker | Solves none of the 19 complex equations; SLS MathML support unconfirmed |
| Feasibility | ✅ High technically — but near-zero practical impact for this document |
Approach C — Parse OLE Binary (MTEF) → MathML/LaTeX
Idea: The .bin files contain MathType's proprietary MTEF (MathType Equation Format)
binary. Parse MTEF directly → convert to LaTeX/MathML.
| Coverage | 19/19 OLE equations |
| Editable in SLS? | ✅ Yes — LaTeX renders in SLS if supported |
| Effort | 🔴 Extremely High — weeks to months |
| Blocker | MTEF is partially documented but complex; no maintained open-source JS/Python parser |
| Feasibility | ❌ Not recommended |
Approach D — Server-side Preprocessing Pipeline (Recommended if pursuing this)
Idea: Add a Python preprocessing step that converts the DOCX before the browser tool processes it:
- Open DOCX, find each
<w:object>withEquation.DSMT4 - Extract the linked WMF preview via its relationship ID
- Convert WMF → PNG using LibreOffice headless or ImageMagick
- Replace the OLE object in the XML with a
<w:drawing>containing the PNG - Save a modified DOCX → pass to existing browser converter
| Coverage | 19/19 OLE equations |
| Editable in SLS? | ❌ No — images only |
| Effort | Medium (3–5 days) — Python script + server integration |
| Feasibility | ✅ Viable — but adds server dependency to what is currently a pure browser tool |
Summary Recommendation
| Editable? | Browser-only? | Effort | Practical? | |
|---|---|---|---|---|
| Current (placeholder text) | ❌ | ✅ | Done | Minimal fidelity |
| A: WMF → PNG | ❌ | ❌ (needs server) | High | Good fidelity, not editable |
| B: OMML → MathML | ⚠️ | ✅ | Medium | Only 4 trivial equations |
| C: MTEF → LaTeX | ✅ | ❌ | Extreme | Not feasible now |
| D: Server preprocessing | ❌ | ❌ | Medium | Best trade-off if server available |
Bottom line:
- Editable math equations in QTI/SLS from legacy MathType OLE objects is not currently feasible in a browser-only tool
- If the Python backend (
app.py) is extended, server-side WMF→PNG conversion (Approach D) gives the best practical outcome — equations appear as images, which SLS can display - The cleanest long-term solution is for teachers to re-author equations in Word's native OMML math editor (
Insert > Equation) instead of the legacy MathType plugin — OMML can be parsed and converted to MathML in-browser
Files Involved (if implementing Approach D)
docx_to_qti.html— change OLE handler to support image tokens from pre-processed DOCXconverter.py— add WMF→PNG preprocessing using LibreOffice/ImageMagickapp.py— expose preprocessing endpoint or integrate into existing DOCX upload flowrequirements.txt— addwandorPillow+ LibreOffice headless dependency
No comments:
Post a Comment