我将draw.io图从本地draw.io应用程序导出为png。xml隐藏在这个png文件中,可能隐藏在"tExt“块中。我正在尝试“借用”draw.io JS的parsePng实现并将其转换为python。XML应该隐藏在zTxt中,但是我只看到tExt (https://www.diagrams.net/blog/xml-in-png)。
import png
filename="./image3.png"
im=png.Reader(filename)
ihdr, text, *rest = im.chunks()
chunk_type, chunk_bytes = text
vals = chunk_bytes.decode("utf-8").split("".join(map(chr, [0])))
print(vals)
这些是可用的块:
python test.py
b'IHDR' 13
b'tEXt' 1031
b'IDAT' 4709
b'IEND' 0
我现在得到的输出是(假设xml隐藏在这个脚本中的某个地方-按比例编码,但无法得到)
['mxfile', '%3Cmxfile%20host%3D%22Electron%22%20modified%3D%222021-11-15T10%3A44%3A54.487Z%22%20agent%3D%225.0%20(Macintosh%3B%20Intel%20Mac%20OS%20X%2011_6_1)%20AppleWebKit%2F537.36%20(KHTML%2C%20like%20Gecko)%20draw.io%2F14.5.1%20Chrome%2F89.0.4389.82%20Electron%2F12.0.1%20Safari%2F537.36%22%20etag%3D%22S6Lk2QkhAN9aeDDzQv4n%22%20version%3D%2214.5.1%22%20type%3D%22device%22%3E%3Cdiagram%20id%3D%223ZARfinUemRlELbDbWll%22%20name%3D%22Page-1%22%3EtZTBcoIwEEC%2FhmNngFihV6ltZ6rtgUPPGVghncAycRHo1zdIEClq9eAJ8rJhsy9LLBZk9aviRbrGGKTl2nFtsWfLdR3HnutHS5qOeMzrQKJEbIIGEIofMNA2tBQxbEeBhChJFGMYYZ5DRCPGlcJqHLZBOc5a8AQmIIy4nNIvEVPaUd%2F1Bv4GIkn7zM78qZvJeB9sKtmmPMbqCLGlxQKFSN1bVgcgW3m9l27dy5nZw8YU5HTNgk%2B22WTOqi4X4cc6iJj37rsP7qz7zI7L0lRsdktNrwBibcQMUVGKCeZcLge6UFjmMbR5bD0aYlaIhYaOht9A1Jjj5SWhRill0sx2OdtEZ4szaIuliuBSRaYA4ioBuhTIDoeguxcwA1KNXqhAchK78U64aaPkEDeY1i9G9i3i3Yl4kaDSxJkcwKC3dVWlgiAs%2BN5Cpf%2B6Uyp3oAjqyzKnpZsF7NG0bPNnXA1%2FgNO3dXrU%2FXP7XrbYOVsn2lVKfTnA%2F6bGWu%2FgbeZf6c2%2F3ZseDlfHfu7oAmbLXw%3D%3D%3C%2Fdiagram%3E%3C%2Fmxfile%3E']
我想得到的输出(至少根标记中的输出):
<?xml version="1.0" encoding="UTF-8"?>
<mxfile host="Electron" modified="2021-11-15T12:30:17.738Z" agent="5.0 (Macintosh; Intel Mac OS X 11_6_1) AppleWebKit/537.36 (KHTML, like Gecko) draw.io/14.5.1 Chrome/89.0.4389.82 Electron/12.0.1 Safari/537.36" etag="f7nqQOQ3-W-PKNeU6aKq" version="14.5.1" type="device">
<diagram id="3ZARfinUemRlELbDbWll" name="Page-1">
<mxGraphModel dx="1106" dy="737" grid="1" gridSize="10" guides="1" tooltips="1" connect="1" arrows="1" fold="1" page="1" pageScale="1" pageWidth="827" pageHeight="1169" math="0" shadow="0">
<root>
<mxCell id="0" />
<mxCell id="1" parent="0" />
<mxCell id="O3ffm1LxuBSNMCc37K82-24" value="" style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;" edge="1" parent="1" source="O3ffm1LxuBSNMCc37K82-22" target="O3ffm1LxuBSNMCc37K82-23">
<mxGeometry relative="1" as="geometry" />
</mxCell>
<mxCell id="O3ffm1LxuBSNMCc37K82-22" value="igor 1" style="rounded=1;whiteSpace=wrap;html=1;" vertex="1" parent="1">
<mxGeometry x="350" y="350" width="120" height="60" as="geometry" />
</mxCell>
<mxCell id="O3ffm1LxuBSNMCc37K82-23" value="igor 2" style="ellipse;whiteSpace=wrap;html=1;rounded=1;" vertex="1" parent="1">
<mxGeometry x="350" y="480" width="120" height="80" as="geometry" />
</mxCell>
</root>
</mxGraphModel>
</diagram>
</mxfile>
上面的XML表示这个draw.io图:
注:这可能是一个现有问题的副本,但没有提供一个正确的答案(如何从draw.io PNG中编程提取XML数据)
发布于 2021-11-15 14:11:44
您现在得到的输出是URI编码。解码它会产生这样的结果:
<mxfile host="Electron" modified="2021-11-15T10:44:54.487Z" agent="5.0 (Macintosh; Intel Mac OS X 11_6_1) AppleWebKit/537.36 (KHTML, like Gecko) draw.io/14.5.1 Chrome/89.0.4389.82 Electron/12.0.1 Safari/537.36" etag="S6Lk2QkhAN9aeDDzQv4n" version="14.5.1" type="device"><diagram id="3ZARfinUemRlELbDbWll" name="Page-1">tZTBcoIwEEC/hmNngFihV6ltZ6rtgUPPGVghncAycRHo1zdIEClq9eAJ8rJhsy9LLBZk9aviRbrGGKTl2nFtsWfLdR3HnutHS5qOeMzrQKJEbIIGEIofMNA2tBQxbEeBhChJFGMYYZ5DRCPGlcJqHLZBOc5a8AQmIIy4nNIvEVPaUd/1Bv4GIkn7zM78qZvJeB9sKtmmPMbqCLGlxQKFSN1bVgcgW3m9l27dy5nZw8YU5HTNgk+22WTOqi4X4cc6iJj37rsP7qz7zI7L0lRsdktNrwBibcQMUVGKCeZcLge6UFjmMbR5bD0aYlaIhYaOht9A1Jjj5SWhRill0sx2OdtEZ4szaIuliuBSRaYA4ioBuhTIDoeguxcwA1KNXqhAchK78U64aaPkEDeY1i9G9i3i3Yl4kaDSxJkcwKC3dVWlgiAs+N5Cpf+6Uyp3oAjqyzKnpZsF7NG0bPNnXA1/gNO3dXrU/XP7XrbYOVsn2lVKfTnA/6bGWu/gbeZf6c2/3ZseDlfHfu7oAmbLXw==</diagram></mxfile>
我们可以看到数据包含在diagram
标记中。多亏了这在draw.io上的便捷工具,我们可以看到这些数据是使用帕科 ( zlib的javascript端口)压缩的。
值得庆幸的是,堆栈溢出上的另一个用户拥有已经编写了相当于Pako方法的Python。使用它,我们可以继续您的程序来获取图表的XML:
from urllib.parse import quote, unquote
import xml.etree.ElementTree as ET
import zlib
import base64
def js_encode_uri_component(data):
return quote(data, safe='~()*!.\'')
def js_decode_uri_component(data):
return unquote(data)
def js_string_to_byte(data):
return bytes(data, 'iso-8859-1')
def js_bytes_to_string(data):
return data.decode('iso-8859-1')
def js_btoa(data):
return base64.b64encode(data)
def js_atob(data):
return base64.b64decode(data)
def pako_inflate_raw(data):
decompress = zlib.decompressobj(-15)
decompressed_data = decompress.decompress(data)
decompressed_data += decompress.flush()
return decompressed_data
original_data = '%3Cmxfile%20host%3D%22Electron%22%20modified%3D%222021-11-15T10%3A44%3A54.487Z%22%20agent%3D%225.0%20(Macintosh%3B%20Intel%20Mac%20OS%20X%2011_6_1)%20AppleWebKit%2F537.36%20(KHTML%2C%20like%20Gecko)%20draw.io%2F14.5.1%20Chrome%2F89.0.4389.82%20Electron%2F12.0.1%20Safari%2F537.36%22%20etag%3D%22S6Lk2QkhAN9aeDDzQv4n%22%20version%3D%2214.5.1%22%20type%3D%22device%22%3E%3Cdiagram%20id%3D%223ZARfinUemRlELbDbWll%22%20name%3D%22Page-1%22%3EtZTBcoIwEEC%2FhmNngFihV6ltZ6rtgUPPGVghncAycRHo1zdIEClq9eAJ8rJhsy9LLBZk9aviRbrGGKTl2nFtsWfLdR3HnutHS5qOeMzrQKJEbIIGEIofMNA2tBQxbEeBhChJFGMYYZ5DRCPGlcJqHLZBOc5a8AQmIIy4nNIvEVPaUd%2F1Bv4GIkn7zM78qZvJeB9sKtmmPMbqCLGlxQKFSN1bVgcgW3m9l27dy5nZw8YU5HTNgk%2B22WTOqi4X4cc6iJj37rsP7qz7zI7L0lRsdktNrwBibcQMUVGKCeZcLge6UFjmMbR5bD0aYlaIhYaOht9A1Jjj5SWhRill0sx2OdtEZ4szaIuliuBSRaYA4ioBuhTIDoeguxcwA1KNXqhAchK78U64aaPkEDeY1i9G9i3i3Yl4kaDSxJkcwKC3dVWlgiAs%2BN5Cpf%2B6Uyp3oAjqyzKnpZsF7NG0bPNnXA1%2FgNO3dXrU%2FXP7XrbYOVsn2lVKfTnA%2F6bGWu%2FgbeZf6c2%2F3ZseDlfHfu7oAmbLXw%3D%3D%3C%2Fdiagram%3E%3C%2Fmxfile%3E'
uri_decoded_data = js_decode_uri_component(original_data)
## Extract diagram data from resulting XML
root = ET.fromstring(uri_decoded_data)
diagram_data = root[0].text
## Decode Base64
diagram_data = js_atob(diagram_data)
decompressed_diagram_data = pako_inflate_raw(diagram_data)
## Turn decompressed data into a usable string
string_diagram_data = js_bytes_to_string(decompressed_diagram_data)
string_diagram_data = js_decode_uri_component(string_diagram_data)
print(string_diagram_data)
输出(格式化):
<mxGraphModel dx="1106" dy="737" grid="1" gridSize="10" guides="1" tooltips="1" connect="1" arrows="1" fold="1" page="1" pageScale="1" pageWidth="827" pageHeight="1169" math="0" shadow="0">
<root>
<mxCell id="0"/>
<mxCell id="1" parent="0"/>
<mxCell id="O3ffm1LxuBSNMCc37K82-24" value="" style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;" edge="1" parent="1" source="O3ffm1LxuBSNMCc37K82-22" target="O3ffm1LxuBSNMCc37K82-23">
<mxGeometry relative="1" as="geometry"/>
</mxCell>
<mxCell id="O3ffm1LxuBSNMCc37K82-22" value="igor 1" style="rounded=1;whiteSpace=wrap;html=1;" vertex="1" parent="1">
<mxGeometry x="350" y="350" width="120" height="60" as="geometry"/>
</mxCell>
<mxCell id="O3ffm1LxuBSNMCc37K82-23" value="igor 2" style="ellipse;whiteSpace=wrap;html=1;rounded=1;" vertex="1" parent="1">
<mxGeometry x="350" y="480" width="120" height="80" as="geometry"/>
</mxCell>
</root>
</mxGraphModel>
https://stackoverflow.com/questions/69974526
复制相似问题