Import and Export
This document describes the import and export capabilities of s3dgraphy, the core Python library for managing Extended Matrix graphs.
Note
s3dgraphy version 0.1.31 supports:
Import: GraphML, XLSX (with JSON mapping), SQLite/pyArchInit (with JSON mapping)
Export: JSON, GraphML
Import System
s3dgraphy provides a flexible import system that can read data from multiple formats and convert them into the Extended Matrix graph structure.
GraphML Import
GraphML is the primary interchange format for Extended Matrix graphs. The GraphML importer reads graph structure, nodes, edges, and attributes.
Basic GraphML Import
from s3dgraphy.importer import GraphMLImporter
from s3dgraphy import Graph
# Create a new graph
graph = Graph("pompeii_excavation")
# Create importer and parse GraphML file
importer = GraphMLImporter("excavation_data.graphml")
graph = importer.parse()
# Check import results
print(f"Imported {len(graph.nodes)} nodes")
print(f"Imported {len(graph.edges)} edges")
# Check for warnings
if graph.warnings:
print("\nImport warnings:")
for warning in graph.warnings:
print(f" - {warning}")
GraphML Import Features
The GraphML importer includes:
Automatic node type detection from GraphML attributes and visual properties
Edge type mapping from visual line styles to semantic types:
solid line →
is_after/is_beforedouble line →
has_same_timedotted line →
changed_fromdashed line →
has_data_provenancedashed-dotted line →
contrasts_with
Container group node support: group nodes with specific background colours are converted to regular stratigraphic nodes with
is_part_ofedges:#9B3333(dark red) → US container#D86400(orange) → USD container#B19F61(gold) → VSF container
Comment node skipping: yEd annotation nodes with yellow fill colours (
#FFCC00,#FFFF00,#FFFF99) are automatically detected and skippedAttribute preservation for all node and edge properties
Warning system for incomplete or malformed data
Support for multilingual content (name, description fields)
Placeholder date detection (XX values) with warnings
# Import with detailed validation
importer = GraphMLImporter("site_data.graphml")
graph = importer.parse()
# Validate imported data
print("\n=== Import Summary ===")
print(f"Total nodes: {len(graph.nodes)}")
print(f"Total edges: {len(graph.edges)}")
# Count nodes by type
from collections import Counter
node_types = Counter(node.node_type for node in graph.nodes)
print("\nNodes by type:")
for node_type, count in node_types.items():
print(f" {node_type}: {count}")
# Count edges by type
edge_types = Counter(edge.edge_type for edge in graph.edges)
print("\nEdges by type:")
for edge_type, count in edge_types.items():
print(f" {edge_type}: {count}")
UUID Slipback System
The GraphML importer implements a slipback mechanism that automatically writes UUIDs back to the source GraphML file after import. This enables persistent identity across multiple edit-import cycles.
How it works:
First import: Parser generates UUIDs for all nodes and edges
Slipback: UUIDs are immediately written to custom
EMIDfields in the GraphML fileSubsequent imports: Parser reuses existing EMIDs instead of generating new UUIDs
Edit cycle: Users can edit the GraphML in yEd and reimport while preserving node/edge identity
Custom fields:
EMID(Extended Matrix ID): Stores the UUID for nodes and edgesURI(Uniform Resource Identifier): Reserved for future linking capabilities
from s3dgraphy.importer import GraphMLImporter
# First import - generates and writes UUIDs
importer = GraphMLImporter("site_data.graphml")
graph = importer.parse()
# Console output: "Generated new UUID: abc123... for node n1"
# Console output: "[GraphML Slipback] SUCCESS: Updated 50 nodes and 30 edges"
# Edit the GraphML file in yEd (rename nodes, add edges, etc.)
# ...
# Second import - reuses existing UUIDs
importer2 = GraphMLImporter("site_data.graphml")
graph2 = importer2.parse()
# Console output: "Reusing existing EMID as node ID: abc123... for node n1"
Duplicate EMID validation:
The system includes automatic detection of duplicate EMIDs, which can occur when duplicating nodes in yEd (Ctrl+D):
[GraphML Parser] Reusing existing EMID as node ID: aaaa-bbbb-cccc for node n1
⚠️ WARNING: Duplicate EMID detected! EMID aaaa-bbbb-cccc... is already used.
This usually happens when duplicating nodes in yEd (Ctrl+D).
Generating NEW UUID for node n2 to avoid conflicts.
When a duplicate is detected:
The first occurrence (by document order) keeps the original EMID
The second occurrence receives a new UUID
A warning message alerts the user
The GraphML file is updated via slipback with the new UUID
Note
The system cannot distinguish which node is the “original” vs “duplicated” - it only knows that two nodes have the same EMID. The first node encountered in the GraphML file (document order) is considered the original.
Current limitations:
Edge recreation: When you delete and recreate an edge in yEd, yEd does NOT preserve the EMID field. The recreated edge will receive a new UUID, losing its history.
No semantic awareness: The system uses document order, not semantic information (e.g., node names like “USM100” vs “USM101”) to identify duplicates.
yEd ID reuse: When adding new nodes in yEd, yEd may reuse GraphML node IDs (like
n5) from deleted nodes, which can cause confusion (though UUIDs remain unique).
Recommended workflows:
Operation |
Safe in yEd? |
Notes |
|---|---|---|
Rename nodes |
✓ Yes |
EMID preserved |
Add/remove nodes |
✓ Yes |
New nodes get new UUIDs |
Add/remove edges |
✓ Yes |
New edges get new UUIDs |
Duplicate nodes (Ctrl+D) |
⚠ Caution |
Auto-detected, new UUID generated |
Recreate edges |
✗ No |
Loses edge history (new UUID) |
Modify node properties |
✓ Yes |
EMID preserved |
For advanced editing scenarios (especially edge recreation), consider developing a custom Extended Matrix editor that properly manages EMID fields.
XLSX Import with JSON Mapping
s3dgraphy can import data from Excel files using JSON mapping configurations. This is useful for importing structured archaeological data from spreadsheets.
Mapped XLSX Import
from s3dgraphy.importer import MappedXLSXImporter
from s3dgraphy import Graph
# Create graph
graph = Graph("xlsx_import")
# Create importer with mapping file
importer = MappedXLSXImporter(
filepath="stratigraphic_units.xlsx",
mapping_name="emdb_basic", # Name of mapping in registry
graph=graph
)
# Parse and import
graph = importer.parse()
print(f"Imported {len(graph.nodes)} nodes from XLSX")
# Display any warnings
importer.display_warnings()
Mapping System
The mapping system uses JSON configuration files to define how Excel columns map to graph node attributes.
Mapping file structure:
{
"mapping_name": "emdb_basic",
"description": "Basic EMdb format for stratigraphic units",
"version": "1.0",
"format_type": "xlsx",
"table_settings": {
"sheet_name": "US",
"header_row": 0
},
"column_mappings": {
"US": {
"is_id": true,
"required": true,
"node_attribute": "node_id"
},
"Definition": {
"node_attribute": "description"
},
"Chronology": {
"node_attribute": "dating"
},
"Material": {
"node_attribute": "material"
}
},
"node_settings": {
"default_node_type": "US"
}
}
How column matching works:
Column names are normalized (uppercase, underscores replace spaces/dashes)
JSON mapping columns are matched to Excel columns after normalization
Unmatched columns generate warnings but don’t stop import
At minimum, the ID column must be found
# Example: Excel has columns "US Number", "Description", "Date"
# Mapping has "US", "Definition", "Chronology"
# After normalization: "US_NUMBER", "DESCRIPTION", "DATE"
# Matches: "US" -> "US_NUMBER" ✓, "Definition" -> "DESCRIPTION" ✓
# No match: "Chronology" (generates warning)
Custom Mapping Creation
You can create custom mapping files for your specific data formats:
# Custom mapping for site-specific format
custom_mapping = {
"mapping_name": "mysite_format",
"description": "Custom format for My Excavation Site",
"version": "1.0",
"format_type": "xlsx",
"table_settings": {
"sheet_name": "Stratigraphic Units",
"header_row": 0
},
"column_mappings": {
"Unit_ID": {
"is_id": true,
"required": true,
"node_attribute": "node_id"
},
"Unit_Type": {
"node_attribute": "node_type",
"required": true
},
"Description_English": {
"node_attribute": "description"
},
"Excavation_Area": {
"node_attribute": "area"
},
"Excavator_Name": {
"node_attribute": "excavator"
}
},
"node_settings": {
"default_node_type": "US",
"create_properties": true
}
}
# Save to JSON file
import json
with open('mappings/mysite_format.json', 'w') as f:
json.dump(custom_mapping, f, indent=2)
# Register and use
from s3dgraphy.mappings import mapping_registry
mapping_registry.register_mapping('mysite_format', custom_mapping)
SQLite/pyArchInit Import
s3dgraphy can import data from SQLite databases, with specialized support for pyArchInit database format.
PyArchInit Database Import
from s3dgraphy.importer import PyArchInitImporter
from s3dgraphy import Graph
# Create graph
graph = Graph("pyarchinit_import")
# Create importer with mapping
importer = PyArchInitImporter(
filepath="excavation.db",
mapping_name="pyarchinit_us_table", # Predefined mapping
graph=graph
)
# Parse database
graph = importer.parse()
print(f"Imported {len(graph.nodes)} nodes from database")
importer.display_warnings()
PyArchInit mapping example:
{
"mapping_name": "pyarchinit_us_table",
"description": "PyArchInit US table mapping",
"version": "1.0",
"format_type": "sqlite",
"table_settings": {
"table_name": "us_table",
"id_column": "sito||'_'||area||'_'||us"
},
"column_mappings": {
"sito": {
"node_attribute": "site"
},
"area": {
"node_attribute": "area"
},
"us": {
"is_id": true,
"node_attribute": "node_id"
},
"d_stratigrafica": {
"node_attribute": "description"
},
"interpretazione": {
"node_attribute": "interpretation"
}
},
"node_settings": {
"default_node_type": "US",
"id_format": "{site}_{area}_{us}"
}
}
Import Factory Function
The create_importer factory function provides a unified interface for all import formats:
from s3dgraphy.importer import create_importer
from s3dgraphy import Graph
# GraphML import
importer = create_importer(
filepath='data.graphml',
format_type='graphml'
)
# XLSX import with mapping
importer = create_importer(
filepath='data.xlsx',
format_type='xlsx',
mapping_name='emdb_basic'
)
# SQLite import
importer = create_importer(
filepath='excavation.db',
format_type='sqlite',
mapping_name='pyarchinit_us_table'
)
# Parse with any importer
graph = importer.parse()
Export System
JSON Export
s3dgraphy exports graphs to JSON format, which is used for web visualization platforms like Heriverse and ATON.
Basic JSON Export
from s3dgraphy.exporter import JSONExporter
from s3dgraphy import get_graph
# Get graph to export
graph = get_graph("pompeii_excavation")
# Create exporter
exporter = JSONExporter("output/project.json")
# Export single graph
exporter.export_graphs([graph.graph_id])
print(f"Exported graph to project.json")
Export All Graphs
from s3dgraphy.exporter import JSONExporter
from s3dgraphy import get_all_graph_ids
# Export all loaded graphs
exporter = JSONExporter("output/all_graphs.json")
exporter.export_graphs() # No arguments = export all
# Or specify multiple graphs
graph_ids = get_all_graph_ids()
exporter.export_graphs(graph_ids)
JSON Export Structure
The exported JSON has this structure:
{
"version": "1.5",
"graphs": {
"pompeii_house_vii": {
"name": "House VII Excavation",
"description": "2024 excavation campaign",
"defaults": {
"license": "CC-BY-NC-ND",
"authors": ["AUTH.001", "AUTH.002"],
"embargo_until": null,
"panorama": "panorama/defsky.jpg"
},
"nodes": {
"US": [
{
"type": "US",
"name": "US001",
"description": "Mosaic floor",
"data": {
"material": "tesserae",
"dating": "1st century CE"
}
}
],
"DOC": [
{
"type": "DOC",
"name": "DOC001",
"description": "Floor photograph",
"data": {}
}
]
},
"edges": {
"is_before": [
{
"id": "edge_001",
"from": "US002",
"to": "US001"
}
],
"has_documentation": [
{
"id": "edge_002",
"from": "US001",
"to": "DOC001"
}
]
}
}
}
}
Convenience Export Function
from s3dgraphy.exporter import export_to_json
# Simple one-line export
export_to_json("output/graphs.json") # Exports all graphs
# Export specific graphs
export_to_json("output/subset.json", ["graph_1", "graph_2"])
Usage in EM-tools Heriverse Exporter
This is a real-world example of how s3dgraphy’s JSONExporter is used in EM-tools for Blender to export projects to Heriverse format:
# From EM-tools exporter_heriverse.py
from s3dgraphy.exporter.json_exporter import JSONExporter
from s3dgraphy import get_graph, get_all_graph_ids
def export_heriverse_project(context):
"""Export complete Heriverse project with 3D models and graph data"""
# Step 1: Update graph with current Blender scene data
# This syncs any changes made in Blender back to the graph
update_graph_with_scene_data(update_all_graphs=True, context=context)
# Step 2: Export JSON using JSONExporter
json_path = os.path.join(project_path, "project.json")
print(f"Exporting JSON to: {json_path}")
# Create exporter
exporter = JSONExporter(json_path)
# Export all graphs (or only publishable ones)
exporter.export_graphs()
print("JSON export completed successfully")
# The exported JSON is then used by Heriverse web platform
# to display the 3D models with their graph relationships
Complete Heriverse export workflow:
User works in Blender with 3D models and EM graph
Models are linked to graph nodes (US, USV, DOC, etc.)
Export operator exports:
3D models (glTF format) to
/modelsfolderProxy models to
/proxiesfolderDocumentation files to
/doscofolderGraph data via JSONExporter to
project.json
Heriverse platform reads
project.jsonto:Display graph structure
Link 3D models to nodes
Show temporal relationships (epochs)
Display paradata chains
Manage documentation
Import/Export Best Practices
Data Validation
Always validate imported data:
def validate_import(graph):
"""Validate imported graph data"""
issues = []
# Check for orphaned nodes
node_ids = {n.node_id for n in graph.nodes}
for edge in graph.edges:
if edge.edge_source not in node_ids:
issues.append(f"Edge {edge.edge_id} references missing source {edge.edge_source}")
if edge.edge_target not in node_ids:
issues.append(f"Edge {edge.edge_id} references missing target {edge.edge_target}")
# Check required attributes
for node in graph.nodes:
if not hasattr(node, 'name') or not node.name:
issues.append(f"Node {node.node_id} missing name")
# Report issues
if issues:
print("Validation issues found:")
for issue in issues:
print(f" - {issue}")
else:
print("✓ Graph validation passed")
return len(issues) == 0
Error Handling
Wrap import/export operations in proper error handling:
def safe_import(filepath, format_type, **kwargs):
"""Import with comprehensive error handling"""
try:
importer = create_importer(
filepath=filepath,
format_type=format_type,
**kwargs
)
graph = importer.parse()
# Check warnings
if graph.warnings:
print(f"Import completed with {len(graph.warnings)} warnings")
for warning in graph.warnings:
print(f" ⚠ {warning}")
# Validate
if validate_import(graph):
print("✓ Import successful and validated")
return graph
else:
print("✗ Import completed but validation failed")
return graph # Return anyway, let user decide
except FileNotFoundError:
print(f"✗ Error: File not found: {filepath}")
return None
except Exception as e:
print(f"✗ Import failed: {str(e)}")
import traceback
traceback.print_exc()
return None
Performance Considerations
For large datasets:
# Use graph indices for efficient queries
graph = importer.parse()
# Indices are automatically built
# Access via graph.indices for O(1) lookups
# Get all US nodes efficiently
us_nodes = graph.get_nodes_by_type("US")
# Find edges by source efficiently
edges_from_us001 = [
e for e in graph.edges
if e.edge_source == "US001"
]
GraphML Export
s3dgraphy can export graphs back to GraphML format, enabling full round-trip editing workflows with yEd and other GraphML-compatible tools.
Basic GraphML Export
from s3dgraphy.exporter.graphml import GraphMLExporter
# Create exporter with the graph to export
exporter = GraphMLExporter(graph)
# Export to file
exporter.export("output/site_data.graphml")
GraphML Export Features
The GraphML exporter preserves:
Node types and visual properties (shapes, colours, borders)
Edge types with correct line styles (solid, dotted, dashed, double, dashed-dotted)
Container group nodes with correct background colours (US:
#9B3333, USD:#D86400, VSF:#B19F61)Epoch swimlanes and activity groups
Paradata node groups
Canvas layout with node positions
Round-Trip Workflow
The combination of GraphML import (with UUID slipback) and GraphML export enables a complete round-trip:
Author a graph in yEd
Import into s3dgraphy (UUIDs written back to the file)
Process, validate, or enrich the graph programmatically
Export back to GraphML for further editing in yEd
from s3dgraphy import GraphMLImporter
from s3dgraphy.exporter.graphml import GraphMLExporter
# Import
importer = GraphMLImporter("site_data.graphml")
graph = importer.parse()
# ... process graph ...
# Export back
exporter = GraphMLExporter(graph)
exporter.export("site_data_processed.graphml")
Future Export Formats
Under consideration for future releases:
GeoJSON export for GIS integration
RDF/TTL export for semantic web (CIDOC-CRM compliance)
Neo4j export for graph database integration
See Also
JSON Configuration Files - JSON configuration files documentation
Mapping System Guide - Detailed mapping system guide
Integration with EM-tools for Blender - Integration with EM-tools for Blender
s3dgraphy Classes Reference - Complete API reference