SPARK Challenge 2026 — Technical Documentation
Technical details of the event-based spacecraft pose estimation pipeline developed for the SPARK Challenge 2026 (AI4Space @ CVPR 2026, Stream 2).
1. Event Camera Data & the SPADES Dataset
Unlike conventional frame-based cameras, event cameras (neuromorphic sensors) output asynchronous per-pixel brightness changes. Each event is a tuple (x, y, t, p) representing the pixel coordinate, timestamp, and polarity (positive or negative change).
The SPADES dataset (SPAcecraft Pose Estimation Dataset using Event Sensing) provides event streams of a spacecraft model under varied lighting and motion profiles, focusing on the Proba-2 satellite. It is split into two categories:
- Synthetic Dataset: Generated using Unreal Engine (UE) with dynamic backgrounds (animated Sun and Earth), and the ICNS event simulator using Blender to model the neuromorphic sensor. It contains 300 trajectories with 179,400 ground-truth pose labels.
- Real Dataset: Collected at the Zero-G Laboratory at the SnT, University of Luxembourg, using a scaled mockup of the Proba-2 satellite and a Prophesee Metavision EVK4-HD event vision sensor. It contains 32 trajectories with 16,900 pose labels.
2. Event Frame Encoding
To improve training efficiency and robustness, raw event streams were converted into 2-channel event frames (accumulated polarity surfaces):
# Event frame parameters
# Output: tensor of shape [C=2, H, W]
# channel 0: sum of positive events
# channel 1: sum of negative events
def events_to_event_frame(events, H, W):
"""Convert event stream to 2-channel event frame."""
frame = np.zeros((2, H, W), dtype=np.float32)
# Accumulate polarities
for i in range(len(events)):
x, y = events['x'][i], events['y'][i]
p = events['p'][i] # 0 (off) or 1 (on)
frame[p, y, x] += 1.0
# Log transformation for dynamic range compression
frame = np.log1p(frame)
return frame
This encoding significantly reduces the input dimensionality compared to voxel grids, allowing for deeper backbones while maintaining the critical spatial features of the spacecraft model.
3. Model Architecture
The final solution utilized a modular PoseNet architecture supporting high-performance backbones like EfficientNet-B3/B4:
class PoseNet(nn.Module):
def __init__(self, backbone="efficientnet-b3", in_channels=2):
super().__init__()
# Using Pretrained EfficientNet with modified input layer
self.backbone = PretrainedEfficientNet(
backbone_type=backbone,
in_channels=in_channels,
feature_dim=512,
pretrained=True
)
# Modular Pose Head for Translation and Rotation
self.pose_head = PoseHead(
feature_dim=512,
hidden_dim=256,
dropout=0.1
)
def forward(self, x):
features = self.backbone(x)
translation, quaternion = self.pose_head(features)
return translation, quaternion
Integration with Feature Pyramid Networks (FPN) further enhanced the model's ability to detect the spacecraft across varying distances and scales.
Loss Functions
We transitioned to a more robust loss formulation that decouples the optimization of translation and rotation:
# Translation: SmoothL1 is more robust to outliers in spacecraft trajectories
translation_loss = F.smooth_l1_loss(t_pred, t_gt)
# Rotation: Geodesic loss on the rotation manifold (SO3)
def geodesic_loss(q_pred, q_gt):
dot = torch.abs(torch.sum(q_pred * q_gt, dim=-1))
dot = torch.clamp(dot, -1.0 + 1e-7, 1.0 - 1e-7)
return torch.mean(torch.acos(dot))
total_loss = translation_loss + (lambda_rot * geodesic_loss(q_pred, q_gt))
For some versions, we also explored a 6D continuous rotation representation, which maps raw network outputs to orthogonal rotation matrices, providing a smoother gradient flow during training.
5. Training Configuration
| Parameter | Value |
|---|---|
| GPU | NVIDIA RTX 4090 |
| Backbone | EfficientNet-B3 / B4 + FPN |
| Input Channels | 2 (Polarity Frames) |
| Augmentations | Density Augmentation, Scale/Shift |
| Optimizer | AdamW (LR=1e-4) |
6. Post-Processing & Refinement
Hybrid Pose Construction
By ensembling multiple models, we found that certain training versions (v7) excelled at translation, while others (v15) were more stable in rotation. Our final submission combined the outputs from these specialized experts.
SLERP Smoothing
To eliminate temporal jitter in rotation, we applied Spherical Linear Interpolation (SLERP) across the sequence of predicted quaternions. This ensured that the estimated spacecraft trajectory was physically consistent and smooth.
7. Results
- Global Rank: 9th Place (Stream 2)
- Total Error: 0.125 (Aggregated Metric)
- Competition: SPARK 2026 @ CVPR 2026 AI4Space workshop