Q & A: Answers -2 🛠️

Answers to Q&A of Hands-on Geometric Deep Learning articles

Patrick R. Nicolas

Jan 15, 2026

Note: The first 32 Q&A responses are described at Q&A Answers 1

Curvature-informed Graph Learning

Q1: What GNN shortcomings are addressed by the application of discrete differential geometry?

A1:

Reducing Over-squashing
Mitigating Over-smoothing
Capturing Higher-Order Structures
Optimizing Geometric Convergence

Q2: What is the alternative to using the joint distribution to computing as input the Earth Mover’s distance between distributions X and Y?

A2: Using the marginal distribution X and Y, noted as r (for rows) and c (for columns) in the article.

Q3: Which two variables or components constitute the formula for Ollivier-Ricci curvature in a mesh setting?

A3: Wasserstein distance W and the shortest path d.

\(\kappa_{OR}(i, j)= 1-\frac{W_{1}(\mu _{i}, \mu _{j})}{d(i, j)}\)

Q4: What geometric feature is indispensable for computing curvature when a graph lies on a non-flat manifold?

A4: Geodesic. This is the shortest path between nodes on a manifold.

Q5: If a graph is equally distributed on a perfectly flat plane (e.g., grid), what is the resulting curvature of its edges?

A5: Zero.

Animation Tools for Geometric Deep Learning

Q1: What is the primary performance limitation when utilizing Manim for large-scale model animations?

A1: Memory and speed. Manim requires significant amount of memory got mobjects and scenes in high quality video setting. It is recommended to use low quality video during development and debugging.

Q2: Which of the most common Matplotlib API functions is responsible for managing animations on a frame-by-frame basis?

A2: matplotlib.animation.FuncAnimation

animation = FuncAnimation(fig, 
                          update_func, 
                          frames=None, 
                          init_func=None, 
                          fargs=None, 
                          save_count=None, 
                          *, 
                          cache_frame_data=True, 
                          **kwargs)

Note: The API function, matplotlib.animation.ArtistAnimation is more flexible but less commonly used.

Q3: What is the standard method used to define and construct a Scene within the Manim framework?

A3: construct

Q4: Which animation library provides native, comprehensive support for dynamic LaTeX rendering?

A4: Manim is the only library that allow dynamic update of LaTeX formula.

Q5: What is the correct CLI command to render a specific scene - NewScene - in Manim using medium quality video?

A5: manim -pqm main.py NewScene

Geometry in Abstract World Models

Q1: How does the latent space geometry in Abstract World Models diverge from the Euclidean assumptions found in standard Convolutional or Transformer architectures?

A1: Latent space relies on non-euclidean geometry - Riemannian manifolds for representing states, prediction, actions …

Q2: What are the three components of an agent in the architecture of the Ha & Schmidhuber (2018) world model?

A2: The agent orchestrates the following components:

Vision model - compressing observed data such as pixel into latent features
Memory model - A recurrent neural network predicting next state or value in latent space
Controller - Action selected from the output of the memory model.

Q3: To what extent do Abstract World Models prioritize the reconstruction of raw sensory data (e.g., images or video) compared to latent-only representations?

A3: Fundamentally, Abtract World Models do not reconstruct the raw sensory or observed data

Q4: Within the JEPA framework, what is the specific nature of the output generated during a prediction, planning or simulation phase?

A4: Both the input and output of prediction, planning or simulation are latent.

Q5: What are some of the geometric representations commonly employed to structure the latent spaces of Abstract World Models?

A5: Smooth or discrete manifolds, graph, topological complexes

Q6: Can you list some of the fields of mathematics that might be involved with the design and execution of Abstract World Models?

A6: Algebraic topology, differential geometry, graph theory, category theory.

Graphs Deserve Some Attention

Q1: What are the main shortcomings of Graph Convolutional Networks (GCNs)that the attention mechanism in GATs was designed to fix?

A1: GAT uses learnable attention weights assigned to different neighboring nodes based on their features while GCN assigns a fixed weight to neighbors. GCN have transductive learning strategy that requires to load the entire graph.

Q2: Can you name a few PyTorch Geometric (PyG) classes that allow you to use Graph Attention Layers in your code?

A2: Here is a partial list.. GATConv, RGATConv, GAT2Conv, TranformerConv, FusedGATConv

Q3: Is a pooling layer required after a graph attention layer if the goal is to classify individual nodes?

A3: The short answer is no. GATs are designed for node-level representation learning. Attention layers do not need a pooling mechanism for node classification and link prediction as the attention mechanism already pools information from its neighbors.

Q4: Do modules like GATConv or RGATConv handle non-linear activation functions internally, or do you need to add them separately?

A4: An attention module of type LeakyReLU with a slope 0.2 is provided for most of the PyTorch Geometric attention modules . This is not the case however for TransformerConv or AGNNConv.

Benchmarking Topological Deep Learning

Q1: Which topological domains (or structures) are most commonly supported in modern TDL and TDA workflows?

A1: The most common to topological domains are simplicial complexes, hypergraphs, cell complexes and combinatorial complexes

Q2: Is it possible to perform topological lifting on structures other than standard graphs?

A2: Yes. Although lifting a graph into a simplicial complex or a hypergraphs are quite comment, lifting process can involve topological domains such as lifting a simplicial complex into a hypergraph.

Q3: Can you provide examples of lifting algorithms used to generate higher-order features?

A3: There are many algorithms to lift graphs and topological complexes such as Clique Complex Lifting, Curvature-Based Lifting or Cycle Lifting.

Q4: Which coding frameworks does TopoBench leverage for its modeling and evaluation pipelines?

A4: TopoBench used PyTorch Lightning for modeling and training. Moreover, the following libraries are used:

NetworkX For graph initialization and visualization
TopoNetX for Topological structures and algorithms,
TopoModelX for Topological Neural Networks
PyTorch Geometric for graph datasets and loaders

Guided Tour of Joint-Embedding Prediction Architecture

Q1: What are the key components of the Joint-Embedding Prediction Architecture?

A1: The components are a context encoder and a target encoder that convert observations into latent state and a predictor that processes these latent states.

Q2: What is the main purpose of regularization in JEPA (VICReg or SIGReg)?

A2: The purpose of regularization the loss function in the encoder is to avoid collapse of the representation for which the latent states becomes indistinguishable.

Q3: Can you describe Latent Planning?

A3: By mapping both observations and actions into a latent space, the JEPA model predicts the world’s next physical state. It performs planning by generating a sequence of future latent steps that lead to a specific target

Q4: Can you list some of the limitations in JEPA?

A4:

The mitigation of representation collapses is not very reliable
The non-generative approach to prediction is not universally accepted
The procedure of masking latent data as input to the predictor is error prone
Keeping track of new iterative improvements in JEPA can be challenging

Q5: How invariance-based reconstruction differs from generative models?

A5: Invariance-based training does not use observed data (labeled) in evaluating the predictive loss and therefore does not need a decoder.

Discussion about this post

Ready for more?