Jason’s research continues in multiple directions — though the threads often connect.
The first direction is towards using association as a learning signal — discovering what goes together, continuously, from lived experience. This versatile idea has been tested on literature, drug discovery (papers below) and is currently being explored in embodied AI (ongoing).
The second is the optical encoding of weight matrices in simulated holographic volumes to reveal interesting properties of distributed storage (papers below).
The third is reservoir computing in networks of oscillators spanning many frequency decades, where a slow rhythm selects which fast dynamics are doing the computing (currently underway).
Each paper has links to arXiv, Zenodo and the associated Git repo.
The foundation: the first encoding scheme, and the first hint of the effect. A simulation study of storing many weight matrices in one shared wave-optical volume.
The companion to the paper above: the mechanism behind the effect, the geometry that scales it, and the 64-matrix demonstration. A simulation study.
The same contrastive learning method that improved text retrieval and discovered narrative structure in 9,766 novels also works in molecular biology. Trained on protein interaction data from the STRING database, the model recovers functional gene relationships that are invisible to standard expression similarity — achieving strong discrimination (AUC 0.908) in precisely the regime where similarity-based methods fall to chance (0.518). A second gene dataset confirms the result. Two drug experiments identify where the method fails and why. Three findings emerge that the text experiments alone could not have revealed: biological associations transfer to completely unseen genes where text associations do not, improvement concentrates on understudied genes with the greatest need for functional characterisation, and tighter association quality outperforms larger but noisier training data — reversing the pattern seen in text.
PAM trained at scale on 9,766 Project Gutenberg novels (25 million text chunks, 373 million temporal relationships) discovers hierarchical narrative structure without supervision. The model learns what passages do rather than what they’re about, grouping a Victorian chase scene with a Russian chase scene because they perform the same structural beat, despite sharing no vocabulary. Clusters range from broad narrative modes at coarse resolution to specific registers like courtroom cross-examination and sailor dialect at fine resolution. Held-out novels receive coherent assignments through single-pass inference, demonstrating inductive transfer. An interactive demo is live.
The foundational paper. A JEPA-style predictor trained on temporal co-occurrence learns to retrieve items linked through shared experience rather than representational proximity. On a synthetic benchmark, the predictor’s top retrieval is a true temporal associate 97% of the time, recovering associations across representational boundaries where cosine similarity scores zero. A temporal shuffle control confirms the signal is genuine co-occurrence structure, not embedding geometry. A held-out evaluation confirms the predictor remembers what it experienced, from the perspectives at which it experienced it.
The applied paper. A lightweight MLP (4.2M parameters) trained on passage co-occurrence reranks dense retrieval results for multi-hop question answering. On HotpotQA, Recall@5 improves by 8.6 points without evaluation-set tuning, with gains concentrated on the hardest questions (+28.5 points where the dense baseline fails). On MuSiQue, +10.1 points. Training on similar-but-not-associated pairs actively degrades retrieval; association and similarity aren’t just different, they’re sometimes opposite. An inductive variant shows no significant improvement, confirming the method captures corpus-specific co-occurrences. The method adds 3.7ms per query and trains in under two minutes.
Pack several patterns into one piece of holographic material and they fight for space — each new pattern muddies the ones already there. That’s the usual story for shared storage. We’ve been chasing a setup that breaks it.
In simulation, we send light through a shared three-dimensional volume and write many weight matrices into it at once, each tagged by its own angular signature. As we add matrices, the ones already stored get sharper. The volume carries far more structure than any single matrix needs, so every matrix we add pins that spare structure down a little further, tightening everything at once. We call it cooperative encoding.
The trick that unlocks it is geometric. Casting the optics in cylindrical coordinates gives exact, clean separation between channels — and because this lives in simulation, we can keep adding rotational axes that no physical optical system could have. Four axes give an enormous number of addressable slots. In one run, 64 matrices share a single volume, every one read back cleanly, none of them degraded.
A caveat we hold firmly: this is all simulation. We’re mapping the mathematical structure of wave-optical storage before we commit it to hardware. The structure is what excites us — a shared medium where room is abundant, channels stay clean, and packing things in tighter actually helps. Whether that becomes associative memory, a dense store for model weights, or something we haven’t pinned down, is what the next stretch of work is for.
Two papers tell the full story:
→ Interference-Resistant Weight Matrix Updates in a Shared Holographic Volume — the foundation: the first encoding scheme, and the first hint of the effect.
→ Cooperative Encoding of Weight Matrices in a Simulated Multi-Rotational Wave-Optical Volume — the mechanism behind it, the geometry that scales it, and the 64-matrix demonstration.
Reservoir computing has a long-standing rule of thumb: these systems tend to compute best when poised at the “edge of chaos,” balanced between order and disorder. A natural idea is to turn that edge into a control. Take a network of oscillators whose natural frequencies span several decades, add a slow periodic drive, and use the drive to move the network across the edge — selecting how it computes as the rhythm advances. We set out to test whether that works. It doesn’t, and the reason it doesn’t is the substance of the result.
In simulation we drive a multi-decade oscillator network with a slow periodic signal and check whether the drive reshapes the network’s effective dynamics — whether performance tracks the bifurcations the drive is meant to engineer. Across four independent lines of attack the answer is consistent: it does not. The regime where the network computes best turns out to have no edge to sit on. It is deeply damped, held active only by the input passing through it, and no setting of the drive’s strength or frequency moves its stability appreciably.
The explanation comes down to bandwidth. A single rhythm can phase-match a set of oscillators packed into one narrow band, but a network spread across decades has no single collective mode for one rhythm to act on. The wider the frequency spread, the less any drive can engage it. Stated as a principle: parametric control of this kind requires spectral coherence, and a broadband reservoir does not have it.
The usual caveat applies — this is simulation throughout — and so does a second one: this is a negative result. We think it is a useful one. It closes a tempting line of work with a mechanism rather than a guess, and it leaves a positive finding standing. The right way to read this network’s memory is its driven stability spectrum, a diagnostic the standard edge-of-chaos picture overlooks in this setting.
The work also points forward. The same coupling that damps the network — oscillators of different speeds shedding energy into one another — is also a channel between them. The drive cannot tune the network, but information injected into one frequency band does reach the others through that lossy exchange. The next phase asks how much of it survives: whether a reading taken from the slow band can recover what was written into the fast band, through the coupling alone.
One paper is in preparation:
→ Floquet Engineering Fails on a Multi-Decade Oscillator Reservoir, and Why — the four null results, the spectral-coherence mechanism that unifies them, and the memory diagnostic that survives.
A reliability-weighted learning mechanism where component adaptability is determined by predictive accuracy. Plasticity reactivates automatically during distribution shifts. An investigation into managing the handoff from bootstrapping to intrinsic cortical function.