[R] Continuous Thought Machines: neural dynamics as representation. by /u/Gramious
Monday May 12^th, 2025 at 1:18 PM

upvoted by clumma

Try our interactive maze-solving demo: https://pub.sakana.ai/ctm/

Continuous Thought Machines

arXiv: https://arxiv.org/abs/2505.05522
Interactive Website: https://pub.sakana.ai/ctm/
Blog Post: https://sakana.ai/ctm/
GitHub Repo: https://github.com/SakanaAI/continuous-thought-machines

We're excited to share our new research on Continuous Thought Machines (CTMs), a novel approach aiming to bridge the gap between computational efficiency and biological plausibility in artificial intelligence. We're sharing this work openly with the community and would love to hear your thoughts and feedback!

What are Continuous Thought Machines?

Most deep learning architectures simplify neural activity by abstracting away temporal dynamics. In our paper, we challenge that paradigm by reintroducing neural timing as a foundational element. The Continuous Thought Machine (CTM) is a model designed to leverage neural dynamics as its core representation.

Core Innovations:

The CTM has two main innovations:

Neuron-Level Temporal Processing: Each neuron uses unique weight parameters to process a history of incoming signals. This moves beyond static activation functions to cultivate richer neuron dynamics.
Neural Synchronization as a Latent Representation: The CTM employs neural synchronization as a direct latent representation for observing data (e.g., through attention) and making predictions. This is a fundamentally new type of representation distinct from traditional activation vectors.

Why is this exciting?

Our research demonstrates that this approach allows the CTM to:

Perform a diverse range of challenging tasks: Including image classification, solving 2D mazes, sorting, parity computation, question-answering, and RL tasks.
Exhibit rich internal representations: Offering a natural avenue for interpretation due to its internal process.
Perform tasks requirin sequential reasoning.
Leverage adaptive compute: The CTM can stop earlier for simpler tasks or continue computing for more challenging instances, without needing additional complex loss functions.
Build internal maps: For example, when solving 2D mazes, the CTM can attend to specific input data without positional embeddings by forming rich internal maps.
Store and retrieve memories: It learns to synchronize neural dynamics to store and retrieve memories beyond its immediate activation history.
Achieve strong calibration: For instance, in classification tasks, the CTM showed surprisingly strong calibration, a feature that wasn't explicitly designed for.

Our Goal:

It is crucial to note that our approach advocates for borrowing concepts from biology rather than insisting on strict, literal plausibility. We took inspiration from a critical aspect of biological intelligence: that thought takes time.

The aim of this work is to share the CTM and its associated innovations, rather than solely pushing for new state-of-the-art results. We believe the CTM represents a significant step toward developing more biologically plausible and powerful artificial intelligence systems. We are committed to continuing work on the CTM, given the potential avenues of future work we think it enables.

We encourage you to check out the paper, interactive demos on our project page, and the open-source code repository. We're keen to see what the community builds with it and to discuss the potential of neural dynamics in AI!

submitted by /u/Gramious to r/MachineLearning
[link] [comments]

Read the whole story

clumma

19 hours ago

reply

Berkeley, CA

The Fastest Way Yet to Color Graphs by Steve Nadis
Monday May 12^th, 2025 at 1:18 PM

Quanta Magazine

Here’s a scary scenario: You’ve been put in charge of air traffic control at Newark airport near New York. You need to make sure every plane can taxi between the runway and its gate without hitting any other planes. Let’s bring the power of mathematics to bear on your problem. First, create a big, abstract map of your airport. For each runway, taxiway and gate, mark a point. Then…

Source

Read the whole story

clumma

19 hours ago

reply

Berkeley, CA

Absolute Zero Reasoner by jonbaer
Monday May 12^th, 2025 at 11:38 AM

Hacker News - beefman's favorites

Article URL: https://andrewzh112.github.io/absolute-zero-reasoner/

Comments URL: https://news.ycombinator.com/item?id=43922341

Points: 69

# Comments: 14

Read the whole story

clumma

20 hours ago

reply

Berkeley, CA

First concrete for US advanced reactor by /u/Spare-Pick1606
Friday May 9^th, 2025 at 12:22 AM

upvoted by clumma

https://www.world-nuclear-news.org/articles/first-concrete-for-us-advanced-reactor

submitted by /u/Spare-Pick1606 to r/nuclear
[link] [comments]

Read the whole story

clumma

4 days ago

reply

Berkeley, CA

[P] Introducing the Intelligent Document Processing (IDP) Leaderboard – A Unified Benchmark for OCR, KIE, VQA, Table Extraction, and More by /u/SouvikMandal
Thursday May 8^th, 2025 at 1:47 PM

upvoted by clumma

The most comprehensive benchmark to date for evaluating document understanding capabilities of Vision-Language Models (VLMs).

What is it?
A unified evaluation suite covering 6 core IDP tasks across 16 datasets and 9,229 documents:

Key Information Extraction (KIE)
Visual Question Answering (VQA)
Optical Character Recognition (OCR)
Document Classification
Table Extraction
Long Document Processing (LongDocBench)
(Coming soon: Confidence Score Calibration)

Each task uses multiple datasets, including real-world, synthetic, and newly annotated ones.

Highlights from the Benchmark

Gemini 2.5 Flash leads overall, but surprisingly underperforms its predecessor on OCR and classification.
All models struggled with long document understanding – top score was just 69.08%.
Table extraction remains a bottleneck — especially for long, sparse, or unstructured tables.
Surprisingly, GPT-4o's performance decreased in the latest version (gpt-4o-2024-11-20) compared to its earlier release (gpt-4o-2024-08-06).
Token usage (and thus cost) varies dramatically across models — GPT-4o-mini was the most expensive per request due to high token usage.

Why does this matter?
There’s currently no unified benchmark that evaluates all IDP tasks together — most leaderboards (e.g., OpenVLM, Chatbot Arena) don’t deeply assess document understanding.

Document Variety
We evaluated models on a wide range of documents: Invoices, forms, receipts, charts, tables (structured + unstructured), handwritten docs, and even diacritics texts.

Get Involved
We’re actively updating the benchmark with new models and datasets.

This is developed with collaboration from IIT Indore and Nanonets.

Leaderboard: https://idp-leaderboard.org/
Release blog: https://idp-leaderboard.org/details/
GithHub: https://github.com/NanoNets/docext/tree/main/docext/benchmark

Feel free to share your feedback!

submitted by /u/SouvikMandal to r/MachineLearning
[link] [comments]

Read the whole story

clumma

4 days ago

reply

Berkeley, CA

WeightWatchers files bankruptcy by bookofjoe
Thursday May 8^th, 2025 at 1:20 AM

Hacker News - beefman's favorites

Article URL: https://www.wsj.com/articles/weightwatchers-files-bankruptcy-to-adapt-to-chemically-induced-weight-loss-future-a63aa8ac

Comments URL: https://news.ycombinator.com/item?id=43916411

Points: 60

# Comments: 216

Read the whole story

clumma

5 days ago

reply

Berkeley, CA

[R] Continuous Thought Machines: neural dynamics as representation. by /u/Gramious Monday May 12th, 2025 at 1:18 PM

Continuous Thought Machines

The Fastest Way Yet to Color Graphs by Steve Nadis Monday May 12th, 2025 at 1:18 PM

Absolute Zero Reasoner by jonbaer Monday May 12th, 2025 at 11:38 AM

First concrete for US advanced reactor by /u/Spare-Pick1606 Friday May 9th, 2025 at 12:22 AM

[P] Introducing the Intelligent Document Processing (IDP) Leaderboard – A Unified Benchmark for OCR, KIE, VQA, Table Extraction, and More by /u/SouvikMandal Thursday May 8th, 2025 at 1:47 PM

WeightWatchers files bankruptcy by bookofjoe Thursday May 8th, 2025 at 1:20 AM

[R] Continuous Thought Machines: neural dynamics as representation. by /u/Gramious
Monday May 12^th, 2025 at 1:18 PM

The Fastest Way Yet to Color Graphs by Steve Nadis
Monday May 12^th, 2025 at 1:18 PM

Absolute Zero Reasoner by jonbaer
Monday May 12^th, 2025 at 11:38 AM

First concrete for US advanced reactor by /u/Spare-Pick1606
Friday May 9^th, 2025 at 12:22 AM

[P] Introducing the Intelligent Document Processing (IDP) Leaderboard – A Unified Benchmark for OCR, KIE, VQA, Table Extraction, and More by /u/SouvikMandal
Thursday May 8^th, 2025 at 1:47 PM

WeightWatchers files bankruptcy by bookofjoe
Thursday May 8^th, 2025 at 1:20 AM