clumma's blurblog

[R] DeepSeek-R1’s paper was updated 2 days ago, expanding from 22 pages to 86 pages and adding a substantial amount of detail. by /u/Nunki08
Thursday January 8^th, 2026 at 2:03 PM

upvoted by clumma

arXiv:2501.12948 [cs.CL]: https://arxiv.org/abs/2501.12948

submitted by /u/Nunki08 to r/MachineLearning
[link] [comments]

Read the whole story

clumma

2 days ago

reply

Berkeley, CA

The biggest breakthroughs in longevity science in 2025 by /u/statto
Thursday January 8^th, 2026 at 2:03 PM

upvoted by clumma

The biggest breakthroughs in longevity science in 2025

At The Longevity Initiative, we’re welcoming the new year with five articles reviewing the old one. From billion-dollar bets on cellular reprogramming to mice living longer, Netflix documentaries and even a leaked hot-mic of Xi and Putin discussing living to 150, 2025 kept aging science in the headlines. The field saw progress, setbacks, and growing debates about policy, equity, and hype.

We’ll be releasing these over the course of this week, starting with 2025 in longevity science (the link in this post). This will be followed by 2025 in longevity business, funding, medicine, and comms, policy and politics on Friday.

I’ll update this post with links as the new pieces go live, or you can follow us on social media (X, LinkedIn, Instagram, Bluesky and Facebook) or sign up for our newsletter to hear about them too.

submitted by /u/statto to r/longevity
[link] [comments]

Read the whole story

clumma

2 days ago

reply

Berkeley, CA

[D] Clean, self-contained PyTorch re-implementations of 50+ ML papers (GANs, diffusion, meta-learning, 3D) by /u/papers-100-lines
Thursday January 8^th, 2026 at 2:01 PM

upvoted by clumma

This repository collects clean, self-contained PyTorch reference implementations of over 50 machine learning papers, spanning GANs, VAEs, diffusion models, meta-learning, representation learning, and 3D reconstruction.

The implementations aim to:

Stay faithful to the original methods
Minimize boilerplate while remaining readable
Be easy to run and inspect as standalone files
Reproduce key qualitative or quantitative results where feasible

Repository (open-source):
https://github.com/MaximeVandegar/Papers-in-100-Lines-of-Code

Interested in hearing where clean, self-contained implementations are sufficient for understanding and reproducing results, and where additional engineering or scale becomes unavoidable.

submitted by /u/papers-100-lines to r/MachineLearning
[link] [comments]

Read the whole story

clumma

2 days ago

reply

Berkeley, CA

Stackoverflow: Questions asked per month over time. by /u/lelanthran
Thursday January 8^th, 2026 at 2:01 PM

upvoted by clumma

submitted by /u/lelanthran to r/programming
[link] [comments]

Read the whole story

clumma

2 days ago

reply

Berkeley, CA

Biosplice submits first osteoarthritis drug to FDA for consideration by /u/sg3510
Tuesday January 6^th, 2026 at 8:05 PM

upvoted by clumma

https://www.biospace.com/press-releases/biosplice-announces-the-submission-of-its-new-drug-application-nda-to-the-fda-for-lorecivivint-lor-to-treat-knee-osteoarthritis

This is the first to be submitted for FDA approval with disease-modifying potential and, if approved, would be the first ever drug to show improvement in pain and joint space width (ie evidence of regrowing cartilage).

Their first phase 2 study failed but their phase 2b with a smaller patient subset (with pain only in one knee) showed improvement.

submitted by /u/sg3510 to r/longevity
[link] [comments]

Read the whole story

clumma

3 days ago

reply

Berkeley, CA

[D] r/MachineLearning - a year in review by /u/Everlier
Sunday January 4^th, 2026 at 6:03 PM

upvoted by clumma

This is a review of most upvoted posts on this sub in 2025, loosely grouped into high-level themes. Many important news will be missing, however that is indicative of discussion lying elsewhere at that time. I hope that you'll find it informative.

Open-Source Parity and Training Efficiency

The year began with excitement about frontier models becoming accessible. DeepSeek R1 and its open-source distillations dominated discussion (386 upvotes, by u/Brief-Zucchini-180), though users noted that locally-runnable versions were distilled models (8B or 32B) rather than the full 671B version, performing at roughly GPT-3.5 level. The broader story was DeepSeek's decision to open-source (965 upvotes, by u/we_are_mammals), despite reportedly achieving 45x training efficiency gains. Discussion centered on monetization models - commenters drew parallels to Meta's Llama strategy, noting open-source drives adoption and hosting revenue while reducing self-hosting friction. By late year, a researcher replicated DeepSeek-R1-Zero's RL recipe on a 3B model for under $30 (278 upvotes, by u/Happysedits), though skepticism emerged about whether improvements represented genuine iterative development or data leakage.

The Conference Crisis

NeurIPS became a cautionary tale about scale. The community watched submission volumes climb to unprecedented levels (from 9k in 2022 to 25k in 2025 according to one discussion (243 upvotes, by u/lapurita)), with acceptance becoming "increasingly lottery-like." Reports emerged that NeurIPS was instructing Senior Area Chairs to reject already-accepted papers due to venue constraints (433 upvotes, by u/impatiens-capensis), despite positive reviews. AAAI 2026 received 29,000 submissions (201 upvotes, by u/Adventurous-Cut-7077), with roughly 20,000 from China, but reviewers reported widespread quality issues (incomplete implementations, unreproducible code, trivial errors). A researcher published a position paper arguing the current conference model is unsustainable (399 upvotes, by u/NuoJohnChen), citing environmental costs and mental health concerns alongside publication saturation.

The infrastructure groaned under the load. Overleaf went down ahead of a NeurIPS deadline (192 upvotes, by u/), overwhelming with simultaneous users. ArXiv announced it will stop accepting literature reviews and surveys without prior peer-review (399 upvotes, by u/NamerNotLiteral), citing LLM-generated spam, though discussion questioned whether a preprint site requiring prior publication undermined its original purpose. The arXiv migration from Cornell to Google Cloud Platform (265 upvotes, by u/sh_tomer) sparked concern about combining a platform rewrite with cloud migration (a risky dual undertaking).

Visa and Access Barriers

International researchers faced mounting obstacles. A researcher denied a U.S. B1 visa for ICCV 2025 in Hawaii (202 upvotes, by u/AnyIce3007) raised concerns that major venues should relocate to countries with fewer visa barriers. The discussion revealed widespread frustration - commenters shared personal experiences of visa denials and expressed reluctance to submit work to U.S.-based conferences. One commenter noted that AAAI 2026's Singapore location attracted significantly higher submissions from China, partly because visa accessibility was easier than previous U.S./Canada venues.

Research Integrity and Review Quality

The year exposed systemic problems in peer review and publishing integrity. A Tsinghua paper was withdrawn from ICLR after all four reviewers identified AI-generated citations (360 upvotes, by u/fourDnet), including fabricated references with fictitious authors like "Jane Doe." The incident sparked broader concerns about publication pressure in Chinese institutions where citation metrics drive promotion decisions.

More damaging was a researcher who discovered critical data quality issues in an Apple paper under review for ICLR 2026 (1555 upvotes, by u/diyer22). After adapting their model to the benchmark and getting poor results, they debugged the official code and found a critical bug: image content wasn't being passed to the vision language model. Manual inspection revealed approximately 30% error rates in the dataset. Reviewers missed it. After the researcher filed a public comment on OpenReview, the authors withdrew the paper and deleted the repository. The discussion praised the researcher's due diligence while acknowledging such issues are unfortunately common and often go undetected.

Another case involved a published 2024 ACL ArgMining paper on scientific fraud detection using fraudulent methodology itself (288 upvotes, by u/WhiteBear2018). The authors trained separate models per class and reported results as a single model, hardcoded a seed that collapsed one model, and deleted the repository when issues were raised.

Discussion coalesced around declining review quality at top ML conferences (191 upvotes, by u/BetterbeBattery). A researcher noted their theory-heavy paper received thoughtful reviews at AISTATS but faced dismissive reviews at NeurIPS and ICLR from reviewers who only commented on missing baselines. The consensus pointed to massive submission volumes forcing underqualified reviewers, zero incentive structures for quality, suspected AI-generated template reviews, and insufficient mathematical training. Commenters noted this affects both theoretical and empirical work and suggested alternative venues like TMLR and domain-specific journals showed better review standards.

Mamba's Disappearance

The year saw continued discussion of why Mamba faded despite initial hype. A discussion titled "Why Mamba disappeared" (190 upvotes, by u/Alarming-Power-813) revealed Mamba hasn't actually disappeared (7 survey papers were published last year), but lacks practical adoption outside research. Commenters noted that while Mamba showed theoretical promise, transformers have become deeply optimized across hardware and software stacks, making retraining massive models with unproven architecture economically unjustifiable when results are comparable or worse. Another thread tackled "Why MAMBA did not catch on" (264 upvotes, by u/TwoSunnySideUp) with commenters identifying that Mamba's real-world performance matches or underperforms well-optimized transformers, the mature transformer software stack creates significant switching costs, and Mamba's fixed state memory cannot selectively retrieve ignored tokens.

Vision Transformers vs. CNNs

The field remained unsettled on whether Vision Transformers have won in Computer Vision (197 upvotes, by u/Amgadoz). Discussion revealed a nuanced landscape: while transformers are increasingly preferred for many tasks and excel with large datasets, CNNs and hybrid architectures remain competitive in low-data regimes, medical imaging, and specialized domains. Commenters noted ConvNext provides strong alternatives, transformers require more memory and complicate variable image resolutions, and dataset quality matters more than architecture choice.

Infrastructure and GPU Competition

NVIDIA's cuML team announced GPU acceleration for scikit-learn, UMAP, and HDBSCAN without code changes (453 upvotes, by u/celerimo), reporting speedups including 25x for Random Forest and 175x for HDBSCAN. Users expressed interest though some questioned memory limitations compared to CPU-only execution.

Huawei's 96GB GPU under $2k sparked discussion (243 upvotes, by u/pmv143) about inference economics, but commenters identified critical limitations (the memory is LPDDR4 with lower bandwidth, lacks BF16 support, and software ecosystem remains immature). The consensus was that despite theoretical efficiency benefits, CUDA's dominance persists due to AMD's similar struggles and the software ecosystem maturity around GPUs.

Discussion emerged on why TPUs haven't achieved GPU prominence (212 upvotes, by u/DryHat3296). Commenters identified multiple factors: TPUs are primarily available only through Google Cloud (vendor lock-in), lack local development capabilities, have limited software support requiring JAX, and lag in features like FP8 training support.

Emerging Techniques and Tools

A developer released Termite, a CLI that generates terminal UIs from natural language prompts (310 upvotes, by u/jsonathan), though discussion centered on security implications of executing generated code and comparison to existing tools. Another developer released Promptimal, a CLI for optimizing prompts using a genetic algorithm (236 upvotes, by u/jsonathan).

A user built a Snake game with a Diffusion model as the game engine (537 upvotes, by u/jurassimo), predicting next frames from user input in near real-time. Discussion focused on training data, diffusion steps, and sampling schedulers.

A developer created torchvista, an interactive PyTorch visualization package for notebooks (283 upvotes, by u/Dev-Table) showing model forward passes as expandable computation graphs. Another created ml-visualized.com combining interactive visualizations with mathematical derivations (426 upvotes, by u/Bright_Aioli_1828) using marimo and Jupyter notebooks.

A developer released an LLM-powered Python debugger allowing natural language queries about program state (202 upvotes, by u/jsonathan). Someone created a lightweight manga generation model by finetuning Pixart-Sigma on 20 million manga images (191 upvotes, by u/fumeisama), supporting character consistency through embeddings from a pre-trained manga encoder.

A researcher introduced DF11 (Dynamic-Length Float) compression reducing BF16 models to 70% size during inference (199 upvotes, by u/choHZ), enabling models like Llama 3.1 405B to fit on 8x H100s.

Diffusion and Generative Models

Researchers demonstrated generative models trained only on furniture and cars somehow generalized to segment basically everything else (321 upvotes, by u/PatientWrongdoer9257). The approach finetuned Stable Diffusion and MAE for instance segmentation using only furniture and cars with novel instance coloring loss, yet generalized to unseen categories.

Google released Gemini Diffusion, a text generation model using diffusion rather than autoregressive approaches (270 upvotes, by u/hiskuu). Commenters noted diffusion models theoretically allow tokens to be refined globally rather than generated strictly left-to-right, potentially addressing limitations of autoregressive generation.

Researchers built NeuralOS, an experimental generative operating system generating every screen pixel from user inputs (590 upvotes, by u/yuntiandeng) at 1.8fps on H100 using an RNN and diffusion model. Discussion acknowledged impracticality but noted novelty as a tech demonstrator.

Interpretability and Understanding

Anthropic released a paper on interpretability using attribution graphs to trace internal mechanisms (230 upvotes, by u/hiskuu) across tasks including reasoning, poetry planning, and refusal. Discussion focused heavily on biological metaphors, with critics arguing these anthropomorphize pattern-matching without genuine foresight.

A researcher shared work on LLM circuit visualization extending 3Blue1Brown concepts (213 upvotes, by u/ptarlye) using mechanistic interpretability to decompose how models process specific examples. Discussion addressed framings of model behavior, with commenters noting attention works through learned statistical processes rather than symbolic rules.

Researchers showed LLMs can be converted to locally linear systems at inference time (239 upvotes, by u/jamesvoltage), achieving reconstruction error around 10⁻⁶. However, limitations emerged - the linear system is input-sequence-specific and takes 10+ seconds to compute for 3B models.

Performance Benchmarking and Reasoning

Gemini officially achieved gold-medal standard at the International Mathematical Olympiad (227 upvotes, by u/currentscurrents). Discussion centered on concerns about validation, compute requirements, and contradictions with models struggling on easier problems. A post analyzed reasoning model limitations (208 upvotes, by u/Fair-Rain3366) finding they exhibit catastrophic failure rather than graceful degradation - maintaining high accuracy up to a complexity threshold before collapsing.

CompressARC achieved 34.75% on ARC without pretraining (246 upvotes, by u/currentscurrents), training small networks during inference on individual puzzles in roughly 20 minutes. Discussion touched connections to test-time adaptation and whether just-in-time training will become more prevalent.

A researcher evaluated LLMs on real-world software engineering tasks from Upwork (197 upvotes, by u/Successful-Western27), creating a $1M benchmark with Claude 3.5 Sonnet earning $208,050 but resolving only 26.2% of tasks. Discussion centered on whether benchmarks capture isolated task completion rather than realistic scenarios within established codebases.

A post analyzing 400+ ML competitions from 2024 (391 upvotes, by u/hcarlens) found Python nearly universal among winners, PyTorch dominates at 9:1 over TensorFlow, CNNs still outpace transformers in computer vision, and quantization/LoRA increasingly common in language model competitions.

Activation Functions and Architecture Components

A post discussed why cosine similarity isn't the silver bullet (460 upvotes, by u/skeltzyboiii) from Netflix and Cornell researchers. Discussion revealed disagreement about novelty (commenters noted the issue is using cosine similarity on embeddings trained with losses that don't optimize for angular distances, not with cosine similarity itself).

A user sparked discussion critiquing softmax (269 upvotes, by u/Sad-Razzmatazz-5188), highlighting that it only cares about differences between inputs, not absolute magnitudes. Discussion revealed fundamental disagreements about whether properties are bugs or features (defenders argued invariance to scaling is intentional and desirable for learning probability distributions).

Researchers introduced SUGAR (Surrogate Gradient Learning for ReLU) (235 upvotes, by u/Radiant_Situation340) addressing dying ReLU by using smooth surrogate gradients during backpropagation. Discussion raised concerns about overhead and inconsistencies between claimed benefits and evidence.

Meta researchers proposed Transformers without Normalization using Dynamic Tanh (270 upvotes, by u/Nunki08). Discussion was mixed (some found work interesting, others criticized lack of theoretical justification and questioned results at small scales).

A researcher introduced the Periodic Linear Unit (PLU) based on Fourier synthesis (230 upvotes, by u/bill1357). Commenters highlighted insufficient literature review, lack of comparison with existing periodic functions like SIREN, and unfair baselines, cautioning the core idea may have merit but requires substantial additional work.

Training Techniques and Adaptation

Sakana AI introduced Transformer², a framework for real-time LLM adaptation (188 upvotes, by u/hardmaru) modifying only singular components rather than full fine-tuning. However, discussion revealed mixed results (significant gains on smaller models but minimal improvement on 70B models).

A researcher presented TMemNet-I with irreversible memory updates (264 upvotes, by u/No_Release_3665) using entropy-based decay. Discussion revealed skepticism about whether irreversibility is necessary versus a biological constraint, and questions about architectural details.

LeJEPA was presented as theoretically grounded for self-supervised learning (303 upvotes, by u/jacobgorm), using Sketched Isotropic Gaussian Regularization to enforce optimal embedding representations. Discussion praised theoretical contribution but raised questions about practical efficiency and generalization.

Emerging Research Areas

Andrew Barto and Richard Sutton were awarded the 2024 ACM A.M. Turing Award (422 upvotes, by u/MTGTraner) for foundational reinforcement learning contributions. Discussion emphasized the 40-year journey from 1980s breakthroughs to real-world applications.

AI-designed proteins neutralized lethal snake venom (242 upvotes, by u/prototypist) using AlphaFold 2 and RFdiffusion. Discussion noted while de novo design is significant, the actual therapeutic challenge is achieving selectivity without harming human tissue.

Meta released DINOv3 trained on 1.7B images (219 upvotes, by u/say_wot_again) achieving state-of-the-art results with linear probing, plus satellite imagery-specific variants. Discussion focused on evaluation methodology and whether compute requirements justify adoption.

A GPU mini-grant program was announced (186 upvotes, by u/tczoltan) to provide computational resources where computing power is the limiting factor. The initiative aimed to democratize access similar to how personal computing replaced mainframes.

Bloat in machine learning shared libraries was quantified at >70% (353 upvotes, by u/Specialist_Square818), with Negativa-ML reducing device code by up to 75% and total size by 55%. Discussion attributed bloat to historical gaps in GPU programming expertise and redundant operations across libraries.

Reasoning About Economic Impact

Ilya Sutskever expressed puzzlement at the gap between AI benchmarks and economic impact (442 upvotes, by u/we_are_mammals). Commenters offered several explanations: AI tools struggle with end-to-end task completion, benchmarks may overfit to specific metrics, and institutional integration takes time similar to Solow Paradox patterns.

Larry Ellison claimed inference is where AI money will be made (210 upvotes, by u/pmv143). While there was agreement that inference represents monetization (versus training as a cost), skepticism dominated about Oracle's competitive viability given custom chips from cloud providers.

Career and Community Issues

A senior ML engineer with 9 years of experience expressed concern about career stagnation (398 upvotes, by u/Only_Emergencies) as roles shifted from building models to integrating APIs. Discussion revealed widespread agreement that engineers who trained models in the 2010s now spend most time on API integration and infrastructure.

A PhD student with strong publication credentials asked how to secure research scientist internships (190 upvotes, by u/ParticularWork8424). Responses emphasized that venue prestige matters less than real-world impact, but networking and referrals remain critical barriers.

A user posted about mental health struggles during NLP graduate research (209 upvotes, by u/moji-mf-joji), motivated by Felix Hill's encouragement before his death. The essay sparked discussions where readers shared difficult experiences in PhD programs, emphasizing the importance of normalizing conversations about these challenges.

Discussion emerged on preparing for a DeepMind Gemini Team interview (238 upvotes, by u/Healthy_Fisherman_88). Respondents emphasized ML system design differs from traditional software engineering - focusing on throughput, memory constraints, latency tradeoffs, and KV cache optimization rather than conventional distributed systems.

A candidate who rejected a solid offer while waiting for a dream job (193 upvotes, by u/DNNenthusiast) found themselves unemployed when both fell through. Most commenters agreed they should have accepted and resigned later - a strategy several reported using successfully.

Information Quality and Misinformation

A user raised concerns about the proliferation of misinformation on social media (375 upvotes, by u/Striking-Warning9533). Commenters identified self-appointed experts using imprecise terminology, LLMs enabling people to appear knowledgeable without understanding mechanics, and media personalities offering conflicting narratives on unsettled questions.

A discussion emerged about LLMs validating people with delusional thinking (319 upvotes, by u/GodIsAWomaniser). Concerns centered on LLM sycophancy creating reinforcing feedback loops - when external criticism is faced, users return to chatbots for validation, further isolating them from reality.

Educational Resources

3Blue1Brown's video explaining attention mechanisms received appreciation (395 upvotes, by u/yogimankk) for visual explanations and pedagogical approach. Commenters clarified a rushed explanation about causal masking and referenced complementary resources.

A developer released beyond-nanoGPT, a 20k+ line educational repository (247 upvotes, by u/tanishqkumar07) implementing modern deep learning from scratch. While praised for bridging theory and practice, critiques included missing test suites, specific technical errors, and skepticism about AI-generated portions.

Stanford announced an updated Deep Learning course (273 upvotes, by u/al3arabcoreleone). Discussion noted alternative courses from CMU and Andrew Ng while expressing interest in what specifically changed.

Discussion on Yann LeCun's Positions

Discussion emerged around Yann LeCun's claim that auto-regressive LLMs are fundamentally limited (356 upvotes, by u/hiskuu). While commenters acknowledged concerns about scaling limitations, several challenged his specific probability-of-correctness argument. Broader consensus suggested auto-regressive approaches may not be sufficient for AGI but remain practical SOTA.

A user asked for clarification on LeCun's comparison of human sensory data to YouTube uploads (434 upvotes, by u/turhancan97). Discussion centered on whether this highlights multimodal sensory learning advantages over text-based training, with counterpoints about blind children learning language effectively.

Hardware Utilization and System Design

An interview question about calculating hardware utilization (559 upvotes, by u/Arqqady) sparked discussion showing calculations arriving at approximately 21.6% utilization. Commenters highlighted significant ambiguities (the answer depends heavily on unstated architectural details, making precise answers difficult). Skepticism emerged about the question's pedagogical value, with some noting it functions as trivia rather than assessing practical ML engineering ability.

Miscellaneous Technical Work

A discussion examined legacy tools like Performer Attention (244 upvotes, by u/theMonarch776) as potential game-changers. Commentary revealed practical limitations (underperformance in LLMs and being superseded by alternatives like Flash Attention).

A researcher built an LSTM-based malware packer (343 upvotes, by u/Acanthisitta-Sea) storing code in model weights through intentional overfitting. Security engineers raised significant practical limitations (the technique only evades trivial static detection and would still be detected once unpacked in memory).

A user shared a knowledge graph traversal approach for RAG systems (312 upvotes, by u/Alieniity). Discussion clarified the work implements semantic similarity graph traversal rather than true knowledge graph construction requiring typed entities and relations.

A user addressed image denoising model performance (608 upvotes, by u/Nyaalice) on smooth noise types. Suggestions included treating as upsampling problem, switching to U-Net, artificially varying noise distributions, and exploring Plug-and-Play methods.

A researcher proposed using eigenvalues as computational primitives (205 upvotes, by u/alexsht1). Discussion highlighted significant concerns (non-differentiability and set-valued nature pose implementation challenges, and eigendecomposition is O(n³)).

Year-End Reflections

The subreddit discussed best papers of 2025 (228 upvotes, by u/ArtisticHamster). Top-voted comment identified DeepSeek R1/V3 and Diffusion-based models as most impactful, followed by Vision Language Action models for robotics and Reasoning models.

A proposal to add "no AI slop" as a subreddit rule (207 upvotes, by u/qalis) received mixed support. While commenters endorsed the idea for giving moderators clearer grounds, detractors raised concerns about enforcement and distinguishing AI assistance from human-written content.

P.S. Special recognition to the researcher who publicly documented data quality issues in the Apple ICLR paper (1555 upvotes) - due diligence like this was rare and needed. Also notable: the Tsinghua fake citations case, the ACL fraud detection paper using fraudulent methodology, and the broader recognition that massive submission volumes have broken the peer review system in ways that need structural rather than merely procedural fixes. The award to Barto and Sutton for reinforcement learning reminded the field that foundational work takes 40 years to pay off.

submitted by /u/Everlier to r/MachineLearning
[link] [comments]

Read the whole story

clumma

5 days ago

reply

Berkeley, CA

[R] DeepSeek-R1’s paper was updated 2 days ago, expanding from 22 pages to 86 pages and adding a substantial amount of detail. by /u/Nunki08 Thursday January 8th, 2026 at 2:03 PM

The biggest breakthroughs in longevity science in 2025 by /u/statto Thursday January 8th, 2026 at 2:03 PM

[D] Clean, self-contained PyTorch re-implementations of 50+ ML papers (GANs, diffusion, meta-learning, 3D) by /u/papers-100-lines Thursday January 8th, 2026 at 2:01 PM

Stackoverflow: Questions asked per month over time. by /u/lelanthran Thursday January 8th, 2026 at 2:01 PM

Biosplice submits first osteoarthritis drug to FDA for consideration by /u/sg3510 Tuesday January 6th, 2026 at 8:05 PM

[D] r/MachineLearning - a year in review by /u/Everlier Sunday January 4th, 2026 at 6:03 PM

Open-Source Parity and Training Efficiency

The Conference Crisis

Visa and Access Barriers

Research Integrity and Review Quality

Mamba's Disappearance

Vision Transformers vs. CNNs

Infrastructure and GPU Competition

Emerging Techniques and Tools

Diffusion and Generative Models

Interpretability and Understanding

Performance Benchmarking and Reasoning

Activation Functions and Architecture Components

Training Techniques and Adaptation

Emerging Research Areas

Reasoning About Economic Impact

Career and Community Issues

Information Quality and Misinformation

Educational Resources

Discussion on Yann LeCun's Positions

Hardware Utilization and System Design

Miscellaneous Technical Work

Year-End Reflections

[R] DeepSeek-R1’s paper was updated 2 days ago, expanding from 22 pages to 86 pages and adding a substantial amount of detail. by /u/Nunki08
Thursday January 8^th, 2026 at 2:03 PM

The biggest breakthroughs in longevity science in 2025 by /u/statto
Thursday January 8^th, 2026 at 2:03 PM

[D] Clean, self-contained PyTorch re-implementations of 50+ ML papers (GANs, diffusion, meta-learning, 3D) by /u/papers-100-lines
Thursday January 8^th, 2026 at 2:01 PM

Stackoverflow: Questions asked per month over time. by /u/lelanthran
Thursday January 8^th, 2026 at 2:01 PM

Biosplice submits first osteoarthritis drug to FDA for consideration by /u/sg3510
Tuesday January 6^th, 2026 at 8:05 PM

[D] r/MachineLearning - a year in review by /u/Everlier
Sunday January 4^th, 2026 at 6:03 PM