1-dan master of the unyielding fist of Bayesian inference
6231 stories
·
1 follower

NRC issues first commercial reactor construction approval in 10 years [pdf]

1 Share

Article URL: https://www.nrc.gov/sites/default/files/cdn/doc-collection-news/2026/26-028.pdf

Comments URL: https://news.ycombinator.com/item?id=47254516

Points: 142

# Comments: 163

Read the whole story
clumma
5 minutes ago
reply
Berkeley, CA
Share this story
Delete

Apple Studio Display and Studio Display XDR

1 Share

Article URL: https://www.apple.com/newsroom/2026/03/apple-unveils-new-studio-display-and-all-new-studio-display-xdr/

Comments URL: https://news.ycombinator.com/item?id=47232421

Points: 234

# Comments: 322

Read the whole story
clumma
5 minutes ago
reply
Berkeley, CA
Share this story
Delete

AI-generated art can’t be copyrighted after Supreme Court declines review

1 Share

Article URL: https://www.theverge.com/policy/887678/supreme-court-ai-art-copyright

Comments URL: https://news.ycombinator.com/item?id=47232289

Points: 193

# Comments: 151

Read the whole story
clumma
5 minutes ago
reply
Berkeley, CA
Share this story
Delete

Claude’s Cycles - Don Knuth

1 Share
submitted by /u/mttd to r/compsci
[link] [comments]
Read the whole story
clumma
3 days ago
reply
Berkeley, CA
Share this story
Delete

BullshitBench v2 dropped and… most models still can’t smell BS (Claude mostly can)

1 Share

This benchmark is kind of brutal.

BullshitBench v2 just came out, and unlike most evals where every new release “wins,” this one says a lot of models are basically not improving at detecting confident nonsense.

What changed in v2:

- 100 new questions

- domain split: coding (40), medical (15), legal (15), finance (15), physics (15)

- 70+ model variants tested

- fully open: questions, scripts, responses, judgments

Main takeaways:

- Anthropic’s latest models are crushing it

- Qwen is also very strong

- OpenAI + Google models reportedly still struggling here

- domain barely changes outcomes (BS detection is similarly hard across fields)

- reasoning mode doesn’t help much, maybe even hurts

- newer model ≠ better model on this task

Data explorer is honestly the best part you can inspect question-by-question and see where models confidently hallucinate.

Links:

- https://petergpt.github.io/bullshit-benchmark/viewer/index.v2.html

- https://github.com/petergpt/bullshit-benchmark

Curious what people think this is actually measuring: calibration? epistemic humility? something else?

Because whatever it is, most models still look shaky.

submitted by /u/snakemas to r/CompetitiveAI
[link] [comments]
Read the whole story
clumma
3 days ago
reply
Berkeley, CA
Share this story
Delete

France to increase nuclear arsenal, stop sharing warhead numbers, and potentially deploy weapons across Europe

1 Share
France to increase nuclear arsenal, stop sharing warhead numbers, and potentially deploy weapons across Europe

In a speech at the SSBN base in Ile Longue, French President Macron said that due to "an increasing risk of conflicts globally crossing the nuclear threshold" France would increase their nuclear arsenal and will "no longer communicate the number of nuclear warheads."

France also plans to potentially deploy French nuclear forces in other countries, and have invited Germany, Greece, Poland, the Netherlands, Belgium, and Denmark to participate in nuclear drills. The US currently already deploys weapons across several European countries under a so-called nuclear umbrella.

France currently has an estimated 290 warheads, the UK ~225, while the US and Russia both have well over 5,000.

https://www.reuters.com/world/europe/macron-says-france-will-increase-size-its-nuclear-arsenal-2026-03-02/

https://www.wsj.com/world/europe/france-floats-nuclear-deployment-across-europe-056a5cbc

submitted by /u/Afrogthatribbits to r/nuclearweapons
[link] [comments]
Read the whole story
clumma
3 days ago
reply
Berkeley, CA
Share this story
Delete
Next Page of Stories