Curated by Shen Huang · 88 stories · ~13 min read
DIGEST · 2026-06-24

OrangeBot.AI Digest — 2026-06-24

88 headlines across 8 sources, aggregated for this day.

Hacker News(15)

  1. OpenAI unveils its first custom chip, built by Broadcom (techcrunch.com)
  2. NSA lost access to Mythos amid Anthropic dispute (www.nytimes.com)
  3. For Most of the World, Open-Source AI Is the Only Way Forward (techstrong.ai)
  4. There are a few things that I look back on as my mistakes in the early days (twitter.com)
  5. RubyLLM: A Ruby framework for all major AI providers (rubyllm.com)
  6. Stealing Is a Skill (ben-mini.com)
  7. A Practical Guide to SSH Tunnels: Local and Remote Port Forwarding (labs.iximiuz.com)
  8. Slate EV truck starts at $24,950 (www.slate.auto)
  9. Founding a company in Germany: €9600, 152 days and I still can't send an invoice (paolino.me)
  10. Reid Hoffman says SpaceX 'not an AI company', xAI 'complete train wreck' (fortune.com)
  11. Krea 2: SOTA open-weights 12B image model (www.krea.ai)
  12. We’re making Bunny DNS free (bunny.net)
  13. "Fix" MacBook Neo Cursor Lag: Record 1 Pixel of the Screen Every 10 Seconds (gist.github.com)
  14. Show HN: An ASCII 3D Rendering Engine (glyphcss.com)
  15. Raspberry Pi Pico W as USB Wi-Fi Adapter (gitlab.com)

GitHub Trending(13)

  1. calesthio / OpenMontage
  2. ZhuLinsen / daily_stock_analysis
  3. apple / container
  4. interviewstreet / hiring-agent
  5. JCodesMore / ai-website-cloner-template
  6. revfactory / harness
  7. flutter / flutter
  8. andreknieriem / headunit-revived
  9. stablyai / orca
  10. google-labs-code / design.md
  11. Flowseal / zapret-discord-youtube
  12. kunchenguid / no-mistakes
  13. NousResearch / hermes-agent

Product Hunt(15)

  1. Crewdle AI

    Use every business AI tool without every subscription

  2. Tencent EdgeOne Makers

    Ship AI agents like web apps, in minutes.

  3. Prospector by Synter

    Your outbound agent, right inside Slack

  4. Stripe.Directory

    New way for you & agents to search for businesses on Stripe

  5. Mindstone Rebel

    AI workspace for agents that know your work and ask first

  6. React UI Kit V7

    All the chat components you need. None of the complexity

  7. Ruby

    Ask better questions, live on every call

  8. Nimt

    Your AI Search Coworker in Slack

  9. FUTO Swipe

    Open models for on-device swipe typing

  10. StaleMate PR

    Your menu bar turns red when PRs pile up

  11. Customer Relationship Agents by Clarify

    The M in CRM shouldn't be you

  12. Swimio

    AI swim coach with Apple Watch tracking & smart workouts

  13. Propane

    Automatic customer context for product teams and agents

  14. Buy by Agentcard

    Order DoorDash from Claude

  15. jebi

    A supercharged terminal for Mac with built-in local AI

Hugging Face(15)

  1. Qwen-AgentWorld: Language World Models for General Agents

    A world model predicts environment dynamics based on current observations and actions, serving as a core cognitive mechanism for reasoning and planning. In this work, we investigate how world modeling based on language models can further push the boundaries of general agents. (i) We first focus on building foundation models for agentic environment simulation. We introduce Qwen-AgentWorld-35B-A3B and Qwen-AgentWorld-397B-A17B, the first language world models capable of simulating agentic environments covering 7 domains via long chain-of-thought reasoning. Leveraging more than 10M environment interaction trajectories of 7 domains in real-world environments, we develop Qwen-AgentWorld through a three-stage training pipeline: CPT injects general-purpose world modeling capabilities from the state transition dynamics and augmented professional corpora, SFT activates next-state-prediction reasoning, and RL sharpens simulation fidelity through a tailored framework with hybrid rubric-and-rule rewards. To evaluate language world models, we present AgentWorldBench, a comprehensive benchmark constructed from real-world interactions of 5 frontier models on 9 established benchmarks. Empirical results demonstrate that Qwen-AgentWorld significantly outperforms existing frontier models. (ii) Beyond foundation models, we further investigate two complementary paradigms through which world modeling enhances general agents. First, as a decoupled environment simulator, Qwen-AgentWorld supports scalable and controllable simulation of thousands of real-world environments for agentic RL, yielding gains that surpass real-environment training alone. Second, as a unified agent foundation model, world-model training acts as a highly effective warm-up that improves downstream performance across 7 agentic benchmarks. Code: https://github.com/QwenLM/Qwen-AgentWorld

  2. NatureBench: Can Coding Agents Match the Published SOTA of Nature-Family Papers?

    We introduce NatureBench, a cross-discipline benchmark of 90 tasks distilled from peer-reviewed Nature-family publications, designed to evaluate whether AI coding agents can move beyond reproduction toward discovery on real scientific problems. NatureBench is built on NatureGym, an automated pipeline that constructs a standardized, per-task containerized environment from a source paper, addressing the environment-fragmentation problem that has limited the credibility of prior agent-on-research benchmarks. Evaluating ten frontier agent configurations under a strict web-search-disabled protocol, we find that the strongest model surpasses SOTA on only 17.8% of tasks under the g>0.1 criterion. Analysis of method pathways reveals that agents succeed primarily through methodological translation, converting scientific tasks into familiar supervised prediction problems, rather than through genuine scientific invention. Failures are dominated by wrong method choice and insufficient compute budget, not by task misunderstanding. We release the benchmark, the NatureGym pipeline, and a public leaderboard with maintainer-side reproduction. Code: https://github.com/FrontisAI/NatureBench

  3. MobileForge: Annotation-Free Adaptation for Mobile GUI Agents with Hierarchical Feedback-Guided Policy Optimization

    MLLM-based mobile GUI agents have made substantial progress in UI understanding and action execution, but adapting them to real target apps remains costly because mobile apps are numerous, frequently updated, and hard to cover with human-written tasks, demonstrations, or reward labels. Existing annotation-free GUI learning reduces manual supervision, yet lacks a unified substrate connecting target-app exploration, curriculum mining, rollout execution, and feedback, while policy optimization often relies on isolated rollouts and coarse rewards that are hard to convert into reliable improvement signals. We present MobileForge, an annotation-free adaptation system for mobile GUI agents. MobileForge consists of MobileGym, which grounds task generation and rollout evaluation in real mobile app interaction, and Hierarchical Feedback-Guided Policy Optimization (HiFPO), which turns trajectory outcomes, step-level process feedback, and corrective hints into hint-contextualized step-level GRPO updates. Using only automatically generated annotation-free adaptation data, MobileForge adapts Qwen3-VL-8B to 67.2% Pass@3 on AndroidWorld, close to the closed-data GUI-specialized GUI-Owl-1.5-8B base model at 69.0%. The MobileForge-adapted ForgeOwl-8B further reaches 77.6% Pass@3 on AndroidWorld and 41.0% success on the out-of-domain MobileWorld GUI-only split, establishing the strongest open-data mobile GUI agent in our evaluation. Code, data, and trained models will be released at https://mobile-forge.github.io/.

  4. MemGUI-Agent: An End-to-End Long-Horizon Mobile GUI Agent with Proactive Context Management

    MLLM-based mobile GUI agents have made substantial progress on short-horizon tasks, yet remain unreliable on long-horizon tasks that require retaining intermediate facts across many steps and app transitions. We attribute this limitation to ReAct-style prompting, which passively accumulates per-step records, leading to prompt explosion and dilution of critical cross-app facts. To address this, we introduce MemGUI-Agent, an end-to-end long-horizon mobile GUI agent with proactive context management. MemGUI-Agent is built on Context-as-Action (ConAct), which casts context management as first-class actions emitted by the same policy that selects UI actions. Instead of passively appending history, ConAct maintains three structured context fields: folded action history, folded UI state, and recent step record, preserving critical UI facts while keeping context compact. To make proactive context management learnable across model scales, we construct MemGUI-3K, a 2,956-trajectory dataset with full ConAct annotations for supervised training and offline analysis. Training an 8B model on MemGUI-3K produces MemGUI-8B-SFT, an 8B MemGUI-Agent that achieves the best open-data 8B performance on MemGUI-Bench and generalizes to the out-of-distribution MobileWorld benchmark. Code, data, and trained models will be released at https://memgui-agent.github.io/.

  5. AOHP: An Open-Source OS-Level Agent Harness for Personalized, Efficient and Secure Interaction

    AI agents are driving a new software paradigm, with the ability to autonomously call tools, extract information, manage memory, and complete tasks that span applications and data sources. Most existing end-user operating systems, however, are designed for application-centric workflows and offer little native support for AI agents. This mismatch limits the wider adoption of agents and leads to execution overhead and safety risks when running agents on conventional systems. While the concept of agent-native operating systems is emerging, the research community lacks an open testbed to explore the architectural primitives desired for agent-mediated interaction. We present AOHP (Android Open Harness Project), an OS-level agent harness built on the Android Open Source Project (AOSP). The core design principle of AOHP is to treat agents as first-class OS actors, enabling adaptive user interfaces and agent-friendly runtime environments. AOHP preserves the mature Android software and hardware ecosystem while introducing three agent-oriented system mechanisms: personalized service composition, efficient agent interfaces, and secure information flow. Based on preliminary experiments on challenging tasks covering key capabilities of OS agents, AOHP shows clear advantages in task completion (+21.12% completion rate), execution cost (-51.55% token cost), and security-policy compliance.

  6. OpenThoughts-Agent: Data Recipes for Agentic Models

    Agentic language models dramatically expand the applications of AI yet little is publicly known about how to curate training data for broadly capable agents. Existing open efforts such as SWE-Smith, SERA, and Nemotron-Terminal typically target a single benchmark, leaving open the question of how to train models that generalize across diverse agentic tasks. The OpenThoughts-Agent (OT-Agent) project addresses this gap with a fully open data curation pipeline for training agentic models. We conduct more than 100 controlled ablation experiments to systematically investigate each stage of the pipeline, yielding insights on the importance of task sources and diversity. We then assemble a training set of 100K examples from our pipeline and fine-tune Qwen3-32B on this dataset, which yields an average accuracy of 44.8% across seven agentic benchmarks and a 3.9 percentage point improvement over the strongest existing open data agentic model (Nemotron-Terminal-32B, 40.9%). Moreover, our training data exhibits strong scaling properties, outperforming alternative open datasets at every training set size in compute-controlled comparisons. We publicly release our training sets, data pipeline, experimental data, and models at openthoughts.ai to support future open research on agentic model training.

  7. LingxiDiagBench: A Multi-Agent Framework for Benchmarking LLMs in Chinese Psychiatric Consultation and Diagnosis

    Mental disorders are highly prevalent worldwide, but the shortage of psychiatrists and the inherent subjectivity of interview-based diagnosis create substantial barriers to timely and consistent mental-health assessment. Progress in AI-assisted psychiatric diagnosis is constrained by the absence of benchmarks that simultaneously provide realistic patient simulation, clinician-verified diagnostic labels, and support for dynamic multi-turn consultation. We present LingxiDiagBench, a large-scale multi-agent benchmark that evaluates LLMs on both static diagnostic inference and dynamic multi-turn psychiatric consultation in Chinese. At its core is LingxiDiag-16K, a dataset of 16,000 EMR-aligned synthetic consultation dialogues designed to reproduce real clinical demographic and diagnostic distributions across 12 ICD-10 psychiatric categories. Through extensive experiments across state-of-the-art LLMs, we establish key findings: (1) although LLMs achieve high accuracy on binary depression--anxiety classification (up to 92.3%), performance deteriorates substantially for depression--anxiety comorbidity recognition (43.0%) and 12-way differential diagnosis (28.5%); (2) dynamic consultation often underperforms static evaluation, indicating that ineffective information-gathering strategies significantly impair downstream diagnostic reasoning; (3) consultation quality assessed by LLM-as-a-Judge shows only moderate correlation with diagnostic accuracy, suggesting that well-structured questioning alone does not ensure correct diagnostic decisions. We release LingxiDiag-16K and the full evaluation framework to support reproducible research at https://github.com/Lingxi-mental-health/LingxiDiagBench.

  8. Semantic Browsing: Controllable Diversity for Image Generation

    Modern text-to-image models excel in visual fidelity and prompt adherence. However, this strict adherence comes at the cost of diversity: generated samples tend to collapse into a single visual interpretation. Existing methods to improve diversity produce outputs driven by incidental variations rather than meaningful design choices. This motivates a new variant of the diversity task where structure is enforced on the generated samples. We introduce a method for controlled diversity that enables Semantic Browsing, where users can navigate structured image galleries and experience creative exploration through a systematic traversal of meaningful, interpretable axes of variation. Achieving this level of semantic control requires a deep understanding of the scene. We exploit the fact that recent text-to-image models are trained on elaborated captions, effectively decoupling semantic decision-making from pixel generation. This enables a paradigm shift: instead of relying on stochastic variation within the text-to-image model, we induce diversity directly at the text level. By leveraging rich textual representations, we allow a Vision Language Model (VLM) to operate on the full scene context. To overcome the generic outputs typical of standard VLMs, we employ an agentic workflow that explicitly enforces structured variation attuned to the original prompt. We demonstrate that our method produces diverse and navigable design spaces where every variation corresponds to a specific, user-understandable semantic decision.

  9. FLAT: Feedforward Latent Triangle Splatting for Geometrically Accurate Scene Generation

    Generating explorable 3D scenes from a single image requires strong generative priors and accurate geometric representations suitable for downstream use. Current video diffusion models offer high-quality generation and implicitly encode multi-view geometric structure in latent space. However, existing feedforward latent scene decoders typically output volumetric 3D Gaussians that lack a well-defined surface, limiting their use in simulation or standard graphics pipelines. This motivates decoding surface-aligned primitives that are not only renderable but also closer to explicit geometric assets. We ask whether compressed video diffusion latents can be mapped directly to explicit surface primitives in a single pass. To this end, we introduce FLAT and, for the first time, show that triangle splats can be decoded directly from video diffusion latents. Compared with decoding 3D Gaussians, predicting flat primitives is notoriously more challenging due to high sensitivity to primitive orientations, oftentimes leading to poor gradient flow. FLAT solves with two key ingredients: a ray-centered rotation parameterization for triangle regression and a novel product window function that improves gradient flow during differentiable triangle rendering. On standard benchmarks, FLAT achieves significantly better geometric accuracy while maintaining competitive visual quality compared to state-of-the-art feedforward baselines. We further show that a lightweight test-time refinement step converts the predicted triangle soup into a fully opaque, game-engine-ready representation that supports real-time rendering. By evaluating 3DGS, 2DGS, and triangle splatting variants under an identical training setup, we provide the first systematic analysis of representation tradeoffs in feedforward scene generation. The project page is available at https://flat-splat.github.io

  10. FedOT: Ownership Verification and Leakage Tracing via Watermarks for Federated LDMs

    Training Latent Diffusion Models (LDMs) within Federated Learning (FL) has attracted increasing attention due to its ability to combine the powerful generative capacity of LDMs with the privacy-preserving properties of FL. However, FL requires sharing the global model with multiple participants, which risks unauthorized model distribution or resale by malicious clients. While an intuitive approach is to adopt existing VAE-based watermarking techniques for LDMs in FL, this strategy falls short in addressing such threats due to two fundamental challenges: (1) Existing methods support ownership verification but lack the ability to trace model leakage to a specific malicious client; (2) VAE-based watermarks are vulnerable, as they can be removed simply by replacing the decoder with a clean counterpart. In this paper, we propose FedOT, the first framework for ownership verification and leakage tracing in federated LDMs. Specifically, to address the first challenge, we design a chunked watermark, where the first part is for ownership verification, and the second part is used for client identification. Furthermore, to overcome the second challenge and secure the model against VAE replacement attack, we introduce Latent Vector Transformation (LVT), which strengthens the connection between the VAE and U-Net latent spaces by modifying the original latent distribution of the VAE. Consequently, any attempt to replace the VAE for watermark removal leads to significant image quality degradation, making the LDM model unusable. Extensive experiments demonstrate that FedOT achieves superior performance in both ownership verification and traceability. Project page: https://spyzixuan.github.io/FedOT/.

  11. Escaping the Self-Confirmation Trap: An Execute-Distill-Verify Paradigm for Agentic Experience Learning

    Experience-driven self-evolution is critical for large language model (LLM) agents to improve through open-world interaction. However, existing experience learning methods mostly rely on single-agent loops, where the same agent executes tasks, summarizes outcomes, and determines memory content. This setup makes agents vulnerable to the Self-Confirmation Trap: wrong-but-self-consistent trajectories are misidentified as successful experience, leading to cumulative errors during retrieval and reuse. To address this issue, we propose EDV, an Execute-Distill-Verify framework for reliable experience learning. In the Execute stage, multiple heterogeneous agents explore the same task space in parallel to generate diverse candidate trajectories. In the Distill stage, a dedicated third-party agent comparatively analyzes these trajectories to produce candidate experiences, reducing executor-centric summarization bias. In the Verify stage, the execution group validates candidates via a consensus mechanism, and only approved experiences are written into shared or private memory. By decoupling the three stages, EDV transforms experience learning from isolated self-reflection into collaborative construction, filtering erroneous and noisy content before memory insertion. We evaluate EDV on three challenging long-horizon benchmarks: tau2-bench, Mind2Web and MMTB. Results show EDV consistently outperforms strong baselines, validating that reliable experience construction is essential for robust agent self-evolution. Our code is available at https://github.com/shidingz/EDV.

  12. Are Text-to-Image Models Inductivist Turkeys? A Counterfactual Benchmark for Causal Reasoning

    Text-to-image (T2I) generation models have achieved remarkable progress in producing visually realistic images from natural language prompts. Yet it remains unclear whether their success reflects genuine causal understanding or sophisticated pattern matching over visual-textual correlations. Inspired by Russell's inductivist turkey, we introduce Counterfactual-World (CF-World), a counterfactual benchmark designed to investigate whether text-to-image models can generate images under rules that systematically contradict real-world priors. CF-World organizes each scenario into three progressive levels: factual generation under ordinary world knowledge, explicit counterfactual generation with direct visual instructions, and implicit counterfactual generation requiring causal deduction from altered rules. We evaluate both open-source and closed-source T2I models using a Vision Language Model (VLM)-based evaluator (CF-Eval). Furthermore, we introduce two metrics: Prior Resistance Rate (PRR), which measures a model's ability to overcome entrenched real-world priors, and Reasoning Retention Rate (RRR), which assesses whether models can maintain reasoning-dependent counterfactual generation without explicit visual cues. Experiments show that all models exhibit sharp degradation from factual to counterfactual settings. Further analyses suggest that these failures arise because current T2I models encode world knowledge and visual appearances as tightly coupled patterns. Consequently, their heavy reliance on frequent visual co-occurrences within the training data forces them to default to familiar commonsense priors when tasked with rendering counterfactual worlds.

  13. DiffusionBench: On Holistic Evaluation of Diffusion Transformers

    Diffusion transformer (DiT) research on image generation has converged to a single evaluation setup: class-conditional generation on ImageNet. While methods improve the FID and related metrics, it is increasingly unclear whether they reflect real progress in generative modeling. The natural alternative, i.e., text-to-image (T2I) generation, is perceived as too costly or inconvenient to train and evaluate and is often skipped. We argue that this perception no longer holds. We introduce NanoGen, a unified DiT training and evaluation framework. NanoGen matches state-of-the-art DiT baselines on ImageNet and, with 12 lines of configuration change, also trains competitive text-to-image models. It currently supports RAE, VAE, pixel-space, and MeanFlow diffusion methods under both ImageNet and T2I setups. Under NanoGen, training T2I requires comparable compute to ImageNet. After training 21 latent diffusion models with NanoGen, we observe that method ranking shows no strong correlation between ImageNet and T2I generation: Pearson correlation is between -0.377 and -0.580 across three metrics. This suggests that a method which improves class-conditional ImageNet FID may show no corresponding improvement on T2I, clearly indicating the necessity of evaluating DiTs on both tasks. To this end, we summarize ImageNet and text-to-image results, which yields DiffusionBench, a holistic benchmark for DiT research. We recommend reporting DiffusionBench in place of ImageNet alone: methods that improve DiffusionBench are more likely to reflect broader progress.

  14. Holistic Data Scheduler for LLM Pre-training via Multi-Objective Reinforcement Learning

    The composition of training data, governed by the diversity of sources and their mixing strategy, is a cornerstone of Large Language Model (LLM) pre-training. Online Data Mixing (ODM), the technique of adaptively adjusting data mixtures during training, has emerged as a promising direction to improve efficiency. However, existing methods are constrained by their reliance on a singular optimization perspective, which fundamentally overlooks the need for complex LLM pre-training to consider the dynamic data composition from multiple dimensions. To overcome this limitation, we introduce the Holistic Data Scheduler (HDS), a novel online data mixing framework. HDS formulates the data scheduling challenge as a reinforcement learning problem in a continuous control space and leverages the Soft Actor-Critic (SAC) algorithm for its stability and sample efficiency in exploring the high-dimensional policy space. At the core of HDS lies a novel multi-objective, holistic reward function that integrates three critical perspectives: a data-driven reward for quality, a loss-driven reward capturing inter-domain influence, and a model-driven reward based on weight norms. To validate our design and determine its optimal configuration, we conducted systematic experiments on LLMs of various sizes. On The Pile benchmark, HDS reaches the final validation perplexity of the next best method with 44% fewer training iterations. Furthermore, it achieves a 7.2% improvement on the MMLU 0-shot task along with consistent gains on other benchmarks, showcasing its ability to enhance both training efficiency and final model capability.

  15. ChartWalker: Benchmarking the Cross-Chart RAG Task

    Cross-Chart Retrieval-Augmented Generation (RAG) is critical for complex multi-modal analytical tasks in scientific, business, and political domains. However, existing benchmarks either focus on tables, which are well-structured and textualized, or generate cross-chart questions by simply extracting key points, which often induces lexical overlap between queries and evidence and yields logically inconsistent reasoning chains. To address this, we introduce ChartWalker, a novel framework for constructing challenging cross-chart RAG tasks. ChartWalker features a hierarchical knowledge graph construction method tailored to charts, which organizes entities and relations by granularity to preserve analytical structure. We then propose a structure-aware sampling algorithm that synthesizes semantically coherent, multi-hop reasoning paths, enabling explicit control over query difficulty and granularity for QA generation. Built with this framework, we release ChartWalker-Bench, a comprehensive benchmark spanning diverse domains and cross-chart query types. Extensive evaluations across major RAG paradigms reveal significant performance gaps, underscoring the benchmark's difficulty and utility. Furthermore, we provide ChartWalker-Agent, an agentic baseline to facilitate analysis and inspire future system design.

Techmeme(15)

  1. Qualcomm expects $15B in data center chip sales by 2029, raises its non-handset chip revenue forecast to $40B by 2029, up from $22B; QCOM jumps 13%+ after hours (Reuters)

    Reuters : Qualcomm expects $15B in data center chip sales by 2029, raises its non-handset chip revenue forecast to $40B by 2029, up from $22B; QCOM jumps 13%+ after hours —  Qualcomm (QCOM.O) said it expects to generate $15 billion in sales from its data center business by 2029 as it moves beyond …

  2. Cloudflare partners with Google, Microsoft, and Mozilla on PACT, a protocol to distinguish legitimate human or bot traffic from undesirable network requests (Thomas Claburn/The Register)

    Thomas Claburn / The Register : Cloudflare partners with Google, Microsoft, and Mozilla on PACT, a protocol to distinguish legitimate human or bot traffic from undesirable network requests —  Makers of Chrome, Edge, Firefox back bot-fraud defense called Private Access Control Tokens  —  Cloudflare on Monday …

  3. Micron reports Q3 revenue up 346% YoY to $41.46B, vs. $35.84B est., gross margin above estimates, and forecasts Q4 revenue above est.; MU jumps 14%+ after hours (Kif Leswing/CNBC)

    Kif Leswing / CNBC : Micron reports Q3 revenue up 346% YoY to $41.46B, vs. $35.84B est., gross margin above estimates, and forecasts Q4 revenue above est.; MU jumps 14%+ after hours —  Micron's revenue more than quadrupled in the fiscal third quarter, the company said on Wednesday, as the memory maker continued …

  4. Sources: in a letter to US officials, Anthropic accused Alibaba of adversarial distillation, accessing Claude 28.8M times from April to June via ~25K accounts (Maggie Eastland/Bloomberg)

    Maggie Eastland / Bloomberg : Sources: in a letter to US officials, Anthropic accused Alibaba of adversarial distillation, accessing Claude 28.8M times from April to June via ~25K accounts —  Anthropic PBC accused Chinese technology giant Alibaba Group Holding Ltd. of waging a large-scale effort to “illicitly” …

  5. Qualcomm unveils Dragonfly C1000, a new data center CPU built for agentic AI, and says Meta will use the chip when production starts in 2028 (Kif Leswing/CNBC)

    Kif Leswing / CNBC : Qualcomm unveils Dragonfly C1000, a new data center CPU built for agentic AI, and says Meta will use the chip when production starts in 2028 —  Qualcomm shares jumped 15% in extended trading on Wednesday after the chipmaker said non-handset revenue in fiscal 2029 will be $40 billion, up from a prior forecast of $22 billion.

  6. Sources: Kalshi is in talks to raise a funding round at a ~$40B valuation that may close as soon as Q3; Kalshi raised $1B at a $22B valuation in May (Financial Times)

    Financial Times : Sources: Kalshi is in talks to raise a funding round at a ~$40B valuation that may close as soon as Q3; Kalshi raised $1B at a $22B valuation in May —  Rapidly growing company is increasingly challenging established derivatives and betting rivals  —  Kalshi is in talks to raise funds …

  7. Documents: Meta's planned prediction markets app will use Meta AI models to generate questions from trending topics, make recommendations, and resolve markets (Bobby Allyn/NPR)

    Bobby Allyn / NPR : Documents: Meta's planned prediction markets app will use Meta AI models to generate questions from trending topics, make recommendations, and resolve markets —  Meta is planning to launch its own prediction market app to compete with companies like Kalshi and Polymarket in a booming sector …

  8. Sources: Google AI researchers Jonas Adler and Alexander Pritzel, both viewed internally as key contributors to Gemini, are planning to leave for Anthropic (Bloomberg)

    Bloomberg : Sources: Google AI researchers Jonas Adler and Alexander Pritzel, both viewed internally as key contributors to Gemini, are planning to leave for Anthropic —  Two leading artificial intelligence researchers at Alphabet Inc.'s Google are planning to leave for rival Anthropic PBC …

  9. Google says computer use is now a built-in tool supported in Gemini 3.5 Flash, available via the Gemini API and Gemini Enterprise Agent Platform (Mateo Quiros/The Keyword)

    Mateo Quiros / The Keyword : Google says computer use is now a built-in tool supported in Gemini 3.5 Flash, available via the Gemini API and Gemini Enterprise Agent Platform —  Computer use is now a built-in tool in Gemini 3.5 Flash to build agents that can interact across platforms.

  10. Binance says it will make a fresh push for permission to operate in the EU after its MiCA license application in Greece failed ahead of the June 30 deadline (Reuters)

    Reuters : Binance says it will make a fresh push for permission to operate in the EU after its MiCA license application in Greece failed ahead of the June 30 deadline —  Crypto platform Binance intends to stay in the European Union and will make a fresh push for permission to operate there …

  11. Runlayer, which provides an infrastructure and control layer for enterprise AI agents, raised a $30M Series A led by Felicis, bringing its total funding to $42M (Lily Mae Lazarus/Fortune)

    Lily Mae Lazarus / Fortune : Runlayer, which provides an infrastructure and control layer for enterprise AI agents, raised a $30M Series A led by Felicis, bringing its total funding to $42M —  When longtime tech investor Vinod Khosla heard that Runlayer—the startup trying to become the default infrastructure layer governing …

  12. Seltz, which is building a web search engine that can be used by AI agents, raised a $12.5M seed led by Speedinvest and B Capital (Jeremy Kahn/Fortune)

    Jeremy Kahn / Fortune : Seltz, which is building a web search engine that can be used by AI agents, raised a $12.5M seed led by Speedinvest and B Capital —  The rise of AI has rekindled the long-dormant search wars.  Chatbots and AI agents need to surface timely, relevant information about news and all kinds of products and services.

  13. Nature publishes a peer-reviewed paper alleging that Microsoft's 2025 quantum breakthrough claims were based on "basic Python errors" and data cherry-picking (Thomas Claburn/The Register)

    Thomas Claburn / The Register : Nature publishes a peer-reviewed paper alleging that Microsoft's 2025 quantum breakthrough claims were based on “basic Python errors” and data cherry-picking —  Nature paper argues researchers cherry-picked data.  Redmond insists its work is sound

  14. Sources: the Trump administration has been happier talking to Anthropic lately after Dario Amodei was replaced by cofounder Tom Brown in meetings about Fable 5 (Hugo Lowell/Wired)

    Hugo Lowell / Wired : Sources: the Trump administration has been happier talking to Anthropic lately after Dario Amodei was replaced by cofounder Tom Brown in meetings about Fable 5 —  At high-stakes meetings with the White House, Anthropic's cofounder—a “weirdo,” per one official—has been replaced by cofounder Tom Brown.

  15. Ornn, which plans to launch a marketplace for GPU capacity designed to function like an exchange to trade oil contracts, raised a $33M seed led by a16z (Katherine Doherty/Bloomberg)

    Katherine Doherty / Bloomberg : Ornn, which plans to launch a marketplace for GPU capacity designed to function like an exchange to trade oil contracts, raised a $33M seed led by a16z —  Ornn raised $33 million in a funding round led by Andreessen Horowitz as the marketplace startup seeks to build a venue where the power …

Solidot(15)

  1. 科学家将早期人类用火时间上溯至 180 万年前

    科学家在南非 Wonderwerk 洞穴发现了新证据,表明人类祖先在 107-179 万年前就开始使用火,这是已知最早的人类用火记录。研究人员在洞穴深处约 30 米处发现了反复用火的痕迹,这些地点远离自然野火可能影响的范围,因此表明早期人类有意将自然产生的火带入洞穴并持续燃烧。早期人类不能随意生火,他们很可能是从闪电引发的火或草原野火收集火源。

  2. 中国一季度 PC 出货量下滑 2%

    根据市场分析公司 Omdia 的数据,中国一季度 PC 出货量下滑 2%,平板电脑下滑 5%。PC 出货量降至 890 万台,平板电脑出货量降至 830 万台。笔记本电脑(含移动工作站)出货量同比下降 19%,而台式机(含台式工作站)出货量同比增长 41%,分别达到 530 万台和 360 万台。Omdia 称市场疲软的原因是组件成本上涨导致设备价格上涨,以及消费者补贴力度减弱。Omdia 预测 2026 年全年 PC 出货量将下降 14% 至 3600 万台,平板电脑出货量预计将下降 11% 至 3200 万台。最主要 PC 制造商包括联想、华为、苹果、软通动力和惠普。

  3. 幼儿早期的屏幕使用与较差的学习成绩和较弱的工作记忆相关

    随着屏幕在幼儿生活中几乎无处不在,一项研究调查了其对学习表现的影响。研究跟踪了 1-8 岁的儿童,发现屏幕观看时间更长与 9 岁时较差的学习表现以及 10.5 岁时较弱的工作记忆存在关联。研究结果表明,屏幕接触的时机可能与屏幕使用的总时长同样重要。WHO 和美国儿科学会建议幼儿在 18–24 个月前不要接触屏幕,2-5 岁儿童每天使用屏幕时间不超过 1 小时。但很多幼儿都超过了这些限制。最新研究追踪了 502 名儿童从婴儿期到童年中期的发育过程,发现在特定发育阶段屏幕观看时间较长的儿童,后期学业表现较差,工作记忆较弱。这种关联在婴儿期和学龄初期最为显著,表明这些阶段可能是认知发展的特别敏感窗口期。在整个童年期屏幕接触总量较高的儿童,学业表现也通常较差。研究结果表明,屏幕使用的时机可能与总暴露量同样重要。研究结果支持“越少越好”的原则,即儿童的屏幕时间越少越好。

  4. 欧洲是变暖速度最快的大陆

    本周英国、法国、意大利和西班牙都发布了红色高温预警,欧洲正经历五月以来第二波热浪。全球气温比工业化前时期——1850-1900 年——的水平高出约 1.4C,而根据欧盟哥白尼气候变化服务中心的数据,欧洲气温比工业化前水平高出约 2.4C。全球平均气温的持续上升主要是由于燃烧石油、天然气和煤炭产生的温室气体排放,但由于多种因素的共同作用,不同地区的升温幅度不同。陆地升温速度快于海洋,因为水可以吸收更多热量并通过蒸发冷却。哥白尼气候变化服务中心称,大气环流的变化导致欧洲夏季热浪更频繁强度更大。另一个主要原因是地理位置,欧洲与北极相连,北极气温比工业化前水平高出 3.2C。北极地区气温上升的部分原因是反照率。明亮的冰雪会将大部分太阳热量反射回太空,但冰雪融化会露出颜色较深吸收热量的陆地。欧洲冬季降雪频繁的地区,积雪覆盖面积正在减少,露出了深色的陆地。

  5. 伊朗断网期间仅约 2000 个 IP 能访问外网

    伊朗今年早些时候全国范围断网,持续数月之久。在断网期间,伊朗实施了白名单制度,也就是只有处于白名单内的极少数 IP 地址才能访问外网。研究人员利用位于伊朗境内的一台 VPS 以及位于匈牙利、美国以及日本的 VPS,根据伊朗自治系统通过 BGP 宣布的 IP 段总数约 11,766,454 个 IP 地址,伪造这些 IP 地址进行穷举,观察哪些 IP 能访问外网。结果显示,能访问外网的 IP 大约有 2000 个。研究人员还发现,即使这些 IP 能访问外网,它们也不能随意访问任何网站,而是受到了基于 SNI 的过滤机制的约束。但白名单 IP 地址也不是所有都受到 SNI 过滤,测试的 IP 至少有半数不受任何 SNI 过滤。这意味着白名单 IP 也存在不同的访问策略。

  6. 阿里巴巴起诉美国国防部

    阿里巴巴及其美国子公司共同以美国国防部为被告,向加州联邦地区法院递交诉状,请求法院宣告美国国防部 6 月 8 日公布的对阿里巴巴的认定决定无效。阿里巴巴在美国设有分支机构,在美国开展电商与云计算业务。即便被列入该名单,企业理论上仍可同美国企业开展合作,但美国国防部有权对与名单内中企合作的美国企业采取解约等限制措施。阿里巴巴在诉状中主张,此次认定缺乏事实依据、相关流程不合规。该认定致使公司无法继续聘用游说机构等,自身合法权益遭受侵害,此举同时违反美国宪法。

  7. 心率同步程度可判断社交投入程度

    根据发表在 PNAS Nexus 期刊上的一项研究,心率同步程度可判断社交投入程度。当人与人在身体与情感层面彼此亲近时,双方心率会逐渐同步。研究团队依托 72 名学生参与音频工程竞赛、赴纽约市期间采集的数据集开展研究。学生借助可收录环境噪音的助听器、监测心率的手环以及记录定位信息的手机采集各类数据。研究规定,人与人相距 20 米以内即为物理近距离接触。受试者共处时心率同步性更强,近距离互动、共同关注同一外界刺激(例如一同听课)时同步效应尤为明显。出行前便彼此熟悉的受试者,心率同步水平显著更高。

  8. GCC 编译器加入对海光苏州 x86 CPU 的支持

    GCC 编译器合并了支持海光代号苏州的 Model 8 c86-4g-m8 处理器的补丁。海光最早是与 AMD 合作的半导体企业,授权提供 AMD Zen 1 CPU的本地化版本,其产品仅供国内市场使用。几个月前 GCC 编译器合并了支持海光 C86-4G CPU 的补丁。Model 8 苏州 CPU 是上一代 Model 7 成都 CPU 的继任者,目前关于该处理器的信息很少,其指令集架构与上一代相差无几,支持包括 AVX-512 在内的指令集。

  9. 德国铁路因 IT 故障而停运

    德国铁路网络周二晚上因 IT 故障而全国停运。凌晨一点国家铁路运营商 Deutsche Bahn 宣布问题已经解决,服务正在逐步恢复。铁路公司称问题是铁路网络内部通信使用的 GSM-R 数字通信系统出现全国故障导致的,它表示已查明原因但未具体说明。铁路公司在故障期间向乘客发放了出租车和酒店代金券,并在条件允许下,在车站提供可供旅客乘坐的列车。该公司就此次事故表示歉意。

  10. 计划在伦敦举行的极端高温会议因极端高温预警取消

    原计划本周在伦敦举行的极端高温会议《Extreme Heat: Improving governance and strengthening action around the world》因英国气象局宣布的极端高温红色预警而取消。根据气象局发布的罕见高温红色预警,伦敦、英格兰中部部分地区、威尔士东南部和英格兰南部受到影响,时间从周三 09:00 BST 持续到周四 21:00 BST ,气象局警告高温可能会有重病或死亡风险。预计英格兰南部气温将升至 37-38 摄氏度左右,周三最高气温甚至可能达到 39 摄氏度。

  11. Valve 称它无法与内存厂商沟通报价

    Valve 宣布了起售价逾一千美元的 Steam Machine,它表示这一定价反映了过去 6 个月确保能获得的内存和存储组件的价格。DDR5 和 SSD 过去半年的价格上涨了数倍之多。Valve 在接受采访时表示,他们在采购内存时根本无法选择,只能接受厂商的报价,想要协商根本不可能,协商价格的结果会是完全断货。一位 Valve 员工说,内存厂商“每个月都给我们报个价,说‘你们可以买这么多’,只有答应或拒绝两种选择。如果我们拒绝,他们就再也不理我们了。”Steam Machine 的内存配置有两种:其一是两条 8GB DDR5 内存条,其二是单条 16GB DDR5 内存条,Valve 称它的测试显示两种配置性能相差无几。

  12. 高温干旱高 CO2 下大豆蛋白质含量会下降

    大豆是重要的蛋白质来源,但气候变化正日益影响其产量和营养品质。根据发表在《Food Research International》上的一项研究,高浓度二氧化碳会使大豆种子产量增加最高 142%,而高温和干旱则分别会使产量降低 91% 和 60%。在高浓度二氧化碳+高温+干旱三重效应下,大豆种子产量可能会增加 50%,可溶性糖含量增加 35%,氨基酸含量增加 175%,同时淀粉含量降低 20%,蛋白质含量降低 6%。

  13. 中国新超算灵晟登顶 Top500 榜单

    Top500 公布了最新的超算榜单,深圳国家超算中心的灵晟首次亮相即登顶榜单。灵晟理论峰值 2.736 Exaflop/s,在 HPL 测试中达到了 2.198 Exaflop/s,是 Top500 榜单中首个仅靠 CPU 实现持续双精度浮点性能逾 2 Exaflops 的超算系统。灵晟使用了 304 个核心的 LX2 CPU,总共 1379 万个核心,运行频率 1.55 GHz,操作系统是麒麟,功耗为 42.2 兆瓦。榜单前五的超算性能都超过了 Exaflops:灵晟;美国劳伦斯利弗莫尔国家实验室的 El Capitan,使用 AMD 第四代 EPYC 处理器,性能 1.809 Exaflop/s;橡树岭国家实验室(ORNL)的 Frontier,使用 AMD 第三代 EPYC,性能 1.353 Exaflop/s;阿贡国家实验室 Aurora 使用英特尔 Xeon CPU,性能 1.012 Exaflop/s,德国 Jülich 超算中心的 JUPITER Booster,使用英伟达 GH Superchip 72C 3GHz,性能 1 Exaflop/s。之后还有意大利 HPC7,微软 Microsoft Azure 超算 Eagle,意大利 HPC6,日本超算富岳(Fugaku),瑞士 Alps。排名前十的超算有四台使用了 AMD EPYC 处理器,两台英伟达处理器,两台英特尔处理器,灵晟的 CPU 架构没有说明。在 Top 500 中,美国有 162 台,日本 44 台,德国 41 台,中国 30 台;联想制造的超算最多有 129 台,其次是 HPE 的 124 台,BULL 的 58 台,戴尔的 49 台,英伟达的 37 台。

  14. 甲骨文过去一年裁员 2.1 万

    根据甲骨文的最新年报,该公司过去一年在全球裁员约 2.1 万人,原因是它正围绕 AI 重塑业务。截至 2026 年 5 月 31 日,甲骨文全职员工总数约 14.1 万人,而去年同期为 16.2 万人。甲骨文在其报告中称,AI 技术在运营中的部署已经导致且可能继续导致员工总数减少。裁员人数约占甲骨文员工总数的 13%。就业追踪公司估计,过去一年中有逾 10 万科技从业者被裁员。甲骨文称,过去一年它支付了 18 亿美元的遣散费和其它重组费用。

  15. 维基百科联合创始人 Larry Sanger 被封禁

    拥抱保守派、支持 MAGA 的维基百科联合创始人 Larry Sanger 再次现身维基百科,理由是帮助维基百科进行改革——aka 将其从自由派手中夺回来。他发起了“WikiProject Intellectual Diversity”提案,想要增加更多保守派的声音。他通过其社交媒体账号宣传该提案,违反了维基百科关于“隐蔽拉票(Stealth canvassing)”的政策,他在维基社区引发了争议,最终被封禁。

NEWSLETTER · FREE · WEEKLY

OrangeBot Weekly

5 Claude Code skills worth using each week — with my verdict on what’s actually good. No hype.