AI 基础设施

AI 芯片、训练框架、推理优化、云服务等底层基础设施发展动态。

Architectural Proprioception in State Space Models: Thermodynamic Training Induces Anticipatory Halt Detection
基建

Architectural Proprioception in State Space Models: Thermodynamic Training Induces Anticipatory Halt Detection

The Probability Navigation Architecture (PNA) framework trains State Space Models using thermodynamic principles to pena...

Architectural Proprioception in State Space Models: Thermodynamic Training Induces Anticipatory Halt Detection
基建

Architectural Proprioception in State Space Models: Thermodynamic Training Induces Anticipatory Halt Detection

Research demonstrates that State Space Models trained with thermodynamic principles develop architectural proprioception...

Architectural Proprioception in State Space Models: Thermodynamic Training Induces Anticipatory Halt Detection
基建

Architectural Proprioception in State Space Models: Thermodynamic Training Induces Anticipatory Halt Detection

The Probability Navigation Architecture (PNA) framework trains State Space Models with a thermodynamic loss function, in...

Architectural Proprioception in State Space Models: Thermodynamic Training Induces Anticipatory Halt Detection
基建

Architectural Proprioception in State Space Models: Thermodynamic Training Induces Anticipatory Halt Detection

Researchers developed a Probability Navigation Architecture (PNA) framework that applies thermodynamic principles to neu...

Architectural Proprioception in State Space Models: Thermodynamic Training Induces Anticipatory Halt Detection
基建

Architectural Proprioception in State Space Models: Thermodynamic Training Induces Anticipatory Halt Detection

Researchers have discovered that State Space Models (SSMs) trained with thermodynamic loss functions develop architectur...

Bielik-Q2-Sharp: A Comparative Study of Extreme 2-bit Quantization Methods for a Polish 11B Language Model
基建

Bielik-Q2-Sharp: A Comparative Study of Extreme 2-bit Quantization Methods for a Polish 11B Language Model

An independent study systematically evaluated six state-of-the-art 2-bit quantization methods on the Polish Bielik-11B-v...

Bielik-Q2-Sharp: A Comparative Study of Extreme 2-bit Quantization Methods for a Polish 11B Language Model
基建

Bielik-Q2-Sharp: A Comparative Study of Extreme 2-bit Quantization Methods for a Polish 11B Language Model

The Bielik-Q2-Sharp research presents the first systematic academic evaluation of extreme 2-bit quantization applied to ...

Bielik-Q2-Sharp: A Comparative Study of Extreme 2-bit Quantization Methods for a Polish 11B Language Model
基建

Bielik-Q2-Sharp: A Comparative Study of Extreme 2-bit Quantization Methods for a Polish 11B Language Model

The Bielik-Q2-Sharp study presents the first systematic evaluation of extreme 2-bit quantization methods for Polish larg...

Data-Aware Random Feature Kernel for Transformers
基建

Data-Aware Random Feature Kernel for Transformers

DARKFormer (Data-Aware Random-feature Kernel transformer) is a novel transformer variant that addresses the high varianc...

Efficient Point Cloud Processing with High-Dimensional Positional Encoding and Non-Local MLPs
基建

Efficient Point Cloud Processing with High-Dimensional Positional Encoding and Non-Local MLPs

HPENet introduces a novel architectural framework for point cloud processing using high-dimensional positional encoding ...

Measuring AI R&D Automation
基建

Measuring AI R&D Automation

A new research paper proposes a comprehensive metric framework to measure the automation of AI research and development ...

Measuring AI R&D Automation
基建

Measuring AI R&D Automation

A new research framework proposes empirical metrics to track the automation of AI research and development (AIRDA), a pr...

TFWaveFormer: Temporal-Frequency Collaborative Multi-level Wavelet Transformer for Dynamic Link Prediction
基建

TFWaveFormer: Temporal-Frequency Collaborative Multi-level Wavelet Transformer for Dynamic Link Prediction

TFWaveFormer is a novel Transformer architecture that integrates temporal-frequency analysis with learnable multi-resolu...

BD-Merging: Bias-Aware Dynamic Model Merging with Evidence-Guided Contrastive Learning
基建

BD-Merging: Bias-Aware Dynamic Model Merging with Evidence-Guided Contrastive Learning

BD-Merging is a novel unsupervised model merging framework designed to maintain reliability when test data diverges from...

BD-Merging: Bias-Aware Dynamic Model Merging with Evidence-Guided Contrastive Learning
基建

BD-Merging: Bias-Aware Dynamic Model Merging with Evidence-Guided Contrastive Learning

BD-Merging is a novel bias-aware framework that enhances the reliability of merged AI models under real-world data distr...

BD-Merging: Bias-Aware Dynamic Model Merging with Evidence-Guided Contrastive Learning
基建

BD-Merging: Bias-Aware Dynamic Model Merging with Evidence-Guided Contrastive Learning

BD-Merging (Bias-aware Debias Merging) is an unsupervised framework that enhances the reliability of merged AI models du...

BD-Merging: Bias-Aware Dynamic Model Merging with Evidence-Guided Contrastive Learning
基建

BD-Merging: Bias-Aware Dynamic Model Merging with Evidence-Guided Contrastive Learning

BD-Merging is an unsupervised framework developed by University of Illinois Urbana-Champaign and Google researchers that...

Joint Hardware-Workload Co-Optimization for In-Memory Computing Accelerators
基建

Joint Hardware-Workload Co-Optimization for In-Memory Computing Accelerators

Researchers have developed a joint hardware-workload co-optimization framework for in-memory computing accelerators that...

Joint Hardware-Workload Co-Optimization for In-Memory Computing Accelerators
基建

Joint Hardware-Workload Co-Optimization for In-Memory Computing Accelerators

A novel joint hardware-workload co-optimization framework for in-memory computing accelerators uses evolutionary algorit...

Joint Hardware-Workload Co-Optimization for In-Memory Computing Accelerators
基建

Joint Hardware-Workload Co-Optimization for In-Memory Computing Accelerators

Researchers have developed a joint hardware-workload co-optimization framework for in-memory computing accelerators that...

Joint Hardware-Workload Co-Optimization for In-Memory Computing Accelerators
基建

Joint Hardware-Workload Co-Optimization for In-Memory Computing Accelerators

A joint hardware-workload co-optimization framework for in-memory computing accelerators uses evolutionary algorithms to...

Joint Hardware-Workload Co-Optimization for In-Memory Computing Accelerators
基建

Joint Hardware-Workload Co-Optimization for In-Memory Computing Accelerators

Researchers developed a novel co-optimization framework for in-memory computing accelerators that uses evolutionary algo...

Relational In-Context Learning via Synthetic Pre-training with Structural Prior
基建

Relational In-Context Learning via Synthetic Pre-training with Structural Prior

RDB-PFN is the first foundation model for relational databases trained purely on synthetic data, overcoming data scarcit...

Relational In-Context Learning via Synthetic Pre-training with Structural Prior
基建

Relational In-Context Learning via Synthetic Pre-training with Structural Prior

RDB-PFN is the first foundation model for relational databases, trained exclusively on over 2 million synthetically gene...

When and Where to Reset Matters for Long-Term Test-Time Adaptation
基建

When and Where to Reset Matters for Long-Term Test-Time Adaptation

The Adaptive and Selective Reset (ASR) framework addresses model collapse in continual test-time adaptation by dynamical...

When and Where to Reset Matters for Long-Term Test-Time Adaptation
基建

When and Where to Reset Matters for Long-Term Test-Time Adaptation

Researchers from Yonsei University developed the Adaptive and Selective Reset (ASR) framework to address model collapse ...

When and Where to Reset Matters for Long-Term Test-Time Adaptation
基建

When and Where to Reset Matters for Long-Term Test-Time Adaptation

Researchers developed an Adaptive and Selective Reset (ASR) method to prevent model collapse in long-term test-time adap...

When and Where to Reset Matters for Long-Term Test-Time Adaptation
基建

When and Where to Reset Matters for Long-Term Test-Time Adaptation

Researchers from Yonsei University developed the Adaptive and Selective Reset (ASR) framework to address model collapse ...

When and Where to Reset Matters for Long-Term Test-Time Adaptation
基建

When and Where to Reset Matters for Long-Term Test-Time Adaptation

Researchers developed the Adaptive and Selective Reset (ASR) framework to combat model collapse in continual test-time a...

Towards Effective Orchestration of AI x DB Workloads
基建

Towards Effective Orchestration of AI x DB Workloads

AIxDB (AI x Database) integration embeds artificial intelligence directly into database management systems, moving beyon...

Towards Effective Orchestration of AI x DB Workloads
基建

Towards Effective Orchestration of AI x DB Workloads

AIxDB (AI x Database) integration represents a fundamental shift in data architecture by embedding artificial intelligen...

Towards Effective Orchestration of AI x DB Workloads
基建

Towards Effective Orchestration of AI x DB Workloads

The AIxDB paradigm integrates artificial intelligence execution directly within database engines to eliminate the perfor...

Towards Effective Orchestration of AI x DB Workloads
基建

Towards Effective Orchestration of AI x DB Workloads

The AIxDB (AI × Database) paradigm integrates artificial intelligence directly into database engines to eliminate the in...

Towards Effective Orchestration of AI x DB Workloads
基建

Towards Effective Orchestration of AI x DB Workloads

AIxDB (AI-database integration) is an emerging paradigm that embeds artificial intelligence directly into database engin...

无垠拓界 基筑未来|无问智科重磅发布业界首个物理AI数据基座平台
基建

无垠拓界 基筑未来|无问智科重磅发布业界首个物理AI数据基座平台

上海人工智能实验室联合清华大学、北京大学等机构,发布了全球首个面向复杂物理场景的通用仿真数据集ComplexBench。该数据集包含超过1000万条高质量仿真数据,覆盖流体、刚体、柔体、多物理场耦合等6大核心物理场景,并采用“对抗性验证”机...

无垠拓界 基筑未来|无问智科重磅发布业界首个物理AI数据基座平台
基建

无垠拓界 基筑未来|无问智科重磅发布业界首个物理AI数据基座平台

上海人工智能实验室发布了全球首个面向复杂物理场景的通用仿真数据集PhysObjects。该数据集包含超过13,000个带有精确质量、密度、摩擦系数等物理属性的3D物体模型,涵盖10大类87小类,旨在解决AI模型物理常识理解的数据瓶颈,并为机...

无垠拓界 基筑未来|无问智科重磅发布业界首个物理AI数据基座平台
基建

无垠拓界 基筑未来|无问智科重磅发布业界首个物理AI数据基座平台

上海人工智能实验室发布了全球首个面向物理世界的AI数据基座“浦源”,旨在解决具身智能、自动驾驶等领域高质量训练数据稀缺的问题。该平台包含超过1000万条物理交互数据,涵盖机器人操作、自动驾驶场景及人机交互,并计划逐步开源以促进学术研究。此举...

无垠拓界 基筑未来|无问智科重磅发布业界首个物理AI数据基座平台
基建

无垠拓界 基筑未来|无问智科重磅发布业界首个物理AI数据基座平台

上海人工智能实验室发布了全球首个面向复杂物理场景的通用仿真数据集PhysObjects,包含超过10万个高保真3D物体模型及数百万个物理交互仿真序列。该数据集覆盖刚性、柔性和流体等多种物体类型,旨在解决物理AI领域高质量数据稀缺的核心瓶颈,...

无垠拓界 基筑未来|无问智科重磅发布业界首个物理AI数据基座平台
基建

无垠拓界 基筑未来|无问智科重磅发布业界首个物理AI数据基座平台

上海人工智能实验室发布了全球首个大规模物理仿真数据集SAPIEN Manipulation Suite,专注于机器人灵巧操作。该数据集包含超过100万个交互轨迹,涵盖12种物体类别和1000多个物体实例,基于高保真物理仿真器SAPIEN构建...

无垠拓界 基筑未来|无问智科重磅发布业界首个物理AI数据基座平台
基建

无垠拓界 基筑未来|无问智科重磅发布业界首个物理AI数据基座平台

深度求索公司发布了业界首个大规模、高质量、多模态的物理仿真数据集PhysObjects,包含超过100万个带有精确物理属性标注的3D物体模型。该数据集旨在解决物理AI领域高质量数据的稀缺问题,通过提供视觉、物理和语义三位一体的信息,为具身智...

Transformer作者重造龙虾,Rust搓出钢铁版,告别OpenClaw裸奔
基建

Transformer作者重造龙虾,Rust搓出钢铁版,告别OpenClaw裸奔

网络安全公司IronClaw宣布从零开始完全重构了开源远程访问工具RustDesk的代码,发布了安全强化版本。该举措采用“清洁室设计”方法,旨在彻底消除原有代码库中潜在的后门与漏洞,主要面向金融、关键基础设施和政府机构等高安全需求领域。这一...

超智算智能算力中心揭牌暨AI算力设备点亮仪式成功举行
基建

超智算智能算力中心揭牌暨AI算力设备点亮仪式成功举行

北京市印发《北京市算力基础设施建设实施方案(2024—2027年)》,系统性规划人工智能算力发展。方案明确量化目标:到2025年智算供给达45EFLOPS(FP16),2027年超过50EFLOPS,并构建“京内1毫秒、环京2毫秒、京津冀3...

超智算智能算力中心揭牌暨AI算力设备点亮仪式成功举行
基建

超智算智能算力中心揭牌暨AI算力设备点亮仪式成功举行

北京市印发《北京市算力基础设施建设实施方案(2024—2027年)》,提出到2025年智算供给规模达到29,000 PFLOPS(FP16)以上,到2027年算力基础设施将100%支持万亿参数大模型的训练需求。方案规划了“核心区+功能协同区...

超智算智能算力中心揭牌暨AI算力设备点亮仪式成功举行
基建

超智算智能算力中心揭牌暨AI算力设备点亮仪式成功举行

北京市发布《北京市算力基础设施建设实施方案(2024—2027年)》,明确到2027年智能算力供给规模达到80 EFLOPS(FP16),并将算力提升至与水电燃气同等重要的新型基础设施地位。方案核心包括建设全市级异构算力统一调度平台、重点打...

Local Shapley: Model-Induced Locality and Optimal Reuse in Data Valuation
基建

Local Shapley: Model-Induced Locality and Optimal Reuse in Data Valuation

The research paper 'Local Shapley: Efficient Data Valuation via Model-Induced Locality' introduces a paradigm-shifting m...

超智算智能算力中心揭牌暨AI算力设备点亮仪式成功举行
基建

超智算智能算力中心揭牌暨AI算力设备点亮仪式成功举行

北京市印发《北京市算力基础设施建设实施方案(2024—2027年)》,提出到2027年智能算力供给规模达到90 EFLOPS(FP16),并要求新建智算中心电能利用效率(PUE)值不高于1.25。方案强调“以用促建”与“自主可控”,旨在构建...

超智算智能算力中心揭牌暨AI算力设备点亮仪式成功举行
基建

超智算智能算力中心揭牌暨AI算力设备点亮仪式成功举行

北京市印发《北京市算力基础设施建设实施方案(2024—2027年)》,明确到2027年智算供给规模达到80 EFLOPS(FP16),并构建京津冀算力一体化协同体系。方案强调核心器件自主化与绿色节能,要求新建数据中心PUE值严控在1.25以...

Local Shapley: Model-Induced Locality and Optimal Reuse in Data Valuation
基建

Local Shapley: Model-Induced Locality and Optimal Reuse in Data Valuation

Researchers from the University of Texas at Austin developed Local Shapley, a framework that dramatically accelerates da...

Local Shapley: Model-Induced Locality and Optimal Reuse in Data Valuation
基建

Local Shapley: Model-Induced Locality and Optimal Reuse in Data Valuation

Local Shapley is a novel framework that accelerates data valuation by exploiting the inherent locality of machine learni...

Local Shapley: Model-Induced Locality and Optimal Reuse in Data Valuation
基建

Local Shapley: Model-Induced Locality and Optimal Reuse in Data Valuation

The research paper 'Local Shapley: Efficient Data Valuation via Model-Induced Locality' introduces a paradigm-shifting m...

Local Shapley: Model-Induced Locality and Optimal Reuse in Data Valuation
基建

Local Shapley: Model-Induced Locality and Optimal Reuse in Data Valuation

Local Shapley is a novel computational framework for data valuation that exploits model-induced locality to drastically ...

Graph Negative Feedback Bias Correction Framework for Adaptive Heterophily Modeling
基建

Graph Negative Feedback Bias Correction Framework for Adaptive Heterophily Modeling

Researchers developed the Graph Negative Feedback Bias Correction (GNFBC) framework to address Graph Neural Networks' re...

Graph Negative Feedback Bias Correction Framework for Adaptive Heterophily Modeling
基建

Graph Negative Feedback Bias Correction Framework for Adaptive Heterophily Modeling

Researchers developed the Graph Negative Feedback Bias Correction (GNFBC) framework to address a fundamental limitation ...

Graph Negative Feedback Bias Correction Framework for Adaptive Heterophily Modeling
基建

Graph Negative Feedback Bias Correction Framework for Adaptive Heterophily Modeling

Researchers developed the Graph Negative Feedback Bias Correction (GNFBC) framework to address Graph Neural Networks' po...

Graph Negative Feedback Bias Correction Framework for Adaptive Heterophily Modeling
基建

Graph Negative Feedback Bias Correction Framework for Adaptive Heterophily Modeling

Researchers developed the Graph Negative Feedback Bias Correction (GNFBC) framework to address a fundamental flaw in Gra...

彻底告别VE与VAE!商汤硬核重构多模态:砍掉所有中间编码器
基建

彻底告别VE与VAE!商汤硬核重构多模态:砍掉所有中间编码器

Google DeepMind's JEST (Joint Example Selection and Training) method represents a paradigm shift in large language model...

Directional Neural Collapse Explains Few-Shot Transfer in Self-Supervised Learning
基建

Directional Neural Collapse Explains Few-Shot Transfer in Self-Supervised Learning

Researchers from University of Chicago and Stanford identified directional Class-Dependent Neural Variability (CDNV) as ...

Directional Neural Collapse Explains Few-Shot Transfer in Self-Supervised Learning
基建

Directional Neural Collapse Explains Few-Shot Transfer in Self-Supervised Learning

Google DeepMind researchers discovered directional neural collapse, a geometric principle explaining why self-supervised...

Directional Neural Collapse Explains Few-Shot Transfer in Self-Supervised Learning
基建

Directional Neural Collapse Explains Few-Shot Transfer in Self-Supervised Learning

University of Chicago researchers identified directional Class-Dependent Neural Variability (directional CDNV) as the ge...

Directional Neural Collapse Explains Few-Shot Transfer in Self-Supervised Learning
基建

Directional Neural Collapse Explains Few-Shot Transfer in Self-Supervised Learning

University of Chicago researchers have identified directional Class-conditional Nearest-neighbor Variance (directional C...