Sign in to view Eric’s full profile
or
New to LinkedIn? Join now
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
Sign in to view Eric’s full profile
or
New to LinkedIn? Join now
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
Redmond, Washington, United States
Sign in to view Eric’s full profile
Eric can introduce you to 10+ people at Microsoft
or
New to LinkedIn? Join now
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
42K followers
500+ connections
Sign in to view Eric’s full profile
or
New to LinkedIn? Join now
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
View mutual connections with Eric
Eric can introduce you to 10+ people at Microsoft
or
New to LinkedIn? Join now
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
View mutual connections with Eric
or
New to LinkedIn? Join now
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
Sign in to view Eric’s full profile
or
New to LinkedIn? Join now
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
Articles by Eric
-
Reflections as AAAI 2026 Concludes
Reflections as AAAI 2026 Concludes
I'm on my way back from the 40th Annual AAAI Conference in Singapore. AAAI is a broad, wide-ranging meeting that covers…
214
6 Comments -
Toward holistic evaluation of AI models for medical tasks: MedHELMJan 21, 2026
Toward holistic evaluation of AI models for medical tasks: MedHELM
In the late-summer of 2022, a senior leader at OpenAI reached out to me about evaluating their latest model, hot off…
192
5 Comments -
A Paradigm Shift for Building and Testing AI in MedicineJun 30, 2025
A Paradigm Shift for Building and Testing AI in Medicine
In medicine, diagnosis is rarely a one-shot answer. It’s an unfolding process of generating, testing, and refining…
498
37 Comments -
A Leap Forward in ChemistryJun 18, 2025
A Leap Forward in Chemistry
Today, our AI for Science team at Microsoft Research announced Skala, a deep learning–based approach that offers a more…
742
25 Comments -
Toward an Era of AI-Enabled Clinical CollaborationMay 19, 2025
Toward an Era of AI-Enabled Clinical Collaboration
Returning to clinical medicine to complete my MD/PhD training at Stanford University, after finishing a PhD in AI, was…
515
22 Comments -
Breakthrough in Quantum ComputingFeb 20, 2025
Breakthrough in Quantum Computing
In March 2012, a bold roadmap landed in my inbox. It was an ambitious plan for building a quantum computer, authored by…
785
26 Comments -
Advancing Healthcare AI: Progress in Medical Reasoning with LLMsDec 18, 2024
Advancing Healthcare AI: Progress in Medical Reasoning with LLMs
Our team has been rigorously evaluating the performance of large language models (LLMs) on medical tasks using…
497
20 Comments -
Protecting Scientific Integrity in an Age of Generative AIMay 22, 2024
Protecting Scientific Integrity in an Age of Generative AI
I enjoyed collaborating with a diverse team of scientists on a set of aspirational principles aimed at “Protecting…
42
3 Comments -
Fortifying the Resilience of our Critical InfrastructureFeb 28, 2024
Fortifying the Resilience of our Critical Infrastructure
Since the days of Franklin D. Roosevelt, U.
89
4 Comments -
Better Together: Joining Forces on Digital Media ProvenanceFeb 10, 2024
Better Together: Joining Forces on Digital Media Provenance
Eric Horvitz Chief Scientific Officer, Microsoft February 9, 2024 No single solution exists to confront the complex…
975
53 Comments
Activity
Sign in to view Eric’s full profile
or
New to LinkedIn? Join now
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
-
Yesterday I joined the roundtable on “AI Value Creation” at The Wharton School. The discussion with leaders from different industries focused on how…
Yesterday I joined the roundtable on “AI Value Creation” at The Wharton School. The discussion with leaders from different industries focused on how…
Liked by Eric Horvitz
-
Enjoyed this wide-ranging conversation with Stanford GSB Dean Sarah Soule. Stanford University Graduate School of Business Stanford University…
Enjoyed this wide-ranging conversation with Stanford GSB Dean Sarah Soule. Stanford University Graduate School of Business Stanford University…
Shared by Eric Horvitz
-
A tragic loss for the AI community and world.
A tragic loss for the AI community and world.
Shared by Eric Horvitz
Experience & Education
-
Microsoft
***** ********** *******
-
***** ********* *** ** *****
********** ******** *****
-
*** ******* **********
***** ** ********
View Eric’s full experience
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
Welcome back
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
Recommendations received
1 person has recommended Eric
Join now to viewMore activity by Eric
-
Today, I had the opportunity to hear from Eric Horvitz, Chief Scientific Officer at Microsoft, and his take on the current AI landscape was a crucial…
Today, I had the opportunity to hear from Eric Horvitz, Chief Scientific Officer at Microsoft, and his take on the current AI landscape was a crucial…
Liked by Eric Horvitz
-
Today Matt Booty and I had the chance to recognize Phil Spencer and the run he’s had leading Xbox. And what a run it’s been! For more than a decade…
Today Matt Booty and I had the chance to recognize Phil Spencer and the run he’s had leading Xbox. And what a run it’s been! For more than a decade…
Liked by Eric Horvitz
View Eric’s full profile
-
See who you know in common
-
Get introduced
-
Contact Eric directly
Other similar profiles
Explore more posts
-
Andrew Ng
DeepLearning.AI • 2M followers
Automated software testing is growing in importance in the era of AI-assisted coding. Agentic coding systems accelerate development but are also unreliable. Agentic testing — where you ask AI to write tests and check your code against them — is helping. Automatically testing infrastructure software components that you intend to build on top of is especially helpful and results in more stable infrastructure and less downstream debugging. Software testing methodologies such as Test Driven Development (TDD), a test-intensive approach that involves first writing rigorous tests for correctness and only then making progress by writing code that passes those tests, are an important way to find bugs. But it can be a lot of work to write tests. (I personally never adopted TDD for that reason.) Because AI is quite good at writing tests, agentic testing enjoys growing attention. First, coding agents do misbehave! My teams use them a lot, and we have seen: - Numerous bugs introduced by coding agents, including subtle infrastructure bugs that take humans weeks to find. - A security loophole that was introduced into our production system when a coding agent made password resets easier to simplify development. - Reward hacking, where a coding agent modified test code to make it easier to pass the tests. - An agent running "rm *.py" in the working directory, leading to deletion of all of a project's code (which, fortunately, was backed up on github). In the last example, when pressed, the agent apologized and agreed “that was an incredibly stupid mistake.” This made us feel better, but the damage had already been done! I love coding agents despite such mistakes and see them making us dramatically more productive. To make them more reliable, I’ve found that prioritizing where to test helps. I rarely write (or direct an agent to write) extensive tests for front-end code. If there's a bug, hopefully it will be easy to see and also cause little lasting damage. For example, I find generated code’s front-end bugs, say in the display of information on a web page, relatively easy to find. When the front end of a web site looks wrong, you’ll see it immediately, and you can tell the agent and have it iterate to fix it. (A more advanced technique: Use MCP to let the agent integrate with software like Playwright to automatically take screenshots, so it can autonomously see if something is wrong and debug.) In contrast, back-end bugs are harder to find. I’ve seen subtle infrastructure bugs — for example, one that led to a corrupted database record only in certain corner cases — that took a long time to find. Putting in place rigorous tests for your infrastructure code might help spot these problems earlier and save you many hours of challenging debugging. [Truncated due to length limit. Full text: https://lnkd.in/ghMqMcV4 ]
2,620
185 Comments -
Andreas Maier
Friedrich-Alexander-Universitä… • 7K followers
AI on Review: How Large Language Models Are Reshaping Peer Review The Peer Review Crunch and New "Reviewer Duties" Peer review is the backbone of scientific quality control, ensuring that research findings are vetted for accuracy and significance before publication (pmc.ncbi.nlm.nih.gov). In fast-moving fields like machine learning and computer vision, top conferences function much like journals - and the integrity of science depends on rigorous peer evaluation. However, the system is straining under an avalanche of submissions. Major AI conferences now routinely receive well over 10,000 papers, a surge that has stretched the reviewer pool to its limits (arxiv.org). This deluge has led to radical policy changes: some conferences now essentially conscript all submitting authors into service as reviewers. For example, ICLR 2025 explicitly warned authors that any paper without at least one author signed up to review would be desk-rejected (reddit.com). NeurIPS and others have similarly pleaded that "all authors help with reviewing, if asked," to tackle the reviewer shortage. https://lnkd.in/d7KDP99K
26
-
Yann LeCun
Advanced Machine Intelligence… • 1M followers
An excellent piece in Newsweek by former Bell Labs President Marcus Weldon on "the 8 principles for the future of AI." In a remarkably clear and concise manner, the piece distills some basic predictions from a series of interviews with roboticist Rodney Brooks, neuroscientist David Eagleman, and myself. https://lnkd.in/dsJSSvGM
859
49 Comments -
The Quantum Insider
59K followers
Quantum needs data—not just qubits. aqora’s new public datasets hub marks a practical step forward for the field. By creating a central space to upload, access, and collaborate on quantum-specific datasets, they’re addressing a quiet but critical challenge in the ecosystem: reproducibility and shared benchmarks. Why does this matter? Quantum machine learning and benchmarking rely on consistent, high-quality data—but until now, much of it has been scattered, inconsistent, or locked in silos. This hub offers: ➡️ Versioned, well-structured datasets ➡️ Integration with tools like pandas and polars ➡️ Options for both public and private data sharing ➡️ A base for teaching, testing, and real-world use cases As the field matures, tools like this can help shift the conversation from isolated experiments to collaborative progress. 🔗 Full article in the comments. #QuantumComputing #QML #OpenData #QuantumResearch #TQI #TheQuantumInsider #Aqora #EcosystemBuilding
79
5 Comments -
Ganesh Venkatesh
Waymo • 2K followers
🚀 Exciting News! Our New Paper on helping Multimodal LLMs see Beyond Language Prior got accepted to CVPR Workshop on Visual Concepts! 🚀 Ever wonder why Multimodal LLMs (MLLMs), despite seeing images, sometimes miss the bigger picture or rely too much on text? While text-only LLMs get rich feedback from every token, MLLMs often face a "sparse feedback" problem. They struggle to learn about image concepts not explicitly mentioned in text descriptions and can default to just predicting text based on language patterns, rather than truly seeing. 🔥 Our latest research tackles this head-on! We've developed novel training strategies that: 1️⃣ Deepen Visual Understanding using Visual Loss: Teach the MLLM to build a much richer internal representation of ALL visual concepts in an image. 2️⃣ Boost Visual Attention using Blank Tokens: Encourage the model to pay significantly more attention to what it sees by subtly weakening its over-reliance on predicting just from previous text tokens. 💡 The Impact? We're seeing strong performance improvements on demanding visual tasks in both upstream (core understanding) and downstream (application-level) settings! 🏆 Key Achievement: Our approach enables our Llama 3.1 8B based Llava model – which is smaller, uses a simpler model architecture, and operates on lower-resolution visual inputs – to MATCH the performance of the much larger Llama 3.2 11B on challenging visual reasoning benchmarks! 🤯 This shows the power of smarter training recipe! This is a crucial first step towards MLLMs that are more visually intelligent, reliable, and truly understand the world around them. Please stay tuned for our upcoming updates on model architecture advancements more conducive to capturing visual and language concepts as well as post training MLLMs to reason through tricky questions. 👉 Dive into the details! Read the full paper here: https://lnkd.in/gZcbw5yC A huge thank you to my incredible co-authors Aarti Ghatkesar Uddeshya Upadhyay and everyone who supported this journey! #MultimodalLLM #MLLM #VisualGrounding #VisCon
69
2 Comments -
David Talby
John Snow Labs • 25K followers
Our recent peer-reviewed paper on JMIR AI: The Leading Journal for Real-World AI Applications in Medicine with Mount Sinai Health System evaluates the performance Tailored #NLP Pipelines for Radiology, Pathology, and Progress Notes in #ClinicalDecisionSupport: https://lnkd.in/gcmFpxEK #ai #research #healthcareai #healthai #datascience #textmining
20
-
Furu Wei
Microsoft Research Asia • 12K followers
Introducing Generative Adversarial Distillation (GAD): a novel GAN-style formulation and framework that facilitates both on-policy and black-box distillation of large language models (LLMs). GAD is the first technique to enable block-box on-policy distillation from proprietary teachers where internal logits or parameters are inaccessible, or distillation between teacher and student LLMs with incompatible vocabularies. GAD expands our prior work on white-box on-policy distillation (i.e., MiniLLM), pioneering block-box on-policy distillation for LLM training. Specifically, GAD frames the student LLM as a generator and trains a discriminator to distinguish its responses from the teacher LLM’s, creating a minimax game. The discriminator acts as an on-policy reward model that co-evolves with the student, providing stable, adaptive feedback. Experimental results show that GAD consistently surpasses the commonly used sequence-level knowledge distillation. In particular, Qwen2.5-14B-Instruct (student) trained with GAD becomes comparable to its teacher, GPT-5-Chat, on the LMSYS-Chat automatic evaluation. The results establish GAD as a promising and effective paradigm for black-box LLM distillation. Our team has been conducting fundamental research in knowledge distillation with wide adoptions across the industry. - MiniLM: We introduced multi-head attention distillation, establishing the most effective distillation method for BERT-style models. The open-source MiniLM models (e.g., 6x384) have become the most widely utilized small encoder models on the Hugging Face. - MiniLLM: Our proposed Reverse KLD is recognized as one of the most effective, de facto on-policy distillation approaches for modern LLM training, which has been widely used by Thinking Machines, Gemma, and many other teams and models. - BitDistill: We proposed BitNet Distillation to finetune off-the-shelf full-precision LLMs (e.g., Qwen) into 1.58-bit precision (ternary weights {-1, 0, 1}), achieving performance parity with the full-precision counterparts on specific downstream tasks. - GAD: The development of Generative Adversarial Distillation (GAD) now allows for black-box on-policy distillation, overcoming two major prior limitations: (1) Distillation from proprietary teachers where internal logits or parameters are inaccessible; (2) Distillation between teacher and student LLMs with incompatible vocabularies. https://lnkd.in/gMaP2c7w
125
2 Comments -
Bo Wang
Xaira Therapeutics • 19K followers
🚨 Excited (and honored) to be invited by Stanford University AI+Biomedicine to give a seminar on BioReason: Toward Biological Foundation Models That Reason. Foundation models are rapidly transforming genomics, single-cell biology, and protein science—but most remain powerful pattern recognizers, not biological thinkers. In this talk, I’ll share our journey: 🧬 scGPT – learning broadly transferable cellular representations from single-cell data 🧠 BioReason – a DNA–LLM hybrid that pairs a genomic encoder with a language model trained via supervised + reinforcement learning to generate explicit, step-by-step biological logic 🧪 BioReason-Pro – a multimodal protein reasoning model integrating sequence, structure, domains, and interaction networks, trained on 130k curated reasoning traces These models don’t just improve performance on tasks like KEGG disease pathway inference and variant-effect prediction—they expose why a prediction is made. Why this matters: Accuracy without explanation doesn’t scale in biology. Discovery requires mechanistic reasoning, not just correlations. Together, these efforts point toward a Virtual Cell: an AI system that can predict, explain, and guide biological discovery—not just score benchmarks. 📍 Feb 10 | Stanford (CoDa E160) + Zoom Looking forward to deep discussions with the Stanford community.
514
7 Comments -
NVIDIA AI
2M followers
Get faster and smarter MOE inference straight out of the box. 👇Deep dive on scaling expert parallelism with TensorRT-LLM. LLMs with MOE promise higher model capacity without linearly increasing compute costs - but they introduce new challenges -- more conditional computation, dynamic routing, and non-uniform GPU utilization -- solved by TensorRT-LLM. ✨ New in TensorRT-LLM has native support for expert parallelism—designed for fast, efficient inference with MoE models like Mixtral (Mistral AI) and DeepSeek (DeepSeek AI). This gives you: ✅ Dynamic expert routing: Automatically route tokens to the top-k experts with minimal overhead. ✅ Efficient expert scheduling: Balance expert loads across GPUs using smart sharding and token bucketization. ✅ Memory-aware execution: Maximize hardware utilization while respecting memory budgets. ✅ Drop-in support: Use @HuggingFace models with minimal code changes via TensorRT-LLM's #Python API. 🧠 How it works: MoE models activate only a subset of "experts" for each token. This dynamic nature is powerful—but hard to optimize. It’s all done under the hood using custom #CUDA kernels and NCCL-based communication primitives—giving you low latency, high throughput, and better GPU scaling. ✨ TensorRT-LLM handles: ✅Token-expert mapping using the gating network. ✅Token sorting to batch same-expert tokens together. ✅Expert parallel execution across GPUs. ✅Merging outputs for final predictions. 🛠️ Developer workflow - here is the code to get started. # Clone the repo git clone https://lnkd.in/g-GiDX23 # Use included examples to load and run a Mixtral model cd TensorRT-LLM/examples/mixtral From there, the Python API lets you load the model, convert it with TensorRT, and run expert parallel inference—all with a few lines of code. Results? 📈 Performance at scale. Tests show up to 2.3x faster inference throughput compared to standard tensor parallelism when using 8 GPUs and top-2 experts per token. Even better—TensorRT-LLM keeps efficiency high across increasing batch sizes. Want to see it in action or contribute? 👉 Read the full tech blog: https://lnkd.in/g_7YV3vV 👉 Explore the code on GitHub: https://lnkd.in/gNjQ5W2U 👉 Follow updates in the TensorRT-LLM repo: https://lnkd.in/gqSHYQ4u Share your experiences with us.
78
3 Comments -
Evan Peikon
NNOXX Inc. • 8K followers
Co-expression network analysis is a powerful approach for identifying groups of genes, referred to as "modules," that exhibit coordinated expression patterns across a dataset. These modules often represent functionally related genes, or sets of genes, that work together in shared biological pathways or cellular processes. By capturing patterns of gene activity, co-expression networks enable researchers to infer relationships between gene expression and biological traits or conditions. This makes them invaluable tools for uncovering the mechanisms driving disease progression, developmental processes, or responses to environmental changes. One of the most widely adopted tools for constructing co-expression networks is Weighted Gene Co-expression Network Analysis (WGCNA). This method leverages pairwise correlations between gene expression levels to group genes into clusters, or modules, of tightly interconnected genes. Additionally, WGCNA highlights key genes within each module—known as hub genes—that are highly connected and often play pivotal regulatory roles. The latest article on Decoding Biology will introduce you to the principles of co-expression network analysis, walk you through constructing a co-expression network, and demonstrate how to identify key modules and hub genes associated with specific biological traits. PS - You can also read this article on Github with the following link: https://lnkd.in/eCJY9TZ2 #compbio #bioinformatics #genomics #biotech #systemsbiology #datascience
63
-
Paul Brian Contino
Paul Brian Contino • 11K followers
New Mount Sinai AI study finds model-driven bias in stress test of nine large language models (LLMs). Despite identical clinical details, AI models altered recommendations based on a patient's socioeconomic and demographic profile. The authors urge for more stringent model validation and refinement of AI tools to ensure they uphold the highest ethical standards and treat all patients fairly. “Socio-Demographic Biases in Medical Decision-Making by Large Language Models: A Large-Scale Multi-Model Analysis.” April 7, 2025, online issue of Nature Medicine https://lnkd.in/eTe98Kin #AL #LLM #bias #AIassurance #healthequity #mountsinai
23
5 Comments -
Google AI for Developers
113K followers
Stanford Center for Research on Foundational Model's Marin project has released the first fully open model in JAX. It’s an 'open lab' sharing the entire research process - including code, data, and logs, to enable reproducibility and further innovation. Check out the project: https://goo.gle/44AMeLY
41
2 Comments -
Arnab Nandi
The Ohio State University • 5K followers
Tomorrow at #SSDBM2025: Tanya will be presenting our vision for "OmniMesh: Addressing Findability Challenges in Distributed Nature Data Repositories"! This is a joint effort with Wei-Lun (Harry) Chao, Hilmar Lapp, Carl Boettiger, and Rongjun Qin. Findability is a critical impediment to biodiversity work: datasets are scattered and hard to find across many siloed repositories. At the same time, biodiversity data is inherently multimodal, making annotation a bottleneck. OmniMesh combines embeddings and zero-shot models to stitch these silos together with a lightweight, standards-based search layer. #multimodal #search #FAIR #ssdbm
44
2 Comments -
Juliana Freire
2K followers
Thrilled to share that our research group #NYUVida will be presenting three projects at #SIGMOD2025. From combating wildlife trafficking with AI to advancing biomedical data integration, we're pushing the boundaries of data science for real-world impact. Smart Sampling, Smarter Labels: LLM-Powered Data Labeling @ SIGMOD Research: What if you could achieve 95% accuracy in wildlife trafficking detection in online marketplaces while slashing labeling costs? Our Learning to Sample (LTS) framework makes it possible by strategically combining clustering with multi-armed bandit sampling to generate high-quality pseudo-labels from LLMs for training specialized classifiers. Born from collaborations with criminologists and environmental scientists, this approach tackles the notorious challenge of highly imbalanced datasets—turning the detection of illegal endangered species trade from needle-in-a-haystack to precision targeting. #MachineLearning #LLM #ConservationTech #Wildlife Juliana Silva Barbosa Gohar Petrossian From Chaos to Harmony: Agentic Data Harmonization @ NOVAS Workshop: Harmonia is an interactive system that combines #LLM reasoning with data harmonization primitives from the open-source bdi-kit library (https://lnkd.in/gKpnEGis) to synthesize data integration pipelines. The system empowers domain experts to harmonize datasets from diverse sources while streamlining the traditionally time-consuming process of resolving schema mismatches and terminology differences. We have demonstrated the effectiveness of Harmonia on biomedical data harmonization, creating reusable pipelines that map biomedical datasets to standards like GDC (Genomics Data Commons), and making biomedical research faster and more reproducible. #DataIntegration #AIAgents #OpenSource Aécio Santos Roque Lopez Eduardo Pena Hierarchical Table Semantics for Exploratory Table Discovery @ HILDA Workshop: Develops automated semantic representations of tables at multiple levels—from specific column types to general table semantics—enabling discovery through semantic alignment rather than traditional keyword or value matching. The system constructs hierarchical semantic trees that capture shared concepts across column groups, improving relevance detection in heterogeneous table collections. Particularly valuable for users exploring unfamiliar datasets where traditional search fails to capture semantic relevance. #SemanticSearch #DataDiscovery #DataIntegration Grace Fan NYU Tandon School of Engineering NYU Center for Data Science
103
1 Comment
Explore top content on LinkedIn
Find curated posts and insights for relevant topics all in one place.
View top content