Brian Uzzi


2026

Large language models are increasingly used in inventive problem-solving, but effective support requires more than open-ended idea generation. Inventive problem-solving requires improving one aspect of a technical system without unintentionally worsening another. TRIZ (Theory of Inventive Problem Solving) provides a unique and structured framework for this setting by representing engineering trade-offs as contradictions and linking them to standardized inventive principles. However, prior TRIZ–LLM evaluations are typically small-scale, case studies in focused areas of technology, and rarely grounded in patent text, which makes it difficult to assess structured reasoning at scale. We introduce TRIZBench, a dataset and benchmark for TRIZ reasoning grounded in open technical sources and U.S. patents. TRIZBench evaluates the core TRIZ workflow through three tasks: contradiction prediction, inventive principle prediction, and grounded TRIZ reasoning. Experiments with multiple LLM baselines show that detecting contradictions is easier than recovering correct trade-off pairs, while principle prediction benefits from explicitly exploiting TRIZ structure. Our findings further underscore the importance of grounding. We show that semantic retrieval enables evidence-based justifications and helps explain why LLMs fail. Dataset and code are available at https://github.com/ellenzhuwang/trizbench.

2024

The patent citation count is a good indicator of patent quality. This often generates monetary value for the inventors and organizations. However, the factors that influence a patent receiving high citations over the year are still not well understood. With the patents over the past two decades, we study the problem of patent citation prediction and formulate this as a binary classification problem. We create a semantic graph of patents based on their semantic similarities, enabling the use of Graph Neural Network (GNN)-based approaches for predicting citations. Our experimental results demonstrate the effectiveness of our GNN-based methods when applied to the semantic graph, showing that they can accurately predict patent citations using only patent text. More specifically, these methods produce up to 94% recall for patents with high citations and outperform existing baselines. Furthermore, we leverage this constructed graph to gain insights and explanations for the predictions made by the GNNs.