Some of the most valuable information within organizations and markets has long been trapped in dense, unstructured text: annual reports, contracts and customer feedback. The challenge was not that this information didn’t exist, but the fact that extracting it was slow, expensive and difficult to do consistently.
Modern generative AI changes this limitation. With today’s models, leaders can transform “textual treasure mines” into structured, decision-ready information: a consistent signal that can be tracked over time, compared among peers, and connected to results.
Also read:
Continues after advertising
We study one of the most relevant areas of opportunity today: what we call climate solutions, that is, products and services that enable decarbonization (batteries, electric vehicles, etc.).
Many leaders considering new investments to drive growth in these areas face a basic question: Who is actually developing and selling climate solutions, and where is the opportunity emerging? Financial statements do not separate “income from climate solutions” in a standardized way.
In our recent research, we show how generative AI can help answer this question by treating regulatory texts as data. We apply an adjusted GPT model to Item 1 (“Business Description”) of companies’ 10-K (annual report) forms to construct a firm-specific annual measure of whether they are developing or implementing climate solutions products and services.
We focus on Item 1 because it provides a rich description of the company’s products and services and is available every year for all U.S. public companies.
We then fit a GPT model to detect phrases related to climate solutions, but not generic climate phrases.
This distinction means that selling electric vehicles counts; just using them in a corporate fleet is not the case.
Continues after advertising
To capture this nuance, we fit the GPT model with a set of around 3,500 Item 1 phrases, collected from companies across different industries.
We then used this model to classify all Item 1 phrases from 39,710 10-K forms from 4,483 US companies between 2005 and 2022. In total, we processed nearly 10 million phrases.
How regulatory texts can be converted into useful information
Consider these five strategic areas:
Continues after advertising
- Customer data: disclosures in regulatory filings about large customers, such as those representing more than 10% of a company’s sales
2. Competitors’ capabilities: product descriptions, technical documentation, patent registrations
3. Supply chain fragility: company’s own disclosures about supplier risk, safety records, or purchasing requirements
4. Regulatory Exposure and Compliance Posture: risk factor disclosures, enforcement language, public policy consultations
5. Workforce restrictions: job advertisements, internal skills inventories, employee feedback
In each case, the following sequence can be followed: choose a relevant and recurring text source; precisely define the concept (what you want to detect); adjust or calibrate the model for that concept; validate based on available references; and then use the resulting measures to monitor changes and support decisions.
Although we analyzed one topic and one type of text (regulatory), our conclusions bring three generalizable lessons for companies looking to transform their data into insights:
Identification of areas of opportunity
One reason Item 1 is useful is that it is closer to what companies claim to sell than what they claim to believe. When we use generative AI to quantify the intensity of a company’s climate solutions based on Item 1, the measure behaves like a real economic signal, not a vague narrative indicator.
Continues after advertising
When identifying areas of opportunity, companies must first find a measure that can be compared to external indicators.
In our study, we observed that the measure of climate intensity derived from our analysis increases in conjunction with external benchmarks of green revenue and green innovation, and grows consistently with real product and market engagement rather than broad climate rhetoric.
Second, companies must connect their measures to business outcomes that matter to executives. We observed that companies with greater intensity in climate solutions showed greater revenue growth.
Continues after advertising
This relationship was stronger in contexts where demand and competitive advantage tend to be more durable: where innovation is protectable (e.g. patents) and in solution categories with greater carbon reduction potential.
In sum, the text-derived measure not only reflected activity but was also associated with growth.
Learning about changes in external factors
Generative AI offers a scalable “external sensing” layer. Instead of relying solely on the news cycle, trends at conferences, or a curated list of competitors, leaders can systematically monitor how companies across the economy describe what they do and spot early signs of convergence that could reshape value chains.
Depending on the data source used by the AI, this can provide an annual, quarterly, or even more frequent view of changes in the landscape.
We observe that, as decarbonization technologies gain scale, they begin to cross traditional sectoral lines. Biofuels, for example, are being developed both by oil and gas companies as alternative fuels and by actors in the agricultural sector through waste-based routes.
In practice, this means that the relevant ecosystem for a given solution is often broader than a single industry classification, and that the set of relevant competitors, partners, and bottlenecks can change more quickly than leaders expect.
Text analytics offers a way to observe this earlier. By examining the distribution of publicized climate solution topics across sectors, we can quantify when two industry groups begin to overlap in the solution areas they emphasize.
We relate this convergence to capital market activities: sector pairs that are most similar in their climate solutions theme profiles showed greater co-movement in stock returns, consistent with fundamentals moving together.
For leaders, this finding indicates that when adjacent industries begin to resemble each other in their solution focus, the competitive environment is likely changing, whether due to new entrants, new substitutes, or new dependencies on shared suppliers and platforms.
Executives operate with hypotheses: beliefs about what will limit adoption, where demand will come from, and which barriers are structural or temporary. But in uncertain markets, these hypotheses can become anchors, especially when executives don’t have direct visibility into what’s happening in the broader market.
A useful feature of generative AI-based measurement is that it can challenge these assumptions by aggregating evidence at scale. Our results illustrate this with a common belief about the climate transition: that politics will sharply divide opportunity and adoption.
We find that politics matters, but to a lesser extent than one might assume: companies with greater exposure to Republican-leaning states have a lower measure of climate solutions in our study than those more exposed to Democratic-leaning states.
However, the standard is not uniform across technologies. The political difference disappears in the case of low-cost technologies. This nuance is strategically important because it reframes the constraint: policy may slow or shape adoption in some segments, but as costs fall and solutions become economically attractive, this constraint weakens.
This type of insight can help update executives’ beliefs. Rather than debating isolated cases, leaders can use systematic evidence from textual data to distinguish where political constraints actually hold sway and where economics dominates. This, in turn, can guide where to allocate capital, which geographies to prioritize, and how to plan entry into different solution categories.
c.2026 Harvard Business School Publishing Corp. Distribuído por New York Times Licensing