Peak AI Hype: An Investing Perspective

Putting aside my general enthusiasm for tech developments, the recent wave of chatbots are distinct from many new technologies (sorry, NFT’s) in that they are useful now, they’re pretty good at what they do - unless it’s maths - and they’re rapidly getting better.

Since ChatGPT’s launch late last year, Large Language Models have quickly captured the imagination, with OpenAI’s initial variant reaching a reported 100 million monthly active users by January. Even by the standards of modern technology platforms, that’s pretty fast:

[100M Users.png] Google Plus is the notable absentee - it took 14 months to reach 100M but was shut down in 2019. Google will be hoping Bard fares better

At this relatively early stage in their development - remember, ChatGPT only launched last November - I’m going to jump on the bandwagon (this is your ‘nearing peak AI hype cycle’ warning) but offer a slightly different take by covering LLM’s current utility as an ‘investment assistant’, as well as briefly look at how markets have factored in the potential of this technology.

AI as Your Investment Copilot - The Current State of Play

In order to evaluate LLM’s as they currently stand, I came up with a short list of actions that I’d expect an ideal investment assistant be able to carry out:

Provide a summary of a sector, and the biggest companies within that sector.
Analyse company reports, accurately surfacing user-requested information.
Calculate and compare valuation metrics between companies and provide plausible suggestions for differences.
Given a user-defined set of characteristics, suggest businesses that fit these criteria.

As it turns out, at the time of writing (May 2023), 1 and 2 can already be accomplished:

1. Summarising Business Sectors

As with most tasks that require the models to provide a broad overview of a topic, either Bing Chat or GPT-4 can handle queries about business sectors with relative ease:

[Business Sector Summary.png] Not bad! Using GPT-4 + the ‘Browsing’ plugin

While this isn’t going to rival a MorningStar report, it’s at least a useful starting point for looking into the wider context of an investment.

2. Analysing Earnings Reports

Moving on to the next example, which involves a little more complexity, it’s worth highlighting quite how quickly some of the early issues here have been addressed. Microsoft’s initial public demo of Bing Chat was far from perfect, including a summary of Gap’s Q3 2022 financial report that bore little relationship to their real, published results:

[Bing Chat's Public Errors.png] Responses shown at the public reveal of Bing Chat. At least one bullet point was correct! Source: DKB Blog

When addressing this alongside the other, non-financial errors, Microsoft was quick to point out that this was a first-generation product and that stumbles were expected (and there were a lot of initial hiccups) and promised quick improvements in functionality over time. In fact, they directly acknowledged the financial report issue:

“For queries where you are looking for a more direct and factual answers such as numbers from financial reports, we’re planning to 4x increase the grounding data we send to the model,” the company said in a statement. - Feb 16th

Now, a couple of months after that statement, I decided to retry the same prompt with the same report:

[Gap Analysis Retry.png] This time using Bing Chat’s recently introduced ‘Precise’ mode

The prompt that caused the originally hallucinated figures is now… answered correctly and matches the real metrics. This is great! But, there’s a caveat. You can only expect Bing Chat to respond satisfactorily to prompts whose answers are specifically stated in the text of the webpage. When it comes to parsing and extracting information from tables, the tech isn’t there yet, so given an open-ended question, it will rely on any available passages. As soon as you want specifics - ‘how much cash was used in investing activities?’, for instance - it shrugs its shoulders. Given the current pace of change, it will hopefully be only be a matter of time before this is updated.

Update: In the 24 hours since I’d written the previous sentence, Anthropic announced this:

Introducing 100K Context Windows! We’ve expanded Claude’s context window to 100,000 tokens of text, corresponding to around 75K words. Submit hundreds of pages of materials for Claude to digest and analyze. Conversations with Claude can go on for hours or days. pic.twitter.com/4WLEp7ou7U
— Anthropic (@AnthropicAI) May 11, 2023

I don’t have access yet, but according to this subsequent post it now looks as if item number 2 is ready to be checked off too! 🤯 There’s no substitute for reading a 10-K, sure, but this is easily one of the most interesting applications of LLM’s that’s been shown so far. Being able to download the past few years of annual reports for a company and query, say, how their risk disclosures have changed over that period, is powerful.

3. Calculating and Comparing Valuation Metrics

So now that this problem is apparently solved, LLM’s can easily be used to calculate valuations too, surely? Unfortunately this is a sticking point of the existing models. At the moment, although most of the major models will happily answer any numerical queries, none of them can claim to be fully accurate, which essentially makes them unfit for that purpose. However, assuming some programming knowledge (and API access), you can circumvent this by asking the LLM to pass any calculations to your own program (or alternatively, just ask an LLM for instructions on how to build one). If using the web-hosted chatbots, specific solutions like Wolfram Alpha’s OpenAI plugin should also reliably solve this issue, once the kinks have been ironed out.

Once valuation metrics have been obtained one way or another, thinking about reasons for differences in valuation is somewhere an AI assistant is able to help:

[GPT-4 Explains the Multiple Gap.png] All plausible explanations, although perhaps surprising it didn’t highlight the different markets each business operates in.

4. As a Screener

The utility of this last suggestion is perhaps questionable, given that there are plenty of stock screeners out there, but I was curious to see what GPT-4 could come up with, if anything:

[GPT-Fail.png] Computer says no.

Ah well, 2.5 out of 4 isn’t a bad start all things considered! Needless to say, it’s unclear whether the frantic pace of change will keep up as we head into the second half of 2023 - it’s hard to imagine that’s even possible - but if so, it seems likely that LLM’s will become a key part of an investors toolkit, if they aren’t already.

[Bing Chat Progress 1.png] Bonus: summarising Twilio’s latest quarterly waffle. Microsoft’s Edge browser allows querying of pdf’s saved to your device.

The Big Tech AI Showdown (& Everyone Else in Software)

[Bing Chat Error.png] Microsoft and Google have moved with rare and telling speed to compete over AI, but not without missteps. Both Bing Chat and Bard had visible errors in their public reveals.

The biggest news to anyone with exposure to ‘Big Tech’, (which at least via pension funds, includes pretty much everyone) is the fact that we now have the largest dust-up since Apple’s IDFA changes, and one with far greater potential consequences. The high-level takeaway is that Microsoft have positioned themselves brilliantly, recognising the promise of OpenAI early on, openly challenging Google, and dumping out a ton of marketing to attempt to convince investors and potential customers alike of their superior AI prowess. Arda Capital has written a great piece on the topic. They assert that for all the noise Microsoft have made around challenging Google’s Search dominance, the real focus is on capturing developer mindshare:

Microsoft’s pitch to developers is simple: Azure will be the best platform to build, train, and fine-tune AI models and applications. - Arda Capital

This forces Google to react on two fronts: moving to defend their Search linchpin, and ensuring the AI-competitiveness of their own cloud platform, GCP, which has historically been a distant third place to AWS and Azure. On the face of it, they’ve been outmanoeuvred, and pretty handily if the reports of a ‘code red’ issued late last year are to be believed. While there’s also been action in Amazon’s corner, most notably through the offering of their own models available via AWS, as well as a partnership with Hugging Face (a machine-learning platform), Apple has been characteristically quiet, though with their annual developer conference, WWDC, beginning early June, that’s unlikely to remain the case.

How has the rest of the software world reacted? Typically, there have been a lot of announcements. Software businesses really want investors to know that they’re paying attention and have recognised that AI is a big deal. Just of the companies that I follow, in addition to Amazon’s product launches above, we’ve had:

Twilio releasing an AI research report
Cloudflare marketing R2 as ideal for AI startups
Datadog announcing an integration for monitoring OpenAI usage
Adobe unveiling Firefly, their own generative AI
AMD teasing a “Data Center and AI Technology Premiere”
Snowflake rumoured to be buying Neeva, an AI search startup

Given the meteoric rise in share price of those the market deems beneficially-positioned, maybe this knee-jerk reaction is justified:

Chegg, an online service that ‘assists’ students with their homework, has been the biggest loser so far. Surprisingly, their almost 50% one-day decline in share price wasn’t a direct result of what I hope will be the worst-named AI product integration announced, CheggMate, and was instead caused by their admission of slowing sign-ups following ChatGPT’s increasing popularity:

In the first part of the year, we saw no noticeable impact from ChatGPT on our new account growth and we were meeting expectations on new sign-ups. However, since March we saw a significant spike in student interest in ChatGPT. We now believe it’s having an impact on our new customer growth rate. - Prepared Remarks, Q1 ‘23.

Pearson (an educational publisher) shares fell in sympathy. That Duolingo’s stock has surged in the opposite direction is a point of intrigue, but has as much to do with two stellar consecutive quarterly reports as with advantageous positioning. The pro-active announcement of their OpenAI-powered ‘Max’ subscription tier can’t have hurt either.

Moving from ed-tech back to software more broadly, a line of thinking that I tend to agree with is concerned with the increasing value of historical stores of data. Another great Tidal Wave (Arda Capital) post details some thoughts on how this dynamic might play out, particularly with a view to systems of record (Oracle, Workday, Salesforce etc.) and how these companies - if they execute - can leverage their treasure troves of business data gathered over the last decade+ to provide meaningful additional value to their customers using AI models. This often unused data, some of it potentially in storage for years, has a new purpose and is yet another section of moat for startups to have to bridge.

The question of where value may accrue was well put in this recent Economist piece:

[Clayton] Christensen’s insights make clear that revolutionary innovation does not always end up being revolutionary in business terms. Yet existing tech firms are now spending enormous sums on AI, suggesting they should be well-placed if the tech does turn out to revolutionise business.

As things stand, it looks more likely that the market value of the technology will end up as a new string to the bow of already giant tech firms.

Helpful Resources

Ethan Mollick is the go-to-guy for any practical chatbot-related tips - his Substack is full of creative and practical tips and updates on the latest in conversational AI.
Again, Arda Capital - fascinating, well-written posts on software strategy, often as it relates to LLMs.