First, let’s define generative AI: supervised and unsupervised algorithms that are creating new content using existing content. Like any creative endeavor, generative AI does not make something entirely out of nothing. As intelligent beings, we all stand on the shoulders of giants and our creative capacity is ever-expanding thanks to people who came before us and created stuff that is part of our mental catalog. The same is true for generative AI - it’s all generated by a corpus of training data and knowledge. It start somewhere. But what exactly that term content means is a critical piece of the puzzle.
Of course, when I first think of content, my initial reaction is very internet heavy - like the content I read on social media, newsletters, blogs, etc. But I think generative AI is more robust than being able to write quick posts on LinkedIn or creating captions on Instagram. The book is currently being written as to what exactly that means, but there are some use cases that stick out presently.
Code Generation: there are a lot of tools that are being created that automatically create code based on a variety of parameters. Some of them are aiming to create software and applications based strictly on simple product designs, while others are helping developers clean up code while it is actively being written. Other tools are predictive and try to auto-fill lines of code for a developer based on what they are already writing. Many products are frequently checking code and suggesting edits based on commonly used practices.
All of these tools should, in theory, help alleviate some of the costs of development and allow for more efficient and effective developers. It will also remove some of the tedium involved in any development project and can act as a spell-check for the software-development industry. This should making shipping products easier, building technology simpler, and the pace of innovation increased.
Text Generation: there are millions of ways that text generation can impact the world, but basically any kind of text can be automated and personalized based on some basic inputs. There are dozens of LLMs that are standing up chatbots, search tools, and improved models to create extremely complex text based on very simple prompts. If you want to learn more about what’s going on the text space, I wrote about it last week here.
Many of the early text-based AI companies are generating improved marketing tools like automated ads, targeted outreach to customers, SEO-enhanced blog posts, etc. Several are creating more robust products, like automated landing pages for specific clients or products like Jasper, which is a comprehensive suite for any content creation exercise. Chatbots and customer service products are being improved and providing improved efficacy for even more complicated questions and concerns from consumers. Training manuals are being summarized and distilled in a manner that makes them easier to read and denser texts are becoming more digestible. Most of the text-generation tools in the space are focused on efficiency and improved content. While they could in theory just complete write all new content, the technology doesn’t seem to be completely there yet.
Image Generation: there have also been tons of developments in this space recently, specifically around Stable Diffusion and GPT-3’s technology. We now have the ability to write up an idea and automatically get fairly high quality images (like the one up top). This allows for a number of uses, like creating digital avatars, generating product mockups, and creating digital assets for video games. Really, creating creative images for any number of purposes is now fully in scope and more and more sophisticated.
This technology also enables improved photo editing technology such as removing objects from a photo, changing the lighting in a photo automatically, or expanding a photo outside of the lens of the camera based on context clues. It also can help create professional grade presentations to help reduce the time some folks spend in PowerPoint and/or the need for a deck designer altogether. Designs for architects, mockups for product designs, and 3D renderings for engineers are all in scope and products are being built around these ideas. Most of these images could be created today by non-AI technology, but in significantly more laborious and costly processes.
Music Generation: there are several companies today that are attempting to create automated music based on simple text prompts or existing musical preferences. These could be used in the background of a video or for music professionals to test out ideas and expand their creative boundaries. It’s still relatively early in this segment of the AI playground, but if text and sounds can work well today, it’s only a matter of time before we have a chart-topping AI-inspired hit.
Video Generation: much like image generation, there are several tools that enhance video editing, by automatically updating certain features within films and providing drag-and-drop-complexity level features that make things like tracking an object through a shot much simpler. Videos can also be created by strictly using basic inputs like text commands, allowing for creators to better storyboard ideas or for marketing agencies to more easily create video marketing content.
Speech Generation: several startups are creating speech technology that translates text to speech, but in a way that actually sounds authentic, to a certain extent. This can be used for voice-over work on marketing content, dubbing content into new languages, generating professional-grade videos without having to pay a videographer, and much more.
These six categories are really just the tip of the spear. The exciting thing about this space is that it inherently opens up the creativity of people in a way that I am sure will result in many use cases aren’t even conceived of yet. AI models are improving in an exponential fashion and that means that more and more use cases are likely to pop up over time.
Startups In The Thick of Things
Of course, these use cases aren’t just theoretical. There are a bunch of startups that are working on this technology as of this writing. Below is just a cross section of some of the interesting startups in the space and what they are doing.
Persado - referring to their product as motivation AI, they use their technology to allow companies to speak directly to customers in a way that encourages them to take action. The company uses predictive AI to know what the right words and phrases are to say to a person depending on a specific situation. The most common use case appears to be marketing messages - a client uploads an ad or an ad packet, and Persado determines the words, phrases and ideas that will motivate someone to engage with the ad. But even before the person interacts with the AI, they have data sets that can roughly approximate what to expect from the person.
They have raised a $30mm.
Lately - is a social media-focused company that turns content (blogs, videos, podcasts, etc.) into social posts on twitter, linkedin, etc. They digest the content and then spit out several social media posts using the customer’s specific voice in order to make it more authentic, while reducing significant time spent by marketing teams repackaging existing content. The Company also does its best to ensure that the social posts are engaging and generate higher engagement metrics and has a “post success indicator” to predict if social posts are going to be more engaging.
They have raised $3mm.
ai21 - provides tools to developers and off-the-shelf models for non-technical people to incorporate natural language processing into workflows and products. They have a variety of off-the-shelf products, like wordtune, which automatically will rewrite something to make it more understandable or coherent, or wordtune read, which will ingest large documents and create summaries to make them easier to read. The company’s Jurassic-1 LLM can summarize, create, improve content, while also interpreting large sets of data.
They have raised $98mm.
Jasper - a chrome extension and SaaS platform that has been built to provide content creators and copywriters insights on the content that they are writing and the ability to generate more meaningful content. The Company can generate ads on social media, emails based on basic information, and blog posts that help enhance SEO. They also provide a content score to help companies rank better on search and read more effectively for users. They also have templates to make writing easier and a chatbot to ask questions to during the writing process.
They have raised $146mm.
Mutiny - a website optimization tool that creates tools for companies to ensure that when someone goes to their website, they increase conversion significantly. Their tool set provides feedback on website copy, creates targeted landing pages for individual accounts, and enhances SEO.
They have raised $71mm.
Mavenoid - a support solution that provides automated customer success support for hardware companies. Customers for products like home appliances, consumer electronics, power equipment, and fitness equipment can develop self-service customer service programs for customers to use when they have a question or issue with a product. The support tool will learn from each interaction and figure out the best way to solve recurring problems over time.
They have raised $39mm.
Gretel - creates synthetic data for databases based on smaller, existing data sets. The product allows users to assess how accurate the data could be based on provided inputs. The synthetic data can in turn be used to make larger predictions, understand what is going on with a data-set at a larger scale, and simulate how the data would act over time.
They have raised $67mm.
latitude - a game developer that allows users to create and interact with games in a more personalized way. The Company utilizes AI to create games that are unique to the user and allow extra personalization - think dungeons & dragons where the game generates storylines, characters, etc. for someone based on some simple prompts. They are also developing a game engine for other folks to build games on top of this technology.
They have raised $4.0mm
Runway - video and photo editing tools that make the editing process simpler and more straightforward. This is like the no-code-ification of the photography business. They allow users to modify existing images using text, generate original images using text, erase and replace a part of a photo, remove background noise from a video, simple motion tracking, and collaborate with other members of the film and photography team.
They are also on the forefront of text-to-video technology:
They have raised $95mm.
Synthesia - creates automated quality videos, avatars, and voiceovers using only text and some drag and drop features. For example, a nearly-professional grade tutorial on how to use a product can be created using only sections of a user-manual. It can also create avatars and voice models for the user to make it look like the user is actually the one in the video, voicing the content created by their product. It is largely targeting B2B use cases, such as customer onboarding, training videos, and product-marketing videos.
They have raised $66mm.
beautiful - helps teams develop attractive, well designed presentations. They are using ai to try and disrupt the PowerPoint industrial complex. It requires a user to upload notes, outlines, data, and other features and will design a deck automatically around that. Edits can be made by the user and the AI will automatically update the design based on those edits.
They have raised $16mm.
uizard - is a tool for product managers and companies without product managers to quickly design apps, wireframes, mockups, and other UI-focused outputs. It requires users to rend a project in low fidelity (think back-of-napkin drawings of what something should look like) and automatically converts it into a professional-grade wireframe or mockup. Once the initial mockup is created, you can toggle through themes and make automatic adjustments to the mockups over time as a user sees fit.
They have raised $18mm.
Stability.ai - is one of the artificial intelligence platforms that is commercializing various open-source projects to utilize in a variety of settings. They have tools that allow for text to image, image-to-image, and a variety of other uses. I wasn’t going to include them originally on this list because some of the big players like them and OpenAI are less focused on use case and more so on the underlying technology, but figured it was worth reviewing a company like them in brief.
They have raised $98mm.
Replit - a tool used by developers to start instantly working on code-based projects. The AI angle is that they have a tool called ghostwriter which sort of acts like an auto-fill for coders. It can start entering in code as you are writing it to increase efficiency and efficacy of the project.
They have raised $100mm
Debuild - is a no-code app builder that uses text commands to generate web applications. This allows anybody with an idea for an app to create one without having to understand any coding language or even complicated back-end infrastructure. They currently only have a waitlist, but the potential for the product is incredible.
They have raised an undisclosed amount. This technology might be more theoretical than literal.
Anima App - allows users to generate working prototypes of software and apps using only wireframes and mockups and some directions. Using those designs, the company will generate “developer-friendly” code that the development team can then use to fully flesh-out the product. It doesn’t create a finished version, yet.
They have raised $12mm.
MURF AI - creates professional grade voiceovers using a script and timing-based tools within an app. It also allows for turning home-produced recordings into professional-sounding content. It is targeting creators at the moment, but could be used in a variety of settings.
They have raised $11mm.
Soundraw - based on a handful of inputs from the user, the Company generates personalized music. Users can toggle mood, length, genre, themes, and story to make the songs they have in their heads already, as well as stuff they weren’t expecting. They are targeting folks who don’t want to deal with copyright issues and are looking for background music for videos and other content.
They have raised $2mm
Papercup - provides video creators with automated dubbing technology, using just existing dialogue. Using their technology, you can create dubs that are in the same voice (or at least sort-of) as the original voice actor of the dialogue. This can be used in the film & entertainment industry, but also for eLearning companies who want to teach content in various languages, enterprise companies trying to communicate to a larger set of people, and marketers looking to speak with people outside of their native language on the cheap.
They have raised $33mm.
Balto - is a company BUILT IN THE MIDWEST!!! They provide software for the customer success and sales industries that provides automated answers to questions customers ask over the phone that an agent can provide verbally, while also updated in real time as a customer responds. After the call, Balto’s technology analyzes what works and provides feedback to agents and managers to how each call went.
They have raised $51mm.