AI chatbots have grown increasingly ubiquitous over the last year. For example, the basic version of ChatGPT is a conversational chatbot capable of understanding natural language inputs and generating highly coherent text responses. However, exciting new multimodal AI models like Google’s Gemini showcase more sophisticated capabilities.
What distinguishes these two varieties of artificial intelligence? How may such multimodal systems further extend machine learning’s capacities? And by what means might novel implementations leveraging multiple modalities secure patent rights?