Comparative Analysis of AI Chatbots: ChatGPT, Google Gemini, and Meta AI

GOOVER DAILY REPORT September 13, 2024

Summary
Subscription Offerings and Features
Performance Comparison Across Tasks
Task-Specific Evaluation
Overall Performance and Reliability
Conclusion

1. Summary

This report conducts a thorough comparative analysis of three leading AI chatbots: OpenAI's ChatGPT, Google's Gemini, and Meta's AI. It reviews subscription offerings, task performance, and specialized features to highlight each chatbot's strengths and weaknesses. Key findings reveal that Google Gemini offers significant subscription perks, including cloud storage and future integrations, while Meta AI excels in technical tasks such as math problem-solving and programming. ChatGPT is noted for its clarity in natural language understanding and reasoning. The report underscores the critical importance of evaluating the reliability and accuracy of these AI tools' outputs due to potential misinformation.

2. Subscription Offerings and Features

2-1. Overview of Subscription Plans

The report indicates that both Google and OpenAI offer their respective AI chatbots as subscription products, with each subscription priced at $20 per month. Google launched its Gemini Advanced subscription in February 2024, which offers access to its best AI model, Gemini Ultra 1.0. OpenAI similarly offers its GPT-4-powered ChatGPT Plus at the same price. Additionally, Microsoft provides subscription access to its AI tool Copilot Pro, also available for $20 per month.

2-2. Advanced Features and Benefits

Gemini Advanced provides significant perks beyond access to the chatbot itself, including 2 terabytes of cloud storage as part of the Google One subscription included in the plan. Users are also expected to benefit from a future integration of Gemini into Gmail and Docs. In contrast, while ChatGPT Plus does not offer additional perks like storage, it provides a unique feature known as the GPT store, which enables users to build and share custom versions of ChatGPT tailored for various applications.

2-3. User Caution and Misleading Information

The report cautions users against blindly trusting the outputs of AI chatbots, noting instances where the tools have provided inaccurate information. An example highlighted was ChatGPT mislabeling a daily multivitamin as a prescription pill for erectile dysfunction, underscoring the potential risks of relying on chatbot outputs without verification. Users are advised to critically evaluate the information provided by these AI tools.

3. Performance Comparison Across Tasks

3-1. Coding Proficiency

In the test of coding proficiency, both ChatGPT and Google Gemini were prompted to develop a Python script that serves as a personal expense tracker. Both models successfully created fully functional expense trackers. However, Gemini exceeded expectations by adding extra functionality, such as labels within categories and more granular reporting options. Therefore, the winner of this category was Gemini.

3-2. Natural Language Understanding

The test for natural language understanding involved a Cognitive Reflect Test (CRT) question: "A bat and a ball cost £1.10 in total. The bat costs £1.00 more than the ball. How much does the ball cost?" Both chatbots provided the correct answer, which is that the ball costs 5 cents, and the bat costs $1.05. However, ChatGPT displayed its workings more clearly, earning it the title of the winner in this category.

3-3. Creative Text Generation

The creative text generation task required the models to write a short story based in a futuristic city where technology controls every aspect of life. Both models delivered commendable stories, but Gemini adhered better to the specified rubric and produced a more compelling narrative, hence it was declared the winner of this category.

3-4. Reasoning and Ethical Decision-Making

In the reasoning test, both chatbots were presented with a classic logic puzzle involving two doors guarded by two guards—one who always tells the truth and one who always lies. The correct question to determine the safe door was, "Which door would the other guard say leads to danger?" Both chatbots provided the correct answer along with sound explanations. However, ChatGPT's response was more detailed and clearer, making it the winner in this category.

4. Task-Specific Evaluation

4-1. Email Composition

Each AI chatbot—Meta AI, ChatGPT, and Google Gemini—was tasked with generating an email requesting a project extension. All three chatbots successfully created well-structured emails that addressed the prompt effectively while maintaining a polite and professional tone. This task demonstrated that they meet basic user needs effectively and received perfect marks for performance.

4-2. Recipe Generation

In a task where each chatbot was asked to provide a recipe for chili, all three—Meta AI, ChatGPT, and Google Gemini—offered detailed and accurate recipes. However, a critical difference was noted in their sourcing methods. Both Meta AI and Google Gemini credited the original sources at the bottom of their recipes, providing links to the websites used. Conversely, ChatGPT failed to provide any sourcing, which raises concerns about potential plagiarism and reliability in culinary instructions. Therefore, for food-related tasks, Meta AI and Google Gemini are recommended due to their transparent sourcing.

4-3. Math Problem Solving

Each chatbot was tested with two sets of math problems. In the first, they addressed a problem involving the expression A³ + B³ + C³ — 3ABC. All chatbots successfully provided correct answers using different methods. The second problem, related to geometry, posed challenges for all three. While ChatGPT and Gemini did not provide definitive numerical answers, Meta AI excelled by delivering an accurate solution. Hence, for math problem-solving tasks, Meta AI is recommended as the most reliable option.

4-4. Programming

For a programming task to create a complex tic-tac-toe game in HTML and JavaScript, both Meta AI and ChatGPT delivered complete code in the requested languages. Google Gemini, however, mistakenly provided CSS instead of HTML, which detracts from its effectiveness. Therefore, if coding accuracy is a priority, users should consider either Meta AI or ChatGPT.

4-5. Mock Interviews

In a mock interview task for a role as a computing staff writer, all chatbots were able to simulate an interview scenario by generating relevant questions and answers. Each chatbot approached the task with distinct styles but ultimately produced satisfactory results that could serve as a solid foundation for preparing for real interviews.

5. Overall Performance and Reliability

5-1. Comparative Analysis and Conclusion

An evaluation of the three AI chatbots, ChatGPT, Google Gemini, and Meta AI, shows varying performance metrics across multiple use cases. According to tests, Gemini excelled in five out of nine evaluations against ChatGPT, indicating a strong overall performance in general utility. Both these chatbots performed similarly in capability, but the free-tier versions were evaluated, suggesting future tests could provide insights into premium functionalities.

5-2. Strengths and Weaknesses of Each Chatbot

Each AI chatbot has distinct strengths and weaknesses. Meta AI consistently provides reliable performance in various tasks, especially in programming and technical inquiries, proving to be the most dependable option. ChatGPT stands out for its clarity in reasoning and natural language understanding, making it apt for interactive dialogues and tasks requiring nuanced understanding. Google Gemini has shown to be creative and adept at ethical reasoning but struggles with technical tasks such as complex math problems.

5-3. Implications for AI Capabilities and Applications

The competition among the three chatbots demonstrates the rapid evolution of AI capabilities. Analyzing their performance indicates that while they are becoming increasingly adept at handling a range of user queries, each has unique applications based on their strengths. Meta AI's reliability can be leveraged in technical and academic environments, ChatGPT's conversational abilities can enhance user interactions, and Gemini's creative strengths may be best utilized in creative fields.

6. Conclusion

In conclusion, each AI chatbot—ChatGPT, Google Gemini, and Meta AI—demonstrates unique strengths suitable for different applications. Meta AI shines in technical and academic environments due to its reliable performance in math problem-solving and programming tasks. ChatGPT's strengths in natural language understanding and reasoning make it ideal for interactive dialogue and nuanced tasks. Google Gemini proves adept in creative text generation and ethical reasoning but struggles with some technical tasks. The evolving capabilities of these AI tools highlight their potential utility across various domains, emphasizing the need for users to select a chatbot that aligns with their specific requirements. Future prospects suggest continued advancements and specialized applications for each AI, encouraging users to stay informed about their development and capabilities. To mitigate limitations, users should critically assess and verify the information provided by these chatbots.

7. Glossary

7-1. ChatGPT [Technology]

Developed by OpenAI, ChatGPT excels in natural language understanding and reasoning clarity. It is often considered for tasks requiring detailed and clear communication. The chatbot is available through subscription plans, including a premium version, ChatGPT Plus.

7-2. Google Gemini [Technology]

Google Gemini is an AI chatbot offering strengths in coding proficiency and creative text generation. It includes additional benefits like cloud storage through Google One. The bot is noted for its nuanced approach to ethical decision-making, though it struggles with some technical operations like math problem solving.

7-3. Meta AI [Technology]

Meta AI, developed by Meta (formerly Facebook), outperforms other chatbots in technical tasks such as math problem solving and programming. It is regarded as the most reliable chatbot among those compared, showcasing consistent performance across various applications.

7-4. GPT store [Feature]

The GPT store is a unique feature of ChatGPT Plus that allows users to access custom chatbot versions. This feature expands the usability and customization of the ChatGPT for specific tasks.

8. Source Documents

ChatGPT vs. Gemini: Which AI Chatbot Subscription Is Right for You?https://www.wired.com/story/chatgpt-vs-gemini-ai-chatbot-comparison/
I tested Google Gemini vs OpenAI ChatGPT in a 9-round face-off — here’s the winnerhttps://www.tomsguide.com/ai/google-gemini-vs-openai-chatgpt
Meta AI vs ChatGPT vs Google Gemini: we tell you which chatbot is the besthttps://www.techradar.com/computing/artificial-intelligence/meta-ai-vs-chatgpt-vs-google-gemini-we-tell-you-which-chatbot-is-the-best

Comparative Analysis of AI Chatbots: ChatGPT, Google Gemini, and Meta AI

TABLE OF CONTENTS

1. Summary

2. Subscription Offerings and Features

2-1. Overview of Subscription Plans

2-2. Advanced Features and Benefits

2-3. User Caution and Misleading Information

3. Performance Comparison Across Tasks

3-1. Coding Proficiency

3-2. Natural Language Understanding

3-3. Creative Text Generation

3-4. Reasoning and Ethical Decision-Making

4. Task-Specific Evaluation

4-1. Email Composition

4-2. Recipe Generation

4-3. Math Problem Solving

4-4. Programming

4-5. Mock Interviews

5. Overall Performance and Reliability

5-1. Comparative Analysis and Conclusion

5-2. Strengths and Weaknesses of Each Chatbot

5-3. Implications for AI Capabilities and Applications

6. Conclusion

7. Glossary

7-1. ChatGPT [Technology]

7-2. Google Gemini [Technology]

7-3. Meta AI [Technology]

7-4. GPT store [Feature]

8. Source Documents