Varon Consulting Interns’ Personal Take on AI

September 6, 2024

Navigating the artificial intelligence landscape as a user and a student

In this article, two Varon Consulting interns, Peter and Jennifer share their thoughts about AI and their personal experiences

Peter – I didn’t understand the excitement around large language model (LLM) AI when OpenAI’s GPT and Meta’s Llama models were first released. I consider myself as technology savvy but I wasn’t seeing the added value at that time. Several months after GPT 3.5 release, I was still in question and thought that the technology is not ripe enough yet for mass adoption. Even though a few people were experimenting AI in their workflow, I remained uncertain.

In late 2023, I saw a YouTube video about Google’s Gemini AI project in recommended videos and I watched it to satisfy my quench of curiosity. I was actually impressed by the level of cognition and cohesiveness that AI models have achieved just in a few years of development. Later I decided to try out the AI models to test the functionalities in May 2024. The driving reason for my embrace was my personal curiosity. I want to see myself how far the current state of researches have progressed and I also wanted to check if this new technology can help me equip myself with new skills for my career.

Since then, I have been applying the AI as a briefing tool to distill the articles and have conversations with them to learn about new concepts. I used the Mixtral 8x7B model, my favorite model at that time, developed by a French AI startup called MistralAI. It is lightweight, versatile and suitable for my personal use. Since then AI development has changed quite a lot and I’ve tested and jumped across many great models like WizardLM 2, Llama 3, Gemma 2, Phi 3, InternLM 2.5, Deepseek Coder V2, and so on. Each model has their own strength and weaknesses on a particular use case but overall they are quite impressive. I am still very excited to try on the recently announced new models like Claude 3.5, Mistral Nemo, and Llama 3.1 models.

After having tested many different models, my personal advice is to be cautious with the generated information and always fact-check the output. Even though the technology is moving forward at an unprecedented rate, the AI models out there do not make a strong comprehensive reasoning as humans. They also tend to not hold memories after a few exchanges within a single conversation.

Google AI search tells to glue pizza and eat rocks

It was an article from a while back and reflects my concerns with this technology. For this reason, I tend to choose models with specific benchmark performance in mind, mainly IFEval, BBH and GPQA scores which are suited for my use case as an alternative search/ answering tool. Previously I used the TruthfulQA and Arc scores to select the modes. Learn about the benchmarks here. I suggest to actually try many different models and see if they perform specific to your needs. Some models perform better in specific areas and scores might not truly reflect the actual usage performance. The issues with relying on benchmarks is that many models are trained with datasets from the human evaluation preference benchmark such as LMSYS. Some models are trained to score high on paper benchmark performance; such as Microsoft’s Phi models.

Another downside of tuning to our preference and social guidelines is that the models will try their to best give a pleasing answer (even if the information is non-factual). Model capabilities are also limited to AI users’ preferences and the tasks used to evaluate. We will be creating a risk for narrow information if we also add in the built in moderation. When using AI LLMs as a form of information gathering, some models might outright refuse or give fictitious info around the fact. This is because the model creators set censorship on what they deem is appropriate representation for their organizations. Phi and Claude models come to mind. It might be possible with some level of prompt engineering to have a decent result and it is not guaranteed. Even then, the information output would still be a lackluster compared to the ones without strict alignment. There is also issues when the user ask the AI for the source, it would just direct to non-existing links and books. Therefore, not all information can be taken for granted with these generations. If you have the books or pdfs that you would like to use as a reference, you can have it embedded and get results using the RAG retrieval. The output information is more accurate in that way than the AI making generations from its pool of training data.

Another thing that I learned is that we can use supplemental online search using AI on platforms like PerplexityAI. It is a powerful complimentary to the traditional engines and it can show real-time information for the AI. If I had to use for my researches, I would be using something like this for assistance. Search engines are also starting to provide AI searches and AI chat functions which would help making searches more compelling. DuckduckGo for example offers GPT 3.5, Claude 3 Haiku, Llama 3 70B and Mixtral 8x7B so we can be a little creative with our information search for our research projects. Overall, it is exciting to use AI and learning the technology and its workarounds and troubleshooting can be a fun experience. As a personal project I’d like to build platform with vision language models (VLMs), LLMs, audio LMs, and web search capabilities for an ultimate research tool. I haven’t actually used AI for education purposes yet but I might integrate some of it in the future to help my research. I believe chat GPT 4o and Claude 3.5 versions are offering the some of the mentioned capabilities built in on their platform and I encourage to try out the features if you are interested. For my other AI and education discussion, you can read “AI learning: Can it teach humanity?” section in the newsletter.

Jennifer – For the past few years, Artificial intelligence (AI) has been a spectacle for humanity with the emergence of various innovative AI-models from Microsoft’s very own Co-Pilot to Google’s Gemini but, none has had the same impact as the release of OpenAI’s Chat-GPT back in November 2022. Developed as a chatbot and virtual assistant developed by OpenAI, an artificial intelligence (AI) company founded in 2015. As a form of generative AI, Chat-GPT utilizes natural language processing to create humanlike conversational dialogue in response to user-entered prompts to receive human-like images, texts, or videos. While Chat-GPT isn’t the first AI model to be produced, it revolutionized the way people work from automating mundane tasks to optimizing workflows with such accuracy and precision giving it a competitive advantage over humans where human error is prevalent. At the same time, AI isn’t as advanced as many would think. While I am weary of its widespread use and integration from math learning programs like Google’s Photomath that solves mathematical problems by the user simply taking a photo for the app to scan and recognize problems as a way to provide step-by-step solutions or even Grammarly that decided to implement AI to their technology to improve their accuracy with catching grammatical errors and provide robust feedback in an effort to strengthen one’s writing. As a Brooklyn College student, the impact of AI in the classroom is not more evident than the emphasis of plagiarism and it’s forbidden use in the classroom in replacement of original content handcrafted by students. At Brooklyn College, all English Composition I students are required to read Ted Chiang’s “ChatGPT Is a Blurry JPEG of the Web” in an effort to disuade students from relying heavily on ChatGPT to get through their written assignments. While many may paint the use of AI in a bad light, there is value in utilizing to your advantage as a writer experiencing writer’s block struggling to put pen to thought in which Chat-GPT can generate great sentence starters to encourage you to write that essay that you’ve been procrastinating on or even generate ideas to basically hook the readr4, introduce what AI is + AI tools, personal anecdotes about AI in the classroom