Navigating AI in Ocean Science: Balancing Innovation and Accuracy
SWMS Clerk, Megan Howson, shares her experiences using AI as a coding tool in her work!
The debate around AI is something that has permeated nearly every discipline. There have been house committee panels, LinkedIn posts, and breakout sessions at conferences. While many of these conversations center around ethics, education, and the role AI will play in art and intellectual property, it also has a powerful role in ocean sciences. So how can we use AI to enhance our science and not hurt it? Particularly in an age where there is already distrust of both science and AI?
First and foremost, I think AI is a wonderful tool, one I use daily both for work and in my personal life (if you have not provided ChatGPT with a list of ingredients at the end of a week when you have a bunch of stray ends and no idea what to cook, I highly recommend). But it needs to be viewed as a tool, and not something that has all the answers.
While I currently bounce around between different models depending on what I am working on, I primarily use AI for assistance in coding. The key word there is assistance. The best example I can give for this is when I asked the AI model I was using one day to write code for me analyzing and aggregating vessel sound profiles. The code it provided ran without any problems. The results that arose from that same code? Wildly inaccurate. I do not give this example to suggest you should not use AI, but rather to show that it is a tool that requires the user to have both expertise in the subject matter and experience. I needed to have the expertise to not only understand what I needed to ask, but also to look at the data and have an idea what the results should be. I still use AI assistance to help with my coding. For example, asking a model to help spot check saves me HOURS of troubleshooting, especially when errors are small such as an extra space into a line of code,. But I still need to know enough about what I am trying to do where AI can be used to assist me, not as the only way of accomplishing the task.
Coding is a fairly niche task, and not something that everyone needs to do. When writing this blog post I was thinking about other ways AI can be tested for ocean science use. How would it respond to ocean science questions? As someone who uses multiple AI models I also wondered how they might perform against each other. Like any true scientist I tested it out.
I admit, I purposely selected a trickier question. As a marine mammal biologist who works in the Gulf, much of my research is focused on Rice’s whale. A relatively new species only defined in 2021. As we know so little about it, information is rapidly changing as we race to learn more. Rapidly changing information is not something that AI is known to perform well on. As part of my mini experiment, I asked three different well known AI models “what is a Rice’s whale?”. They all started out strong, and could tell me that it is an endangered species, found in the Gulf, and was only discovered in 2021 (I could argue discovered vs. defined but close enough). ChatGTP said that Rice’s whales are 10-12m (33-39ft), Claude provided a length of 40-45ft, Gemini had to be specifically asked for a length, and answered with 41.5ft (12.65m). While these answers are close, if you had three clueless people asking this question they all would now think that Rice’s whales are different sizes. And to us tiny humans, a difference of 8 ft is quite a lot. Again, this is a bit of an unfair question. We know so little about this species that with only a handful of sightings how accurately can we provide measurements? Officially, however, the NOAA species record for Rice’s whales list them up to 41 ft. ChatGTP performed the best, 33-39 is certainly up to 41 ft, and gets bonus points for listing in meters. Claude over estimated slightly with 40-45ft, but did have a disclaimer that this is a new species and information is changing and facts needed to be verified, and Gemini was closest but slightly over at 41.5ft, though listing a single length rather than a range is bold. But no model performed perfectly.
The current climate surrounding science is clear. There is a lack of understanding of the scientific process, that as we learn more our answers will change. This does not leave us much room for error. Many of the questions surrounding AI are fair, when it does have such a high margin for error how can we use it safely and accurately, what are the ethics surrounding when it is ok to use and AI model and when it is not? While I most definitely do not have the answer to any of these questions, I think it is important to keep in mind that the power behind AI lies in the hands of the user. It is a powerful tool that we can use to further our science, and to reach conclusions more quickly (every biologist who has spent hours combing through PhotoID pictures greatly appreciates when we can use machine learning to make identifications). But we need to use our knowledge and expertise to verify the answers or work done by these AI models. At the end of the day the tool is only as good as the user and the first rule of science is to always verify, verify, verify.