Everybody is talking about machine learning and big data these days, but what does it mean in reality, and how do geoscientists embrace it in their daily lives? To learn a bit more about that, I caught up with Tom Marsh and Elias Ortiz from Rock Flow Dynamics in their Aberdeen office. Both of them work with and develop reservoir modelling software, which is a discipline that lends itself well for elements of coding and machine learning applications.
The first thing that Elias told me immediately reminded me of something I had heard before. “The subsurface geoscience sector is not a big data environment. That’s where the social media platforms operate in,” he kicks off the conversation.
“Instead,” he continues, “our geoscience niche is rather an environment where there is a lack of data, so machine learning algorithms taken from the big data realms are not necessarily useful in our case. In addition, we work with spatial and stratigraphic data, which is more complex in some ways.”
“No generic data analysis code will be developed with superposition in mind,” adds Tom. “For us, it makes sense, but it is important to be mindful about when using off-the-shelf solutions.”
But how does it work in practice?
“While our core software is developed in C++, we offer a native Python extension and API, allowing users to work and develop workflows in Python,” Tom continues. “This is the environment that most developers use these days. Our clients can output the workflows they develop in our software as a Python script, and subsequently add their own twist to it. It’s the new way of working when it comes to operating software and offers users the flexibility they need and more and more often require.”
But that means users will need to have some Python skills too.
“Python is not as scary as you think,” says Elias. “Admittedly, I started doing elements of coding when I attended university, and I have a natural interest in it, but the time of learning all the commands one by one is surely over. In the Python environment, Notebooks allow you to quickly test something for which code already exists,” says Elias. “The tool I use a lot is called Jupyter Notebook, and it hosts a wide range of Python applications and libraries for interactive development. In addition, ChatGPT will also spit out some code for you when you ask.”
“Are you afraid that all this will replace the geologist at some point soon?” I asked at the end of the conversation. “No, I’m not,” says Tom. “The modules we develop all require an element of expert supervision to verify what the machine comes up with. I would rather say that we can do more and we can do it faster, but the geologist will always be an essential part of the process.”