Liberating well data with modern data science and AI

The Norwegian Continental Shelf (NCS) Big Data Well Conditioning project illustrates how AI-driven workflows transform subsurface analysis. As shown in Figures A through E, the process begins with large-scale data ingestion and harmonization (A), followed by ML- based enrichment using models trained across diverse geological settings (B and C). This enables prediction of missing logs and key properties like porosity and water saturation (D). Automated petrophysical analysis, including uncertainty quantification (E) ensures reliability for exploration and Carbon Capture and Storage (CCS) applications. Together, these workflows provide scalable, high-resolution insights that improve data quality, accelerate decision-making, and support strategic initiatives such as carbon storage and field development.

AI-driven well data revolution: Unifying traditional expertise with modern data science

The oil and gas industry is navigating a transformative era fueled by the convergence of geoscience, data science, and artificial intelligence. As the volume and complexity of subsurface data grow exponentially, traditional well data workflows – once rooted in manual QC and interpretation – are becoming insufficient. In response, new methodologies have emerged, leveraging modern data science to transform static, fragmented well datasets into dynamic digital resources. Two key efforts that exemplify this transformation are the “Go Digital for Wells” initiative, using EarthNET, and the Earth Science Analytics large-scale Big Data Well Conditioning project on the NCS.

The NCS Big Data Well Condition­ing project is a living example of this workflow in practice, incorporating over 2,000 wells and 30,000 km of log data, and has produced a unified and consistent dataset for the region. ML models were trained to impute missing logs, resulting in a 5-fold increase in log coverage and enabling downstream pre­diction of porosity, lithology, and water saturation across the dataset.

Figure 1: Data ingestion and standardization schematic, showing how raw well data is ingested from multiple sources at scale uniformly, harmonized, validated, and all connected through the digital data lake.

Integrated workflow: From data to insight (Figure A)

Both workflows begin with a foundational data management phase. In the NCS study, logs, core data, lithology descrip­tions, and metadata were aggregat­ed from diverse sources and merged into a centralized digital data lake. A harmo­nization process compared overlapping data from different channels, flagged inconsistencies, and resolved discrepan­cies to produce a high-quality baseline.

Multiscale well evaluation and machine learning for data enrichment (Figure B and C)

The ML pipeline uses the curated, high-di­versity datasets to train algorithms – includ­ing Random Forest, Light GBM, and XG­Boost. In the NCS project, more than 350 models were trained across four geolog­ical provinces. These models predicted missing logs (e.g., density, sonic, shear sonic) and derived properties, such as lithology, porosity (PHIE), and water sat­uration (SW). Cross-validation and blind testing ensured model robustness, and out­put logs were ranked with priority flags to guide usage based on prediction quality.

Automated interpretation of petrophysical properties (Figure D)

These workflows emphasize automa­tion without compromising geological context. In “GoDigital for Wells,” petro­physical properties are interpreted with AI support, reducing manual workload while maintaining consistency. The NCS project extended this automation to include adaptable pay analytics: Users can apply custom porosity and water saturation cutoffs to evaluate reser­voir quality, calculate net pay, and generate maps of prospectivity.

Uncertainty quantification and quality assurance (Figure E)

Every prediction is accompanied by uncertainty estimates, enabling probabilistic analysis. This feature allows users to make informed decisions with transparency on data reliability – crucial for exploration, development, and CCS site evaluation. In the NCS caliper-based QC process flags ‘bad hole’ conditions, further enhancing prediction confidence.

Figure 2: Image showing the fivefold increase in log coverage for the NCS project.

Real-world applications and results

Figure 3: DTS machine learning model prediction and imputation, built using combinations of logs as features. The imputation priority goes from left to right on the, while the origin of the log prediction is shown by the ML-flag.

The outcomes of these workflows are both quantitative and strategic:

In the NCS project, the coverage of shear sonic logs increased 5-fold, and interpreted reservoir properties ex­panded significantly (Figures 3 and 4).

Exploration teams can now identify previously overlooked pay zones using ML-enriched data (Figures 4 and 5).

Field development geoscientists and engineers leverage high-resolution petrophysical models for depth conver­sion and uncertainty assessment.

This process also adds significant value to CCS screening by leveraging the consistent, high-quality datasets to assess rock properties and injectivity. Earth Science Analytics exemplifies this approach with a comprehensive evaluation of CCUS capacity and risk in Gulf of Mexico. Integrating seismic data from Geoex MCG, 4,000 wells, and ML-predicted reservoir properties enables data-driven site assessments and informed decisions for offshore carbon storage (Figure 6).

Transformational benefits across the energy sector

  • Scalability: ML models handle thou­sands of wells across basins, enabling macro-scale geological interpretations.
  • Data accessibility: Cloud-based digital lakes break down data silos, promoting cross-functional collaboration.
  • Improved interpretation Quality: AI models offer consistent predictions and reduce subjectivity.
  • Efficiency gains: Automated QC and interpretation reduce time-to-insight, enhancing project delivery.
  • Informed decision-making: Uncertain­ty quantification enables risk-aware planning and strategy.
Figure 4: Adaptable pay analytics by querying the data with a cutoff on water saturation and porosity.

Paving the way for the energy transition

Although these workflows were orig­inally designed for oil and gas devel­opment, they are proving crucial in the evolving energy landscape. In CCS applications, consistent and enriched well data support site characterization, reservoir modelling, and injectivity pre­diction. The same tools are adaptable to geothermal energy, unconventional resources, and basin-scale modelling.

Figure 5: CCUS MCG GeoEX project; Basin-scale reservoir property analysis from well data, e.g. Vshale, water saturation, and lithology; Automated fault interpretation using ML models.

Conclusion

The NCS data set, now enriched through ML, provides a robust foun­dation for subsurface analysis across multiple use cases. CCS project offers a real-world example of how digital transformation can directly contribute to climate solutions by enabling data-driv­en carbon storage evaluation.

The digital revolution in well data is not a future possibility—it is a current reality. By embracing AI-driven workflows such as “Go Digital for Wells”, the industry is unlocking new efficiencies, insights, and opportunities. These approach­es demonstrate that when traditional geoscience expertise is combined with advanced data science, the result is a smarter, faster, and more confident path to understanding the subsurface. In an era where decisions must be made quickly, confidently, and with long-term sustainability in mind, intelligent, well-designed data workflows are no longer optional—they are essential.

Previous article
Are half of the Tano Basin’s reservoirs in the Keta Basin?

Related Articles