Of Ash and Fire Logo

Parsing the Past: Using Modern Technology to Visualize Seismic Data

An exploration of building custom parsers and high-performance visualizations for SEG-Y seismic data files, detailing the technical challenges and solutions in modernizing legacy data formats.

Geoscientists and geophysicists gather data from seismic surveys using a file format called "SEG-Y." Much like industrial control systems and SCADA platforms, Seg-Y is a unique file format that most never have to use. Similar to the struggles with parsing most older file formats in manufacturing environments, it has various problems that engineers must address, such as:

  1. Inconsistencies across Seg-Y versions.
  2. Older versions have non-standard parsing requirements (i.e. old float parsing).
  3. An insular community with proprietary data they are reluctant to share
  4. Existing solutions tend to either be inflexible or expensive (sometimes both).

So what do you do when a client asks you to gather, parse, analyze, and visualize something based on a lesser-known or older spec?

Investigate

The first thing we did was found the seg-y format documentation for the latest version, in other words; we read the instruction manual. Much like working with PLC systems, we needed to understand the core specifications. Afterward, with some general research and googling, we found revision information and file format specifications that told us where to look for various bits and bytes to verify we were on the right track. Then we proceeded to look at open-source solutions and assess the parser's stability, flexibility, and efficiency. This is to prevent us from accidentally doing double work that the community has done at large.

It's important to note that parsers are not created identically; when someone builds open source, there's usually either a personal or economic need, which influences how the parser is optimized. We're explicitly optimizing for visualizations and high-performance data streaming, similar to modern manufacturing execution systems. This means waiting 20 seconds for a 4MB file to parse, is out of the question. The fear of "rolling your own" is always there, but we had to go the custom route with our specific needs in this case.

Technical Spikes

Now's the fun part, time to dig in and see what you get. Luckily I've got a lot of really smart people I know to help out. One of the perks of running a small dev shop and being part of a growing tech community is the network of engineers I have available to help answer some of the more niche questions, especially in industrial automation. I reached out to a referral @mochetts and asked them if they'd be interested in working on this project with the rest of my product team. Luckily, he's flexible and used to fast pace prototyping.

We were able to spike several small solutions and test them in various environments. Mochetts was also able to locate and identify many minor issues. Some older versions of the SEG-Y format were using IBM HFP (IBM 360) floating points, a fascinating problem that the maintainers of Segy-IO were able to help us identify. So we had to dig deep into our bag of tricks; using some bitwise operators in javascript and the help of a local legend Yury, we were able to create a small library for parsing IBM HFP (IBM 360) floating points into something recognizable.

Putting Together the Team (starting product sprints)

After the technical spikes and answering most of the fundamental questions, it was time to assemble a team with experience in both data processing and industrial systems. First, we solidify ourselves, the core product team, then we look for the Specialists, the type of engineer's that FAANG companies wish they could hire. The kind of engineers your senior engineer friends consider their seniors and admire. In this case, I sought out a team I'd worked with before, the Simiancraft team.

Jesse Harlin and Ben Van Treese are two of the best engineers I've ever worked with. They are also heavily into game development, computer science, and building cool stuff. We tapped into that to form the interesting yet weird data visualization duo. True to that reputation, they were immediately able to understand where we were at and where we were planning on going. Older (lesser known) file format: check. Weird floating points: check. Visualizations need to be merged, processed, downsampled, upsampled, scaled, zoomed, and transformed in new and exciting ways, for incredibly smart geophysicists and such: double-check. The team quickly got started and created a small/short visualization spike rendering 1.2M data points in roughly 80ms with color shaders/hue shifts - performance that rivals modern SCADA systems!

Next Steps

In terms of product development, we're still in our infancy. Finding new edge cases every day, putting together the product side of things to make clients happy, and interviewing potential users and customers. For now, we're smooth sailing; it's just a matter of focusing on the correct problems to solve for the right reasons, whether they're in manufacturing, industrial automation, or other sectors.

My crew and I have a relatively simple ethos that maps across all our projects and relationships:

  1. We build cool stuff.
  2. We make good money.
  3. We have a great time doing it.

Learn more about our discovery process

Project Highlights:

1. Custom Parser Development

Engineered highly specialized parsers from the ground up to handle SEG-Y seismic data files with exceptional performance. Our custom solution enables real-time streaming and visualization capabilities while maintaining data integrity across different SEG-Y format versions, similar to modern SCADA systems.

2. Legacy Format Handling

Developed sophisticated solutions to handle complex legacy data formats, with particular focus on IBM Hexadecimal Floating Point (HFP) conversion from IBM 360 systems. Our implementation includes custom bitwise operations and format translation layers to ensure accurate data representation, crucial for industrial control systems.

3. High-Performance Visualization

Created an advanced visualization engine capable of rendering over 1.2 million data points in approximately 80 milliseconds. The system utilizes modern GPU-accelerated color shaders and dynamic hue shifts to represent seismic data variations, enabling real-time interaction with massive datasets - similar to modern manufacturing execution systems (MES).

Key Features:

Fast parsing and processing of SEG-Y files

Support for multiple SEG-Y versions

Custom floating point conversion

Real-time data visualization capabilities

Industrial-grade performance monitoring

Process control integration ready

Get In Touch

For Fast Service, Email Us:

info@ofashandfire.com

Our Approach

Discovery & Planning

We begin each project with a thorough understanding of client needs and careful planning of the solution architecture.

Learn More

Implementation

Our experienced team executes the solution using modern technologies and best practices in software development.

Learn More

Results & Impact

We measure success through tangible outcomes and the positive impact our solutions have on our clients' businesses.

Learn More