Komprise Targets AI’s Unstructured Data Challenge with Transparent File Tables

Data lakehouses have become central to enterprise analytics. But they still have a blind spot: unstructured file data.

Most organizations have done a decent job getting structured data into platforms like Snowflake and Databricks. The same can’t be said for the billions of files spread across NAS systems and hybrid cloud storage, many of which remain disconnected from analytics and AI.

Komprise is looking to close that gap with its new Transparent File Tables. It is built to work by presenting enterprise file data as Apache Iceberg tables that can be queried without first migrating the underlying content.

The nature of enterprise file data itself is the source of the challenge. Unlike structured data in databases, unstructured data typically lacks a consistent schema and is often spread across multiple storage systems.

This means that copying petabytes of files into a lakehouse can take weeks or months. This creates more costs, even before any analytics or AI work begins.

One often-cited IDC statistic says more than 80% of enterprise data is unstructured. Yet, according to Komprise, less than 1% of that data is currently being used in AI.

The company says Transparent File Tables addresses that challenge by exposing a structured metadata layer while leaving the underlying files in place.

“The reason 99% of enterprise unstructured data has been dark to AI and analytics is because discovering and generating its schema and moving it is inherently complex and costly,” said Kumar K. Goswami, CEO and co-founder of Komprise.

“Komprise brings to light the huge petabytes of enterprise unstructured data in a form that data teams can access easily and transparently for analytics. Komprise Transparent File Tables opens a whole new world to AI.”

Behind the scenes, Komprise indexes enterprise file data across on-premises and cloud storage into what it calls a Global Metadatabase. Transparent File Tables expose enriched metadata and pointers to the original files. This gives analytics platforms a structured view of file-based data.

Komprise says this offers a simpler experience for users. They can query enterprise file data directly from familiar tools such as Snowflake and Databricks. If an AI application needs the full files, only the relevant data is retrieved, according to Komprise.

The company highlighted several potential use cases.

“A data analyst at a pharmaceutical company can create dashboards in Snowflake or Databricks for their drug research projects by querying a Komprise Transparent File Table for project files generated by each instrument and lab,” stated Komprise in the press release.

“The analyst can then join the data with financial tables from their ERP systems and instrument information from Benchling, thus combining structured and unstructured data from different sources in a single interface.”

Similarly, in media and entertainment, AI agents could use project metadata to identify only the scripts needed for a particular project rather than searching the entire archive.

The announcement reflects a broader challenge facing enterprise AI. While organizations have spent years modernizing databases and building data lakehouses, much of their institutional knowledge still lives in file-based content that has remained difficult to incorporate into analytics workflows.

The industry has responded with a growing number of tools focused on AI-ready data preparation, metadata management and open table formats. Rather than moving ever-larger datasets into centralized repositories, many vendors are now looking for ways to make existing enterprise data easier to discover and query where it already resides.

Komprise is taking that approach a step further by targeting file-based data, an area that has historically been much harder to integrate with analytics platforms.

As enterprises move AI projects from pilots into production, reducing the time and cost of preparing unstructured data could become just as important as advances in AI models or compute infrastructure.

The post Komprise Targets AI’s Unstructured Data Challenge with Transparent File Tables appeared first on AIwire.