Channel Insider content and product recommendations are
editorially independent. We may make money when you click on links
to our partners.
Learn More
Komprise recently announced a new ingestion engine solution aimed at the curation of unstructured data. We spoke with co-founder and COO Krishna Subramanian to learn more about how the company is poised to enable enterprise AI adoption and other strategic business outcomes.
Decade-old company now answering pivotal questions for AI adoption success
Komprise was founded in 2014 by experienced technology leaders looking for their next opportunity following two successful acquisitions.
The company’s co-founders initially set out to address sprawling costs in cloud and backup storage, which enterprises were finding inhibitive to efficient operations.
“We were hearing from customers of our other companies that they were having a lot of issues with their unstructured data, and we decided to address those problems with our new venture,” Subramanian said.
Subramanian uses a comparison to phone storage to illustrate the complexity many enterprises face. Everyone has photos in their smartphones, she says, but not all of them are what a person would consider worth sharing or using. Still, they all sit there, taking up space. Now, multiply that across thousands of employees over the timespan of several decades, and you get a sense of how much data enterprises hold.
“Nobody sits there cleaning up all of that data, and so it just sits there. Plus, in industries like healthcare, you are regulated to keep data for certain lengths of time,” Subramanian said. “Plus, even if you had a person doing that, you can’t possibly scale that.”
Komprise’s technological solution to the problem: ingest engines
On Sept. 23, the company announced the general availability of Komprise Intelligent AI Ingest, a new solution it says will provide the following benefits to enterprise users:
- Metadata-rich Global File Index: Komprise automatically builds metadata and delivers a single view of all file data within the enterprise, at scale, so that you can find precisely the correct data for your AI use case with simple queries.
- Precise curation boosts RAG efficiency: Unlike traditional ETL and data ingestion approaches that provide connectors to indiscriminately copy data from a source, Komprise delivers a surgical approach with rich filters to eliminate low-quality and sensitive data during ingestion.
- 2X ingestion performance improvement: Komprise doubles the performance in benchmark tests against a data transfer tool from a primary cloud provider. This is made possible by a purpose-built transfer engine that minimizes file overhead for AI, supported by a massively parallel architecture.
- High-performance parallel architecture: The Komprise elastic grid architecture is parallelized across multiple network interfaces, shared engines, and thread pools, enabling high performance. This enables the solution to index and enrich metadata across billions of files rapidly, and to transfer large volumes of files to various AI tools and services as needed.
- Built-in PII and sensitive data handling: Komprise provides standard and custom sensitive data classification, allowing you to reduce the risk of sensitive data leakage and compliance violations.
- Automated data governance auditing: Komprise automatically maintains an audit trail of each ingestion workflow for data governance and auditing, documenting the who, what, when, and data lineage for compliance reporting.
“We’re building a metadata base, if you will,” Subramanian said. “This is a powerful way to find the right unstructured data for what you need.”
Why unstructured data remains a pain point for so many enterprises
Now, unstructured data poses a challenge to another business goal: AI adoption. As enterprises continue to struggle with realizing ROI and determining the best use cases for various AI tooling, unstructured data is a hurdle for many.
“AI has a finite amount of compute. If you clutter that up with junk, you will get junk answers,” said Subramanian. “AI has people’s attention right now, and some of the really mature companies have built their own models and cleaned their own data to do so. But that’s not scalable, and now the market is shifting towards enterprises using off-the-shelf models to see success.”
While structured datasets are easier to get under control and feed into many models, unstructured data in particular comes with a unique set of challenges, including:
- Suboptimal AI, RAG, and LLM outcomes: Unstructured data is unorganized, containing large quantities of irrelevant, outdated, and duplicate files. This reduces precision, clutters context windows, and adds latency in AI pipelines. Studies show a 10% efficiency drop for every 10,000 additional unstructured documents in a typical RAG, resulting in reduced accuracy and poor outcomes.
- High cost of inferencing: Irrelevant unstructured data wastes expensive AI processing resources, drives up costs, reduces accuracy, and ultimately erodes AI ROI. The costs compound in AI inferencing as the augmentation occurs for each prompt.
- Sensitive data leakage security risk: Ingesting data in bulk can lead to inadvertent sensitive data exposure in AI tools, violating privacy, security, and compliance policies.
“There is greater recognition now that unstructured data is a problem, but I think a lot of enterprises still don’t necessarily know the solution to fix that problem,” Subramanian said.
How channel partners play a key role in building success for enterprises and Komprise itself
Subramanian calls the company “100 percent partner-driven” and stresses the company does not have a direct selling motion.
“There’s a last mile with enterprise products. We need channel partners to enter these markets. Someone who knows the customer well has to take the tech and know how it fits into the organization, and that’s what channel partners do,” said Subramanian.
It also leverages strategic alliances with several technology companies. Komprise recently added two new AI-focused ecosystem partners:
- NVIDIA: Nvidia customers can ingest the right unstructured data to Nvidia GPU-Direct storage and Nvidia NeMo DataStores and automatically manage the AI data lifecycle using Komprise. As a Nvidia Connect partner, Komprise collaborates with Nvidia to curate AI-ready unstructured data for model training and inferencing.
- SUSE Linux: Komprise enables SUSE Rancher customers to catalog their unstructured data, profile it, and ingest the relevant data into Rancher for AI use cases. This partnership allows both companies to develop and validate joint solutions for AI-ready data and data lifecycle management.
To Subramanian, channel partners have a robust opportunity to build a wider portfolio of services around the AI and data needs of modern enterprise customers.
“Every enterprise is really still trying to figure all of this out,” Subramanian said. “Partners have a real opportunity to not just talk about modernizing infrastructure, but also getting data AI ready and bringing forward all of the services to enable them to do that.”