Auto-classification, which is a statistical technique for categorizing content or documents into pre-defined categories, is a hot topic today. And with good reason – it promises to help companies around the world efficiently tackle records management, eDiscovery, information security and information governance amid a world of overwhelming data overload.

“What’s interesting is that this “new” technology isn’t really so new. In fact, it’s been around for more than half a century and many of the algorithms we use today were invented in the 50’s and 60’s.” – Chris McHenry, Integro VP of Technology

Over the decades, there have been incremental improvements in auto-classification, but the concept never rose to the popularity and success it’s seeing today. That’s because in the past, computing resources were expensive and not nearly as accessible to the mainstream. And…we had much less data – manual alternatives for going through files were sufficient. Today’s environment is drastically different. You might say we’re experiencing a perfect storm of:

  1. Pain – we have way too much data to manually deal with, and
  2. Technology resource availability and familiarity – with today’s resources like the Cloud and other new technologies we can perform auto-classification more easily and affordably.

Thus, 50 years later, auto-classification seems to be experiencing somewhat of a 2nd lifecycle and we find ourselves in the early stages of visibility and maturity along Gartner’s “Hype Cycles” for a second time.

A recent survey Integro conducted confirms this sentiment. In the survey, 72% of respondents felt that we are in the first three phases of Gartner’s five key phases of a technology’s lifecycle (see definitions of the phases below).

Gartner phases of technology lifecycle with Integro survey results

Clearly there’s a lot more to be learned and developed in the area of auto-classification, and companies are eager to hear about the successes. Integro is actively working in this field every day and we’re seeing strong technology solutions that are achieving great success in a variety of compelling use cases. We’re leveraging those solutions to help our customers govern information retention periods, identify risky and confidential content for better information security, respond to eDiscovery, and clean up their massive file shares.

We’ll continue to share our experiences and success stories through future webinars and case studies, so stay tuned. If you’re ready to start exploring how auto-classification can benefit your company, we invite you to schedule a one-on-one with us today.


Gartner “Hype Cycles” for the Five Key Phases of a Technology’s Lifecycle (source: Gartner.com)

Technology Trigger: A potential technology breakthrough kicks things off. Early proof-of-concept stories and media interest trigger significant publicity. Often no usable products exist and commercial viability is unproven.

Peak of Inflated Expectations: Early publicity produces a number of success stories — often accompanied by scores of failures. Some companies take action; many do not.

Trough of Disillusionment: Interest wanes as experiments and implementations fail to deliver. Producers of the technology shake out or fail. Investments continue only if the surviving providers improve their products to the satisfaction of early adopters.

Slope of Enlightenment: More instances of how the technology can benefit the enterprise start to crystallize and become more widely understood. Second- and third-generation products appear from technology providers. More enterprises fund pilots; conservative companies remain cautious.

Plateau of Productivity: Mainstream adoption starts to take off. Criteria for assessing provider viability are more clearly defined. The technology’s broad market applicability and relevance are clearly paying off.