ECM and Machine Learning – What are Box, IBM, OpenText and other Vendors doing?

There are many use cases in Enterprise Content Management (ECM) for which Machine Learning can be deployed. In fact, i’d argue that you can apply machine learning in all the stages of content life cycle. You can apply:

  • Supervised learning e.g, to automatically classify images, archive documents, delete files no longer required (and not likely required in future), classify records and many more
  • Unsupervised learning e.g, to tag audio and videos, improve your business processes (e.g., approve a credit limit based on a machine learning algorithm instead of fixed rules), bundle related documents using clustering and so on

What are ECM vendors currently offering?

Not much i’d say. These are still early days.

To be fair, Artificial Intelligence and Machine Learning have been used for a long time in enterprise applications but their usage has really been for really complicated scenarios such as enterprise search (e.g., for for proximity, sounds etc) or sentiment analysis of social media content. But it has never been easy to use machine learning for relatively simpler use cases. Additionally, no vendor provided any SDKs or APIs using which you could use machine learning on your own for your specific use cases.

But things are gradually changing and vendors are upping their game.

In particular, the “infrastructure” ECM vendors – IBM, Oracle, OpenText and Microsoft — all have AI and ML offerings that integrate with their ECM systems to varying degrees.

OpenText Magellan is OpenText’s AI + ML engine based on open source technologies such as Apache Spark (for data processing), Spark ML (for machine learning), Jupyter and Hadoop. Magellan is integrated with other OpenText products (including Content, Experience Suites and others) and offers some pre-integrated solutions. Specifically for ECM, you apply machine learning algorithms to find related documents, classify them, do content analysis and analyse patterns. You can of course create your own machine learning programs using Python, R or Scala.

Screen Shot 2018-01-24 at 5.54.13 PM

Figure: Predictive analytics using OpenText Magellan. Source: OpenText

IBM’s Watson and Microsoft Azure Machine Learning get integrated with several other enterprise applications and also have connectors for their own repositories (FileNet P8 and Office365).

Amongst the specialised ECM vendors, Box is going to make its offerings generally available this year.

Box introduced Box Skills in October 2017. It’s still in beta but appears promising. You can apply machine learning to images, audios and videos stored in Box to extract additional metadata, create transcripts (for audio and video files), use facial recognition to identify people and so on. In addition, you will also be able to integrate with external providers (e.g., IBM’s Watson) to create your own machine learning use cases with content stored in Box.

box ML

Figure: Automatic classification (tags) using image recognition in Box. Source: Box.com

Finally, there are some service providers such as Zaizi who provide machine learning solutions for specific products (Zaizi is an Alfresco partner).

Don’t wait for your vendors to start offering AI and ML

The rate at which content repositories are exploding, you will need to resort to automatic ways of classifying content and automating other aspects of content life cycle. It will soon be impossible to do all of that manually and Machine Learning provides a good alternative for those type of functionalities. If the ECM vendor provides AI/ML capabilities, that’s excellent because you not only need access to machine learning libraries but also need to integrate them with the underlying repository, security model and processes. An AI/ML engine that is pre-integrated will be hugely useful. But if your vendor doesn’t provide these capabilities yet, you still have alternatives. I’ve said this before and it applies to ECM as well:

There is no need to wait for your vendors to start offering additional AI/ML capabilities. Almost all programming languages provide APIs and libraries for all kinds of machine learning algorithms for clustering, classifications, predictions, regression, sentiment analysis and so on. The key point is that AI and ML have now evolved to a point where entry barriers are really low. You can start experimenting with simpler use cases and then graduate to more sophisticated use cases, once you are comfortable with basic ones.

If you would like more  information or advice, we’d be happy to help. Please feel free to fill the form below or email.

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s