Allocation of spend categories based on free text analysis

We have a request to create spend categories based on the analysis of free text data. This may be a combination of Omni functions i.e. text analysis and machine learning.

Are there any guidelines or can anyone provide some pointers as to the best way to manage this?




Omniscope is used by many spend analysis specialists for this purpose... 

Are you able to post a small sample of your dataset (you can scramble/remove anything sensitive) and just show what is the text to be used for this classification?


Hi Karen,   we're The Classification Guru and use Omniscope for this very thing!  We use it in a slightly different way for spend classification but always recommend starting by doing it manually the first time with people before automating the process.  If you want to have a chat, you can email me


Attached is a sample dataset. The classification is to be based on the 'linedescription'




Hi Karen, 

This is pretty much what we do for a living, it looks like it might be credit card or expense data.  We tend to classify this manually as there's too much that can go wrong with keyword searches with things like this.   Here's a couple of videos I've made showing what we do:

Hi Susan

Thank you for taking the time to respond. 

We are looking to automate the process as far as possible as there will potentially be high data volumes. We are also bidding against an organisation that provides automatic classification a standard.

Many thanks


Not a problem, there are a lot out there I'd just be careful about what they can promise you, hopefully they can give you a sample..

Good luck  :)

Hi Paola

I posted a data sample for text analysis. Are you able to advise of the most efficient way to manage this?



