In Brief
Function: The Omniscope Basket Analysis block finds relationships between a subset of items and other items outside of this subset. For example, a typical rule (association) in supermarket point of sale data could be {Bread, Butter} -> {Milk}. The Basket Analysis block finds these relationships and provides an assessment on the relative strength of the associations.
Typical Use Case: The internal algorithm for Basket Analysis has a wide range of applications, from marketing to bioinformatics. For example, if you have point of sale transactional data, one could use basket analysis to assess which goods/SKUs are commonly purchased with other goods or combinations of goods. This type of analysis can be useful for making recommendations to customers, choosing store shelf layouts or making stock selections.
Case Study
Workflow
Here we have a case with Belgian Supermarket data where each record is the purchase of an individual item. The data must have a transaction, or group field along with a label for each item in the group. Here we also have a broader "Category" label which can also be used. We set up the following workflow in Data Manager:
Options
Note that the advanced options are not highlighted. The data needs to be in a 'long' more rows than columns format, with an item per row with a corresponding group name per row (an example is shown on the below).
Specifying an Item Group Field is optional. This field should contain the name of the broader category (for example "Apple" may have the Item Group Field name of "Fruit"). Providing an Item Group Field means that a broader set of rules can be found. By default the best set of rules is provided, but if “Provide rules for Items and Groups” is selected then both will be provided in the output data.
The minimum support and minimum confidence are more advanced, technical specifications that enable the number of associations to be found. Lower values of each lead to more rules being provided, but doing so can lead to memory errors. Leaving the values empty will mean that Omniscope’s internal smart defaults will be used. The minimum and maximum length values restricts the number of items in Item Combination for each of the found association. For example setting the minimum length to three means that {X} -> Y would not be included as there are only 3 items in the association.
Input data taken from a supermarket tillBasket analysis produces results by “Baskets” of items for which relationships have been found with a “Target Item”. In this case the block output is as follows.
Output from the studyThe metrics that are given for each of these relationships can be defined as follows (common academic term is provided in brackets):
Target Relative Strength (Difference of Confidence)- Strength of association relative to all other associations for the target item. Correlation (Phi)- Basket and target item correlation (correlation of rule) and has a range between 0 and 1 Frequency (Support)- Frequency of item(s) or association Probability (Confidence)- Probability of associated item being purchased given the known basket and has a range between 0 and 1. Lift - Strength of the association Quality- The quality of a rule, decided by the rules correlation ordered by quintile (fifths of the population).
Rules Level- Indicates if the rule is found from the Group or the Item (indicated by “Group Level” or “Item Level” respectively).
Output
Combining these metrics enables users to easily find relevant associations in the data set and visualise them in sets of views using the visualisation DataExplorer interface. For example, a combination of Venn view and Scatter plot can enable end users to visually explore associations based on specified goods in the basket:
Was this article helpful?
That’s Great!
Thank you for your feedback
Sorry! We couldn't be helpful
Thank you for your feedback
Feedback sent
We appreciate your effort and will try to fix the article