Word Similarity

Get a feeling with word similarity. The Artificial Intelligence algorithm behind the Tasting Intelligence Data-driven Flavor Profile.

Summary

In this article, we describe how we have constructed our Tasting Intelligence Data-driven Flavor Profile. This software gives a good representation of flavors extracted from texts. Here we explain one of the artificial intelligence algorithms from our software, word similarity. There is also an app in this aritcle that gives an opportunity to get a personal feeling and find out how it works. We give you an exclusive sneak peek of our developed software.

Introduction

Tasting is not an exact science, taste and tasting notes may differ from person to person. It depends on how developed a person's palette is and the tasting experiences someone has. Other factors that play a role can be the environment or location you taste and the occasion. Everyone has a different association you have with a taste, it is personal. The Tasting Intelligence Flavor Profile is a data-driven flavor profile. Flavor notes are optimized for a mathematical model to calculate taste from texts.

The tasting wheel of the looks like:

The Tasting Intelligence Flavor Profile is in line with other common profiles. The goal is a multi-purpose tasting wheel to extract flavors with our software. It is a universal wheel for alcoholic drinks but it applies to non-alcoholic drinks and food as well.

In this article, explain one of the algorithms from our software, word similarity. We give you an exclusive sneak peek of our developed software. An app is available to play and get a feeling with word similarity.

If you are looking for an example of an analysis with word similarity, please visit our review example: rum analysis example.

Method: Word Similarity

The key algorithm to extract the flavors from a text is cosine similarity on words. This forms the basis to get a flavor wheel from texts. It analyzes word similarity against a statistical model, a word vector, or "word embedding".

The software which calculates similarity is spaCy, an open-source library for advanced Natural Language Processing (NLP) in Python. SpaCy describes their similarity feature as "comparing words, text spans, and documents and how similar they are to each other". This model can compare words against each other and compare words against a text or document.

In our experiments, we have made a flavor profile and optimized using the algorithm. The goal was to have unique words which contributed to the main flavors. We selected flavors and compared their similarity. This was an iterative process.

This cosine similarity method takes into account words (flavors) and non-identical words. For a given word (flavor) it will look for similarities against a text (reviews or tasting notes). When the similarity is high enough, above a threshold value, it will contribute to the main flavor. Check this in the app below!

Results and Discussion

After an iterative process, the Tasting Intelligence Flavor Profile was constructed. The similarity of words (flavors) is shown in the heatmap below. This heatmap shows the correlation between the tastes of the Tasting Intelligence Flavor Profile.

Choose your threshold value:

We see that the words have a high similarity with each other and with words of the same main flavor. This means that the software extracts a unique flavor with some described words.

Interesting cases: fruity and spicy

In real life, spices can be correlated to fruits. As an example, we take lemon zest. Lemon zest and peel have a high similarity with spices. Looking at what zest is according to the collins dictionary: “the zest of a lemon, orange, or lime is the outer skin when it is used to give flavor to something such as a cake or a drink.'' it has many similarities with a spice: “a spice is a part of a plant, or a powder made from that part, which you put in food to give it flavor. Cinnamon, ginger, and paprika are spices.” both are extracts and add flavor. In the statistical model, lemon zest, lemon peel, and burned mandarin peel have more similarities with spicy. Thus we say it as a spice.

Conclusion

The model is based on similarity analysis with the software spaCy with its word vector statistical model. Luckily for us, people have already described the similarity of words. To construct tasting notes the algorithm makes a word-to-word similarity and counts hits. Based on this count the software constructs the ai-projected flavor map. In the case of a hit, the flavor from the flavor profile will contribute to the corresponding main flavor.

The data-driven tasting wheel is kept as simple as possible. It is optimized for our software to analyze text and recognize taste and flavor. Using other software might result in a different optimized flavor wheel. Words not in the flavor list are also incorporated so it catches words with similar words.

Try it out!

Use this app to find word similarities of a word in your text to experience it. Give a word and add a text. You can also choose from our presets. Use the Tasting Intelligence Dashboards to calculate your mapped tasting wheel. This and other text analytics are possible at the Dashboard for Bloggers and Writers. Click here if the app is not showing properly.