How Data Mining Reveals the World’s Healthiest Cuisines
Jean Brillat-Savarin was a 19th-century French lawyer famed for his writings on gastronomy. In his most famous work, he said: “Dis-moi ce que tu manges, je te dirai ce que tu es.” Or “Tell me what you eat and I will tell you what you are.”
This idea—that you are what you eat—has become increasingly popular. Since Brillat-Savarin’s time it has been used as the title of various cookbooks and health guides; for some it is a way of life.
If there is any truth in Brillat-Savarin’s phrase, it should have important implications for public health. We know from experience that Indian cuisine, for example, is very different from Mexican or Italian or Chinese. But we have little idea how to quantify these differences. Indeed, exactly how cuisines vary around the world and how they influence health is poorly understood.
Today that changes in part thanks to the work of Sina Sajadmanesh at the Sharif University of Technology in Iran and a few pals who have gathered a huge database of recipes from the Web, categorized them by cuisine, and then analyzed their relationship to each other and to other factors such as health measures in different parts of the world.
The work shows for the first time just how different cuisines are linked by similar ingredients, how specific ingredients help define certain cuisines, and how foods influence our health.
Sajadmanesh and co begin by assembling a database from the recipe recommendation app Yummly. They downloaded some 150,000 recipes from 200 different cuisines in this way, although they confine their work to the 82 cuisines that have more than 100 recipes. Together these recipes use some 3,000 ingredients.
They then determined the nutritional qualities of each recipe by calculating the amount of carbohydrate, protein, and fat each contains.
And they downloaded various country-level statistics such as the health expenditure as a percentage of GDP, the prevalence of obesity, and net immigration levels.
Finally, they used various data-mining and machine-intelligence techniques to mine all this data for interesting nuggets.
One measure that Sajadmanesh and co look at is the diversity of ingredients in a cuisine. So they measure how many different ingredients appear in the dishes from each country (their global diversity) and look at how these ingredients vary between dishes (their local diversity).
It turns out that countries with big immigrant populations tend to have the greatest diversity—places like the U.S. and Australia, for example. These countries have the greatest number of ingredients and the biggest variation between dishes, too. “This is mainly due to immigrants bringing their native culinary culture with them, which in turn makes the cuisines of their target country richer,” say Sajadmanesh and co.
Another interesting measure is the complexity of the dishes in each cuisine—in other words, the number of ingredients they use. For example, about half the dishes from the Southeast Asian country of Laos have more than 15 ingredients, whereas half the dishes from Russia have fewer than seven. So the cuisine in Laos is significantly more complex than Russian cuisine.
In general, Sajadmanesh and co say, countries with large numbers of ingredients on offer tend to have the most complex dishes. But there are some exceptions. Chinese and Indian cuisine both have relatively few ingredients to choose from, but these are used in relatively complex dishes.
Why this happens isn’t clear. “Perhaps, these countries had or have good chefs that could cook more complex foods with the available ingredients,” suggest Sajadmanesh and co. Another possibility is that the cuisine from older cultures in these countries is more complex because it has had longer to evolve.
The team also look at similarities between cuisines by comparing the ingredients they use. It turns out that some ingredients tend to define cuisines. For example, mozzarella cheese appears only in Italian cuisine, while the ground spice garam masala is a signature of Indian cuisine.
Finally, the team looks at the correlation between the nutritional qualities of a cuisines and the health of the populations that eat it. They show that there is a clear correlation between obesity and cuisines that are dominated by sugar and carbohydrate. Conversely, health-related problems are lower among peoples who eat protein-rich cuisines.
That’s interesting work, but it comes with some caveats. Perhaps most significant is the limitation of the data set itself. An important question is how accurately the recipes from Yummly represent those from different cuisines around the world.
For example, the curries on offer in Indian restaurants in London are very different from those in Mumbai or Kolkata. Would both types of recipe be labeled as Indian on Yummly? Indeed, it’s hard to see how to classify Indian cuisines under a single label at all.
That raises the question of who is posting the Indian recipes on Yummly. Is it cooks from the Indian subcontinent or gastronomers from Soho?
It’s probably not hard to guess. It may be that Yummly’s recipes offer a view of global cuisine through the peculiar prism of wealthy, tech-savvy foodies from the developed world. Sajadmanesh and co could do more to check for any potential bias.
Even so, this kind of data mining offers a fascinating insight into the cuisines of the world and how they vary. Brillat-Savarin would surely be amazed.
Ref: arxiv.org/abs/1610.08469 : Kissing Cuisines: Exploring Worldwide Culinary Habits on the Web
Leave a Reply