Abstract

Understanding human perception and requirements on food for cancer prevention and condition management is important so that food applications can be catered to cancer patients. In this paper, web scraping was conducted to understand the public’s perception, attitude, and requirements related to a plant-based diet as a recommended diet for cancer prevention and condition management. Text and sentiment analyses were carried out on results gathered from 82 social sites to determine whether noncancer and cancer patients use plant-based diets, how they have been consumed, their benefits in the prevention and condition management of cancers, the existing myths/fake news about cancer, and what do cancer patients need in a food app. The results of the text analysis highlighted gaps in existing apps, including a lack of credibility as there were a lot of fake news and myths about cancer and endorsement by professionals. Future food apps should provide personalized diets to include both plant-based diets as well as meat, symptom management, good user experience, credibility, and emotional and mental health support.

1. Introduction

Cancer is an important cause of morbidity and mortality worldwide where 18.1 million new cases of cancer and 9.6 million deaths from cancer were estimated in 2018. The cumulative risk of this incidence indicates that 1 in 8 men and 1 in 10 women will develop the disease in a lifetime [1]. Long-term cancer survivors represent a sizeable portion of the population. Plant-based foods may enhance the prevention of cancer-related outcomes in these patients [2]. Plant-based diets are a diverse family of dietary patterns defined as infrequent consumption of animal foods along with frequent intake of plant-based foods in the usual diet. A high fruit and vegetable diet reduces the risk of cancer in the mouth, esophagus, lung, stomach, colon, and rectum. Furthermore, evidence is present of probable risk reduction in cancer of the larynx, pancreas, breast, and bladder [3]. By the World Cancer Research Fund (WCRF) cancer prevention recommendations, 30-50% of all cancer cases are preventable by following a healthy diet and lifestyle [4]. Although these facts are widely known, only a fraction of the population follows the WCRF recommendations. Urgent action is needed to promote healthy plant-based foods in dietary guidelines to effectively reduce the risk of cancer.

Most people are now using smartphones, and these devices, coupled with embedded sensors and modern communication technologies, make them an attractive technology for enabling the monitoring of an individual’s health [5]. To monitor the alcohol consumption of college students, a study was conducted using smartphones and wearables which concluded by providing insights into mobile technology [6]. Mobile health (mHealth) applications (apps) have gained popularity in intervention for health behavior change [7]. Research into developing apps aimed at modifying key lifestyle behaviors associated with chronic diseases and other health issues have yielded positive findings [8]. For example, system monitoring during chemotherapy via an app has been found to lengthen survival where 75% of patients using the app were still alive, compared to 49% of nonusers [9]. Reviews conducted by Schoeppe et al. [10] and Villinger et al. [11] provide comprehensive evidence that app-based mobile interventions are effective and highly promising for changing nutrition behaviors and nutrition-related health outcomes. However, there is a lack of evidence base and medical professional involvement in the development of current health apps. For example, Mobasheri et al. [12] examined 185 health apps that focused on breast cancer awareness and found that only 13% were developed with professional medical input.

In this paper, we have examined many online forums from Malaysia and Singapore to address the following five research questions: (i)RQ1: do cancer patients use a plant-based diet and is it effective?(ii)RQ2: what are the types, volume, frequency, and methods of cooking (where applicable) of fruits and vegetables used for the prevention and condition management of cancer?(iii)RQ3: how does eating a plant help with a particular organ? How does it prevent or/and help during and after the different stages of cancer?(iv)RQ4: what are the existing myths/fake news about cancer?(v)RQ5: what are the requirements of a food app for cancer patients?

To facilitate answering the last question, three subquestions emerged: (i)RQ5.1: what do cancer patients need in a food app?(ii)RQ5.2: what are the missing features in the current cancer apps they might be using?(iii)RQ5.3: how do cancer patients/caretakers search for food information/recipes online?

The rest of the paper is organized as follows, with the next section detailing related work. The following sections outline the methodology used and findings of the public’s perception, attitude, and requirements related to plant-based diets, followed by discussions. This paper contributes to providing greater insights and understanding of how a food app can be designed to help improve cancer patients’ diets and outcomes.

2. Literature Review

2.1. The Role of a Plant-Based Diet in Cancer Prevention

A plant-based diet has been shown to protect against the 15 leading causes of death in the world, including many cancers, and may offer benefits as nutrition interventions to improve the management and treatment of these conditions [13]. Although the role of diet and lifestyle factors in health and disease is gaining more attention and emphasis, the benefits are still underestimated and undervalued. Commonly cited reasons for not eating fruits and vegetables are a lack of knowledge (e.g., not knowing how to cook them) and a dislike for their texture, smell, and taste [14]. However, the study [14] found that people are willing to try them when they are taught about their health benefits or how to prepare/eat them. In addition, the kind of plant-based diet and amount of intake depends on various factors such as whether a person is undergoing chemotherapy, the stage of cancer, and his/her age, gender, sex, and other psychosocial factors. It is also important to ascertain if the type of nutrition diet is being recommended for prevention or to manage a particular stage of cancer. Unfortunately, there is still a lack of a mHealth app that can provide detailed dietary guidance or recommendations for fruits and vegetables as prevention and condition management for cancer.

The finding of a recent study indicates that greater adherence to a plant-based diet index (PDI) is inversely associated with the risk of breast cancer [15]. A plant-based diet is also valuable in the primary and secondary prevention of colorectal cancer where epidemiological studies show a 46%-88% reduced risk of colorectal cancer for those following a plant-based diet [16]. A study involving a good representation of an ethnically diverse population (including both men and women of Asian, American Indian, Black, and Caucasian ethnicities of different ages, smoker/nonsmokers, consumers, and nonconsumers of alcohol) reveals that lower consumption of vegetables, fruits, fiber, and whole grains is associated with higher pancreatic cancer risk [17].

The National Cancer Society Malaysia, for example, provides useful nutrition tips for people living with cancer to be taken during cancer treatment and recovery [18]. However, the types of food, volume, and frequency change during and after treatment, and thus, a personalized diet is essential. This is to ensure that users will be able to adapt to what is recommended according to their body’s changing nutritional needs. Through a cognitive approach by understanding the patient’s needs, values, and psychosocial factors involved in nutritional behavior and food-related decisions alongside other variables (sex, age, and race), researchers have found that it is possible to achieve important clinical targets, to develop a personalized approach and to support concrete actions towards healthier diets thus preventing recurrences, monitoring chronic conditions, and supporting a good quality of life [19].

2.2. The Impact of Mobile Health Applications on Cancer

The appeal of smartphones for assistance in health promotion concurs with the trend that more people are seeking health information via mobile devices [20]. Researchers like Wang et al. [21] have advocated for smartphone interventions for long-term health management of chronic diseases. In this context, apps provide the opportunity to bring behavioral interventions into real-life situations where people make decisions about their health [22]. A study done by Viitala et al. [23] showed that patients’ sense of security and freedom increased after using mHealth apps targeted for cancer. Research has shown that compared to those without mHealth apps, individuals with mHealth apps have significantly higher odds of using their smart devices to track progress on a health-related goal to make a health-related decision and in health-related discussions with care providers. Middelweerd et al. [24] highlight that smartphone users value health behavior apps that require low effort is pleasant to use and enable self-monitoring. The other requirements are that the apps should be developed by credential experts, should enable self-monitoring, provide advice on how to change (dietary) behavior, include positively framed alerts/reminders, provide accurate tracking functions, incorporate adequate privacy settings, and clearly show what the app will do. These factors need to be taken into consideration to improve the engagement and retention of the user [25].

The use of mobile health apps to provide help with nutrition has yielded positive findings [10, 11]. There are food apps catered for cancer, such as OncoFood to help patients track their daily dietary habits [26]. However, there are limitations to the app such as patients having to take too much time to input data and many of them wish for recipe suggestions as well as to be able to make changes to existing and past data on food and prepared meals. Another study by Keaver et al. [27] reviews the quality, nutrition content, and behavior techniques of 1149 apps aimed at those with cancer, but after two rounds of screening, only 12 apps were identified. There was a lack of strategies for implementation and a lack of indication on whether the information available is catered for specific cancer types or at specific stages of cancer or treatment. Out of the 12 apps, 6 apps were also providing nonevidence-based information. This study concludes that there is little nutrition information that is currently available on publicly available apps for cancer. However, only 3% of those apps have had their content developed or evaluated by health providers leaving behind the question of whether those apps are reliable or not. The challenge is in developing apps that are appropriate for health tracking, monitoring, and interventions using evidence-based strategies. In addition, there is a lack of understanding of how wearable or smartphone sensors can be used for personalized diet management and interventions [28]. Research done by Cai et al. [29] further emphasizes on the need of having patients, nurses, and healthcare professionals to collaborate in the design of a mHealth app. Uncertainty is a common factor in general healthcare and knowing how to navigate it is quite useful in any health technology [30]. In Malaysia, a cancer dietary app was developed to provide a healthy eating guide (advice from healthy eating to eating problems, weight loss prevention, and increasing protein intake) which is uniquely tailored to the local food choices, preferences, and ingredients [31].

2.3. The Use of Web Scraping and Text Analysis to Identify Requirements

Web scraping is a method used to extract a huge amount of information automatically from websites. It is also known as screen scraping, web data scraping, or web harvesting [32]. Web scraping allows the immense amount of information found on the internet to be compiled and analyzed to make sense of what is happening in a short amount of time [33]. Currently, web scraping has mostly been used to research food prices [34] and to extract recipes [35]. Previously, to understand the perception towards diet and food, surveys, records, 24-hour recall, and questionnaires were used [36]. The use of wearable cameras was previously implemented to understand the food consumption life cycle [14]. So far, web scraping has not been used as a digital ethnography method to find, analyze, and understand human perception of food, diet, and cancer.

A lot of people, including cancer patients, have been using social media to express how they feel, and what they are going through and share with each other their own experiences. For example, a breast cancer patient, Lisa Bonchek Adams, used Twitter and tweeted over 176000 times to talk about her own cancer experience [37]. Shaiket et al. [38] highlight that data analysis has been done previously to conduct an online diagnosis of diabetes with Twitter data, to find out about the average happiness of cancer patients by using patient tweets, to conduct a sentiment analysis on breast cancer screening as well as many others.

To design an app for cancer patients, it is important to know what the cancer patients want and need in an app. It is common to use focus groups to find out more about a particular cancer [39, 40]. However, the use of text analysis to identify requirements for the design and development of cancer apps is still rare and little understood. While a plant-based diet is effective to improve outcomes for cancer patients, there is a lack of mHealth apps that provide reliable plant-based dietary information and recommendations for cancer patients. As a result, the public tends to just take in whatever information they have, which may not be true. As such, we conducted a text analysis to understand the perception that the public has on the above to identify requirements for a food app to help improve cancer patients’ diets and outcomes.

Sentiment analysis, also known as opinion mining, is the process of automating information such as opinions, attitudes, emotions, and feelings. Sentiment analysis is usually applied to reviews and social media. It calculates the aggregate sentiment polarity and classifies the sentiment as positive, neutral, or negative [41]. In sentiment analysis, results are represented in a score for each term as follows: positive score (), neutral score (), and negative score (). Each score is used to determine how that sentence is perceived [42]. Microsoft Azure Machine Learning embodies cloud services and can be used to calculate the contribution score of the user based on the metrics and has about 100 techniques including regression, classification, text analysis and recommendation [43]. This study applied these methods to identify requirements for a plant-based food app, particularly suitable for cancer patients.

3. Methods

3.1. Categorizing and Source Gathering

Before starting the process of requirement gathering, we mapped the research questions from Section 1 into 5 categories. Each category had a main question and subquestions related to a certain topic. Category A focused on the reason for a plant-based diet, category B on the types of plants used, category C on the association between a plant-based diet and cancer, category D on myths about cancer and sources of cancer-related information, and finally, category E focused on the requirement for cancer prevention and condition management via a food app. For each category, search keywords were identified to get information from online forums such as LowYat and Reddit and social networking sites such as Twitter, Facebook, YouTube, and Instagram. The keywords were manually typed onto those sites, and relevant links were gathered to be used as sources of web scraping. This process resulted in a total of 82 links gathered with a minimum of 10 links per category required. Table 1 shows the number of links gathered for each category.

3.2. Web Scraping

To extract information from the links gathered, the links were run through web scraping. Links from Facebook and Instagram were excluded from the scraping as they provided information in the form of pictures that could not be scraped by a text-based web scraper. YouTube videos whose comments were blocked or did not have any comments were also excluded. The Beautiful Soup library in Python was used to web scrape the links gathered. Multiple Python files were created, each for a platform, and modified to cater to the structure of each platform and to match the source code. The source code provided information as to where the needed information was present on the page such as under the paragraph HTML tag of <p>. The Beautiful Soup library was then used to return all instances of the tag, comprising textual information which is exported as Excel sheets. The results of web scraping provided more than 100 posts for each category, adding up to 3908 posts in total. Table 2 shows the entries provided by the links in each category.

3.3. Text Analysis

So far, text analysis has not been used for requirement gathering although it falls in the cognitive technique of requirement gathering which is to obtain knowledge from stakeholders and from the stakeholders’ perspective and perception of the solution domain [44]. In this project, text analysis was used as a tool to gather requirements for a food app that caters to a user’s diet needs with a focus on cancer prevention and condition management.

Text analysis has made it easier to become aware of the public’s opinions on what they want or does not want. For example, from the analysis, we discovered that the words in favor of plant-based diets do not carry a lot of weight compared to those not in favor showing that the public mainly prefers diets that are not plant-based. In this way, we can cater the application to contain diets that are not entirely plant-based but to mix plant-based diets with meat recipes to keep general users healthy. Text analysis can help identify trends among the public. For example, keywords such as ‘research and patients’ showed that there is a trend emphasizing reliable sources of information among the public.

Text analysis also allowed for a wider audience to be reached. In contrast, interviews could involve only a limited number of people, but they give researchers the freedom to ask follow-up questions. So instead of getting keywords and having to infer what they mean, questions could be asked to have a better understanding of the interviewee’s point of view. While as seen from keyword findings, especially in category C, text analysis may provide only surface-level answers without the opportunity to dive further into the keywords obtained.

Sentiment analysis has proven to be very useful and plays a huge role in understanding human perception. Sentiment analysis is being used for business, politics, disease outbreaks, sports, data security, and health care [45]. Sentiment analysis has helped in providing useful information for designers. Knowing how the public perceives a particular product allows the designer to know whether to venture deeper into that product or to completely stray away from it [46].

Sentiment analysis was also applied to data from the online vegan community where there are a lot of recipes and reviews. The purpose of that study was to find the sentiments from the reviews and comments. The results from this analysis were able to give an idea of which recipes vegans like the most [47]. Sentiment analysis has also been used as part of a systematic review of health from online communities. From the results, five roles of authors were identified as well as demographic factors. Health-related problems and healthcare treatments were categorized and studied by sentiment analysis [47]. Sentiment analysis has also been used to find out about users’ food preferences. Food names from comments made by users were extracted and analyzed. This analysis showed a high level of precision in knowing the user’s preference [48]. In another study, twitter messages were also explored and analyzed to determine the contents of tweets related to four eating situations—breakfast, lunch, dinner, and snack. This study provided a framework for understanding food intake and food selection [49].

Before proceeding with text analysis, the records obtained from web scraping needed to go through a process of cleaning to remove unnecessary keywords, time stamps, and usernames associated with the posts. The unnecessary keywords included those that were used to search for the links such as Malaysia, cancer, vegetarian, and Singapore. Those keywords were not needed as they describe the research broadly and do not cater to specific categories.

A code written in Python was used to remove null spaces and common stop words in the English language such as “of, for, at, to, you, a, i, the”. Stop words are words often used in the English language but do not provide significant meaning in terms of Natural Language Processing (NLP). Lemmatization was also performed to remove different forms of a word. It aimed to remove inflectional endings of a word to give its base form. For instance, when presented with the token (word) “eaten,” lemmatization would return either eat or ate depending on whether the word was used as a noun or verb [50]. The pandas, time, defaultdict, and spacy libraries were used for data handling, timing operations, word frequency, and pre-processing, respectively.

Using the ‘Add-in’ feature on Microsoft Excel, Microsoft Azure Machine Learning was applied to each category. This feature was used to determine the sentiment of each sentence. Each category was then divided into 2 parts, one for all positive sentiments and then negative sentiments. The documents were then passed through a term frequency-inverse document frequency (TF-IDF) calculator to identify the key 50 words or terms based on their relevance to each category and to generate their associated weight. The higher the occurrence of a word in a text document collection, the higher the TF while IDF is used to measure the importance of a term in the documents. The TF-IDF is the weight associated with a certain keyword, because by multiplying the TF and IDF obtained [51]. From the derived 50 keywords, the top 10 with the heaviest weights were chosen to represent each category. Choosing the top 10 keywords narrows down the results to provide a clear representation of the public’s sentiments to address our research questions. All documents obtained were combined and classified per category.

3.4. Analysis of Keywords

The final step in requirements gathering was analyzing the meaning of the keywords to understand the public’s sentiment about the category they represent. The context behind the keywords was understood by scanning the documents for where they occurred and how they were used by users who mentioned those keywords.

3.5. Analysis of Sentiments Using Microsoft Azure Machine Learning

All documents in each category were consolidated into one document before it was analyzed. This process was done manually and allowed us to understand how negatively or positively the public reacts to a certain topic and their views towards it.

4. Results and Discussions

4.1. Findings

The data gathered from text analysis was the public’s opinions about plant-based diets and their correlation with cancer. The text analysis provided keywords that can be used to create an application that best suits their needs concerning cancer, especially those who are already diagnosed with it. The keywords were not based on scientific research but rather on the public’s own beliefs. It is important to analyze those keywords as they can give information about what people might expect from an application that would cater to their health. The keywords can be used to meet the needs of users and their expectations.

For each of the categories identified, the top ten keywords that best answered the questions of the category were chosen to represent it in order of their positive and negative weights. The percentage of the sentiment was calculated by counting the number of each sentiment divided by the total number of responses, for example, if there are 100 responses and 50 of them are positive, 25 are negative, and another 25 are neutral, 50% of these responses in this category will be positive. Words with a higher weight were considered first, and if those words did not satisfy the category, text analysis was run again to get substitute words that are better suited for the category. Then, those words were gathered to represent each category and their total weight was added to determine the importance and relevance of a category in comparison to others. Table 3 displays each category, its keywords, and its associated weight. The keywords are put in order of their respective weights.

4.2. Analysis of Data Collected

The data obtained from each category were analyzed to evaluate whether they answered each question.

4.2.1. Category A

This analysis identified top keywords that were either positive or negative sentiments in response to the question, “Do cancer patients use a plant-based diet and is it effective?” This category gave insight as to why a plant-based diet would be considered or not.

(1) Positive Sentiment: Good, Vegetarian, Want, Body, and Healthy. Some percentage of the public acknowledged that eating a plant-based diet or even being a vegetarian is good for the body and is in fact, healthy.

(2) Negative Sentiment: Meat, Vegan, Protein, Fat, and Sugar. For the negative sentiment, meat came in as the first word (with the heaviest weight) to show that many people would prefer not to go on a plant-based diet as they preferred meat in their meals. The negative sentiment came from having to reduce or completely not eat meat in their meals. As such, meat was the main barrier to pursuing a plant-based diet. A lot of users also had a very negative sentiment towards vegan which showed that a majority of people do not want to be vegan. There was also a mention of how meat provided the source of protein and fat that the human body requires compared to a plant-based diet. Out of all the posts that were analyzed in this category, there seemed to be 50% of negative sentiment towards a plant-based diet and only 39% of positive responses with 11% being neutral. The keywords discovered through text analysis identified that many people in Malaysia and Singapore were not willing to completely give up their current diets to pursue plant-based ones due to a preference for meat in their meals. Those who pursued a plant-based diet mainly do so as a method of taking better care of their health and not as a personal preference or an attraction to the diet. In general, the reaction toward a plant-based diet was negative. The study done by Tee [52] shows that 86% of vegetarians do want to switch back to nonplant-based diet. The focus of a food app for cancer patients should be on providing users with information about plants and their correlation with cancer followed by how to use plants in recipes, using reliable sources for information if users would like to know more about cancer, and finally, showing how effective plant-based diets are in battling cancer.

4.2.2. Category B

The public perception in response to the question, “What are the types, volume, frequency, and methods of cooking (where applicable) of fruits and vegetables used for prevention and condition management of cancer?” can be seen in this analysis.

(1) Positive Sentiment: Water, Protein, Alkaline, Fruits, and Acidic. This category gave insight as to what kind of fruits and vegetables were used in plant-based diets and how they were used. From the keywords obtained, it revealed that most people preferred plants that have a lot of protein content and were knowledgeable about the alkaline and acidic content of their food. There seemed to also be a slight inclination towards alkaline foods rather than acidic foods. Water had the most weight of keywords that showed how vegetables were cooked and eaten, whereas 2% of the responses preferred cooking such food in water.

(2) Negative Sentiment: Bread, Protein, Food, Meat, and Need. Bread was mentioned a few times as being a common staple in Malaysian houses, and it would be hard to give up on bread if there was any diet involved. Moreover, there was a perception that protein-based food that can be used for vegans is mostly processed food and, therefore, are not healthy as well. This result reinforced that the supply of animal-based protein in Malaysia has been increasing at a faster rate than vegetable-sourced protein. The total animal-based protein products have increased by 59.1% while vegetable protein increased only by 11.9% [53]. Not only is the frequency of daily consumption of both vegetables and fruits far lower than desired, but Malaysia has also been topping the league tables as the most obese nation in Southeast Asia since 2014 [14]. The strong evidence showing obesity increases the risk of several types of cancer including colorectal, breast, and prostate shows how impertinent it is that this research addresses the lack of plant-based diets to provide better health outcomes for Malaysians [54].

Sentiment analysis revealed that 41% of the posts in this category were positive, 50% were negative, and 9% neutral. One of the underlying factors for the negative responses was the cost of treatment and/or hospital. These words were mentioned in relation to our question “prevention and condition management of cancer” and while the results of this analysis did not answer our main question, it showed how patients are concerned about the cost of food in order to eat healthier.

4.2.3. Category C

This category responded to the question, “How eating a plant helps with a particular organ and how it prevents or/and helps during and after the different stages of cancer?”

(1) Positive Sentiment: Risk, Breast, Genetic, Colorectal, and Colon. Breast, colorectal, and colon cancer were the most mentioned type of cancers due to people’s beliefs. Plant-based diets could be used during the treatment of cancer to maintain health, to battle other diseases as they boost immunity, and aid vaccines to maintain the immunity gained. Plant-based diets are also believed to elongate a person’s lifespan and give a higher chance of survival [2]. Genetic was a word that was mentioned several times referring to how most cancer patients that these users had come across to were related to someone else having cancer or that it was mostly genetic.

(2) Negative Sentiment: Diagnosed, Stomach, Report, Medical, and Research. Most of the top keywords from the negative sentiment were referring to the stories of different people being diagnosed and treated. These results did not contribute to the question. While this category surfaced in different cancers, it failed to go deep into answering the question for this category. This raised the following suggestions as to what this could mean: (1) The public did not see how eating a certain plant help with a particular type of cancer, and (2) the public seemed to view eating a plant-based diet as providing benefits, but they did not have any idea about how consumption of a plant-based diet affects someone in different stages of cancer. This can conclude that the web scraping and text analysis did not provide enough information to be able to answer the question. Users would prefer recipes that involve plants with other food items such as meat rather than meals that solely focus on fruits and vegetables. Therefore, when designing a food app, recipes that involve both meat and plant components together would have to be considered.

4.2.4. Category D

This category focused on identifying the existing myths or fake news about cancer. The analysis aimed to look through what are some of the news that the public knows and might consider as true.

(1) Positive Sentiment: Breast, Cure, Vitamin, Good, and Water. The public perception of the breast was on identifying how to find out if someone has breast cancer or not and to debate on whether having certain pills affect the breast. Results of the findings showed that users of food applications that focused on the prevention and condition management of cancers were not only cancer patients but users who would use those applications to maintain their health and prevent cancer. As such, future food apps will need to be designed in a way that aids cancer patients on their road to recovery and considers users who are conscious of their health. Cure was also mentioned when several people were giving different ideas on how to get cured of cancer. Internet threads are, therefore, a common source of receiving or correcting facts. This information is useful for deciding how to present information to users on a food app based on sources they would trust instead of any strangers. The mention of the keyword ‘water’ was associated to the belief that drinking a lot of water can help prevent cancer. It was also believed that taking vitamins regularly can improve immunity and reduce the risk of cancer while others believed that vitamin C can cure cancer.

(2) Negative Sentiment: Treatment, Health, Years, Risk, and People. The keyword treatment was used several times to discuss how it was important not to believe that vitamin C will help in curing cancer and that it was not healthy to just take vitamin C without any treatment.

From this question, these were some common myths and facts that the public was perceiving: (i)Pills affect breast cancer(ii)Drinking water can reduce cancer(iii)Eating vitamin C can cure cancer(iv)Treatment is better than just taking vitamins

The results from this category indicated that information provided by any food app will need to be concise, accurate, and trustworthy sources to gain user trust.

4.2.5. Category E

This last category focused on identifying requirements to support app development for cancer patients and referred to the questions: (1)What do cancer patients need in an app?(2)What are the missing features in the current cancer apps they might be using?(3)How do cancer patients/caretakers search for information/recipes online?

(1) Positive Sentiment: Patients, Help, Research, BookDoc, and Android. From this category, it can be understood what expectations potential users would have for an application that caters to cancer care. Finally, BookDoc, an application that connects patients to healthcare individuals was mentioned several times as an application that could help cancer patients. BookDoc’s features can be used as an indication of what makes a healthcare application successful. BookDoc allows teleconsultation 24/7, allowing users to be in communication with a doctor without the need to travel. Other features include searching and booking the user’s preferred healthcare professional, providing wellness programs being a platform for users to buy products, and even providing nutritional and dietary advice from professionals [55].

(2) Negative Sentiment: Novartis, iPhone, Breast, Bra, and Detect. Novartis was mentioned a few times as an iPhone application developed for mobile phone users. Novartis’s aim is to provide a community of people who can support each other in this journey of cancer. The most prominent keyword on the list was ‘iPhone’ showing that most of the cancer applications launched are for iOS compared to Android. A suggestion would be to make sure to have the application have the option to let users provide feedback regarding its function. Based on the apps available such as BookDoc, users would expect the application to help cancer patients on their journey, to provide news and research information on cancer, and be free. The application should also provide help for high-risk individuals or patients.

Results from this category provided suggestions as to what features cancer patients would like to have in a food app: (i)To help cancer patients while they are going through treatment and surgery(ii)To provide real news and research on cancer that is backup by professionals(iii)To provide help in terms of consultation, dietary features, and dietary advice for high-risk individuals or patients(iv)To incorporate more plant-based ingredients in their diet(v)To get emotional and mental help

5. Features

Most cancer apps such as BookDoc [55] and OncoFood [26] do not prescribe personalized diet recommendations. They offer patients community support via an online network, to reduce stress and aid the healing process via meditative music and art, serve as a calendar to help schedule appointments, used as an online health journal that could improve their treatment by tracking medications and blood counts, to provide educational videos for pain management, to identify suitable physical activities for patients, and to act as a handy Dictaphone to record answers from doctors and nurses, equipped with medical jargons. Most of these apps, however, do not have recommendations of recipes, specifications of what to eat or what not to eat depending on the cancer type, cancer stage, and treatment currently being done by the patient that is based on reliable sources of information confirmed or endorsed by a health professional.

5.1. Personalized Diet

From the results of web scraping, patients wanted to have an app that helps them in their diet. However, the app will need to provide patients with a personalized diet as patients might have different food preferences and allergies, and may need different types of food elements that can help them at different stages of cancer. While patients are also going through different types of cancer therapy, the food they consume can affect them differently. Therapies can also affect the patient’s sense of smell, taste, appetite, gastric capacity, or nutrient absorption [56].

5.2. Symptoms Management

Another result of the web scraping was the concern about recurrence after the first cancer treatment. Different treatments also affect patients differently both in positive and negative ways. To take care of cancer patients, symptom management is an important part of it [57] and will need to be taken into consideration in the design of an app.

5.3. User Experience

User experience is a factor that needs to be considered when developing the app. In designing this app, there was a need to find out the current public perception, which is, to be able to identify the problems that users might face using current apps and to ultimately provide a design that will be able to provide a solution [58]. From the research done, it can be concluded that users prefer a low-cost app, that is easy to use and reduces manual input [26]. Since users also are not totally ready for plant-based diets yet, it is also important to still include meat in this app. An app with a good user experience should aim for at least these factors.

5.4. Credibility

Web scraping and background research show that 3% of 1149 apps lack credibility or the input of professionals based on the research by Keaver et al. [27]. Having accurate medical information is an important aspect of credibility. Web-based medical information is viewed skeptically as they are known to be misleading and inaccurate. From the results in category C and D, there were also a lot of misinformation about cancer on the Internet. Designing the app with a health professional will aid in making sure that the information available on the app are accurate and not misleading [59].

5.5. Emotional and Mental Health Support

The results from the web scraping and text analysis showed that there is a need for patients to get emotional and mental help. There is a lack of mental health support in the cancer community [60, 61]. To solve this gap, there are different guidelines on how to help support a patient in these times, which include helping them with personal health goals, coping skills, healthy sleeping, and relaxation methods among others [62].

5.6. Limitations

While a plant-based diet is generally good to enhance health and reducing cancer risk, some certain plant-based diets might not be beneficial for cancer patients. First, a plant-based diet might be lacking in providing the sufficient nutrition needed by cancer patients [2]. While β-carotene is an important source of vitamin [63], it has been found to also increase the risk of lung cancer in smokers [64]. As such, it is important to also consider food that can be dangerous for cancer patients while suggesting any recipes.

Gathering sources based on web scraping was the first step in this process and provided a few limitations. A plant-based diet and even diet, in general, differs from region to region due to the availability of fruits and vegetables, environment, culture, and many other factors [65]. Having that in mind, there was a need to be specific in the regions we chose to do web scraping from since the project aimed to create a food app for cancer patients in the region of Malaysia and Singapore. We also had only a few forums and social media to choose from, having to exclude information found in videos and images. With forums, information could be repetitive, and some were meant to be ads. The same person or sometimes even, different people had the same text all over that source.

Although the documents were cleaned at the beginning to remove stop words and unnecessary keywords, as they contained a large amount of data, it was difficult to go through them each to remove all unwanted keywords which resulted in the repetition of the cleaning process. Keywords extracted through TF-IDF were not always used in the same context by all users. For example, one user could mention the keyword “meat” to indicate they are in favor of meat-based diets while another could mention it to indicate they were against such diets. The majority opinion would be considered as the main context behind a certain keyword.

While this paper focused on the use of ID-TRF and Microsoft Azure Machine learning for analysis, various techniques can also be used for text analysis. Behaviour Change Techniques Taxonomy version 1 (BCTTv1) has been used on some fitness and food mobile apps as well as sentiment analysis and user feedback was collected through online reviews [66]. Other text mining techniques have been used as a big data analysis tool for food science and nutrition. These include word association analysis, text classification, text clustering, and text modeling [67].

6. Conclusion

With the increase in the use of food apps and those dedicated to health, there are some missing gaps when it concerns mHealth apps for cancer patients and the use of a plant-based diet to help cancer patients in symptom management. This paper aimed to analyze results from social media in finding human perception and sentiment on plant-based diets and cancer. The results of the web scraping showed that the majority of people did not want to have a completely plant-based diet, that there was a lack of apps dedicated to helping with personalized diet, and that there were a lot of myths and fake news about cancer. In conclusion, the following design requirements should be considered in the development of a plant-based food app for cancer patients—support for personalized diet, good symptom management, good user experience, good credibility, and emotional and mental health support.

Data Availability

The sentiment analysis data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors have no conflicts of interest to declare.

Acknowledgments

We would like to thank Research Square which published a pre-print of this manuscript [68]. The research has been supported by the Fundamental Research Grant Scheme (FRGS) through the project: mHealth App: Prevention and Management of Cancer via an AI-Integrated Mobile Application to Recommend Plant-Based Diets by the Ministry of Higher Education (MOHE), Malaysia.