Abstract

The elderly population growth has posed several challenges which the current healthcare systems are incapable of handling. In the past few years, there has been a close collaboration between both the scientific and industry communities to provide feasible solutions capable of addressing the growing demands from people with special needs, namely, in terms of assistance and improvement of their overall life quality, which promoted to the development of the ambient assisted living (AAL). Despite the general consensus regarding its positive impact in the user’s daily life, several challenges compromise their overall adoption. As a consequence, the research undertaken so far focused over the mitigation of technical-related limitations, overshadowing user-related limitations, namely, the ecosystem’s usability. This article presents a parametrization of the literature guidelines, which provides the end-users a consistent and accurate way of using the heuristic methodology to assert the interface’s usability without relying in external entities with specialized know-how.

1. AAL Ecosystems

The shift of the world’s demographic pyramid is a phenomenon that affects both western and eastern civilizations. The effects of the higher life expectancy, the declining birth rates, and the overall improvement of healthcare services are noticeable upon a close inspection of the current worldwide population distribution tendency (see Figure 1) [15].

This new paradigm presents a unique set of conditions that will have a direct impact in the modus operandis of multiple sectors, with special emphasis over the health sector, where the demands for services capable of assuring the elderly’s wellbeing have grown over the years. Demands combined with human resource shortage, lack of patient-oriented approaches, and the rapidly rising costs of elderly care threaten to hinder the core service efficiency and availability, leading to their long-term collapse. As a consequence, an attempt was made to develop an ICT-based solution capable of tackling the economic and functional endeavours of the assistance services within the health sector and promoting the elderly’s quality of life and autonomy—the ambient assisted living (AAL) ecosystems [5]. Despite the significant improvements in the field, multiple challenges still need to be tackled to make their widespread adoption on the market viable. These challenges can be segmented into technical (system’s bottlenecks from an engineering standpoint) and end-user challenges (user perceived main barriers to the product’s acceptance) (see Table 1).

Regardless, an effort was performed to overcome all these drawbacks and promote the ecosystem mass adoption. In terms of usability, multiple studies have been conducted to identify the range of factors that have a direct impact in the product’s usability and several approaches have been proposed to assist in the process used to evaluate this property within the context defined [1924]. Despite the research effort undertaken, the results produced are still not sufficient to consider this a closed issue. To address this issue, the author’s initial proposition was to empower the product manufacturers and provide them a simple and feasible way of evaluating and monitoring the product’s usability. From all the available methodologies, the one eligible to be executed in an enterprise setup, due to its inherent cost and speed of execution, was the heuristic-based. Alas, it also presents limitations that compromise its adoption, namely, its results’ accuracy and its restricted applicability.

Considering the challenges presented, this article provides a parametrization of the usability guidelines. Our aim is to (1) optimize the subjectivity level typically found in heuristic-based methodologies, (2) optimize their overall accuracy and results consistency, (3) extend its accessibility/applicability to nonusability experts, and (4) minimize the effort typically related to their automation.

A thorough search in literature was conducted in order to identify what can be learned from the best practices depicted, how they can be applied in a practical scenario, and how the inclusion of automation can be a feasible option in an enterprise context to mitigate the flaws typically detected in multiple medical applications in the field.

1.1. Usability

Usability is a multidimensional property that reflects the scope in which a product/service is expected to be used [25, 26]. This characteristic varies depending on its end-users, product/service purpose, and application context. To ensure its compliance, it is required for the design process to take into account multiple elements [25, 2731]: (i)Efficiency: measures the speed rate at which the users perform accurately a task in the interface(ii)Effectiveness: measures the accuracy with which the actions required to complete a task in the interface are executed by the user(iii)Satisfaction: measures the interface friendliness and user’s need compliance level(iv)Learning curve: measures the speed in which the end-user is able to assimilate the knowledge needed to use the system properly(v)Rememberable: measures the difficulty behind the retention of the required knowledge for the user to handle the system properly(vi)Context compliance: measures the compliance of the product/service created in terms of its applicational context(vii)Security: measures how the use of the depicted product/service impacts the user’s data integrity

To enforce usability throughout the development lifecycle, it is mandatory to integrate within the process practices that could aid both developers and designers. For this purpose, it is imperative to stipulate a set of guidelines to be adopted, evaluation scales and methodologies for the identification, and prioritization of the usability bottlenecks.

1.2. Guidelines

The search for a set of generic rules not bound to any applicational context to assist the team during the development cycle was a thematic thoroughly explored in the literature. The first set of principles was proposed by Gould and Lewis in 1985 [32, 33]; however, their limited applicability hindered their adoption and acceptance by the scientific community [34, 35]. To address the gap between the theoretical definition and practical application, in the following years, several authors proposed their own version of usability principles, which considered their applicability in a real environment.

In 1990, Nielsen and Molich proposed 10 heuristic principles [3639] that emphasized the following guidelines: (1) visibility of system status, (2) match between system and the real world, (3) user’s control and freedom, (4) consistency and standards, (5) error prevention, (6) recognition rather than recall, (7) flexibility and efficiency of use, (8) aesthetic and minimalist design, (9) error diagnose and recognition, and (10) user’s assistance and documentation. In 1995, Constantine proposed 11 rules based on multiple interface-related subjects (access, efficiency, progression, support, context, structure, simplicity, visibility, reusability, feedback, and tolerance) [40]. In 1996, Powalsa proposed 10 cognitive principles [33, 37, 41] focused on a holistic analysis of the usability evaluation process. In 1998, Shneiderman proposed a set of 8 golden rules [42] whose definition stressed scopes already enforced by its predecessors. In 2000, the authors Susan Weinschenk and Dean Barker proposed 20 principles, which resulted from the combination of the Jakob Nielsen principles with vendor specific guidelines (Apple and Windows) [43] focused on multiple topics (user’s control, human limitations, modal integrity, and linguistic clarity, among others). In 2003, Tognazzini proposed another set of principles covering a sum of 19 areas that range from broad subjects, such as learnability and readability, to more practical and specific ones, such as colour blindness and Fitts’s law (human interaction prediction model published by Paul Fitts in 1954) [44, 45]. This set was the first to include accessibility elements within its heuristic set and stress its importance to the product’s acceptance.

Additionally, the emergence and mass adoption of new interface devices (smart phones and smart TVs) motivated the creation of specific guidelines capable of taking into account their design restrictions. On the mobile domain, several proposals with a well-defined set of heuristics have been made by the authors Silva et al. [46] and Inostroza et al. [47]. The smart TV domain should be highlighted in the study by Martins et al. [48], which aimed at adapting the Jakob Nielsen heuristics to the specific technical constraints.

Among the previously mentioned principles, the most widely accepted, due to their extensive applicability in multiple use cases, maturity level, and adaptability, is the Jakob Nielsen and Rolf Molich heuristic principles.

1.3. Scales

The scales used in the evaluation process are aimed either at quantifying the product usability level or at providing a formal procedure to prioritize each bottleneck detected according to its criticality and severity. They can be segmented into two groups: (i)User-based: quantification process based on the information extrapolated from the user’s reactions, opinions, desires, and feelings expressed during the interaction process (e.g., ICF-usability scale [49] and Likert [50])(ii)Heuristic-based: prioritization process performed within the scope of heuristic methodologies that identify the critical flaws in need of further improvement (e.g., Jakob Nielsen scale [51] and Joe Dumas and Ginny Redish scale [52])

1.4. Methodologies

Evaluation methods used to identify usability bottlenecks and provide the development team feedback are required to mitigate them accordingly [53]. According to their execution requirements, they are divided on a high level into three typologies, which one with its unique characteristics: (i)Inspection-based: analytic methodology whose execution depends on the availability of formal evaluators or field specialists capable of performing an accurate assessment of the interaction between system and end-user. This scope includes the following techniques: (1) heuristic evaluation, (2) feature inspection, (3) consistency inspection, (4) standard inspection, (5) formal usability inspection, (6) cognitive walkthrough, and (7) pluralistic walkthrough [28, 53, 54](ii)Enquiries-based: empirical methodology based on execution of questionnaires specifically design to take into account the inner characteristics of the product/service being evaluated. The subjectivity of the collected data is a relevant asset in the identification of the user needs and in the identification of usability bottlenecks. Additionally, their execution speed and low implementation costs make them a common choice within the usability evaluation scope [54](iii)Test-based: empirical methodology in which a human evaluator observes the user’s executing specific tasks in the interface and collects enough data to extrapolate empirical evidence to optimize interaction mechanisms within the interface [54, 55]

From the multiple methodologies described, this article focuses over the heuristic methodology. For this purpose, it defined quantifiable metrics based on literature guideline parametrization. However, before applying any guideline breakdown, it is important to identify the advantages and disadvantages behind the adoption of such methodology, in order to clarify what drove that decision.

Regarding the main advantages, the heuristic methodology is known as a quick and low-cost approach capable of providing feedback to the designers since an early stage, without the need of real end-users. This approach uses literature guidelines to evaluate the interface, and this assists the designers in identifying corrective measures needed to solve usability bottlenecks detected. In terms of disadvantages, the ones most frequently highlighted are (1) the approach’s efficiency and viability depends on expert know-how, (2) inability to evaluate usability to its full extent (for example, user’s satisfaction is out of the methodology’s scope), and (3) the lack of reliability of the produced end-results, which can be tackled by including additional specialists to perform the evaluation process [56, 57]. According to the best practices, its applicability requires a group of 3 to 5 evaluators [58].

The dependency on expert’s know-how and the execution restrictions identified are challenges that the proposed parametrization intends to mitigate, ensuring the methodology accessibility to anyone that intents to use it. However, it should be noted that applying a set of well-defined metrics to manually evaluate an interface in terms of guidelines compliance level is a time-consuming task. Since parametrization is a first step to define business rules to be consumed by a yet to define tool, it is reasonable to explore the use of automation mechanisms to handle such procedure.

2. Heuristic Methodology Roadmap

This methodology has been a thematic explored in the literature by several authors whose contributions led to its typification into four categories, each one with their own unique characteristics: (i)Interaction-based: an approach focused on the use of users’ interactions to evaluate the interface’s usability. The analysis performs a comparison between the interaction samples collected and the ones considered optimal within the applied context, so as to identify usability bottlenecks. Despite being acceptable to check the interface implemented at the interaction level, its use entails multiple consequences, such as (1) the dependence on real users and an already testable interface and (2) the results reliability and the significant number of samples required to ensure the process viability. In terms of architecture, these solutions usually delegate the sample processing and analysis to a remote server, but inherent performance drawbacks have led to the adoption of a new approach in which the load is divided between server and client accordingly with the objective of improving the results’ response times [5963](ii)Metric-based: an approach focused on the definition metrics used to quantify the interface’s compliance level with usability guidelines defined in literature. These solutions do not require the intervention of end-users, which make them a viable option to be used in the interface’s development initial stages. Despite the correlation shown between the approaches and the manual tests outcomes, the metrics’ lack of context awareness and their overall dependence on expert’s know-how compromise their adoption in multiple use cases [5961](iii)Model-based: an approach developed to mitigate the drawbacks identified in the metric- and interaction-based approaches through the use of artificial intelligence mechanisms; this allows for the definition of the interaction model then applied in the evaluation process. Features which make it independent from end-users enhance its context awareness. Although efforts have been made to enhance the model’s creation and training algorithms and their context adaptability, current solution scalability and performance still need improvement before being implemented in a real environment [5961, 64](iv)Hybrid-based: an approach that combines the mechanisms of each of the previously described categories to provide a holistic solution capable of identifying usability flaws and proposing measures to address them [59, 60]

Despite their differences, there is a common denominator among the categories defined—the automation mechanisms—which aim to reduce the number of tasks to be handled by a human agent. Based on such typification, a development effort has been performed by both the scientific and industry community to create solutions capable of evaluating to a certain degree in the interface’s usability.

2.1. Industry Context

In the industry context, external factors, such as the government accessibility guideline compliance policy [65, 66] combined with the elapsed time typically required to perform an interface full assessment, fostered the development of standalone metric-based solutions capable of evaluating a specific usability subset—accessibility—for both web and mobile interfaces (please see Tables 2 and 3).

2.2. Academic Context

In the academic context, an analysis performed to 99 scientific articles ranging from 1997 to 2022 provided an overview of the heuristic methodology evolution tendency and respective trends explored throughout the years [60, 6264, 67142] (see Figures 2 and 3).

Based on the results obtained, it is perceivable that 40% of the scientific articles published focused over interaction-based solutions. However, such tendency has been shifting from 2005 onwards to the model-based approaches, due to the improvements achieved in the machine learning and data mining algorithms.

From the several articles sampled are highlighted the following ones: (i)In 2013, Dingli and Cassar proposed a tool that automates the usability evaluation process of web interfaces using several artificial intelligence mechanisms. The results obtained demonstrate the capability of the tool in the identification of critical bottlenecks otherwise identified by a human agent [69](ii)In 2014, Yáñez Gómez et al. compiled a set of heuristic evaluation checklists and adapted them by taking into consideration the unique requirements of mobile interfaces. As a result, an assessment tool based on a mobile-oriented best practices checklist was generated. Such tool provided the means to trained and nontrained developers to assess and identify critical usability bottlenecks in mobile interfaces in an accurate and feasible way [139](iii)In 2017, Sun proposed a usability evaluation approach based on mixed intelligent optimization focused over the assessment of educational resources software [140]. In the same year, Ferre et al. proposed an extension to the Google Analytics functionality, which stored the actions executed by the user during the usability evaluation process. The solution itself was divided into three functional blocks: the first one identifies the tasks executed and events triggered in the interface, the second one maps the event/task information sampled and logs it into the client device, and the third one focuses over the identification of interaction patterns in the samples collected through the execution of a set of data mining algorithms and compares them with the defined expected interactions [73](iv)In 2018, Othman et al. published a study that compares the Jakob Nielsen principles and the smart heuristics in the detection of usability bottlenecks within a museum guide app developed for the mobile environment. Comparison is aimed at identifying which heuristic subset was suitable to evaluate/assert the interface’s usability—a generic heuristic set (Nielsen’s heuristics) or a tailored heuristic set whose metrics have been customized to consider the unique characteristics of application developed for smartphones. The output provided highlighted the importance of the application context during the selection of the heuristic set to maximize the accuracy and efficiency of the evaluation process [141](v)In 2019, Ribeiro et al. proposed an approach to perform an assessment and identification of usability bottlenecks in an automatic manner through the use of user interactions with the interface in production environment. Such approach is aimed at mitigating the costs and complexity related with the user interaction sampling process during the execution of usability tests [105]. In the same year, Virtanen used the robot framework to automate the analysis of the interface’s compliance level with the Jakob Nielsen heuristics. The study compared the usability bottlenecks identified with the ones detected by the traditional manual approach to assert the approach effectiveness and accuracy [143](vi)In 2020, Bures et al. proposed a crawler specialized in the generation of interaction models within the context of smart TV applications. The generated models provide the end-user a mechanism to quantify the feasibility and effort related with the execution of each action within the interface and evaluate it in terms of usability [100](vii)In 2021, Ripalda et al. proposed a tool that correlates the adopted usability metrics defined in the literature with the feedback retrieved automatically from LINKERT questionnaires designed by the development team. Such tool provides designers, developers, and usability specialists the mechanisms required to evaluate the interfaces, to identify the side effects of design changes in the interaction process and to perform an assessment with recommendations meant to optimize the overall results obtained [137](viii)In 2022, Muhanna et al. proposed a set of heuristics to tackle the lack of explicit usability evaluation methods capable of assessing the usability within a specific application context—Arabic mobile games. For this purpose, Nielsen’s heuristics were revised and adapted to meet the unique characteristics of the software being evaluated. Through such adaptation it became possible for the evaluators to detect additional critical bottlenecks within the test scope defined [142]

The approach that is explored in the article is included in the metric-based type.

2.3. Heuristic Optimization
2.3.1. Principles’ Breakdown

The guidelines selected for the parametrization process were the following: (1) Jakob Nielsen’s principles, (2) Shneiderman’s golden rules, and (3) Weinschenk and Barker’s cognitive principles. Each principle was grouped according to its scope within the interface and its direct relation with the interface main building blocks, the components, and the actions [144]. As a result, four typologies were defined: component oriented (CO): focused on the compliance of the interface’s native components with the look and feel defined during the design phase. As the name implies, its applicability requires an assessment of the components within each application’s section in terms of typology (active or passive), family (button, checkbox, and input tag, among others), and name; action oriented (AO): focused over actions provided used to navigate across the interface and manipulate the business data consumed system; section oriented (SeO); and screen oriented (ScO).

For each principle, the respective parametrization is presented in Tables 46.

It should be noted that within the Shneiderman and Weinschenk and Barker subsets, there were principles not described, since they have a common ground in terms of concept and definition to principles whose evaluation process had already been discussed.

In the Shneiderman subset, principles such as (1) “Strive for consistency,” (2) “Seek universal usability,” (3) “Prevent errors,” (4) “Permit easy reversal of actions,” and (5) “Reduce short-term memory load” share the same evaluation process described for their counterparts in the Jakob Nielsen and Rolf Molich set (“Consistency and standards,” “Flexibility and efficiency of use,” “Error prevention,” “User’s control and freedom,” and “Recognition rather than recall,” respectively).

In the Weinschenk and Barker subset, principles relating several topics, such as (4) “Accommodation,” (10) “Accuracy,” (16) “Consistency,” (17) “Support,” and (20) “Responsiveness,” were discarded from the evaluation process. According to Nayebi et al. [156], the use of a proper language, terminology, and adequate metaphors (daily life objects within the interface components as a visual representation of their purpose) plays a major role to ensure the interface suitability to the user’s needs and behaviours. These key factors are already considered in the second Jakob Nielsen and Rolf Molich principle (“Match between system and the real world”). In terms of “Accuracy,” the principles state that the interface should be deprived of errors or, in other words, the errors should be prevented to a certain degree. This topic is already discussed and addressed by the fifth Jakob Nielsen and Rolf Molich principle (“Error prevention”). The remaining categories share a direct connection with their Jakob Nielsen and Rolf Molich’s counterparts (“Consistency and standards,” “Help and documentation,” and “Visibility of system status”).

2.3.2. Real Environment Applicability

The parametrization provided a checklist to assert the interface’s usability compliance level. To foster its development, it becomes imperative to identify current bottlenecks that may hinder its adoption by using it in a real use case. For this purpose, two e-health applications were evaluated (an academic prototype and an enterprise solution).

2.3.3. Doctor Helper (Prototype)

The Doctor Helper is an academic prototype created as an attempt to replicate the functionalities identified in typical e-health applications, namely, (1) account creation/authentication, (2) sensor samples’ registration, (3) presentation of the sensor sample history in multiple visual formats, and (4) creation of data reports and inclusion of a notification mechanism to aid the user in his/her daily tasks or to report any abnormal event in the system.

The performed evaluation considered the 106 actions and 356 components available in the 15 screens of the entire interface (see Figure 4). The end-results are presented in Figures 57.

The evaluation results are allowed to identify a total amount of 1781 usability smells. In terms of “Visibility of system status,” the main bottlenecks in the Jakob Nielsen subset are the lack of confirmation/conclusion dialogs and the lack of a task completion rate. Regarding the “Error prevention” principle, the most prominent issues were related to the lack of an autocomplete mechanism in the components which receive an input from the user, the lack of a mechanism that automatically saves the user’s work, the lack of error messages providing clear indications of the type of unconformities detected in the user’s input, and the lack of mechanisms capable of disabling the action-related controls when the view requirements are not met. Finally, for the “Help and documentation” principle, its evaluation was mainly related to the lack of an option in the interface to access the platform official documentation.

In terms of “Support internal locus of control,” the main bottleneck in the Shneiderman subset was related to the lack of confirmation dialogues to assert the user’s intentions before executing a certain task within the interface.

In terms of “Human limitation,” it should be emphasized that, in the Weinschenk and Barker subset, the components lack the capability to store their previous state, preventing the user from being aware of his/her previous interactions without appealing to his/her memory. Regarding “Interpretation” principles, the main bottleneck detected was related to the lack of a mechanism in the system capable of predicting the user’s intentions or user’s input when the interaction process is taking place.

2.3.4. SmartAL (Enterprise Application)

SmartAL (https://http://www.alticelabs.com/site/smartal/) is a solution developed by Altice to monitor in real time chronically ill and elderly patients. This platform is intended to perform a thorough follow-up of the patients by providing mechanisms to monitor their vital signs, perform video calls with their physician, manually register vital signs samples, and schedule video call appointments among other functionalities.

During the analysis were taken into consideration 523 actions, 1918 components, and 103 screens of the entire interface (see Figure 8). Note that there is a significant discrepancy between the number of objects and actions identified between the academic prototype and a commercial application in a real environment, due to the number of functionalities and user roles supported (patient and caregiver). The interface technical depth combined with extensive number of elements to be tackled and the project time constraints implied the execution of compromises. As a consequence, the analysis scope was restricted to the most mature and emphasized subset in the literature—the Jakob Nielsen and Rolf Molich subset. The end-results are presented in Figure 9.

The results obtained highlighted 2488 usability smells. Among the evaluated principles, the ones that were not able to achieve the minimum acceptable score (70%) were related to “Visibility of system status,” “User’s control and freedom,” “Error prevention,” “Flexibility and efficiency of use,” “Help users recognize, diagnose, and recover from errors,” and “Help and documentation.”

The lack of progress bar indicators for time-consuming operations, the lack of dialogues to signal the user when a certain action is concluded, the lack of meaningful tooltips with information regarding the object or action intent, the existence of hints with system specific terminology, the lack of dialogues to assert the user’s intentions when an action is being executed, the lack of advices within the error message to mitigate the abnormal event, the lack of mechanisms capable to cancelling an action execution at any given time, the presentation of error messages with technical terminology, the lack of a link in the interface section to redirect the user to the official documentation, the lack of a tooltip in the interface controls describing the keyboard shortcut which could be used to trigger the component inherent action (see Figure 10), and the lack of a mechanism to assure the user’s work is not compromised by any abnormal event that may be triggered within the interface are some of the most prominent flaws identified in the interface which led to the obtained results.

All the results from the evaluation process were obtained through a manual analysis of the interface, a time-consuming and error prone approach. To maximize its scalability, it is required to implement automatisms capable of assisting the end-user with the evaluation tasks.

3. Implications

The AAL ecosystem’s rapid growth in the past few years led to the development of multiple use case-oriented solutions and studies addressing different subjects and challenges identified in such ecosystems. Depending on the scope of the studies and solutions devised, different approaches have been used to explore and tackle usability.

There are studies that focus over the usability assessment of existing solutions to identify critical bottlenecks and suggest optimizations accordingly. In this scope are included but not limited to the following application contexts: wearable camera system for dementia patients in 2014 by Matthews et al. [162], diabetes-monitoring applications in 2015 by Isaković et al. [163], IPTV application running in commercial environment that integrates multiple solutions to support home care by Ribeiro et al. [164], emergency alert devices in 2020 by Lersilp et al. [165], and smart bands in 2021 by Correia et al. [166], among others.

Studies focused over the usability processes adopted and their role within the development cycle. In 2017, Sili et al. presented a user-centred design process for an indoor and outdoor navigation solution suitable for the target group defined [167]. In 2021, Bastardo et al. went a step further by questioning the methodological quality of the user-centred usability evaluation of AAL ecosystems. The heterogeneity of the methods adopted in this scope imposes the definition of guidelines and instruments to assess the quality of the evaluation procedures adopted. Through an accurate assessment, it becomes possible to identify bottlenecks optimizing the process in an overall manner [168].

Other studies emphasize how the lack of attention given to usability has hindered the acceptance and adoption of AAL ecosystems’ solutions. In 2013, Agha et al. proposed a study which aimed to provide awareness to the readers of the complexities and importance of addressing human factors and usability aspects of telehealth. According to the authors, the rapid growth of mHealth technologies imposes a shift in terms of priorities during the development cycle. Feedback provided by both patients and caregivers must be considered to maximize the solution’s acceptability and adaptability [169]. In the same year, Queirós et al. performed a review of the AAL ecosystems to identify how end-users are involved during the development cycle. The end-results proved a certain lack of involvement of the end-users in the process, especially in the assessment of usability and accessibility bottlenecks [170].

Note that it is perceptible that usability is gaining momentum in AAL ecosystems. However, the way how it is tackled tends to be centred over the solution’s specific needs. Within the article’s context, usability is explored in an independent manner. The parametrization of the literature guidelines is aimed at creating the cornerstones required for the definition of heuristic metrics suitable for any solution within the AAL ecosystems’ scope. In opposition to the typical approaches adopted so far, the objective was to avoid coupling the metrics defined to any specific solution. The unique characteristics and the nature of the tasks executed within such ecosystems impose the definition of tailored made metrics. An analysis of the literature showed that such customization has not been explore thoroughly due to the heterogeneity of the solutions devised within this scope, so with this article, it is intended to perform the first step towards achieving that goal.

The findings present a practical application of the parametrization adopted and highlight how its use allows to assert the level of compliance with the guidelines covered in the study. Despite providing the foundations of the metrics to be used within the evaluation process, their scope still needs to be further refined, in order to make them suitable with the ecosystems that will be evaluated. Therefore, in a second iteration, it is expected to test the parameters in other e-health applications and collect user’s feedback regarding the usability bottlenecks identified to identify which metrics have contributed to the identification of critical issues and in which way they can be further optimized to take into account the user standpoint when handling the application.

4. Conclusions

The research performed compiled the knowledge available in the literature which is culminated in the proposed parametrization. Parametrization provided an objective manner to address the usability principles frequently applied in a heuristic evaluation process. Its main differentiating factor is related to how the metrics were defined. The definition of each principle was analysed thoroughly to identify practices which are applied with the objective of ensuring, within a certain degree, its compliance. By isolating the typically used approaches, it was possible to define metrics that can be asserted in a binary manner. Its applicability allowed the maximization of the result’s accuracy and consistency, making the heuristic methodology more accessible to users without expert usability know-how.

However, there are inherent limitations within the current process which hinder its long-term scalability: (i)Manual analysis: the interfaces’ compliance level with the metrics defined was checked by a human agent. The multiple scopes that are expected to be analysed (components, actions, and sections) in combination to the respective interface technical debt tend to significantly increase the time needed to perform a thorough analysis of the interface. Therefore, it is imperative to automate the current approach. The use of automation in the heuristic methodology brings multiple benefits such as evaluation cost reduction and test coverage maximization [59, 93](ii)Context adaptation: the current approach provides the groundwork in terms of the parametrization of the metrics applied. However, the parameters defined are too generic and must be further refined and adapted to the environment where they will be used. Therefore, user feedback through the evaluation of multiple AAL ecosystems is required to refine the parametrization process and improve their accuracy in the detection of critical usability bottlenecks. Such refinement is expected to be performed through a comparison between the usability bottlenecks identified by the current approach and by end-users

Data Availability

All the data collected within the study scope can be requested to the authors through their email ([email protected] and [email protected]).

Disclosure

A preprint has previously been published [171].

Conflicts of Interest

The authors declare that there is no conflict of interest regarding the publication of this paper.

Acknowledgments

This work was funded by FCT/MCTES through national funds and when applicable cofunded by EU funds under the project UIDB/50008/2020.