Abstract

We propose a system that automatically generates portrait drawings for the purpose of human emotional care. Our system comprises two parts: a smartphone application and a server. The smartphone application enables the user to take photographs throughout the day while acquiring heart rates from the smartwatch worn by the user. The server collects the photographs and heart rates and displays portrait drawings automatically stylized from the photograph for the most exciting moment of the day. In the system, the user can recall the exciting and happy moment of the day through admiring the drawings and heal the emotion accordingly. To stylize photographs as portrait drawings, we employ nonphotorealistic rendering (NPR) methods, including a portrait etude stylization proposed in this paper. Finally, the effectiveness of our system is demonstrated through user studies.

1. Introduction

Internet of Things (IoT) has undergone rapid advances following the popularization of smartphones and wearable sensor devices. Owing to these circumstances, IoT has become nearly ubiquitous in our lives, widely supporting facets of our well-being as well. Especially, in healthcare front, many applications (e.g., Apple’s HealthKit) based on IoT have been developed and employed. However, such applications generally focus on physical healthcare only. Consequently, the needs for emotional and mental healthcare applications are gradually increasing.

We present a system developed with the aim of caring for the user’s emotions; the proposed system automatically generates and displays the users’ portrait drawings by utilizing IoT. In this system, when a user takes photographs throughout the day using a smartphone, the most exciting portrait is selected automatically based on the user’s heart rate taken from a smartwatch equipped with the heart rate sensor. The selected photo is then rendered as portrait drawings with various styles by using the nonphotorealistic rendering (NPR) techniques at the server located at the user’s home. Finally, when the user returns home, the stylized portrait drawings are displayed in a digital photo frame that is connected to the server. Consequently, the user can recall the most exciting and also the happiest moment of the day through admiring the drawings and heal their emotions accordingly.

The major contributions of this study are as follows. First, we present an IoT-based application for human emotional care as mentioned earlier. This is achieved by employing the NPR techniques, which aim to acquire aesthetic images. The study not only shows that the NPR can be utilized for emotion-aware applications but also that using IoT technique can further expand it. Second, we propose a novel portrait stylization method inspired by Henri de Toulouse-Lautrec’s etude, which mostly stylizes the facial region while abstractly representing the others. In this study, our system cares for human emotions using automatically stylized portrait drawings. Selective focusing and expressing the facial region during stylization process, therefore, play a key role in enabling the resulting drawings to convey facial expression representing the most exciting moment of the day. Finally, by conducting user studies, we show that our system makes users feel better after recalling the most exciting moments of their days by admiring the drawings our system generates. Through the user study, we also confirm that the stylization method proposed in this paper generates results that reasonably resemble Henri de Toulouse-Lautrec’s etude.

The remainder of this paper is organized as follows. In Section 2, we provide an overview of related works on IoT applications and artistic stylization methods. Next, we describe our system which automatically and remotely generates portrait drawings using IoT technique in Section 3. We then present our portrait stylization approaches in Section 4. In Section 5, we present experimental results and their evaluation. Finally, in Section 6, we conclude with a summary of our method and the scope for future development.

Due to the popularization of mobile devices, the range of Internet of Things (IoT) technology has been rapidly broadening. This has made IoT technology nearly ubiquitous in our lives, and it widely supports facets of our well-being. In this regard, many studies focused on various IoT applications. For healthcare, [1, 2] developed applications which monitor heart rate measured by heart rate sensor of smartwatch and provide personalized information for the user’s heart care without going to hospital or pharmacy to get doctor’s diagnosis. For fitness, applications where activity tracker communicates with fitness facilities like running machine and cycling fitness to obtain precise fitness data, such as exercise distance, speed, and slopeness, and tracks these on smartphone to suggest personalized exercise plan are widely used [3, 4]. However, despite increasing interest in physical heath, studies for emotional care have rarely conducted. In this paper, we aim at developing a system for human emotional care. To do this, we utilize stylization techniques.

Artistic image stylization is a technique that aims at converting an original image into an artistic image with certain style [5]. Early studies on artistic image stylization [6, 7] focused on expressing painterly style by generating brush strokes. In the studies, brush stroke’s properties such as direction, size, and texture were mainly managed to stylize original image. Meanwhile, several studies have focused on expressing specific artistic style. Van Gogh’s painting styles were imitated by manipulating brush stroke’s properties [8, 9], and Seurat’s pointillism was simulated as well by employing color juxtaposition theory [10]. These studies were conducted while targeting on mimicking the impressionists’ painting style, because the brush strokes they used were quite unique. In this paper, we aim at generating Toulouse-Lautrec’s portrait etude who is one of the famous impressionists in 18 centuries.

3. Remote Portrait Generation and Display System for Human Emotional Care

3.1. System Overview

The system proposed in this paper consists of a mobile application part including smartphone application and smartwatch and a server part including server and digital photo frame. In our framework, the user wears a smartwatch and takes portrait photographs throughout the day by using our smartphone application. While taking a photograph, the application also acquires the heart rate from the smartwatch and sends it to the server located at the user’s home. The server then stylizes the photographs as portrait drawings with given styles if they are judged as the most exciting moment of the day based on the heart rate. When the user comes back home, the server displays the drawings by using a digital photo frame, and through admiring these drawings and recalling the moments, the user experiences an emotional healing. We present the detail of each part in the next two sections.

3.2. Mobile Application Part for Acquiring Photograph and Heart Rate

User’s photographs are acquired throughout the day by using a smartphone and smartwatch equipped with heart rate sensor, built-in camera, and depth perception sensor. When the user takes a photograph using a smartphone, our smartphone application (Figure 1) simultaneously obtains user’s heart rate information from the smartwatch and sends the photograph with heart rate to the server. In this study, we used Samsung Gear S which is a smartwatch equipped with heart rate sensor and integrated it with our smartphone application developed on Android environment.

Our system mainly aims at generating portrait drawings. Consequently, dividing input photograph into various regions such as facial area and background enables the system to obtain a better quality of the drawings. To do this, we segment the input photograph by using the depth perception of Google’s Tango, which is an augmented reality computing platform. Figure 2 shows the process that is performed on Lenovo Phab 2 Pro, which is a smartphone employed in this study and supports the Tango. The Tango reconstructs a 3-dimensional surface from the photograph taken by using depth sensor and motion sensor. Starting from the surface, we search for the point of the greatest depth and find the center of the bounding box, which is detected as the face from the photograph. By using these two points as seeds, we then perform watershed segmentation on depth domain to separate object from the background. Optionally, we also provide semiautomatic segmentation using the seeds mentioned above and user-selected seeds for hair and body. Finally, the segmentation result is incorporated along with the input photograph and is sent to the server to be used in portrait drawing generation process described in the following section.

3.3. Server Part for Generating and Displaying Portrait Drawings

When a photograph is received with its segmentation result and heart rate, the server which is located at the user’s home judges whether the photograph is the most exciting moment of the day. In this study, we regard the photograph incorporated with the highest heart rate as the most exciting moment. Therefore, the server renders portrait drawings only if the heart rate of the received photograph is higher than that of previously received photographs. For the style of portrait drawings, we provide oil paintings, pastel drawing, pencil drawings, and portrait etudes, as described in Section 4. Using these four styles, the server generates portrait drawings and preserves them until a new photograph corresponding to a more exciting moment is received. The server monitors the user’s location, which is computed on the smartphone using GPS. When the user is within a predefined distance from the server, the server finally displays the portrait drawings on the digital photo frame connected to it.

This paper supposes the system to be used for home use. Therefore, it is recommended to use a low-power device as the server of our system for a practical use. To this end, we used Raspberry Pi 2 B+ and Camel PF1710IPS as the server and digital photo frame, respectively, as shown in Figure 3. When a user admires our resulting drawings, interactions which enable the user to browse and select preferred drawings are provided by an infrared remote controller. To do this, we capture infrared remote signals from the GPIO 22 pin of Raspberry Pi.

4. Portrait Stylization Methods

This study aims at enabling users to admire portrait drawings with various styles to arouse their interests. To achieve this, algorithms for stylizing various styles are required. As mentioned in Section 2, lots of stylization methods have been proposed. Among them, we employ Seo et al.’s painterly rendering algorithm [11] using a stroke texture database. In their study, each individual brush stroke texture, which is captured from a glass sheet where an artist paints on, is stored in their stroke texture database and is utilized for generating brush strokes in the rendering process. At this time, if the database is replaced with another one which consists of brush strokes of different medium, the style of resulting drawing is consequently changed. In this manner, we used three mediums (i.e., oil paint, pastel, and pencil) to provide each style (Figure 4).

In this study, we focus on generating portrait drawings using the most exciting moments taken by users throughout the day. Therefore, conveying facial expression with abstract representation of the others is sometimes more effective for giving a strong impression to the users. In addition to the styles mentioned above, we propose a portrait stylization inspired by Henri de Toulouse-Lautrec’s etude, which mostly stylizes the facial region while abstractly representing the others. As shown in Figure 5, Toulouse-Lautrec left many portrait etudes. The main feature of his etude style is represented as simplification except for facial region. In his etudes, the background is generally eliminated. Only face part is depicted in detail while most of body parts are simplified as a few lines. In this manner, we extract simplified edges for representing body and manipulate them to mimic the lines he used. Then, we paint brush strokes on facial region mostly to make user concentrate on the facial expression. We present the detail of proposed algorithm in the following section.

4.1. Portrait Etude Stylization

Figure 6 shows our portrait etude stylization process. The process is divided into three steps. The first step is line processing. In this step, we extract edges from input image and merge them into long simple lines. Next step is region processing; we divide the input image into several regions such as face, hair, body, and background. Final step is rendering processing. In this step, we render line and brush strokes.

4.1.1. Line Processing

In this step, we focus on the expression of simple lines shown throughout Toulouse-Lautrec’s portrait etudes, especially for body expression. To generate simplified lines, we first extract edges from input image and vectorize them. In this study, flow-based difference of Gaussian (FDoG) filtering [12] is used to extract edges. As shown in Figure 7, the filtered result is a raster image including thick edges. Therefore, vectorizing edges is necessary to obtain simplified lines. To achieve this, we first use the thinning algorithm to make 1-pixel wide edges (Figure 7(d)). We then use the tracing algorithm to detect and trace each point on edges by using 3 × 3 kernels as shown in Figure 8. In the figure, the tracing starts from the end point of the edge (a) and proceeds along the edge (b–d). In cases of (a), (b), and (c), there is only one candidate for next edge point which is not traced yet. However, in case of (d), two candidates are given, so choosing one which makes the edge simpler is required. To solve this, we select the point where the angle difference is minimized through the vector calculation (inner product) of the tracked edge points and candidate points.

If more than one pixel is cut off on the edge (Figure 9(a)), it will not be traced to single line (Figure 9(b)) because the kernel covers only adjacent neighbor pixels. To solve this, we merge nearby lines considering the distance and direction between them (Figure 9(c)). If the distance between each end point of two lines and the difference between their directions are below predefined threshold, we merge them into the single line. In this paper, experimentally, thresholds for distance and direction are set to a default value of 5 and 15°, respectively.

Among the separated line segments, we select long and strong line segments only by using their lengths and average gradient values to represent overall structure abstractly. Figure 10 shows the line segments and a histogram which represents their length and numbers. We obtain a threshold value for dividing them into long and short line segments by applying the -means clustering algorithm with on the histogram. In the figure, long lines (blue) and short lines are divided by using the threshold whose value is set to 32. In the same manner, we divide the line segments into strong and weak lines by using their gradient intensity. In our method, strong lines tend to be detected more in the areas where detailed expression such as face is needed. These two lines are utilized to express each part of input image separately in Section 4.1.3.

4.1.2. Region Processing

To express each region separately according to what the region is, it is needed to divide the input image into regions and label them. As shown in Figure 5, Toulouse-Lautrec’s etude can be mainly divided into face, hair, body, and halo in background. As mentioned in Section 3.2, we obtain such regions by using depth perception and user assistance.

In Toulouse-Lautrec’s etudes, he briefly depicted facial components such as eyebrows, eyes, nose, and mouth by using simple lines rather than colored brush strokes. To express this, we only draw the lines which correspond to these components. To achieve this, we extract 77 face landmarks from the input image by using active shape model (ASM). We then construct a distance map around facial components which are generated by connecting the landmarks as shown in Figure 11. Finally, we remove out-of-range lines by using the distance map.

For halo region in background, we represent the backlight behind the face. In Toulouse-Lautrec’s etudes, the halo is often represented as straight hatching lines with arbitrary direction or following the silhouette of the head. To express this, we calculate modified flow map where the direction is generated by mixing the flow direction obtained by ETF [12] and arbitrary direction as shown in Figure 12.

4.1.3. Rendering Lines and Brush Strokes

Before rendering lines and brush strokes, we create a rectangular background using groundwood paper texture which is shown in Toulouse-Lautrec’s etudes. We then render in the order of long lines, halo, face, hair, and facial components. As shown in Figure 13, we render each line through following steps: we first create a triangle strip mesh following the gradient direction; we then color it with the color sampled from input image where the mesh is located on; and we finally map predefined line texture captured from Toulouse-Lautrec’s etudes.

Our line drawing is performed using the long and strong line segments obtained previously. To render lines with various width, we adjust the width by using the following equation: where and are the width and height of the input image, is the average gradient strength of the area through which the line segment passes, and controls the maximum width of lines. Figure 14 shows the lines resulting from adjusting .

Rendering brush strokes is performed on each region separately. We depict face and hair in detail but body and halo roughly. To do this, we arrange the brush intervals narrow and short on face and hair. On the other hand, for body and halo, the spacing between brushes was widened, and the length was arranged long.

After rendering brush strokes on the entire facial region, we draw facial components defined in Section 4.1.2 to convey facial expression. To achieve this, we first render additional thin brushes on the facial region and then draw the edges corresponding to the facial components. Figure 15 shows the results without and with drawing facial components. As shown in the figure, drawing facial components convey facial expression better.

Figure 16 shows our portrait etude rendering results. As shown in the figure, our method mainly focused on conveying facial expression while representing the others abstractly. In Section 5.2, we evaluate our results by using user study in terms of similarity to Toulouse-Lautrec’s etudes.

5. Experimental Results

Figure 17 shows the portrait drawings our system generated. As mentioned above, our system generates four styles of portrait drawings from the most exciting photograph of the day and allows users to select preferred results to display them on the digital photo frame of the system. In the figure, each marked result represents the drawing selected by the user who took the input photograph. In case of that users do not perform the optional semiautomatic segmentation described in Section 3.2, only facial region and background are distinguished; consequently, our portrait etude is not generated. We note that the results which do not contain the portrait etude result correspond to this case. As shown in the figure, the system generated the drawings of the most exciting moments of the day. We can observe that the system rendered facial region in detail compared to the others. It is because our smartphone application divided input photograph into face and the other parts semiautomatically and used them to render each part separately.

In the next two sections, we show that our system makes users feel better after recalling the most exciting moments of their days by admiring the drawings our system generates and also show that our portrait etude stylization method generates results that reasonably resemble Toulouse-Lautrec’s etude.

5.1. Evaluation of the System

We suppose that a user takes photograph throughout the day and uses our system after coming back home where the system is installed. However, we installed the system at an office in the university and conducted a user study. We lent each participant a smartphone where our application was installed and a smartwatch, demonstrated how the devices worked to them, and let them freely take portrait photographs containing themselves throughout the day while wearing the watch. We then made them visit the office, demonstrated how the digital photo frame worked to them, and finally let them admire the original photograph they took and the drawings generated. Participants consisted of 19 undergraduate and graduate students who did not have any expertise in fine art. When they admired the photograph and drawings displayed on digital photo frame, we checked which styles they chose and also checked whether they had performed semiautomatic segmentation after taking the photograph. We then inquired them about the next three questions. (1) Do you think you felt better after admiring the drawings? (2) Do you agree with that the drawings correspond with the most exciting moment among what you took? (3) Do you think the drawings represent the mood and atmosphere of the moment of photographing well? For the answers, we use the following scale: 1 = strongly disagree, 2 = disagree, 3 = neutral, 4 = agree, and 5 = strongly agree. In addition, we asked them to give reasons and any feedback.

Figure 18(a) shows the styles participants chose. In the experiments, participants were able to choose multiple styles including original photograph to display them. Otherwise, they were also able to select none of the styles to display all of them. As shown in the figure, among 19 participants, 15 of them chose their preferred styles; the others chose nothing. Regarding the styles chosen by participants, photograph, oil painting, pastel drawing, pencil drawing, and portrait etude were chosen 15, 13, 8, 2, and 7 times, respectively. At this time, we note that portrait etudes were not generated for 7 cases where semiautomatic segmentation was not performed. The result shows that participants mostly preferred original photograph. Among drawings, oil painting was the most preferred style in our system. In this regard, participants responded that original photograph was easy to recognize. They also answered that oil painting was the most familiar style among what we provided. Meanwhile, pencil drawing was chosen only twice. They also mentioned that this was because facial detail was not depicted well in our pencil drawings.

Figure 18(b) shows the answers to question 1. The answers ranged from 3 to 5, and its average value was 3.74. From this result, we confirm that they mostly felt better after admiring the drawings our system generated.

Figure 18(c) shows the answers to question 2. The answers were distributed from 2 to 5, and the average value was 3.37. This was the worst score among our three questions. We assume that this is because our system depends on heart rate to select the most exciting moment. For instance, a participant answered 2 = disagree to Figure 19 of which input photograph was taken after exercise, and we can suppose that heart rate was affected by exercise rather than emotion. In this regard, we present a possible solution for this problem in Section 6.

Lastly, Figure 18(d) shows the answers to question 3. The answers ranged from 3 to 5. Its average value scored 3.95, and this was the best score in our experiments. Especially, the score of participants who chose our portrait etude as their one of preferred styles was 4.42. We analyze that this is because the algorithm effectively conveyed facial expression by focusing on mainly expressing facial region while representing the others abstractly.

Taking all together, we confirm that our system generated portrait drawings which effectively conveyed the mood of the moment of photographing and that users felt better after admiring the drawings, although the drawings did not well correspond with the most exciting moment of the day. Besides multiple-choice answers, participants’ feedbacks were as follows: the styles provided were too limited; portrait etude looked like incomplete; pencil drawing lacked detailed depiction; segmentation process was bothersome; not only the best moment but also the other moments could be displayed; smartphone application drained battery too fast. For these, we also suggest several solutions in Section 6.

5.2. Evaluation of Portrait Etude Rendering

For the style of portrait etude we proposed in Section 4.1, we conducted a user study as well. 31 participants including 19 who participated in the experiment in Section 5.1 consisted of undergraduate and graduate students who did not have expertise in fine art. We showed them 6 portrait artworks (Figure 20) including Toulouse-Lautrec’s etude and let them choose the artworks whose styles were similar to our portrait etude drawing results. We allowed multiple-choice questions, consequently 58 selections were made. Figure 20 shows the artworks and the number of participants’ selections. As shown in the figure, Toulouse-Lautrec’s artwork was selected as the one which was the most similar to our portrait stylization by being chosen by 28 participants. Artworks of Edgar Degas, Eduard Vuillard, Berthe Morisot, Vincent Van Gogh, and Edouard Manet were chosen by 13, 10, 3, 3, and 1 participants, respectively. From this result, we confirm that our portrait etude stylization method generated resulting drawings quite similar to Toulouse-Lautrec’s artwork which we targeted. Participants who mainly chose the artworks of Toulouse-Lautrec, Degas, and Vuillard said that line style, halo effect, selective abstraction, and incompleteness were similar to the characteristics found in our resulting drawings. On the other hand, the participants who chose Morisot and Gogh explained that the strokes used for facial area in our results were similar to those of artworks.

We also asked participants how similar Toulouse-Lautrec’s artwork and our resulting drawings are. For answers, we used the scale same with Section 5.1. Figure 21 shows the result. The answers were widely distributed from 1 to 5, and the average value was 3.32. Although participants chose Toulouse-Lautrec’s artwork as the style most similar to ours, but the score they evaluated was not very high when the result was directly compared with his artwork. In this regard, they mentioned that painting nonfacial parts and using more abstract lines were required to mimic his artwork.

6. Conclusions and Future Work

In this study, we proposed a system that automatically generated portrait drawings for the purpose of human emotional care. In the system, our smartphone application enabled the user to take photographs throughout the day while acquiring heart rates from the smartwatch worn by the user. Meanwhile, the server collected the photographs and heart rates and displayed portrait drawings automatically stylized from the photograph for the most exciting moment of the day. Using the system, the user could recall the exciting and happy moment of the day through admiring the drawings and heal the emotion accordingly. To stylize photographs as portrait drawings, we employed NPR methods, including a portrait etude stylization proposed in this paper. Finally, our system and proposed stylization method were evaluated through user studies.

We collected some ideas for future work from the feedback of user study participants. As mentioned before, participants answered that the drawings our system generated occasionally did not correspond with the most exciting moment of the day. We analyzed that this was caused that our system entirely depended on heart rate to judge it. To solve this problem, we will employ emotion recognition from facial expression. In addition, participants also mentioned that semiautomatic segmentation was bothersome. To provide more better user experiences, we will employ automatic segmentation and labelling methods using semantic recognition. We also believe that this will be beneficial to represent more regions with their own styles to enhance our portrait etude stylization.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there is no conflict of interest regarding the publication of this paper.

Acknowledgments

The authors thank the participants of the user studies of the paper. This work was supported by the National Research Foundation (NRF) of Korea grant funded by the Korea government (MSIT) (no. NRF-2017R1A2B4007481) and Chung-Ang University research grant in 2017.