Abstract

By using keywords crawled by big data as a survey reference, this research applied latent category clustering method and binary logistic regression model analysis method to analyze the differences in community group buying behaviors of residents from different city scale and summarize the shopping behavior and features of different types of residents, for the purpose of offering advice on different marketing methods for different types of urban residents, so as to realize the precise marketing of community e-commerce and promote the further development of the industry.

1. Introduction

1.1. Research Background

In early 2020, the sudden outbreak of COVID-19 greatly changed people’s way of life. Due to the pandemic, residents were forced to isolate at home. Under these circumstances, work and social life can only be conducted through the Internet, making it difficult for people to purchase the daily necessities. Therefore, community group buying industry in China witnessed an explosive growth during the pandemic, as it offers convenience for people to purchase daily necessities. According to iiMedia Research (a consulting agency) [1], the growth rate of community group purchase market increased by more than 100% and the market scale reached 72 billion yuan. Community group buying finally ushered in a new development boom in 2020, since it fell into trough in 2019.

In addition, we also found that the main battlefield of community group buying in the past two years began to develop in the sinking market, and it did not enter the first-tier and new first-tier cities such as Beijing, Shanghai, Guangzhou, and Shenzhen until the first half of 2020 [2]. Therefore, we are thinking of whether residents’ purchase intentions and consumption behavior in community group purchase will be significantly different due to the different city scale where they live, and whether the influence factors will also be very different.

Now we are in the postepidemic era; the community group buying industry is accelerating its integration, and more entrepreneurs want to get a share of it, making the competition in this industry increasingly fierce. At the same time, people’s reliance on the community group buying has decreased with the improvement of the pandemic [3]. If residents’ consumption behavior in community group buying differs significantly due to the scale of the city, then different marketing should be adopted for residents from cities of different scales. This method is more conducive to community e-commerce to achieve precision marketing and promote further development of this industry.

1.2. Current Status of Research Conducted in China and Abroad

Through consulting domestic and foreign literature, we find that people ’s consumption behavior is mainly affected by their internal factors such as gender, age, education, income, price awareness, quality awareness, and personal preferences, and it is also affected by external stimuli such as platform and social environment. In the analysis of the causes of the online shopping willingness of fresh agricultural products authored by Chen and Lu [4], it is shown that the basic personal characteristics such as gender, age, and income are significantly related to the results of consumers’ buying fresh products online. In the research on the factors affecting consumers’ willingness to buy imported fresh fruits online, He [5] found that the main factors affecting consumers’ willingness to buy imported fresh fruits online are based on personal cognitive characteristics, such as time saving and labor saving, and the rich varieties of imported fruits. Guo and Xu [6] explored the influence factors of customer online shopping behavior under the background of “Internet +” and found that external environmental factors such as merchant, logistics, website, and commodities are the most direct factors affecting customer satisfaction.

In their research on the influence of online shopping festivals on consumers’ online shopping intentions during the “Double Eleven Shopping Carnival,” Bai and Liu [7] found that external environmental stimuli such as festival atmosphere and panic buying have a positive effect on consumers’ shopping mood and, therefore, enhanced consumers’ shopping intention.

In summary, existing research mainly analyzes the factors that affect consumers’ online shopping willingness and rarely involves consumer community group buying willingness and behavior based on the perspective of city scale. Therefore, by making community group buying behavior as the carrier and city scale as the perspective, this research applied latent category cluster analysis, chi-square test, and binary logistic regression model to study the impact of different factors on urban residents’ consumption behavior.

1.3. Collect the Data with the Octopus Software

Today, in the 21st century, information is growing in an explosive way. The era of big data has long come, and people use a variety of methods to deal with big data, such as cloud storage, cloud computing, and Python. As a product of the high-tech era, big data work as the original driving force for the world economic development. Contemporary industries, social networking, companies, and various industries are inseparable from big data. According to the “Big Data Special” report of Chinese Entrepreneurs, Nongfu Spring uses big data to sell mineral water. In 2011, SAP launched the innovative database platform, SAP Hana, with which real-time reporting could be achieved as compared with the previous data without big data. It can be seen that big data can bring forward-looking decisions and help optimize the allocation of resources. Therefore, the study will analyze the online shopping behavior of urban residents based on big data. Starting from the current background, the authors put forward the research topics, consulted the research situation in China and abroad, further determined the entry point of the research, used Octopus crawler software to collect data, and calculated the data to form cloud map and then analyze which online shopping modes are preferred by residents and how this influences online shopping behavior, products, and other information and thus obtained the characteristics of consumer shopping behavior.

2. Questionnaire Design and Sample Composition of a Community Group Buying Behavior

2.1. Investigation Method

This study adopts a questionnaire survey method to issue online questionnaires. On the one hand, the questionnaires were distributed to people in their area through relatives and friends. On the other hand, the questionnaires were distributed by random search community social groups on the Internet and also distributed randomly on social media platforms such as Weibo and Douban, so as to obtain more valid sample data. A total of 800 questionnaires were issued, and 750 questionnaires were collected, of which 672 were valid questionnaires, with an efficacy rate of 89.60%. All the collected questionnaires were recorded and sorted out. In the end, a total of 672 samples from primary, secondary, and below cities of Beijing, Guangdong province, and Hunan and Henan provinces were obtained for us to understand the basic situation of group buying behavior in urban cities of different scales.

2.2. Questionnaire Design

On the basis of literature review, the questionnaire was designed in combination with the opinions of relevant experts. We set up options such as gender, age, city, and monthly income to understand the basic information of the interviewees. At the same time, we defined the monthly purchase frequency, consumption level, purchase channels, shopping type, and other options to understand consumption behavior characteristics in community group.

This survey uses a presurvey method to evaluate the reliability and validity of the questionnaire and adjusts the content and structure of the questionnaire based on the survey situation and evaluation opinions. After the survey was completed, we checked the completeness of the questionnaire and logic of all questionnaires and removed the unqualified questionnaires.

2.3. Statistical Description of Individual Information

Among the 672 respondents, the proportions of the scale of the cities where residents live were equal. Among them, there were 311 respondents from large-scale cities and 361 from small cities, accounting for 53.7% of the total. More women were interviewed than men, accounting for 56.1% of the total respondents, and the majority were unmarried, accounting for 58.9% of the total number. Respondents were mainly aged among young groups, with 53.9% aged 18–25 years. More respondents were living with others compared to living alone. Among them, those with more than four people living together accounted for 38.4%. In addition, their monthly income was concentrated in the low to medium level. The respondents with an income of 5,000 yuan or less accounted for 66.7% of the total, and those with a monthly income of 10,000 yuan and above accounted for 10.1% of the total. The education level of the interviewees was mostly concentrated in the undergraduate level, accounting for 62.4% of the total. The majority were students, followed by employee of enterprise, public institutions, and individual businesses, accounting for 46.6%, 22.9%, and 9.8% respectively. Among the interviewees in this survey, 287 have participated in community group buying, accounting for 42.7% of the total.

3. Data Processing and Analysis of Group Buying Behavior of Urban Residents

3.1. Data Statistical Tools and Methods

This study used SPSS 25.0 and Mplus 8.0 to process the data, used Mplus to perform latent category cluster analysis, used SPSS to perform chi-square test on the data to check the correlation between the latent categories and the basic information of residents, and then used the binary logistic regression model to test the degree of influence of various basic information on group buying behaviors in different potential types of residents’ communities.

3.2. Model Construction and Variable Selection
3.2.1. NVivo Software Analysis

NVivo is powerful qualitative analysis software that can effectively analyze a variety of different data. This research will use big data text analysis software NVivo to make statistics of the collected text data to form a word cloud map and then perform cluster analysis for further research. With these data, the authors find that the larger the proportion of the data area in the word cloud map is, the higher the willingness to do online shopping that consumers have, and the more they are inclined to choose this type of purchasing.

3.2.2. Latent Category Cluster Analysis Model Construction and Variable Selection

Potential category analysis is a mathematical model that describes the interrelationship between a set of categorical variables, and the method of integrated clustering is suitable for exploratory research.

This study first uses Mplus 8.0, based on the perspective of different city scales, and takes the consumption behavior characteristics of community group buying and the scale of permanent cities as observable external variables to conduct exploratory potential category analysis and find out the fit through specific indicators.

In the end, this study set a total of 11 categorical variables, such as urban scale, shopping channels, consumption level, purchase frequency, shopping preference, and important characteristics.

3.2.3. Chi-Square Test Analysis

Chi-square test is a widely used hypothesis testing method. It is used to calculate the degree of difference between the actual observed value and theoretical inferred value of a sample. It is usually expressed by Pearson’s chi-square, and asymptotic significance is used as two random variables’ statistical indicator of the closeness of the correlation. This study believes that when the asymptotic significance is less than 0.01, it can indicate that there is a strong correlation between two random variables.

In this study, SPSS 25.0 was used to perform a chi-square test to analyze the basic information and potential categories of the interviewees and to initially screen out variables with strong correlations with the potential categories.

3.2.4. Binary Logistic Regression Model Construction and Variable Selection

Binary logistic regression is a linear regression analysis model for binary categorical variables to be explained. It is often used in the fields of data mining and economic forecasting and other fields. This study establishes a binary logistic regression model for each potential category and discusses whether the personal basic information variable that has a strong correlation with the city scale variable has a significant impact on the community group buying behavior of residents in each potential category. Among them, the explanatory variables are the respondent’s age, gender, occupation, marital status, monthly income, and the number of people living together, and the interpreted variable is whether to belong to this potential category.

4. Results and Analysis of Group Buying Behavior in Three Urban Communities

4.1. NVivo Analysis

After the text data was collected by Octopus crawler software, NVivo software was used to perform statistical analysis on text big data, count the frequency of word occurrences, and analyze the concerns of the consumer community during group purchases. Figure 1 shows the information that urban residents cared about during online shopping.

All the words in Figure 1 are closely related to community group buying. The results indicate that, because of the pandemic, commodity operation in the market has further developed towards community group buying. “Sink” and “city” reflect that the city scale is changing. The shopping goods are mainly raw and fresh fruits. Shopping channels mainly include Meituan selection optimization, Xingsheng Optimal, and related stores and convenience stores, and then these products are delivered to home to improve online shopping efficiency; thus a supply chain is formed in this way. Customers, cost, capital, commission, and o forth reflect the level of residents’ purchase level. Price, demand, quality, and after-sales reflect the factors that consumers cared about when shopping. Community group buying platforms are mainly provided on small program, WeChat, online community, and so on. Analysis is made based on factors like the city scale, monthly purchase frequency, consumption level, purchase channels, shopping types, and so forth, to reflect consumers’ purchase features in the community group buying under certain circumstances.

4.2. Cluster Analysis

The study starts with the single-category initial model and selects latent category models from single category to 7 categories to explore the minimum number of potential categories that can fully explain the relationship between the explicit variables of residents’ consumption behavior.

The indicators used in this study are Log (Log likelihood): log likelihood function value, AIC (Akaike information criterion), BIC (Bayesian information criterion) and aBIC (Sample-Size-Adjusted BIC): BIC after sample-size correction, Entropy: Entropy and LMR (Lo-Mendell-Rubin), and BLRT (parametric bootstrap likelihood ratio test): bootstrap-based likelihood ratio test.

Studies have shown that the smaller the values of Log, AIC, BIC, and aBIC, the better the fitting effect of the model; the higher the Entropy value, the higher the accuracy of its latent category classification; the significant LMR and BLRT values indicate that n categories of the model are better than the n − 1 category model. Table 1 reports the data fit from the single-category model to the 7-category model.

The results in Table 1 show that Log (L) decreases with the increase in the number of categories. The information evaluation indicators AIC and aBIC have minimum values when the model category is 4, and the BLRT value reaches a very significant level (), indicating 4 potential categories better than 3 latent category models. Generally speaking, when the number of samples is not more than 1000, it is recommended to judge the fitting effect of the model with the AIC index. In total, 287 samples are analyzed in this study, so AIC can be used as a decision-making indicator for model suitability. According to the analysis results of the 7 models, the AIC value is the lowest when the number of model categories is 4 (3229.823), so this study considers to choose the 4 best-fitting latent models (Class1, Class2, Class3, and Class4).

The cluster icicle diagram is shown in Figure 2. By observing the height of the white strips, we can divide the number of 287 samples into 4 categories. In conclusion, through Mplus, the frequency of Class1 is 42, the frequency of Class2 is 106, the frequency of Class3 is 53, and the frequency of Class4 is 86. The x-axis represents the observation object, and the y-axis represents the frequency that can be divided into each category (Class). Each sample name corresponds to a blue strip, and 287 sample strips have the same length. There is also a white strip between every two sample strips. The length of the strip indicates the degree of similarity between the two samples. The higher the similarity, the longer the length of the white strip.

The average attribution probability matrices of 4 potential categories are shown in Table 2. The average probability distribution of each potential category is between 72% and 90%, indicating that the models with 4 potential categories are reliable.

A comprehensive analysis of Table 3 and Figure 3 shows that 287 urban residents with community group buying behaviors are classified into Class1, Class2, Class3, and Class4. They have the following characteristics:42 urban residents come from Class1 cities, accounting for 14.6%. For Class1 cities, city scale is more evenly distributed and people have high monthly purchase frequency. Residents usually use community group buying APP to conduct community group buying and will not purchase due to the rich variety of goods. Among them, more than 75% of urban residents will buy food and nonfood goods through community group buying.106 residents come from Class2 cities, accounting for 36.9%. For Class2 cities, city scale is smaller and has low monthly purchase frequency. Most of the purchase channels are community group buying APP, and residents choose community group buying due to rich variety of goods, time saving, affordable price, quality assurance, and good service quality. All residents will buy nonfood goods through community group buying.53 residents belong to Class3 cities, accounting for 15.8%, Class3 cities are with a small city scale, low monthly purchase frequency, and high consumption level. Most of the purchase channels are self-organized purchases through WeChat group chats; because of the rich variety of goods, time saving, better quality assurance, and distribution service, the community group purchase is selected. The characteristics of price concessions are less important and everyone will buy food goods.86 residents belong to Class4 cities, accounting for 30.0%. For Class4 cities, the city scale is more evenly distributed. Most of the purchase channels are through community group purchase apps. The consumption level is low. At the same time, residents choose community group buying due to rich varieties of commodities, time saving, affordable price, quality assurance, and good distribution service, and everyone will buy food goods.

It can also be seen from Figure 2 that the consumption behavior characteristics of the two categories of Class2 and Class4 are similar. The main difference lies in the types of goods purchased when conducting community group purchases. Most residents of Class2 live in small-scale cities, and everyone will buy nonfood products through community group buying. Most people in Class4, where the city scale is evenly distributed, will not buy nonfood products through community group purchases. Most residents of Class3 live in small-scale cities, and the response probability of monthly purchase frequency, purchase channel, and consumption level is significantly higher than those of the other three categories. The response probability of whether to choose to participate in the community group buying due to the affordable price is significantly lower than those of the other three categories; Class1 urban scale is evenly distributed. The response probability of whether to participate in community group buying due to the rich variety of goods, whether to choose to participate in community group buying because of time saving, whether to participate in community group buying because of better quality assurance, and whether to participate in community group buying due to good distribution service are significantly lower than those in the other three categories.

4.3. Inspection and Analysis of the Chi-Square Test

Firstly, the correlation between the two variables is analyzed: the residents’ understanding of community group buying and whether they participate in community group buying and city scale.

It can be seen from Table 4 that the level of understanding of community group buying among small-scale urban residents is higher than that of large-scale cities. The number of residents participating in community group buying in small-scale cities (50.1%) is significantly more than that of those in large-scale cities (34.1%).

According to the chi-square test in Table 5, the progressive significance <0.01 reaches a significant level, indicating that the two variables, the respondents’ understanding of community group buying and whether they have participated in community group buying, have a strong correlation with the variable of city scale.

Second, it studies the correlation between the basic information of the interviewees and the potential categories and makes a preliminary screening of the basic information variables that affect the group buying consumption behavior of residents’ communities. The basic information is the age, gender, occupation, marital status, number of people living together, and education level.

From Table 6, it can be seen that, except for the variable of education level (progressive significance >0.01), the remaining basic information variables have a strong correlation with the potential category.

In conclusion, since the variable of education level has no significant correlation with the potential category, the variable of education level is excluded from the explanatory variable when analyzing the date with the binary logistic regression model.

4.3.1. Analysis of the Binary Logistic Regression Model

In order to further explore the difference of community group consumption behavior of residents from cities of different scale, this study uses the binary logistic regression model on four potential categories with different characteristics by using basic information that is highly correlated with potential categories as independent variables.

According to Tables 7 and 8, the impact of the birth year on all 4 groups was insignificant. This may be caused by the rapid development of Internet big data, as well as the increasing advancement of science and technology in China. Under this background, consumers of different birth years are all affected by the same external environment.

When studying the influence factors of the community group buying behavior of different groups, the authors find the following features:For Class1 group, monthly income has a significant impact on the community group buying behavior. For Class1 consumers, their monthly income is at the middle and low levels of four groups. Such consumers generally do not have high requirements for the richness of product categories. The focus is on meeting their daily needs, and they pay more attention to the convenience brought by affordable prices.For Class2 Group, the basic information variables are not significant. This is caused by smaller city scale in which household connection is closer and information exchange is more frequent and information received is similar. Therefore the basic information has less influence on the consumption behavior of this type of group.For Class3 group, community group consumption behavior, gender, the number of people living together, marital status, monthly income, and occupation all have significant impact on the group. The reason is that married high-income men are the main shoppers in Class3; and as they have more family members, they need to buy a lot of daily necessities to meet the family needs at each time. Also high-income men are busy with work; the frequency of participating in community group purchases is relatively low, so they cared about time saving and family life quality. Factors like the variety and richness of commodity, product quality, and the speed of delivery service on the community group buying platform are particularly important.For Class4 group, marital status and occupation have a significant influence on this group, where the individuals are mostly unmarried, and more than half of them are students. The individuals in this group usually only need to ensure their daily life needs, and most of them are living in a collective life and do not have good food storage conditions. So they tend to buy only short-term needs. Therefore their community group purchase frequency is higher, while purchase amount is lower than those of the rest of the groups.

5. Conclusions and Suggestions of Differences in Four Urban Residents

5.1. Main Conclusions

Based on the survey data of community group buying behaviors of 672 respondents in cities of different sizes, latent category clustering analysis and binary logistic regression analysis are carried out, and the following research conclusions are drawn.

The differences in community group purchasing behaviors of urban residents of different sizes are as follows.

Small-scale urban residents account for a larger proportion of the two consumer behavior categories, Class2 and Class3, which is because, in small-scale cities, community group buying is more popular and has a wider audience. Through latent category cluster analysis, the 287 urban residents who have participated in community group purchases are roughly divided into 4 types of consumption behavior, among which the consumption behaviors of urban residents in the Class3 and Class4 categories have obvious grouping characteristics. On the whole, only the year of birth has no significant impact on the four groups. Monthly income has a significant impact on the consumption behavior of Class1 urban residents, and the number of people living together, occupation, marital status, and gender have a significant impact on the consumption behavior of Class3 urban residents. Marital status and occupation have a significant impact on the consumption behavior of Class4 urban residents.

5.2. Countermeasures and Suggestions

On the whole, the vast majority of consumers value the safety of commodity quality and quality of community group purchase platform and the superiority of distribution service. Therefore, strengthening the quality of goods and improving the quality of after-sales service platform will undoubtedly encourage more users to choose community group purchase. The thoughtfulness of distribution service is also a plus. In addition, more than half of the people in the city residents involved in the questionnaire have not used the community group buying, and chi-square test inspection analysis results show that the understanding of community group buying is strongly correlated to the city scale. Residents living in large-scale cities are less involved in community group buying and have lower understanding of it. It is recommended to use big data advertising to increase the promotion of community group buying as a new shopping mode.

Starting from the consumption behavior of urban residents, it is found that consumers of different groups also show different urban scale distribution and shopping behavior characteristics. Therefore, for the community e-commerce company to accurately identify the target customers and achieve accurate marketing, the specific suggestions for four types of urban residents are as follows.

For Class1 group, because they have low demand for the richness of product types and higher purchase frequency and they are more concerned about price benefits, community e-commerce companies can carry out a large promotion of several commodities on the e-commerce platform according to the psychology of such consumers, so as to attract such people to come to consume.

In view of Class2 group having no obvious influencing factors, community e-commerce platforms should be more targeted in advertisement. They can conduct in-depth investigation into all walks of life, explore the more detailed shopping needs of such groups, and provide a clear target direction for future advertisement, as well as discount activities.

For Class3 group, community platforms should increase their dependence and irreplaceability on community group buying, such as more efficient delivery service and better quality assurance, because most of the individuals in this group are keen to use WeChat group chat to organize group purchases on their own. The community e-commerce platform can promote the further growth of the commodity shopping group by attracting more enthusiastic mothers or community store owners to join the group leaders.

Aiming at the Class4 group, based on the characteristics that most of its individuals are students, the e-commerce platform can adopt means such as product promotions and student discounts to develop more potential student users or also set up more community group buying sites on campus to improve the convenience of community group buying so as to attract more students to join in the community group buying.

Data Availability

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.