Home  >  Article  >  How to conduct effective data analysis

How to conduct effective data analysis

angryTom
angryTomOriginal
2019-07-22 11:44:4513721browse

How to conduct effective data analysis

Recommended tutorial: Python tutorial

In the second half of the Internet, there are constant In the context of refined operations, product managers no longer just rely on feelings to make products, but also need to cultivate data awareness and be able to use data as a basis to continuously improve products.

Unlike the company’s professional data analysts, product managers can look at data more from the user and business levels, and find the reasons for data changes faster and more thoroughly.

So how to effectively analyze the data under the premise that the data has been effectively recorded?

1. Clarify the purpose of data analysis

## 1. If the purpose of data analysis is to compare the page before and after revision To determine the pros and cons, the indicators to be measured should start from the click-through rate, bounce rate and other dimensions of the page. E-commerce applications should also observe the order conversion rate. Social applications should focus on the user's visit duration, likes, forwarding interactions, etc.

When many newcomers design their own products, they may spend a lot of time on the design of the product itself, but do not spend energy thinking about how to measure the success of the product. In the product documentation Writing an empty phrase like "the user experience has been improved" is not conducive to the smooth passing of the product design review, nor is it possible to quickly improve the KPI indicators of the product more effectively.

 2. If the purpose of data analysis is to explore the reasons for abnormal data fluctuations in a certain module, the analysis method should be gradually dismantled according to the pyramid principle, version->time-> crowd.

For example, if you find that the click-through rate of the Guess You Like module on the homepage has recently dropped from 40% to 35%, a sharp drop of 5%, at this time, first check to see which version of the data has occurred. Is the fluctuation caused by omissions or errors in the launch of the new version?

If the version fluctuation data remains consistent, then look at when the data started to change. Is it because of the Christmas and New Year’s Day holiday factors that other modules on the page are online? The new activity affected guess-you-like conversions.

If not, then break it down to see if the composition of traffic sources has changed, and whether it is caused by an increase in the number of new users being exposed.

Product managers need to analyze data with a clear purpose and think about what dimensions need to be constructed to verify to achieve the goal. Most of the time, product managers need to be very patient and dismantle the subdivisions step by step to investigate the reasons.

2. Multi-channel data collection

There are generally four types of collection methods.

 1. Obtain from external industry data analysis reports such as Analysys or iResearch. You need to observe the data with a cautious attitude and extract effective and accurate information. , peel off some data that may be filled with water, and you need to always be wary of secondary data that has been processed by others.

 2. Actively collect user feedback from AppStore, customer service feedback, Weibo and other community forums. When I have free time, I often go to the community forum to read the status comments of users. Generally, such comments are very extreme, either very good or scolding, but these comments are still very beneficial to the improvement of my own product design. Yes, you can try to infer why the user had such emotions at that moment.

 3. Participate in questionnaire design, user interviews and other surveys by yourself, face users directly, collect first-hand data, and observe the problems and feelings users encounter when using products. The questionnaire needs to refine the core questions and reduce the number of questions, and the recycling results need to eliminate ineffective and perfunctory questionnaires. User interviews need to be careful not to use guiding words or questions to bias the user's natural feelings.

 4. Study data from recorded user behavior trajectories. Large companies generally have fixed-line reports/emails to provide daily or even real-time feedback on online user data. They also provide SQL query platforms to product managers or data analysts so that they can explore and compare data in a more in-depth manner.

3. Effectively eliminate interference data

 1. Select the correct number of samples and select a large enough number to eliminate the influence of extreme or accidental data. In the 2008 Olympics, Yao Ming's three-point shooting percentage was 100%, and Kobe's three-point shooting percentage was 32%. Does that mean that Yao Ming's three-point shooting percentage was higher than Kobe's? There is a problem with the display, because in that Olympics, Yao Ming only shot one three-pointer, and Kobe shot 53.

 2. Develop the same sampling rules to reduce the bias of analysis conclusions. For example, two Push copywritings, the first one "You have a heart-warming takeaway red envelope that you have not received. The biggest red envelope is only reserved for you who are the best at eating, click to enter", the second one "I will give you a takeaway low-temperature benefit without leaving the house." Households can enjoy hot and delicious food, click to collect.” Experimental data shows that the click-through rate of the second Push copy is 30% higher than that of the first one. So is it really the second copy that is more attractive? It turned out that the activity of the recipients of the second Push copy was significantly higher than that of the first one.

 3. Excluding the interference of versions or holiday factors, the data performance when a new version is first launched is often very good, because users who actively upgrade are generally highly active users. When weekends or major holidays are approaching, users' consumption needs will be triggered, and the order conversion rate of e-commerce applications will also rise sharply. Therefore, when comparing data, the data of the experimental group and the control group should remain consistent in the time dimension.

 4. Forgetting historical data. Human beings are different from data technology. Data technology has 100% memory ability, while humans can only remember 33% after 1 day, 25% after 6 days, and 21% after 31 days according to Ahobins' Law of Forgetting. Therefore, we must choose the screening time period reasonably. For example, the Guess You Like module not only performs a certain weighting process on the scoring of interest tags, but also conducts a series of regression experiments based on factors such as the life cycle of the product to obtain the decline curve of the audience's various interests and purchasing tendencies. Use regular time changes to effectively delete old data to improve the click-through rate of the module.

5. The experiment needs to split the A1 group, that is, add another group A1 to the experimental group B and the control group A. The rules of A1 and A should be consistent, and then explore the rules of AB. Comparing data fluctuations with AA1, eliminates the impact of natural/abnormal fluctuations in data. My actual A/B experiment shows that it is very important and necessary to set up the A1 group. No matter how big the data magnitude is, the data of the two groups with the same experimental rules will also have certain small fluctuations, and this small fluctuation is in the refinement process. Today's operations may cause major interference bias in our judgment.

4. Review the data reasonably and objectively

## 1. Don’t ignore silent users

Product managers make decisions when they hear feedback from some users and spend a lot of time developing corresponding functions. Often, as a result, these functions may only be the urgent needs of a very small number of users. And most users don't care. It may even be contrary to the demands of core users, causing data to plummet after the new version of the product is launched.

Ignoring silent users and failing to comprehensively consider the core needs of most target users of the product may result in a waste of manpower and material resources, or worse, missed business opportunities.

 2. Comprehensively understand the data results

## If there is an obvious difference between the expectations of the experimental results and our experience and cognition Bias, please do not blindly jump to conclusions and question your intuition, but try to conduct a more thorough analysis of the data.

For example, I once conducted an experiment to deliver active pop-ups to users on the homepage. I found that the data of the experimental group improved in terms of homepage click-through rate, order conversion rate and even 7-day retention rate. Far exceeding the control group, the conversion rate of each module on the homepage has been significantly improved, far exceeding our expectations. So is this really an active pop-up that stimulates the user's conversion rate?

Later we found that users who can display active pop-ups on the homepage tend to have better network conditions when using the environment. In a wifi environment, users who do not display pop-ups are It may be that in mobile scenarios such as buses, subways, and shopping malls, network communication may be poor, thus affecting the results of the A/B experiment.

 3. Don’t rely too much on data

Over-reliance on data, on the one hand, will make us do a lot of things that are of no value Data analysis; on the other hand, it will also limit the inspiration and creativity that product managers should have.

Just as Luo Zhenyu mentioned in the New Year’s Eve speech of Friends of Time. Give users whatever they want. You can guess it even before they say it. This is called the maternal love algorithm. No one does it better than Toutiao in the field of content distribution. However, the maternal love algorithm has big drawbacks. In the recommendation It will become narrower and narrower as time goes by.

On the other side is the algorithm of fatherly love, standing high and seeing far. Tell users, put down the crap in your hands, I will tell you a good thing, follow me. Just like the iPhone series products created by Qiao Bangzhu back then, he did not look at market analysis or conduct user research to create products that exceeded user expectations.

##5. Summary

Netflix, the most successful video website in the United States, uses big data to analyze user habits. The analysis went deep into the creative process of the film, shaping the popular American drama "House of Cards". However, Netflix staff told us that we should not be obsessed with big data

If a TV series with a score of 9 is considered a high-quality product, big data can save us from the risk of a low score of 6 or less, but It will also lead us step by step towards mediocrity, the vast majority of which is between 7-8 points.

The above is the detailed content of How to conduct effective data analysis. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn