search

Amazon product dataset

Aug 27, 2024 am 06:02 AM

Hi, I found a dataset of Amazon products in Kaggle and decided to find a relationship between price and star rating.

Full code in :
https://github.com/victordalet/Kaggle_analysis/tree/feat/amazon_products


I - Preparing data

To do this, I use SQLAlchemy to convert the csv file into a small database, and plotly to display the information.

pip install SQLAlchemy
pip install plotly

In the following script, I extract the data and obtain :

  • ratio between price and number of stars
  • final rating and number of stars
  • price and number of stars
import pandas as pd
from sqlalchemy import create_engine, text
import plotly.express as px


class Main:
    def __init__(self):
        self.result = None
        self.connection = None

        self.engine = create_engine("sqlite:///my_database.db", echo=False)
        self.df = pd.read_csv("amazon_product.csv")
        self.df.to_sql("products", self.engine, index=False, if_exists="append")

        self.get_data()
        self.transform_data()
        self.display_graph()
        self.get_data_number_start_and_price()
        self.transform_data()
        self.display_graph()
        self.get_data_number_start_and_start()
        self.display_graph()

    def get_data(self):
        self.connection = self.engine.connect()
        query = text(
            "SELECT product_price, product_star_rating FROM products where product_price != '$0.00'"
        )
        self.result = self.connection.execute(query).fetchall()

    def get_data_number_start_and_price(self):
        query = text(
            "SELECT product_price, product_num_ratings FROM products where product_price != '$0.00'"
        )
        self.result = self.connection.execute(query).fetchall()

    def get_data_number_start_and_start(self):
        query = text(
            "SELECT product_star_rating, product_num_ratings FROM products where product_price != '$0.00'"
        )
        self.result = self.connection.execute(query).fetchall()
        for i in range(len(self.result)):
            self.result[i] = [self.result[i][0], self.result[i][1]]

    def transform_data(self):
        for i in range(len(self.result)):
            self.result[i] = [float(self.result[i][0].split("$")[1]), self.result[i][1]]

    def display_graph(self):
        fig = px.scatter(
            self.result, x=0, y=1, title="Amazon Product Price vs Star Rating"
        )
        fig.show()


Main()

II - Result

Price and notation

Amazon product dataset

Price and number of notation

Amazon product dataset

Notation and number of opinion

Amazon product dataset

III - Conclusion

We can see, there's not necessarily a relationship between price and rating, but the higher the price, the lower the rating, and the more reviews, the higher the rating.
Which seems logical, since if a product is bought a lot, it means it's popular.

The above is the detailed content of Amazon product dataset. For more information, please follow other related articles on the PHP Chinese website!

Statement
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Merging Lists in Python: Choosing the Right MethodMerging Lists in Python: Choosing the Right MethodMay 14, 2025 am 12:11 AM

TomergelistsinPython,youcanusethe operator,extendmethod,listcomprehension,oritertools.chain,eachwithspecificadvantages:1)The operatorissimplebutlessefficientforlargelists;2)extendismemory-efficientbutmodifiestheoriginallist;3)listcomprehensionoffersf

How to concatenate two lists in python 3?How to concatenate two lists in python 3?May 14, 2025 am 12:09 AM

In Python 3, two lists can be connected through a variety of methods: 1) Use operator, which is suitable for small lists, but is inefficient for large lists; 2) Use extend method, which is suitable for large lists, with high memory efficiency, but will modify the original list; 3) Use * operator, which is suitable for merging multiple lists, without modifying the original list; 4) Use itertools.chain, which is suitable for large data sets, with high memory efficiency.

Python concatenate list stringsPython concatenate list stringsMay 14, 2025 am 12:08 AM

Using the join() method is the most efficient way to connect strings from lists in Python. 1) Use the join() method to be efficient and easy to read. 2) The cycle uses operators inefficiently for large lists. 3) The combination of list comprehension and join() is suitable for scenarios that require conversion. 4) The reduce() method is suitable for other types of reductions, but is inefficient for string concatenation. The complete sentence ends.

Python execution, what is that?Python execution, what is that?May 14, 2025 am 12:06 AM

PythonexecutionistheprocessoftransformingPythoncodeintoexecutableinstructions.1)Theinterpreterreadsthecode,convertingitintobytecode,whichthePythonVirtualMachine(PVM)executes.2)TheGlobalInterpreterLock(GIL)managesthreadexecution,potentiallylimitingmul

Python: what are the key featuresPython: what are the key featuresMay 14, 2025 am 12:02 AM

Key features of Python include: 1. The syntax is concise and easy to understand, suitable for beginners; 2. Dynamic type system, improving development speed; 3. Rich standard library, supporting multiple tasks; 4. Strong community and ecosystem, providing extensive support; 5. Interpretation, suitable for scripting and rapid prototyping; 6. Multi-paradigm support, suitable for various programming styles.

Python: compiler or Interpreter?Python: compiler or Interpreter?May 13, 2025 am 12:10 AM

Python is an interpreted language, but it also includes the compilation process. 1) Python code is first compiled into bytecode. 2) Bytecode is interpreted and executed by Python virtual machine. 3) This hybrid mechanism makes Python both flexible and efficient, but not as fast as a fully compiled language.

Python For Loop vs While Loop: When to Use Which?Python For Loop vs While Loop: When to Use Which?May 13, 2025 am 12:07 AM

Useaforloopwheniteratingoverasequenceorforaspecificnumberoftimes;useawhileloopwhencontinuinguntilaconditionismet.Forloopsareidealforknownsequences,whilewhileloopssuitsituationswithundeterminediterations.

Python loops: The most common errorsPython loops: The most common errorsMay 13, 2025 am 12:07 AM

Pythonloopscanleadtoerrorslikeinfiniteloops,modifyinglistsduringiteration,off-by-oneerrors,zero-indexingissues,andnestedloopinefficiencies.Toavoidthese:1)Use'i

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Dreamweaver Mac version

Dreamweaver Mac version

Visual web development tools

SublimeText3 English version

SublimeText3 English version

Recommended: Win version, supports code prompts!

ZendStudio 13.5.1 Mac

ZendStudio 13.5.1 Mac

Powerful PHP integrated development environment

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools