將Apache Spark 與MySQL 整合以讀取資料庫表
要將Apache Spark 與MySQL 連線並將資料庫表用作Spark 資料幀,請依照以下步驟操作:
建立Spark 會話:
<code class="python">from pyspark.sql import SparkSession # Create a Spark session object spark = SparkSession.builder \ .appName("Spark-MySQL-Integration") \ .getOrCreate()</code>
實例化MySQL Connector
:<code class="python">from pyspark.sql import DataFrameReader # Create a DataFrameReader object for MySQL connection jdbc_df_reader = DataFrameReader(spark)</code>
配置MySQL 連接參數
:<code class="python"># Set MySQL connection parameters jdbc_params = { "url": "jdbc:mysql://localhost:3306/my_db", "driver": "com.mysql.jdbc.Driver", "dbtable": "my_table", "user": "root", "password": "password" }</code>
讀取資料庫表
:<code class="python"># Read the MySQL table as a Spark dataframe dataframe_mysql = jdbc_df_reader.format("jdbc") \ .options(**jdbc_params) \ .load() # Print the dataframe schema dataframe_mysql.printSchema()</code>
以上是如何將 MySQL 表讀取為 Spark DataFrame?的詳細內容。更多資訊請關注PHP中文網其他相關文章!