How to save files from spark using python


To save files from spark using python

Functions used :

df.write.save(“Filepath&name”,format=’fileformat’) – Save the RDD dataframe from spark in the given file path with given file name in given format.(format=parquet(default),json,csv)


  Import necessary libraries

  Initialize the Spark session

  Create the required data frame

  Use the predefined function to save the RDD data frame from spark

Sapmle Code

from pyspark.sql import SparkSession
#Set up SparkContext and SparkSession
spark=SparkSession \
.builder \
.appName(“Python spark example”)\

#Load the file
df2 = df1.withColumn(“Total_salary”, (df1.SALARY+df1.TA))
#To save the result