InData ArenabyThiago CordonMerging different schemas in Apache SparkThis article explores an approach to merge different schemas using Apache Spark.Dec 21, 20208Dec 21, 20208
Adrian LamDeveloping PySpark UDFsPyspark UserDefindFunctions (UDFs) are an easy way to turn your ordinary python code into something scalable. There are two basic ways to…May 7, 20191May 7, 20191
Nabarun ChakrabortiRead JSON using PySparkThe JSON (JavaScript Object Notation) is a lightweight format to store and exchange data. The input JSON may be in different format —Oct 4, 20202Oct 4, 20202