Let me put it this way - say you hv a single json file and you are using spark.read.json , it will create a single task to process the json data. When the “wholeFile” option is set to true in spark, Json is NOT splittable. So u have to figure out a way to pre-process your json file into chunks before feeding those to spark !
good explanation
Very well explained..!!
Thank you
Good job................
Thanks
Its very informative
Glad to know that.
Good concise video but the background music was unnecessary.
what it means by json is not splitable does that mean all get processed in one executor .I assume thats not correct.. But its informative video thanks
No I said - Non-Binary formats (JSON, XML) "can not be Split".
@@GKTechplex So does it mean if I process a json file in spark all will come to one executor .
Let me put it this way - say you hv a single json file and you are using spark.read.json , it will create a single task to process the json data.
When the “wholeFile” option is set to true in spark, Json is NOT splittable.
So u have to figure out a way to pre-process your json file into chunks before feeding those to spark !
@@GKTechplex Thanks cleared the doubt