bigdata - How Can I know how much data is written to which executer during Shuffle-write in spark? -
i ran simple wordcount , trying understand how spark processed data there 3 executers below , i want know more shuffle , first executer wrote 16.2 kb of data how did write each executer ? shuffle write disk or disk + memory ?
aggregated metrics executor
time |tasks| failed| succeed| input/records | shuffle write/records
1.4min 6 0 6 1536.0 mb/15571058 16.2 kb / 1638 1.4min 6 0 6 1536.0 mb/15571061 16.4 kb / 1638 1.5min 7 0 7 1682.5 mb/17056569 19.0 kb / 1911
finally figured out how shuffle works in spark. shuffle write -> each executer produce local file on disk after map stage shuffle read -> cumulative data fetched executer other executers.
Comments
Post a Comment