bigdata - How Can I know how much data is written to which executer during Shuffle-write in spark? -


i ran simple wordcount , trying understand how spark processed data there 3 executers below , i want know more shuffle , first executer wrote 16.2 kb of data how did write each executer ? shuffle write disk or disk + memory ?

aggregated metrics executor

time |tasks| failed| succeed| input/records | shuffle write/records

1.4min   6    0        6        1536.0 mb/15571058    16.2 kb / 1638 1.4min   6    0        6        1536.0 mb/15571061    16.4 kb / 1638 1.5min   7    0        7        1682.5 mb/17056569    19.0 kb / 1911 

finally figured out how shuffle works in spark. shuffle write -> each executer produce local file on disk after map stage shuffle read -> cumulative data fetched executer other executers.


Comments

Popular posts from this blog

html - Outlook 2010 Anchor (url/address/link) -

javascript - Why does running this loop 9 times take 100x longer than running it 8 times? -

Getting gateway time-out Rails app with Nginx + Puma running on Digital Ocean -