Joshua Rodgers on 09 Jul 2024 23:48:43
It would be great if we could start a spark session when a pipeline runs so it is already ready when notebook tasks kick off. Currently, there is a delay added to each notebook activity while the spark session starts up.
A common pattern below would greatly benefit from having the spark session start prior to the notebook activity starting.
Current:
1-Pipeline Starts
2-Copy Activity
3-Notebook Activity (1-3 minutes delay with sessions startup)
Proposed:
1-Pipeline Starts
2-Spark Session Starts
3-Copy Activity
4-Notebook Activity (no delay as session has already started)