Apollo Apps

Spark Applications

The Spark application is an extension of the current app(let) framework. Currently, app(let)s have a specification for their VM (instance type, OS, packages). This has been extended to allow for an additional optional cluster specification with type=dxspark.

  • Calling /app(let)-xxx/run for Spark apps creates a Spark cluster (+ master VM).

  • The master VM (where the app shell code runs) acts as the driver node for Spark.

  • Code in the master VM leverages the Spark infrastructure.

  • Job mechanisms (monitoring, termination, etc.) are the same for Spark apps as for any other regular app(let)s on the Platform.

  • Spark apps use the same platform "dx" communication between the master VM and DNAnexus API servers.

  • There's a new log collection mechanism to collect logs from all nodes.

  • You can use the Spark UI to monitor running job using ssh tunneling.

Spark apps can be launched over a distributed Spark cluster.