Skip to main content

Spark Config Parameters

The following parameters can be specified:

ParameterDefinitionExample
spark.openlineage.transport.typeThe transport type used for event emit, default type is consolehttp
spark.openlineage.namespaceThe default namespace to be applied for any jobs submittedMyNamespace
spark.openlineage.parentJobNamespaceThe job namespace to be used for the parent job facetParentJobNamespace
spark.openlineage.parentJobNameThe job name to be used for the parent job facetParentJobName
spark.openlineage.parentRunIdThe RunId of the parent job that initiated this Spark jobxxxx-xxxx-xxxx-xxxx
spark.openlineage.appNameCustom value overwriting Spark app name in eventsAppName
spark.openlineage.facets.disabledList of facets to disable, enclosed in [] (required from 0.21.x) and separated by ;, default is [spark_unknown;] (currently must contain ;)[spark_unknown;spark.logicalPlan]
spark.openlineage.capturedPropertiescomma separated list of properties to be captured in spark properties facet (default spark.master, spark.app.name)"spark.example1,spark.example2"
spark.openlineage.dataset.removePath.patternJava regular expression that removes ?<remove> named group from dataset path. Can be used to last path subdirectories from paths like s3://my-whatever-path/year=2023/month=04(.*)(?<remove>\/.*\/.*)
spark.openlineage.jobName.appendDatasetNameDecides whether output dataset name should be appended to job name. By default true.false
spark.openlineage.jobName.replaceDotWithUnderscoreReplaces dots in job name with underscore. Can be used to mimic legacy behaviour on Databricks platform. By default false.false
spark.openlineage.debugFacetDetermines whether debug facet shall be generated and included within the event. Set enabled to turn it on. By default, facet is disabled.enabled