Skip to main content
1 vote
0 answers
67 views

I am using Databricks Runtime 15.4 (Spark 3.5 / Scala 2.12) on AWS. My goal is to use the latest Google BigQuery connector because I need the direct write method (BigQuery Storage Write API): option(&...
Thilina's user avatar
  • 157
1 vote
0 answers
98 views

To reduce the computing cost on Databricks, I changed the databricks job bundle configuration as below: Original job_clusters: - job_cluster_key: ... new_cluster: ... ... ...
Morgan Fisher's user avatar
0 votes
1 answer
434 views

I’m currently learning Databricks using a trial account. I created a volume and successfully loaded data into it. However, when trying to access the file using Spark, I encountered the following error:...
Learn Hadoop's user avatar
  • 3,058
0 votes
0 answers
85 views

I have been getting my schema passed in with custom_schema = create_StructType_schema (access_key,secret_access_key,schema_bucket_name,schema_folder,schema_file_name) metadata_fp = {"comment"...
Ethan Bonsall's user avatar
1 vote
1 answer
429 views

I'm setting up an external location in Databricks using Unity Catalog via Terraform. During terraform apply, I encounter the following error: > 2025-05-12T21:28:43 Error: cannot create external ...
Olfa2's user avatar
  • 131
0 votes
0 answers
46 views

I have a simple Python project with the following structure: root/ │── src/ │ ├── package_name/ │ │ ├── __init__.py │ │ ├── main.py │ │ ├── submodules1/ │ │ │ ├── ...
NaineeL SoyantaR's user avatar
0 votes
0 answers
5 views

I have website tracking data that has session_id's and the hits are presented with the timestamp they occurred. I'm trying to create search cases with an ID number within those sessions. Every time ...
RainGardner's user avatar
0 votes
0 answers
131 views

We are using databricks autoloader to process parquet files into delta format. The job is scheduled to run once per day and the code looks like this: def run_autoloader(table_name, checkpoint_path, ...
Boris's user avatar
  • 906
0 votes
1 answer
37 views

I have a json structure that I am trying to match all the cpe_match nodes for, using a JSONPath expression. Using databricks sql, I have the following query, where "nodes" contains my json: ...
Neil P's user avatar
  • 3,250
3 votes
0 answers
184 views

I created skewed data to test a salting approach and tried three different solutions, but none achieved the desired results with a significant runtime improvement. Can you guide me on the best ...
Learn Hadoop's user avatar
  • 3,058
0 votes
1 answer
106 views

below my code snippet. spark.read.table('schema.table_1').createOrReplaceTempView('d1') # 400 million records spark.read.table('schema.table_2').createOrReplaceTempView('d1') $ 300 million records ...
Learn Hadoop's user avatar
  • 3,058
0 votes
1 answer
827 views

Is there any way that I can log Azure databricks cluster usage metrics like CPU, Memory, Network, Throughput usage and etc.... ? There are ways before 13.3 series, is there any way post 13.3 series (...
MJ029's user avatar
  • 319
1 vote
1 answer
278 views

I am trying to create a workspace in databricks linked to AWS. Its failing on the last step. It says- MALFORMED_REQUEST: Failed storage configuration validation checks: List,Put,...
Priyanka Gupta's user avatar
0 votes
2 answers
340 views

I'm very new to spark streaming and autoloader and had a query on how we might be able to get autoloader to read a text file with "§" as the delimiter. Below I tried reading the file as a ...
beingmanny's user avatar
0 votes
1 answer
117 views

I want to protect an Excel with password which is available in S3 bucket and save it back to s3, I tried with openpyxl and xlsxwriter, it is generating xlsx file, but it opens without asking for ...
MOHAMMED SALMAN's user avatar

15 30 50 per page
1
2 3 4 5
15