1

I am having data in parquet format in ADLS gen2. I want to implement dalta layers in my project. So I kept all the data from on-prem in ADLS Gen2 via ADF in a separate container called landing zone.

Now i created a separated container called Bronze where I want to keep delta table. For this I have did like below. I have created a database in databricks. And I have created a delta table in data bricks using below SQL code.

create table if not exists externaltables.actv_snap_view(
id String,
mbr_id String,
typ_id String,
strt_dttm String,
otcome_typ_id String,
cdc String
)
using delta
location '/mnt/Storage/Bronze/actv_snap_view'

Now my table is not having any data.

  1. How can I add data which is in data lake landing zone into delta table which I created.
  2. My database is in databricks after data is added to the table where will be the underlined data will be stored.
3
  • If externaltables.actv_snap_view is your Destination table then you can insert data from your parquet insert into externaltables.actv_snap_view select * from parquet.your adls location your adls location should be enclosed within backticks Commented Jun 6, 2022 at 12:43
  • While creating the Database if you can specify Location pointing to your adls then DB will be created there. Similarly if for tables if you specify the location then the data will be copied there. Commented Jun 6, 2022 at 12:55
  • If Location is not specified then its copied to default /dbfs location Commented Jun 6, 2022 at 12:56

1 Answer 1

0

You can follow the steps below to create table using data from landingzone (source for parquet files), where the table belongs to the database present in bronze container.

  • Considering your ADLS containers are mounted, you can create a database and specify its location as your bronze container mount point as suggested by @Ganesh Chandrasekaran.
create database demo location "/mnt/bronzeoutput/" 
  • Now use the following SQL syntax to create a table using parquet file present in mount point of the landingzone container.
create table demo.<table_name> (<columns>) using parquet location '/mnt/landingzoneinput/<parquet_file_name>'; 

Using the above steps, you have created a database in your bronze container where you can store your tables. To populate a table created inside this database of bronze container, you are using the files present in your landingzone container.

Update:

  • Using the create table statement above is creating a table with data from the parquet file, but this table does not reflect in the data lake.

  • You can instead use the query given below. It first creates a table in the database (present in bronze container). Now you can insert the values from your parquet file present in landingzone.

create table demo.<table_name> (<columns>);
-- demo database is inside bronze container

insert into demo.<table_name> select * from <data_source>.`/mnt/landingzoneinput/source_file`

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.