0

We have enterprise hadoop cluster installed on linux servers in our organisation. I am trying to insert csv file into one of our hive tables. I have csv file in my local windows machine. I am using python and using jaydebeapi package to connect to hive through jdbc. While I am using "LOAD DATA LOCAL..." command in code, it is throwing error related to paths not legal file path. How to resolve this issue? Below is the error I am getting.

Note: I am not going to use hdfs commands manually to copy csv file to hdfs and then load data. I want to load it directly from my windows machine to hive table.

com.cloudera.hiveserver2.support.exceptions.ErrorException

Error while compiling statement: FAILED: SemanticException [Error 10028]: Line 1:23 Path is not legal

I am using below piece of code for loading data.

query="""LOAD DATA LOCAL INPATH 'C:/Users/Docs/file.csv' INTO TABLE default.logstable"""
conn = getConnection()
curr = conn.cursor()
curr.execute(query)

connection was already setup, select queries are working fine.

3
  • query=r"""LOAD DATA LOCAL INPATH 'C:/Users/Docs/file.csv' INTO TABLE default.logstable""" Form the query as raw string realpython.com/python-raw-strings or escape the path query="""LOAD DATA LOCAL INPATH 'C://Users//Docs//file.csv' INTO TABLE default.logstable""" Commented Aug 26, 2024 at 17:36
  • Hi @Abbas thanks for your answer. I tried both ways. Getting same error Commented Aug 26, 2024 at 18:22
  • I would recommend using PySpark for this, not a direct SQL statement. Commented Aug 27, 2024 at 0:13

1 Answer 1

0

C:/ is not a legal Unix filepath.

You would have to use file://c:/Users/Docs/file.csv

Otherwise, yes, upload the CSV directly to HDFS, then define an EXTERNAL TABLE over it.

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.