SQL
This is a legacy Apache Ignite documentationThe new documentation is hosted here: https://ignite.apache.org/docs/latest/
Python thin client fully supports Ignite SQL queries. These examples are similar to the ones given in the Apache Ignite SQL Documentation: Getting Started.
Setup
First let us establish a connection.
client = Client()
client.connect('127.0.0.1', 10800)
Then create tables. Begin with Country table, than proceed with related tables City and CountryLanguage.
COUNTRY_CREATE_TABLE_QUERY = '''CREATE TABLE Country (
Code CHAR(3) PRIMARY KEY,
Name CHAR(52),
Continent CHAR(50),
Region CHAR(26),
SurfaceArea DECIMAL(10,2),
IndepYear SMALLINT(6),
Population INT(11),
LifeExpectancy DECIMAL(3,1),
GNP DECIMAL(10,2),
GNPOld DECIMAL(10,2),
LocalName CHAR(45),
GovernmentForm CHAR(45),
HeadOfState CHAR(60),
Capital INT(11),
Code2 CHAR(2)
)'''
CITY_CREATE_TABLE_QUERY = '''CREATE TABLE City (
ID INT(11),
Name CHAR(35),
CountryCode CHAR(3),
District CHAR(20),
Population INT(11),
PRIMARY KEY (ID, CountryCode)
) WITH "affinityKey=CountryCode"'''
LANGUAGE_CREATE_TABLE_QUERY = '''CREATE TABLE CountryLanguage (
CountryCode CHAR(3),
Language CHAR(30),
IsOfficial BOOLEAN,
Percentage DECIMAL(4,1),
PRIMARY KEY (CountryCode, Language)
) WITH "affinityKey=CountryCode"'''
for query in [
COUNTRY_CREATE_TABLE_QUERY,
CITY_CREATE_TABLE_QUERY,
LANGUAGE_CREATE_TABLE_QUERY,
]:
client.sql(query)
Create indexes
CITY_CREATE_INDEX = '''
CREATE INDEX idx_country_code ON city (CountryCode)'''
LANGUAGE_CREATE_INDEX = '''
CREATE INDEX idx_lang_country_code ON CountryLanguage (CountryCode)'''
for query in [CITY_CREATE_INDEX, LANGUAGE_CREATE_INDEX]:
client.sql(query)
Fill tables with data.
COUNTRY_INSERT_QUERY = '''INSERT INTO Country(
Code, Name, Continent, Region,
SurfaceArea, IndepYear, Population,
LifeExpectancy, GNP, GNPOld,
LocalName, GovernmentForm, HeadOfState,
Capital, Code2
) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)'''
CITY_INSERT_QUERY = '''INSERT INTO City(
ID, Name, CountryCode, District, Population
) VALUES (?, ?, ?, ?, ?)'''
LANGUAGE_INSERT_QUERY = '''INSERT INTO CountryLanguage(
CountryCode, Language, IsOfficial, Percentage
) VALUES (?, ?, ?, ?)'''
for row in COUNTRY_DATA:
client.sql(COUNTRY_INSERT_QUERY, query_args=row)
for row in CITY_DATA:
client.sql(CITY_INSERT_QUERY, query_args=row)
for row in LANGUAGE_DATA:
client.sql(LANGUAGE_INSERT_QUERY, query_args=row)
Data samples are taken from Ignite GitHub repository.
That concludes the preparation of data. Now let us answer some questions.
SQL Query
Display all the information about a given city.
CITY_INFO_QUERY = '''SELECT * FROM City WHERE id = ?'''
result = client.sql(
CITY_INFO_QUERY,
query_args=[3802],
include_field_names=True,
)
field_names = next(result)
field_data = list(*result)
print('City info:')
for field_name, field_value in zip(field_names*len(field_data), field_data):
print('{}: {}'.format(field_name, field_value))
# City info:
# ID: 3802
# NAME: Detroit
# COUNTRYCODE: USA
# DISTRICT: Michigan
# POPULATION: 951270
SQL Fields Query
What are the 10 largest cities in our sample data (population-wise)?
MOST_POPULATED_QUERY = '''
SELECT name, population FROM City ORDER BY population DESC LIMIT 10'''
result = client.sql(MOST_POPULATED_QUERY)
print('Most 10 populated cities:')
for row in result:
print(row)
# Most 10 populated cities:
# ['Mumbai (Bombay)', 10500000]
# ['Shanghai', 9696300]
# ['New York', 8008278]
# ['Peking', 7472000]
# ['Delhi', 7206704]
# ['Chongqing', 6351600]
# ['Tianjin', 5286800]
# ['Calcutta [Kolkata]', 4399819]
# ['Wuhan', 4344600]
# ['Harbin', 4289800]
The sql() method returns a generator, that yields the resulting rows.
SQL Join Query
What are the 10 most populated cities throughout the 3 chosen countries?
Note that if you set the include_field_names
argument to True
, the sql()
method will generate a list of column names as a first yield. You can access field names with Python built-in next
function.
MOST_POPULATED_IN_3_COUNTRIES_QUERY = '''
SELECT country.name as country_name, city.name as city_name, MAX(city.population) AS max_pop FROM country
JOIN city ON city.countrycode = country.code
WHERE country.code IN ('USA','IND','CHN')
GROUP BY country.name, city.name ORDER BY max_pop DESC LIMIT 10
'''
result = client.sql(
MOST_POPULATED_IN_3_COUNTRIES_QUERY,
include_field_names=True,
)
print('Most 10 populated cities in USA, India and China:')
print(next(result))
print('----------------------------------------')
for row in result:
print(row)
# Most 10 populated cities in USA, India and China:
# ['COUNTRY_NAME', 'CITY_NAME', 'MAX_POP']
# ----------------------------------------
# ['India', 'Mumbai (Bombay)', 10500000]
# ['China', 'Shanghai', 9696300]
# ['United States', 'New York', 8008278]
# ['China', 'Peking', 7472000]
# ['India', 'Delhi', 7206704]
# ['China', 'Chongqing', 6351600]
# ['China', 'Tianjin', 5286800]
# ['India', 'Calcutta [Kolkata]', 4399819]
# ['China', 'Wuhan', 4344600]
# ['China', 'Harbin', 4289800]
Finally, delete the tables used in this example with the following queries:
DROP_TABLE_QUERY = '''DROP TABLE {} IF EXISTS'''
for table_name in [
CITY_TABLE_NAME,
LANGUAGE_TABLE_NAME,
COUNTRY_TABLE_NAME,
]:
result = client.sql(DROP_TABLE_QUERY.format(table_name))
Python example filesPython thin client contains fully workable examples to demonstrate the behavior of the client.
Updated 5 months ago