

With SAS/ACCESS interface to JDBC installed and valid AWS Access keys placed at SAS compute server, you can access the Athena table by executing the following statement from SAS Studio application. BASE SAS access to S3 data file via Athena. Select * from baseball_athena_s3 limit 10 Ĥ. The following screen-shot describes the data read from S3 datafile under Athena query using schema/table stored in Glue catalog. Verify Athena query reads S3 data files with schema in Glue catalog database. The following screen-shot describes the SQL execution at Athena SQL editor to create a table under a specific database to read S3 data files.ģ. TBLPROPERTIES ('has_encrypted_data'='false') LOCATION 's3://sas-viyadeploymentworkshop/gel/Data/baseball_prqt/' Notice the data file location and row format SerDe used to read the file.ĬREATE EXTERNAL TABLE If NOT EXISTS baseball_athena_S3( The SQL executed from Athena query editor. The following SQL statement can be used to create a table under Glue database catalog for above S3 Parquet file. To read a data file stored on S3, the user must know the file structure to formulate a create table statement.

Create metadata/table for S3 datafiles under Glue catalog database. Option to switch to AWS Glue Data Catalog:Ĭreating a new database under Glue data catalog:Ģ.

On AWS Glue dashboard, under database section user can add a new database. User can switch to Glue catalog application from Athena query editor dashboard. To create and store metadata for S3 data file, a user needs to create a database under Glue data catalog. Create a database in AWS Glue Data catalog. The following screen-shot describes an S3 bucket and folder having Parquet files and needs to be read into SAS and CAS using the following steps.
S3 json query driver#

JDBC Data Connector installed on CAS servers.SAS/ACCESS Interface to JDBC installed on Viya (SAS) compute server.The following method describes a step-by-step process to access S3 files from CAS and BASE SAS via Athena. The following diagram describes the data access from CAS to S3 via Athena. Mobile users: To view the images, select the "Full" version at the bottom of the page. Select any image to see a larger version. The following diagram describes the data access from BASE SAS to S3 via Athena.
S3 json query serial#
Means, user can use serial data load method to load an Athena table to CAS. CAS can access Athena (Glue) S3 table using JDBC Data Connector. When metadata/schema for data files are available in AWS Glue catalog, BASE SAS can access Athena (Glue) S3 table using JDBC connection. CAS and SAS Access to S3 data files via Athena As per AWS documentation, the user pays $5.00 per TB of data scanned. The charges are based on the amount of data scanned by each query. The schema for the S3-data files created and stored under AWS Glue catalog.ĭata file format supported by Athena Query:Ī user only pays for the query executed on S3 data files. User needs to define a schema for the data file and can start executing SQL queries on S3 data. A user can access Athena through either AWS Management console, API or JDBC driver. AWS Athena is a serverless query service, the user does not need to manage any underlying compute infrastructure unlike AWS EMR Hadoop cluster. AthenaĪmazon Web Services Athena is a service which enables a user to perform interactive queries on data files stored in S3.
S3 json query how to#
This post is about how to read various data files stored on S3 locations using AWS Athena to SAS and CAS. Users can store various formats of a data files on S3 locations from different applications. It’s a secure, reliable, scalable, and affordable environment for storing huge data. Amazon S3 is one of the most widely used platforms to store big data.
