hbutani's comments

hbutani · on April 2, 2016

I don't know the specific state of the DataSources you mention, but yes just providing a DataSource only is not enough; you can only push Filters and Projections to the underlying engine. You need to develop Query Rewrite Rules to rewrite Joins, Group Bys, Having, Limit etc to the underlying engine. For example our Rewrite engine for Druid is here: https://github.com/SparklineData/spark-druid-olap/blob/maste...

Can you be specific about ANSI SQL compliance requirements. Spark SQL is closing the gap on Hive SQL; both have decent support for analytical queries: Cubes/Rollups/Windowing etc. The only major gap between Spark and Hive SQL I know off is SubQuery predicates(exists/not exists).

hbutani · on April 2, 2016

We have couple of companies running Tableau on top of this. The deployment is Tableau - Spark ThriftServer(with our extension) - Druid. We push down Slice and Dice and Star Join Queries as Druid Queries; all of Spark SQL is supported with some portions of a Query Plan being executed in Spark. We are working on supporting more Spark UDFs being pushed to Druid, performance improvements, and more coverage for Tableau. Further down we will support Star Schemas where some or all dimensions are not indexed. Happy to discuss specific SQL support or deployment questions. Please reach out to us.

- Harish.