Hacker Newsnew | past | comments | ask | show | jobs | submit | volderette's commentslogin

How do you query your iceberg tables? We are looking into moving away from Bigquery and Starrocks [1] looks like a good option.

[1] https://www.starrocks.io/


Trino is pretty good (open source presto).

https://trino.io/


Common opensource options (other than Spark and Flink): 1. Dremio: https://www.dremio.com/ 2. Trino: https://trino.io/ 3. Iceberg Java API: https://iceberg.apache.org/docs/1.6.1/api/


Starburst (full disclosure: I work there) provides a query engine (trino under the hood) with Iceberg support [1] -- worth checking out.

[1] https://www.starburst.io/platform/icehouse/


right now, starrocks or trino are likely your best options, but all the major query engines (clickhouse, snowflake, databricks, even duckdb) are improving their support too.


Why away from bigquery? Just wondering if it’s a cost thing.


Yes, mainly driven by cost. BigQuery is really unpredictable when dashboards with filters are being used intensively by users. We don’t want to limit our users in their data exploration.


Will this stay open source or will it end up as another limited open core data product?


This project will always remain open-source.


That sounds great thanks!


Airbyte is definitely one of the MDS vendors. Plus they have a ton of bugs because their only focus is having the most connectors on the market. A lot of them are broken or badly implemented.


I don't doubt the bugs. In fact, I expect them because so many of the connectors are community contributed and so I try to think of it as more like GitHub than a curated set of expertly-built connectors. That said, for the time being it's more important that I not break my other rules. The promise of a perfectly working connector (if such a thing exists) in exchange for unknowable pricing or a SDK I'm gonna have a hard time training people on is a tradeoff I feel I can't reasonably make right now, but I'm very open to the idea that my calculus may change.


There is sqlmesh which implements some nice concepts like blue green deployments.

https://sqlmesh.readthedocs.io/en/latest/


Docker integrated docker-compose a while ago, so the command is correct.


There is also sqlmesh (https://sqlmesh.com/). Pretty new as well. It introduces some interesting concepts. For smaller dbt projects it could be a drop-in replacement as it allows importing dbt projects.


There are browser extensions like PrivacyRedirect (iOS app) and LibRedirect (Chrome, Firefox …), that have the ability to redirect you to alternative reddit frontends (teddit and so on).


This is not how GTM server side works. There is not a single call to Google domains from the client, when GTM server side is set up to its fullest. The config (gtm.js) will be loaded from my subdomain and not googletagmanager.com. Also gtm.js can be renamed.


Per the docs here [1], that is not true. You continue to load gtag.js off the googletagmanager.com domain; subsequent events can flow to a custom domain.

[1] https://developers.google.com/tag-platform/tag-manager/serve...


Couldn't you still recognize the script by its content?


No because the script contents can change from site to site. Maintaining an index for every site would get you closer, but individual sites can trivially tweak things to break fingerprinting as often as they want. Even on every request.


Exactly, this is already done for tracking scripts, since it's commong to use proxies to load tracking scripts.


Not with dynamic obfuscation.


Posthog as well as Snowplow are open source solutions that can be self hosted. Snowplow is always hosted in your cloud infrastructure even if you use their managed service.


Fathom's EU docs (https://usefathom.com/features/eu-isolation) seem to suggest that EU-hosted but US-owned cloud infrastructure isn't sufficient either though - you're exposing any data stored/transferred through there to access by the US government.

That means Posthog self-hosted on an AWS server in Frankfurt wouldn't avoid this issue.

What're the best options for non-US owned cloud providers? AFAICT Canada or many other countries with privacy laws would be fine, it's really the US specifically that's problematic.


well according to them they use hetzner and I'm not sure but since hetzner now has us servers they might be in the wrong, too. it has nothing to do with us companies...


When creating a new VM at Hetzner, you have to explicitly pick the US location, for the exact same GDPR reasons. Hetzner has to do this or risk loosing customers because hosting in the US and/or using US services is forbidden for some kind of infrastructures.


let's be honest, no ones want to self hosts a website analytics application because in most of the cases they just want to focus on their core business, or at least that's the value proposition of SaaS. This will ultimately limit innovation to bigger companies that can afford maintaining their own infrastructure for everything.


I think the article misses the point on what a billionaire tax and a progressive tax system in general should be about. Tax systems in modern societies are not solely about generating income for the state but to a part also about distributing wealth more fair.

We are coming from a time where the marginal tax rates were between 70% and 90%. Now it‘s under 40%. [1] This is probably one reason why we came to the point were we have to discuss billionaires tax in the first place.

[1]https://eml.berkeley.edu/~saez/course/Labortaxes/taxableinco...


A couple of thoughts:

1) I think that the redistributionist view of taxation is still quite controversial (at least in the US)

2) I think redistribution via taxation is exceedingly unpopular if you are paying for it (I.e. people are all for other people paying more, but not themselves)

3) while marginal tax rates were higher previously, a) the highest marginal tax brackets were as high as $100m in today’s dollars, and 2) effective tax rates were not massively different (can’t find a source here, but recall seeing data on this) through avoidance/deductions etc. such that tax collected per dollar of income actually hasn’t changed much.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: