Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Launch HN: HiGeorge (YC W21) – Real-time data visualizations for public datasets
89 points by saigal on Feb 19, 2021 | hide | past | favorite | 44 comments
Hi HN!

Anuj here. My co-founder Amir (Aazo11) and I are building HiGeorge (https://hi-george.com/). We make localized drag-and-drop data visualizations so that all publishers, even the small ones, can better leverage data in their storytelling. Think Tableau with all the necessary data attached.

At the onset of the pandemic Amir and I were looking for local data on the spread of the virus. We visited the sites of large national newsrooms like the NYTimes and were impressed by the quality of data visualizations and maps, but they lacked the geographic granularity for our own neighborhood.

We then turned to our local newsrooms but found they presented data in tables and lists that made it difficult to comprehend the virus’ spread and trends. We wondered why. After talking to local journalists and publishers, we found that newsrooms simply do not have the resources to make sense of large datasets.

Public datasets are hard to clean, poorly structured, and constantly updated. One publisher explained to us that she would refresh her state health department’s website 5 times a day waiting for updated COVID data, then manually download a CSV and clean it in Excel. This process could take hours, and it needed to happen every day.

This is where HiGeorge comes in. We clean and aggregate public datasets and turn them into auto-updating data visualizations that anyone can instantly use with a simple copy/paste. Our data visualizations can be drag-and-dropped into articles, allowing news publishers to offer compelling data content to their communities.

Check out a few versions of what we’re doing with customers -- COVID-19 data reporting at North Carolina Health News [1], COVID-19 vaccine site mapping at SFGATE [2], real-time crime reporting in Dallas, TX [3], and police use of force at Mission Local [4].

Today, HiGeorge works with dozens of newsrooms across the country. Our visualizations have driven a 2x increase in pageviews and a 75% increase in session duration for our partner publishers. We charge a monthly subscription for access to our data visualization library – a fraction of the cost of an in-house data engineer. In the long run, we are building HiGeorge so that it becomes the single place to collaborate on and publish data content.

We’d love to hear from the HN community and we’ll be hanging out in the comments if you have any questions or feedback.

[1]https://www.northcarolinahealthnews.org/2021/02/09/coronavir... [2] https://www.sfgate.com/bayarea/article/vaccine-sites-San-Fra... [3] https://lakehighlands.advocatemag.com/2021/02/data-crime-tre... [4] https://missionlocal.org/crime-data/



Very cool! Selling to news publishers is really hard, though. They're cash strapped, and the industry is in a lot of turmoil. The PE firms that are buying up papers are not known for investing in the journalism. How are you approaching this challenge?

Also, what public data sets do you use? This seems like a natural evolution of Steve Ballmer's USAFacts: https://usafacts.org/


PE firms and hedge funds that buy papers want to maintain quality journalism in a cost-efficient manner. HiGeorge does exactly this. It enables newsrooms to have geographic coverage while maintaining low overhead. Because the ROI on interactive data content is high (increased time spent on page, increase in pageviews etc) buyers of papers look very favorably on a solution such as HiGeorge.

We can ingest any public datasets. USAFacts is fantastic and would be a fabulous partner for us!


Woah, this is exactly what I was looking for. I had vaccination data that I wanted to visualise but seemingly nothing in the market makes it easy to put it out there fast.

Google chart library is great (if you have heard of it) but they come with restrictions on their geocharts. They require Google Maps Platform API key in the client which made me reconsider using it. I only needed a simple choropleth chart for which static geojson feature file would work, no need for fetching from Google maps for every user.

Great work. I will check it out.


We should have what you need. Feel free to email me at anuj [at] hi-george.com

Also you can browse our library at https://hi-george.com/visualizations


Thanks. I saw the pricing in the comments. It wouldn't be affordable for me for the project I wanted to do above (I am not getting anything from it). Nonetheless, good luck. It's a real problem.


Similar to what I mentioned in another reply, for now we offer a limited set of data visualizations that are self-serve and entirely free here: https://hi-george.com/selfserve

if there is a data set that we don't cover, please feel free to request it and we will get back to you asap.


I like the concept a lot, but aren't the newsroom outsourcing their due diligence to a company that may not have the same standards (or lack of) as their own? If I'm NYT (hint I'm not) why would I trust you to provide data that is the central point of my narrative and what if that data goes the other way as time progress?


Every data visualization that HiGeorge publishes is reproducible using the Associated Press data guidelines (Link below, note that acct creation may be required). This means that anyone (journalist, editor, even the reader) can easily verify that the data is accurate and can assess for themselves whether the data source is reputable.

Each of our data visualizations is connected to a data feed so that it is always up to date. They are of the "set if and forget it" variety.

https://www.apstylebook.com/ap_stylebook?az_chapter=80&other...


Assuming news orgs only want to tell the truth may be a bit too naive.


It is true that some newsrooms have a point of view that they push. While data can be manipulated, the desire is that data visualizations from reputable sources (e.g. CDC, etc) can help separate fact from fiction.


One man's trash is another man's treasure.


Congrats Anuj & Amir for launching! It's a really exciting problem to solve, and I like your take on it.

> Public datasets are hard to clean, poorly structured, and constantly updated.

We relate strongly to this at Monitoro[0], the situation is even more dire for data published as HTML pages only.

Here's a demo [1] where we're tracking European market indexes at Euronext. The data we capture is much more involved, but that's what you get with Airtable embeds :)

Do you have plans for integrations, or open to chat about potential synergies?

I'd love to chat. My email is omar @ our domain.

[0]: https://www.monitoro.xyz

[1]: https://www.monitoro.xyz/demo-euronext


I think this has potential. I would move away from the "Corporate illustration style" if I were you, it's getting some pushback lately. Also, a minor nitpick about the logo which reads "HIGEORGE" as a single word.

I instantly searched for pricing, and was put off by the lack of it. I think if I were a small publication, my chances of subscribing would be a lot higher if I knew how much it would cost me without going through a sales person. Maybe you are still figuring out prices right now?


Pricing begins at $199/month for small publishers, increasing for publishers that have larger size/scope. Your feedback regarding availability of pricing on the website is noted and I can understand the friction caused by needing to speak with sales. It is a change that we are currently weighing.


I didn't know links in the description of HN posts were clickable. Is this true only for Launch/Show HN? Or maybe because they're a YC company they get special treatment?


I think it's only for Launch HN (show HN is a link post, not a text post).

It doesn't seem to be for YC only. Here's an example from a non-YC co: https://news.ycombinator.com/item?id=23466470


I would explore small starter plans - a couple of published charts / stories per month - or even one per month. IMHO, a lot of marketers like myself are professional story tellers and use data extensively. We'd love your product, potentially. It isn't news, per se, but we still use data viz to communicate to the public. I can't justify $199/mo. but wish your team well with the concept.


Ah, I love this. While we are starting with news publishers, we are HIGHLY aware that others (marketers, bloggers, press teams at enterprises, and many more) are interested in storytelling that is supplemented with data. Another great example are Substack writers. We are exploring individual plans. May I ask what you would be willing to pay as an individual?


I had one more question... Can you tell me just a bit more about your use case as a marketer? We've received a lot of similar requests and I'm trying learn more about the use cases. Thank you in advance.


This is very nice idea. Glad you find a model to make this a viable business.

I wonder in the long run how do you decide what data/vis to publish? People care a lot about trendy topics, e.g. mars rover now, GME short squeeze a few weeks back and wildfires last year. But those are short-lived, which might not be as profitable as things like COVID and election?


In these early days we perhaps spend more time thinking about this question than any other. After speaking with hundreds of news publishers, the truth seems to be that certain topic areas have "staying power" while others are more fleeting. For example, rental price trends, real estate trends, economic data (e.g. unemployment rates), environmental issues (e.g. air quality)are always relevant. The other nuance is that these topics have constantly updating data feeds since the the realities of the world are constantly changing. Kind of like the Pareto principle.


Understand. It's unfortunate that those fleeting trendy datasets are hard to monetize. At the same time they seem equally important in terms of the degree to which people care about. Just look at the stuff on r/dataisbeautiful, which is a mix of the two. IMO fleeting datasets could be pretty good marketing if done right.


It's true. The one-off data viz often get the most engagement. For marketing purposes, this is fantastic. As a company, however, our product is more "sticky" the more often the underlying datasets update since the content would always be fresh. I like to say it's kind of like a Netflix movie except you don't mind watching it again and again :-)


Visualisation look good.

But have to say landing page is simply the weirdest I have come across in a long time :)

The visualisation in landing page & how it works are chopped off on right side!

It is not clear on how it works with existing data sets. Say I've a csv or a database, can higeorge connect and show some visualisation ? How does this compare to Superset/Metabase etc ?


Our eventual goal is to remove the need for anyone to do redundant data engineering work on public datasets. So while Superset/Metabase are tools to explore you enterprise data, we are focused on building a growing library of feeds from public data. That being said, we allow users to add to the library from datasets they can point us to.

Thanks for the feedback on the landing page. We will take it into account in the next iteration of the design.


I'm curious what the business model is. These seems like a service business, you have to manual create each graphic. Then hope that one can be used long term.

Seems less of a saas and more of a service business, which typically doesn't scale the way YC companies are expected to.

Love to hear your feedback


We do not manually create any graphs. The infrastructure we have developed allows us to create new graphs without writing any code:

- On the backend we can add new data pipelines to any public data by setting a few configuration parameters

- The front end is completely modular with re-usable D3 components

While we are doing the configuration ourselves for now, we plan to open this up over time.


In addition, because each of our data viz is connected to live data feeds, it creates an ever-expanding library with decreasing effort/time from HiGeorge. More content, more revenue, less time/effort as time goes on.


So cool, this idea makes a lot of sense to me. I sold a data viz article to a newspaper and was surprised to learn that the biggest value we provided was compiling the (publicly available) data.


Nice name. :-)


hahaha


you might consider a free plan that limits the overall views/requests over a period (day/month). Once exceeded, you can provide a simple message stating this, and come back tomorrow, etc. It would be great for small-time bloggers and websites.


I like this idea a lot. For now we offer a limited set of data visualizations that are self-serve and entirely free here: https://hi-george.com/selfserve

Requests for other data sets are welcome


this is super-cool, thanks.

You might want to highlight this on the landing page.

Can I please ask for some weather related datasets - snow received over time, over the past years etc.


Good recommendation. May I ask what is the use case -- blog post, academic journal, news article? If you send us the url to where the data is, we can certainly take a look. pls feel free to email me at anuj@hi-george.com


I couldn't find any pricing info on your site, can you please share more details here?


Pricing depends on the size of publisher/newsrooms; it starts at $199/month for access to our data viz library. For individual newsletter writers and bloggers, we are experimenting with plans that are less.


Do you get a lot of backlinks to your site from major domains, hence improving your SEO?


Currently our data viz get embedded through an iFrame so the backlinks are from our front end domain, unless we get an explicit shoutout in the article. However, we plan on moving the "powered by HiGeorge" HTML tag outside the iFrame to help with SEO.


But whatever domain is embedding your iframe is still sending you backlink juice, be it to your front end domain.


If I remember correctly, most of the crawlers consider iframe as a separate window and not as backlinks.


Would love to hear what your favorite and least favorite datasets are


"Favorite" is probably the wrong word but the most 'insightful' data viz could be the one that shows the disproportionate impact of COVID-19 on different racial groups. Here's an example for San Francisco. https://missionlocal.org/2021/02/2-19-tracker/


Least favorite would definitely be California campaign finance data.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: