anshumania's comments

anshumania · on May 7, 2020

Anyone know the bill Netflix has for running on the cloud ?

anshumania · on Feb 24, 2020

For a use case where we ingest hundreds of millions of data points to hadoop then run spark etl jobs to partition the data on hdfs itself. And then next day we have several million updates on the several datapoints from the last day(s). What would be recommended on a hadoop setup ? HBase ? Parquet with Hoodie to deal with deltas ? Or Iceberg ? Or hive3 ?

billman · on Feb 24, 2020

First mention of hoodie here. I'm surprised.