

The first issue was with the default value of arc_shrink_shift. There are also tools like wal-e for automatically writing and restoring Postgres backups to S3.Īs for stability, have been two major sources of instability with ZFS: Postgres provides point in time recovery, which is useful. We don't use snapshots for backups as the Postgres backups are more convenient.

If you are curious, I wrote a blog post on this investigation. We made a small change that resulted in a 10x improvement to ingestion throughput. A good example is we used flame graphs to see what Postgres was using CPU for. Introspection - There are times where we've needed to debug performance problems with Postgres and EXPLAIN ANALYZE won't suffice. By compressing our data, we get a major cost saving and a major performance boost at the same time! There isn't an easy way to compress your data if you use RDS. For example, we run ZFS on our EC2 instances which compresses our data by 2x. As a side note, the cost of an i3 is also less than the cost of an r3 with an equivalent amount of EBS.Ĭonfiguration - By using EC2 we can configure our machines in ways we wouldn't be able to if we used RDS. We used to use r3s with EBS and got a major speedup when we switched to i3s. EBS is slow compared to the NVMe the i3s provide. Performance - The only kind of disk available on RDS is EBS. For example, an on-demand r4.8xl on EC2 instance costs $2.13 an hour, while an RDS r4.8xl costs $4.80 an hour. The cost of an instance on RDS is more than twice the cost on EC2. The amount of data we store is at the point where RDS is too expensive for us. As for reasons why we use EC2:Ĭost - Our primary data store has >1 Petabyte of raw data stored across dozens of Postgres instances. We use Postgres on EC2 only for our primary data store. We actually do use RDS for a number of our services. First of all, I will say I love RDS as a product. There are several reasons why we use EC2. I'm one of the database engineers at Heap.
