For those of you who like me have been shying away from looking at serverless architecture because “how the hell do you run a web server without a server?” only to realize serverless means something completely different, I wanted to write a very short first impressions take on Google Cloud’s Firebase platform. So what is it? Well, you can read the official feature list and explanation on their website but I’d summarize it as a cloud-powered way (set of tools and platform) to have a backend for your web or mobile app without needing to really code any aspect of the backend yourself. In that suite of tools there’s capabilities for realtime client/server communication, a database, file storage, and many other things I didn’t bother to use for my pet project.
For my first time use of firebase, I built a very simple react web app to find when local swimming pools in my home town of Toronto are open. You can view the project here: https://poolfinder.podrezo.com/
Firebase access using Ruby
Once I had the library set up and started trying to upload data to firestore, I ran into the question of “what credentials do I use to do this?” since the web credentials are not suitable for running server/background code like my ruby data uploader. For this, it turns out, one must set up a service account (which lives inside the project but outside of firebase, confusingly, just as part of Google Cloud) and then install the firebase CLI client which you then login with and generate a key file using that. That part of the process was seamless but felt clunky for the fact I would have preferred to just have some sort of API secret string that I could store without having to jump through the hoops of installing a client and storing a JSON file.
After having the ruby script set up and the data loaded to the firestore database, I began working on my client. I immediately ran into a whole bunch of issues here. Firstly, my data had two “tables” – one for locations and one for the schedule. What I wanted to do was load the five closest pools to the user’s geographic coordinates and then for each of those pools load the schedule for the next 24 hours – it seemed simple enough but quickly I fell prey to firestore’s oddities.
FireStore and geo data 👎
The first pain was that despite supporting storing a data type called GeoPoint you cannot actually query the backend by geographic distance – I can’t tell firestore “give me the five closest pools to (x,y)” or even “give me all pools within 5km from (x,y)” because it has no concept of how to query GeoPoints aside from the raw floating point values which you can query using only simple less than/greater than operators. This means that to get pools closest to me I either have to use client-side logic to get the north-western most and south-eastern most points in a bounding box and then query the locations using those by comparing less/greater which is annoying because a bounding box doesn’t have a radius like a circle does which makes it inconvenient for finding the closest locations. Alternatively, one can just query for all locations (in this case there were only maybe about 120 locations that I was interested, which is not a lot) and then filter them on the client side by distance; this is the approach I went with initially but it turned out to be really bad for another reason I’ll get to further down.
FireStore and “complex” queries 👎
After having the locations all set up, I wanted to query the schedule for each of the five closest locations and where the time range for the block of swim time was between a certain amount. This proved impossible; I wish I read the limitations of querying firestore before uploading my data because I didn’t realize that compound queries (e.g. ones with more than one condition) that contain range operations such as less than/greater are not supported for multiple fields at the same time. That is, I can’t query the “from” time as being greater than some value while simultaneously querying “to” time as being less than some other time. This was a huge flop, and then on top of that operations like inequality are not supported – the official docs say to use “age > 30 & age < 30” instead of “age != 30” but this assumes you are comparing numbers; so if you wanted to compare inequality on a string you’re out of luck. In the end I opted to simplify my whole architecture of this application to fit more neatly with how firestore works and that ended up being fine, but before I get to that I wanted to mention another FireStore issue I had – limits!
FireStore’s usage limits
My pet project was a way to try out firebase for something small and very basic; I thought it would be a perfect fit. What I didn’t really pay attention to is how the limits work, especially for FireStore. For the free tier you get something like ~50k reads per day – I read this as being 50,000 queries per day which given the low level of volume I expected was totally fine so it came as a total shock when during development, before I even launched the project, I ran into the firebase console telling me “you have exceeded your daily quota” when merely testing the project myself.
It turns out the 50k figure is for the number of documents, not the number of queries. Remember when I said “oh yeah we have only 120 locations for pools so loading them all is fine”? Well that’s 120 document reads on each page load; couple that with loading the schedules for 120 ⨉ 5 locations where each swim time block is one document this adds up to several hundred reads per page load. Refresh the page a few dozen times during testing and you’re done for the day! I ended up solving this problem by making a simple but very impactful optimization I’ll go over next, but suffice it to say I was not pleased with how this went. It seems pretty reasonable to want to split my data into a schedule table and a locations table, then perform queries on each one individually but firestore is very much not conducive to that so the moral of my firestore experience is “do things in a firestore way or else you’re asking for trouble.”
Optimizing for querying firestore 🤔
I rewrote my ruby data uploader script to upload data in a different format and then altered the client side to work a bit differently. Firstly I compiled my list of locations, which is static, to be stored on the client-side (without any scheduling data). This means when I want to find the 5 closest pools, I don’t even need to query a server at all and I gain the benefit of being able to query by distance since I don’t have FireStore’s limitations when dealing with geo data I mentioned above. Then, once I have the five closest pools, I query the schedules for those pools but with a twist from the previous implementation – on firestore I used the support for hierarchial data to my advantage by bundling a pool’s schedule into the locations table and getting rid of the schedule table completely. That is, each location will store its usual properties like name and coordinates, but it will also have a complex field for the schedule and will contain all time blocks that it is open that I have uploaded. What this means is that on a single page load, I grab the 5 closest pools and then make exactly 5 queries total to firestore to get the schedule data and filter that schedule data on the client side. This improves the number of read operations by several orders of magnitude and then I no longer ran into the firestore limits anymore.
My friend Steve T. added the following, which may be of interest to you as well:
One of the things that I wasn’t aware of at first moving from their Realtime database to Firestore was that queries were shallow by default. So my queries returned my documents but not their sub collections which required another read operation. I went the same route as you did where I used a complex field (or as they refer to as maps), to store values I previously had in a subcollection which helped reduce the amount of reads I made.
Static assets & deployment 🎊
The next thing I wanted to do was host my client-side code so that others could actually hit the app. This turned out to be dead simple using the “hosting” part of firebase. This part of firebase was really awesome for a few reasons; number one is the simplicity – all I had to do was build my project whatever way I had already set up (e.g. running webpack and outputting a “dist” folder) then just having a JSON file to point at said folder and using the firebase CLI to say “hey, deploy this project” and it would do everything else. It uploads, takes snapshots, logs what happened, etc. and puts it all on your app’s subdomain on google cloud with HTTPS already there out-of-the-box.
I then thought about “How would I actually put this as a subdomain under my own domain name instead of google’s?” and felt like it was going to be some huge hassle but it turned out to also be surprisingly easy. One needs only to click a button to get a verification code, create a DNS “TXT” record on your domain with that value, and click “confirm” and voila – everything’s done, HTTPS and all.
Final note – firestore security
I didn’t have to take security when accessing my database into consideration at all for this project because the data was all public, so I just had to enable read for all of it publicly and disable writing completely (service accounts, such as the one utilized by my ruby data uploader script are exempt from the security rules). In a more real-world application where users are submitting data, this would obviously not fly. For this reason, google allows you to write little scripts that basically define the rules for who can read/write from/to what documents. I will definitely be looking into this in the future for any new firebase projects but on first glance it seemed a little overcomplicated for most use cases. Definitely beware that you might be looking at a lot of documentation-reading for how to set up your firestore if you plan to have users write to it because it did not seem very straightforward.
My final verdict on firebase, based on my admittedly pretty limited experience with it is as follows:
- Fast set up time for base project
- Once you model your data and application in a way that takes into account firebase’s limitations, it is actually really simple to work with and quite efficient
- Deployment of static website content was a breeze; HTTPS comes out of the box with no configuration required.
- Using a custom domain or subdomain with the hosting is incredibly simple as well
- Firebase Library is not well documented, assumes you understand that the method signatures are the same across languages
- Firebase client CLI required to do many tasks, which you have to set up and configure
- FireStore is pretty limited on its querying capabilities
- Geo-data and firestore do not go hand-in-hand at all; no way to query by geographic proximity
- FireStore quotas are based on number of document reads/writes not on the number operations which is really hard to work with if you want to structure your data in a meaningful way in certain applications
Overall, I’d definitely recommend it for small projects but would think twice for anything more complicated. With the generous amount of credit google gives out to first time customers plus the free tier that’s available to everyone, I think it is very much worth a shot.