by Alan Yao, Dipak Pawar, Blair Wu, and Abhishek Parmar The Problem Over the last couple of years, Airbnb engineering moved from a monolithic Ruby on Rails architecture to a service oriented architecture. In our Rails architecture, we had an API per resource to access the underlying data. These APIs had authorization checks to protect sensitive […]
Over the last couple of years, Airbnb engineering moved from a monolithic Ruby on Rails architecture to a service oriented architecture. In our Rails architecture, we had an API per resource to access the underlying data. These APIs had authorization checks to protect sensitive data. As there was a single way to access a resource’s data, managing these checks was easy. In the transition to SOA, we moved to a layered architecture where there are data services that wrap databases and presentation services hydrating from multiple data services. The initial approach to moving the permission checks from monolith to SOA was to move these checks to presentation services. However this led to several problems:
To tackle these issues, we made two changes:
Himeji exposes a check API for data services to perform authorization checks. The API signature is as follows:
// Can the principal do relation on entity?
boolean check(entity, relation, principal)
A permissions check will look like the following, which states “can user 123 write to listing 10’s description?”:
check(entity: “LISTING : 10 : DESCRIPTION”,
This is interpreted by Himeji as the statement “is user 123 in the set of users that can write to listing 10’s description?”.
Similar to Zanzibar, the basic unit of storage for Himeji is a tuple in the form entity # relation @ principal.
If we had to write a tuple for each exact permission that is checked, the volume of data and denormalization would grow exponentially. For example, we’d have to write both LISTING : 10 # WRITE @ User(123) and LISTING : 10 # READ @ User(123) for the listing owner to be able to both read and write.
Based on the Zanzibar configuration, we use a YAML-based configuration language that allows for the resolution of permissions checks via set algebra, allowing a developer to map a check to a set operation:
Suppose user 123 is the owner of listing 10. Then the database will have the tuple LISTING : 10 # OWNER @ User(123).
When we request check(entity: "LISTING : 10", relation: WRITE, userId: 123), Himeji interprets LISTING # READ as the union of READ & WRITE, and transitively LISTING # WRITE as the union of WRITE & OWNER. Therefore, it will fetch the following from its database, with any matches belonging to the set of LISTING # WRITE:
Query LISTING : 10 # WRITE @ User(123) => Empty
Query LISTING : 10 # OWNER @ User(123) => Match User(123)
So for example, user 123 need only have LISTING : 10 # OWNER @ User(123) to be in the LISTING : 10 # WRITE set.
We observed that entities at Airbnb frequently grant access to other entities as a result of their existence. For example, a guest of a reservation gains access to a listing’s location, along with other pieces of the listing’s information. We represent this use-case with a tuple where the principal is a reference to an entity, i.e. LISTING : $id # RESERVATION @ Reference(RESERVATION : $reservationId). This allows us to express the concept that a user in the ‘guest’ set of a reservation that is in the ‘reservation’ set of a listing is in the LISTING : LOCATION # READ set, minimizing the amount of data that needs to be stored:
- LISTING : $id # RESERVATION @
Reference(RESERVATION : $reservationId # GUEST)
Where this approach differs from Zanzibar is that such a tuple does not contain a relation (i.e. Reference(RESERVATION:$id # GUEST) ) within the principal. The relation following a referenced entity is static and retrieved from configuration. Taking the listing example and then checking against other use cases, we found that typically a reference will be followed to multiple relations. In our product, there is no variance in the set of relations used between two entity types; a change in the set means a product change and applies across all entity types. If the set of relations between two entity types (i.e. Reference(RESERVATION:$id # GUEST), Reference(RESERVATION:$id # COTRAVELLER), Reference(RESERVATION:$id # BOOKER), … ) has size M, writing a tuple for each of these leads to N*M tuples. By pulling the relation into configuration, we reduce the size of the stored data to N.
At read execution time, suppose the following tuples are stored in the database:
LISTING : 10 # OWNER @ User(123)
LISTING : 10 # RESERVATION @ Reference(RESERVATION : 500)
RESERVATION : 500 # GUEST @ User(456)
Now, if a client sends a request like:
check(LISTING : 10 : LOCATION # READ, User(456))
then based on the configuration, Himeji issues the first DB fetch based on the information from the request and the above config:
Query LISTING : 10 # RESERVATION => Match Reference(RESERVATION:500)
Query LISTING : 10 # OWNER @ User(456) => Empty
Himeji will then issue the 2nd DB fetch, substituting in the id of the reservation found, where a match indicates that the user 456 is in the set of users allowed to read listing 10’s location.
Query RESERVATION : 500 # GUEST @ User(456) => Match User(456)
Himeji is split into three layers:
The most significant changes we made to Himeji over Zanzibar’s setup are to:
We implement the same reliability (hedging, tiered caching) and load shedding features as Zanzibar for availability.
Himeji has been serving checks in production for about a year and its throughput has scaled up from 0 in March 2020 to 850k entities / sec in March 2021, while maintaining its availability and latency targets over the last year:
P50 Latency 1.8 ms
P95 Latency 7 ms
P99 Latency 12 ms
In order to cut down integration time and and drive developer adoption, we built some tools such as:
The Himeji authorization system, based on Zanzibar, unifies authorization data and logic for Airbnb. Prior to its introduction, maintaining consistency and performance across disjoint pieces of logic was difficult. Himeji utilizes a simple data model with a flexible logic configuration to centralize all product and data authorization. Himeji expands on Zanzibar’s scalability and performance attributes, and pushes latencies lower through its high hit rate tiered distributed cache. All these together result in Himeji storing tens of billions of relations and serving nearly a million entity authorizations a second while maintaining low latency and high availability.
Himeji was made possible through the contributions of many members of the team within Airbnb. We thank previous and current members of the team — Max Burkhardt, Alex Rosenblatt, Jefferson Lee, Divya Gupta, Clare Liu, Houkun Li, Leelakrishna Nukala, Karen Kim, Gary Leung, Ryan Flood, Tony Tran, and Gurer Kiratli. Additional thanks to our current and previous management that is incredibly supportive of this work — Anish Das Sarma, Vijaya Kaza, Jason Sobel, Bipin Suresh, Marc Blanchou, Raymie Stata, and Aristotle Balogh.
This work, and many exciting things are always happening at Airbnb. If you want to join us, check out our Airbnb Careers page.
“Rails” and “Ruby on Rails” are the registered trademark of David Heinemeier Hansson.
Apache Kafka, Apache Airflow, Apache Spark and Apache are either registered trademarks or trademarks of The Apache Software Foundation in the United States and/or other countries.
AWS and Amazon Aurora are the trademarks of Amazon.com, Inc. or its affiliates in the United States and/or other countries.
Java is registered trademarks of Oracle and/or its affiliates.
All trademarks are the properties of their respective owners. Any use of these are for identification purposes only and do not imply sponsorship or endorsement.
Himeji: A Scalable Centralized System for Authorization at Airbnb was originally published in Airbnb Engineering & Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.