Breaking the Black Box with the Contents Intact: Refactoring the Transaction Domain
In October 2021, Mercari launched the “RFS” project to strengthen the company’s foundation for service and feature development.
“RFS” is an acronym for “Robust Foundation for Speed.” The project was launched to support disruptive growth across the entire Mercari Group, and aims to solve complex technical challenges and drastically strengthen the shared foundation of all Mercari Group companies.
It has now been more than a full year since the launch of the RFS project. So, in this article, we wanted to answer the question that’s on everyone’s mind: “How is it going?” In this series spanning three articles, we will talk to three key people overlooking the three areas of focus for RFS, namely C2C transactions, ID platform, and CS tool, and ask them about the problems faced by the project and how they are addressed. The first interview is with Engineering Manager Yuichi Takada and Tech Lead Eric Jeliffe, who belong to the Transaction Team overlooking the C2C transactions domain.
Featured in this article
Yuichi TakadaAfter working as a software engineer in Rakuten and Sansan, he joined Mercari in 2017. He initially performed as a backend engineer for Mercari US (US@Tokyo), later taking up the role of a tech lead within Mercari Japan. He currently is an engineering manager, overlooking the Transaction Team.
Eric JelliffeAfter graduating from New Jersey Institute of Technology, he gained multiple years of work experience in New York as a frontend/backend developer using LAMP and JS. Following that, he moved to Tokyo. He joined Fast Retailing in 2017 as a backend developer. In March 2021, he joined Mercari as a backend engineer.
Formed as a team that can focus on the transaction domain 100%
──The Transaction Team was formed at the start of RFS. What has been the mission of the team since?
@takady：C2C transactions were picked as a domain for RFS to focus on.
C2C transactions have been the core of Mercari’s business since its launch–it is the function that makes Mercari a marketplace for people to buy and sell things with one another. It is central to all other functions and features of Mercari. The systems in this domain were complex and tightly coupled, which affected development speed significantly. This was a major issue back then. By skillfully pulling these tangled domains apart from each other, we would be able to swiftly develop and implement necessary features and services. This is why we decided to focus on C2C transactions.
Splitting domains was necessary not just for technological purposes, but also for the sake of the organization. In order to ensure the expandability of the product, we had to form a development structure that allowed for microservices and modular monoliths. Engineers, EMs, and product managers had to come together and figure out the optimal way for splitting the domains, and build a structure based around that plan. The Transaction Team was formed around the time RFS was kicked off as the unit to challenge these issues.
──Why wasn’t there a Transaction Team up until that point?
@takady：There had been other teams who owned the C2C transaction domain up to that point, but their focus was more on managing the Mercari API monolith, while also looking over the transaction domain. However, considering the scope of the RFS project, we needed a team that can focus on the transaction domain 100%. The team that Eric and I were a part of became the base as a few more members joined and the Transaction Team was formed.
──I see. So then, what kind of a team has it grown into since last October?
@takady： Initially, not all members had knowledge of the transaction domain. As it was a new team as well, the members had to deepen their understanding of each other.
We started running “code reading circles” as a team, through which we intended to grow everyone’s domain knowledge. The first quarter, which was Q1 of last year, we ran two circles per week for 30 endpoints. Looking back now, I think we benefited greatly from those code reading circles. They served as a platform not just for learning, but also communication. We were able to observe the communication style of each member and the kind of workstyle they wanted to achieve.
──Is it a popular practice among engineers to hold code reading circles?
@takady：Forming study groups and reading books on technology is a practice that I think is common, but I had never heard of code reading circles before.
@eric.j：Since the chain of events that led to this group of people owning this certain domain is a little out of the ordinary in its own right, we thought a similarly unorthodox method would serve well for us to get up to speed. I think the circles also helped us build a sense of comradery as a team.
@takady： Even so, the first session of the code reading circle did come with its own issues, or should I say difficulties. Eric was the host of the first one, so the proceedings were mainly in English. This put the members who are not as skilled in English in a tough spot and made it difficult to have deep exchange and discussions. We made sure to assure them that Japanese was welcome too, but I think there was a bit of awkwardness as we were still a fresh team. Following that, we held a retrospective before the second session and took steps to ensure that the practice of holding code reading circles would be more meaningful for the team.
──By the way, what kind of talent do you have in the team?
@takady：We have a lot of people who have experienced leader or manager positions. But as most of the members were relatively new to Mercari, they did not know the full picture of the C2C transaction domain. That is why I believe documentation is very important for driving projects or teams successfully. We have documented each of the code reading circle sessions and have made good use of the recording feature on Google Meet. Now the team has grown to a state where we can easily onboard any member who may join us in the future, not just in the tech aspect but also for explaining the reasoning and history of decisions made.
@eric.j：The documentation of the pre-existing monolith was written entirely in comments. There is a feature that can parse those comments and export proper documents, but no feature or mechanism in place to check that the knowledge recorded there is indeed true and accurate. Because of that, we chose not to continue that tradition and instead opted for writing API documentation based on OpenAPI specs.
The documents prepared with that process are also currently used by the QA Team, which shows that all the effort was indeed worth it. (laughs)
Tackling a large and complex black box
──What were the issues you had to address as a team?
@takady：The code reading circles taught us with full certainty that the biggest issue was the structure of the transaction domain, which was essentially a large and complex black box. A black box means no one can see the full picture and fiddling with it without the proper knowledge can easily break the whole thing. We had to bring it to a state where development could be driven safely and swiftly. We unanimously decided to tackle that first and foremost.
──And how did you end up resolving the issue of complexity?
@takady：We have different system units and team units. In terms of system units, we first defined the transaction domain to be composed of nine components.
Meanwhile, the team is split into two. These team units are namely “Checkout” and “Transaction”. The Checkout component is vital to making payments when buying something on the marketplace and requires a substantial amount of expert knowledge to handle properly. This is why we thought it should have its own sub-team. The Transaction Sub-team, on the other hand, owns every other part of transactions from start to finish.
After we decided to split the transaction domain to nine components around the beginning of 2023, we discussed with PMs to prioritize those components. As we addressed them from the top, some of them were completed and became loosely coupled. This provided the foundation to build new features on top of those, which is major progress.
──Can you explain “loosely coupled” in more detail?
@takady：In the process of defining components, we realized that some of them were heavily dependent on one another. The problem caused by such dependencies is that if we were to make a change in one component, it is not just that another component would be affected, it is that we weren’t even able to predict how large such an impact would even be. “Loosely coupled” as we envisioned it meant drawing the proper boundaries, providing the appropriate interfaces, and making it possible to access other components via these official interfaces. All of this makes it possible to develop features safely and swiftly.
Judging from the goal, we have finished around 40–50% of the work. We are halfway there. The design and examination of the system are also happening alongside the project, so we aren’t as much in the dark as we were initially. Still, some aspects of the domain are buried deep and waiting to be uncovered, so maybe those parts are still in the aforementioned black box.
──How do you feel about the progress you’ve made so far?
@takady：A whole year went by very quickly. We invested time into team building in the early stages. Thanks to that, looking back at our results these past two quarters, I think it’s moving along smoothly. The team was indeed formed to progress RFS, but that’s not all we do. Our work in the past year has allowed for the development of all-new features, so we can use our time for such projects as well. What about you, Eric?
@eric.j：Compared to the early stages of the team, we all have deeper knowledge of the domain, and I say that with confidence. When we get an inquiry about the system from another team, we immediately know which part to examine. When we receive the requirements of a new feature from a PM, we can tell how feasible it is and the way to do it. Even these are enough to make me feel happy about our progress.
Boldly redesigning components that directly connect to profits with the ultimate goal of eventually running feature development and system improvement in parallel
──What will the Transaction Team be doing in the future?
@takady：For RFS work, now we know that not all components need to be loosely coupled. We will prioritize the components based on the magnitude of their business impact and work our way down from the top. The team’s work will continue beyond that. I want to tackle feature development alongside other PMs and engineers. That is something I’m personally looking forward to.
The transaction domain is still a complex one, and it will always be vital for the Mercari marketplace. We have the opportunity to push forward the evolution of such an important area, and then build more features on top of it. I believe there is a lot that can be done.
@eric.j：Before, even when we were handed large-scale requirements, we weren’t able to get to work very quickly, which was frustrating. But after pulling the domain apart and making each module loosely coupled, we can build new features on top, meaning we can make the lives of our users even easier, and I’m looking forward to that very much.
@takady：We have a history of prioritizing feature development that will benefit the user. System redesign or maintenance hadn’t been prioritized very well before RFS, making it increasingly difficult to build new features. Of course there were other efforts to fix some of those problems in the past, but the executive decision to invest resources in a major way has certainly changed the picture.
@eric.j：I have felt firsthand how large the impact of this redesign is. It is very difficult to redesign an existing system without discarding or even suspending any of it. I don’t think you can experience something like this in another company.
──What is most interesting about working in the Transaction Team?
@takady：Refactoring work will be a necessity sooner or later in any company. It is still rare for a service of this size to receive support from the company to refactor its core domains. It’s stimulating work to understand business needs and “uncover the mystery” of expert knowledge lost to time. Having impact on a part of the product that is so integral to our revenue is thrilling, too. This would be a great experience for any team, and it certainly is for ours too.
@eric.j：What we are trying to do now is basically breaking a black box but leaving the contents intact. Mercari’s developers have been afraid of touching the black box that is the transaction domain. This makes it harder for them to add new features in it to make Mercari users’ lives easier. The intention behind this redesign is to create interfaces that can eliminate such fears in developers and ultimately help them provide new and better features to end users.
@takady：RFS is not the goal of the Transaction Team. What we have to do is continue building new features on top of what RFS intends to achieve. Even if we rebuild our infrastructure with this project now, continuously developing new features might eventually require that we refactor our code base again. We have to run feature development and system improvements in parallel to one another, and maintain a code base that makes it easy to build new things on top. I believe that is the ultimate purpose of this team.